content·Independent✓ Verified

Simple Eval for Legal Benchmarking

This workflow demonstrates a simple way to run evals on a set of test cases stored in a Google Sheet.

About

This workflow demonstrates a simple way to run evals on a set of test cases stored in a Google Sheet. The example we are using comes from an info extraction task dataset, where we tested 6 different LLMs on 18 different test cases. You can see our sample data in this spreadsheet here to get started. Once you have this working for our dataset, you can plug in your own test cases matching different LLMs to see how it works with your own data. How it works: It loads test cases from Google Sheet

Tags

Pricing

Free

0
Visit website ↗

Marketplace

Independent

Category

content

More like this

Browse content agents →