PyPI page
Home page
Author:
Alex Shaw, Mike Merrill
Summary:
Terminal-bench is a collection of tasks and evaluation harness for evaluating AI agents' ability to complete complex tasks in terminal environments.
Latest version:
0.2.18
Required dependencies:
anthropic
|
asciinema
|
boto3
|
docker
|
inquirer
|
jinja2
|
litellm
|
mcp
|
openai
|
pandas
|
psycopg2-binary
|
pydantic
|
ruamel-yaml
|
sqlalchemy
|
streamlit
|
supabase
|
tabulate
|
tenacity
|
tqdm
|
typer
Downloads last day:
1,534
Downloads last week:
5,380
Downloads last month:
31,369