Test fixture
Latency-sensitive tasks where concise correct output matters.
The model receives the prompt (and optional system message). The run uses scorer contains_all with the JSON configuration below. Pass/fail and partial credit are determined entirely by that scorer against the model output; no human grading.
You are documenting an internal REST API for a Task service. Produce a detailed Markdown specification with substantial technical depth (multiple sections, many paragraphs, and multiple fenced JSON examples).
Use exactly these Markdown section headings in this order:
## Overview
## Authentication
## Endpoints
## Error model
## Pagination
Under ## Endpoints include at minimum all of: GET /tasks, POST /tasks, GET /tasks/{id}, PATCH /tasks/{id}, DELETE /tasks/{id}. For each endpoint describe typical request and response JSON shapes using fenced code blocks.
Write in English. Do not stop early; expand each section thoroughly.{
"expected_contains": [
"## Overview",
"## Endpoints",
"GET /tasks",
"POST /tasks",
"PATCH /tasks",
"```"
]
}temperature
0
max_tokens
2048
timeout (s)
300
type
throughput
file
speed_perf_throughput_api_design.json