Test fixture

Speed Perf Throughput Api Design

Speedv1 — Nutritionhardscorer: contains_all

Throughput and TTFT-focused generation tasks.

How it is scored

The model receives the prompt (and optional system message). The run uses scorer contains_all with the JSON configuration below. Pass/fail and partial credit are determined entirely by that scorer against the model output; no human grading.

User prompt

You are documenting an internal REST API for a Task service. Produce a detailed Markdown specification with substantial technical depth (multiple sections, many paragraphs, and multiple fenced JSON examples).

Use exactly these Markdown section headings in this order:

## Overview
## Authentication
## Endpoints
## Error model
## Pagination

Under ## Endpoints include at minimum all of: GET /tasks, POST /tasks, GET /tasks/{id}, PATCH /tasks/{id}, DELETE /tasks/{id}. For each endpoint describe typical request and response JSON shapes using fenced code blocks.

Write in English. Do not stop early; expand each section thoroughly.

Scorer config

{
  "expected_contains": [
    "## Overview",
    "## Endpoints",
    "GET /tasks",
    "POST /tasks",
    "PATCH /tasks",
    "```"
  ]
}

Run parameters

temperature

max_tokens

2048

timeout (s)

300

type

throughput

file

speed_perf_throughput_api_design.json

← PreviousSummary

Next →Speed Perf Throughput Data Structures