Test fixture
Roblox OpenGameEval tasks. Special category: visible in breakdowns, excluded from Overall.
The model receives the prompt (and optional system message). The run uses scorer roblox_open_eval with the JSON configuration below. Pass/fail and partial credit are determined entirely by that scorer against the model output; no human grading.
[Roblox OpenGameEval] Add an enclosed cave to the map. The cave should have a ceiling, floor, and walls forming a room-like structure that a player can walk into. Don't do it on terrain.
{
"input_script": "[embedded in CLI runner]",
"upstream_path": "Evals/094_village_add_cave.lua",
"upstream_sha256": "3c345a41f408b922df849cb6dd4043576aa6a66c4efdd1b4d64c3db127c750de",
"scenario_name": "094_village_add_cave",
"place": "village.rbxl",
"eval_kind": "codegen"
}temperature
0
max_tokens
1
timeout (s)
900
type
scored
file
094_village_add_cave.json