OpenAI Challenge · 16MB Artifact

OpenAI Model Craft Challenge: Parameter Golf

A frontier model-engineering challenge: train the strongest language model that fits in a 16MB artifact and trains in under 10 minutes on 8×H100s, evaluated by compression performance on FineWeb validation (tokenizer-agnostic, bits per byte).

Overview

Parameter Golf is inspired by NanoGPT Speedrunning, but shifts the optimization target to an explicitly parameter-constrained regime. The objective is to discover architectures and training methods that maximize capability under strict size and runtime limits.

You can think of this as an L(N)-style optimization problem: achieve the lowest possible loss for a fixed parameter budget, unconstrained by architecture creativity.

What Makes It Interesting

Research Surface Area

Test-time compute and depth recurrence
Aggressive parameter tying and low-rank training
Quantization, QAT, bit-level model formats, tokenizer innovation
Long-context evaluation and system-level kernel optimizations

Two Tracks of Exploration

Record submissions: must satisfy the official 10-minute / 8×H100 bound.
Non-record submissions: unlimited-compute explorations still welcomed for ideas and breakthroughs.

Core Constraints & Rules

Total artifact budget: 16,000,000 bytes (decimal MB), including code + compressed model.
Evaluation and training must be self-contained; no external downloads or network calls during eval.
Leaderboard records should beat SOTA by at least 0.005 nats with strong statistical evidence.
Validation data cannot be leaked into training beyond allowed test-time training rules.

Leaderboard Snapshot

LeakyReLU² + Score-First TTT + Parallel Muon

1.1194

Author: abaybektursun · 2026-03-23

11L EMA + GPTQ-lite + warmdown3500

1.1228

Author: signalrush · 2026-03-22

11L Partial RoPE + LN Scale + EMA + XSA4

1.1248

Author: jfprincz · 2026-03-21

Getting Started

The official workflow supports both local iteration (e.g., Apple Silicon / MLX) and cloud GPU scaling (e.g., RunPod H100 pods). Typical setup includes cloning the repo, creating a fresh Python environment, downloading cached FineWeb shards, and launching baseline training with torchrun or MLX scripts.

OpenAI is also sponsoring $1,000,000 in compute credits to help participants bootstrap experiments through a compute grant process.

Timeline & Participation

Challenge window: March 18 – April 30.
Participant form is optional, but useful for attribution and OpenAI outreach.
Top technical submissions may stand out to researchers and recruiters.

View Official Repository View My GitHub Fork

← Back to Home