rinha4-back-end-dotnet
.NET 10 NativeAOT implementation for Rinha de Backend 2026.
The current build is optimized for latency first:
- raw socket HTTP/1 server
- Unix Domain Sockets behind the standalone
rinha4-lb-yolo-modeproxy - manual JSON request parsing
- prebuilt HTTP responses
- rounded int16 IVF2 fraud classifier
- archived official-like k6 results after each main build
- optional one-core CI contention probe for mismatch diagnosis
The project target is explicit: lead the .NET entries, keep score 6000, and keep 0 failures.
Current signal
Latest CI benchmark history lives at /reports/.
The home page reads the latest official Rinha issue result from
docs/public/official/latest.json and the latest CI candidate result from
docs/public/reports/latest-candidate.json.
CI results are useful for regression tracking. They are not official Rinha
hardware results. Candidate CI runs keep the canonical docker-compose.yml
standalone-yolo layout; manual stress runs can pin all service containers to one
host CPU when diagnosing official-preview mismatch.
Active lane
Transport is currently stable in CI. The active lane is the IVF approximate-nearest-neighbor index built from the allowed reference dataset and loaded at startup.
Repository map
| Path | Purpose |
|---|---|
src/WebApi | NativeAOT fraud-score server |
src/DataConverter | Converts official reference data into references.ivf.bin |
data | Allowed challenge datasets and normalization files copied into the API image |
tests | Focused validation tests |
scripts | Benchmark and report archive automation |
docs/public/reports | Versioned benchmark JSON history |
Challenge
Rinha de Backend 2026 scores implementations by latency, correctness, and request survival under the official k6 workload.
Required endpoints:
| Method | Path | Role |
|---|---|---|
GET | /ready | readiness probe |
POST | /fraud-score | fraud decision |
Default topology:
- reverse proxy on
9999 - two API instances
- total container budget:
1.00 CPU / 350 MB - no privileged container
- runnable through Docker Compose
Ranking pressure:
- lower p99 improves p99 score
- 0% failures preserves detection score
- HTTP errors destroy score quickly
This repository currently keeps transport errors at 0 in CI-like runs. The
candidate IVF2 path also replays the public payload with 0 false positives and
0 false negatives locally. Remaining ranking work is p99 reduction without
giving back correctness.
Architecture
k6 / judge
|
v
rinha4-lb-yolo-mode :9999
|
+-- unix:/sockets/api1.sock -> WebApi NativeAOT
|
+-- unix:/sockets/api2.sock -> WebApi NativeAOT
Request path
- The standalone yolo load balancer accepts TCP on port
9999inproxymode. - It forwards bytes to API instances over Unix Domain Sockets.
The compose file pins
webapi1to cpuset0,webapi2to1, andlbto2,3. CPU quotas still total1.00; cpuset reduces scheduler contention under the official host. RawHttpServeraccepts the socket connection.HttpWireparses method, path, headers, andContent-Length.FraudRequestParserreads only required JSON fields.FraudScorerbuilds a normalized 14-dimensional vector.FraudScorermaps vector to the IVF classifier.HttpResponseswrites a prebuilt HTTP/JSON response.
Data pipeline
src/DataConverter converts data/references.json.gz into data/references.ivf.bin during image build.
The references.ivf.bin file stores:
IVF2magic for the candidate default- trained int16 centroids in dimension-major layout
- per-cluster int16 bounding boxes in dimension-major layout
- packed int16 vector blocks
- labels and original ids for deterministic top-five tie-breaking
Classifier
Default and only runtime mode uses IVF.
Startup loads the IVF index and runs nearest-cluster search. Current settings
target nprobe=1, one-pass full bbox repair, and rounded int16 squared L2
ranking. IVF2 uses int64 accumulation for accuracy. If the IVF file is missing
or invalid, startup fails.
Runtime implementation is split into focused partial files:
IvfIndex.cs: binary loading, validation, immutable arrays, and search dispatchIvfIndex.Int64.cs: IVF2 candidate path forIVF_SCALE=10000
Startup readiness
Each API process recreates its Unix socket file on startup in the shared
sockets tmpfs volume. The standalone LB consumes /sockets/api1.sock and
/sockets/api2.sock and keeps the proxy layer byte-oriented.
Rules
Allowed in this repo:
- preprocess
references.json.gz - preprocess
mcc_risk.json - preprocess
normalization.json - use any classifier built from allowed reference data
- build ANN or IVF indexes from
references.json.gz - run the public official k6 script in CI
- compare against the public ranking preview
Not allowed:
- using official test payloads as reference data
- hardcoding expected answers from preview runs
- building correction tables from misclassified test payloads
- letting the reverse proxy inspect fraud payloads or answer
/fraud-score
The CI benchmark only mounts official test data into the k6 container. API containers do not receive test payload files.
The IVF scorer follows the same boundary: it trains and packs only
references.json.gz, reads no benchmark payload files, and fails startup when
the index is unavailable.
Getting Started
Build the app:
dotnet build src/WebApi/WebApi.csproj --no-restore
Run local stack:
docker compose up --build
curl -i http://localhost:9999/ready
Tune IVF image-build parameters with IVF_CLUSTERS, IVF_TRAIN_SAMPLE,
IVF_ITERATIONS, and IVF_SCALE when testing alternatives. Runtime repair controls are
IVF_FAST_NPROBE, IVF_FULL_NPROBE, IVF_BBOX_REPAIR,
IVF_REPAIR_MIN_FRAUDS, and IVF_REPAIR_MAX_FRAUDS.
Generate IVF data without Docker:
dotnet run --project src/DataConverter/DataConverter.csproj -- data/
Run focused tests:
dotnet run --project tests/VectorizationTests/VectorizationTests.csproj --no-restore
Run official-like benchmark locally:
bash scripts/ci-official-benchmark.sh
Run docs locally:
cd docs
bun install
bun run devPerformance
Hot path choices:
- NativeAOT publish
- raw socket HTTP/1
- one task per client connection
- pooled read buffers
- Unix Domain Sockets behind the standalone yolo proxy
- manual request parsing
- no model binding
- prebuilt response bytes
- rounded int16 IVF nearest-neighbor ranking
- no fraud-payload parsing in the proxy layer
Current bottleneck
Transport is fast enough for the current target. Recent yolo-LB CI runs have
shown 0 HTTP errors; p99 work is now inside IVF repair, vector scan cost, and
CPU split between the API containers and the standalone proxy.
The latest validated main build before this cleanup used image
ci-ecdcc3f1b0059842489ae32102763ac957cc2a36 and produced p99 0.40ms,
score 6000, 0 false positives, 0 false negatives, and 0 HTTP errors in
the automatic benchmark lane. A same-matrix comparison with that image was also
correct but narrowly trailed Danilo in that run (0.39ms vs 0.37ms).
Accuracy experiments
Earlier non-candidate classifier paths were removed from production. Rounded IVF2 is the only runtime classifier now.
The current production lane is IVF approximate nearest-neighbor search:
- build centroids and compact vector blocks from
references.json.gz - load
references.ivf.binat startup - scan the nearest cluster first with
IVF_FAST_NPROBE=1 - use scalar bbox repair with early exit to scan only clusters whose bounding box can still beat the current top-five bound
- skip repair for first-cluster
0/5approval and5/5denial candidates below tuned distance bounds - rank candidates with rounded int16 squared L2 distance
- use one-pass full bbox repair for the accuracy candidate
This path is implemented, unit-tested on a synthetic boundary case, and under CI benchmarking as the submission default.
Rejected A/Bs: AVX2 bbox repair raised p99 to 5.37ms; a cluster-major bbox
copy raised p99 to 6.89ms; 4096 clusters raised p99 to 16.69ms; 1024
clusters raised p99 to 19.78ms; removed experiments either missed labels or
lost to the current standalone-yolo path.
Reverse proxy
The retained load balancer path is the standalone rinha4-lb-yolo-mode image in
LB_MODE=proxy. It keeps the proxy byte-oriented on port 9999 and forwards to
the API containers over Unix Domain Sockets.
The benchmark workflow runs the canonical root docker-compose.yml used by the
submission. The compose file allocates 0.42 CPU / 160 MB to each API container
and 0.16 CPU / 30 MB to the proxy while keeping the total at 1.00 CPU / 350 MB.
CI/CD Pipeline
Main build flow:
- Build amd64 Docker image.
- Push immutable
ci-${GITHUB_SHA}tag to GHCR. - Start Docker Compose with that exact image.
- Clone official Rinha 2026 repo.
- Run public
test/test.jsthrough k6. - Upload raw benchmark artifacts.
- Archive summarized JSON into
docs/public/reports. - GitHub Pages deploys the docs site.
The automatic main-branch benchmark runs against the immutable image tag built in
the same workflow, not a locally rebuilt image. The canonical submission/runtime
shape is root docker-compose.yml: webapi1 on cpuset 0, webapi2 on 1,
and standalone lb on 2,3, while Docker resource limits remain active. Manual
runs can add a one-core overlay when diagnosing official mismatch, but that
stress mode is stricter than the candidate tracking run.
The build workflow also archives an official-calibrated run after the normal
candidate run. That lane can override service CPU quotas to screen splits such as
api=0.40 and proxy=0.20. It is a prediction/screening signal only; the
candidate/submission compose remains the source for official testing.
Manual Official-like Benchmark runs can archive experiment reports too. For
IVF, dispatch with report_kind=experiment, IVF_FAST_NPROBE=1,
IVF_FULL_NPROBE=1, bbox repair on, IVF_BOUNDARY_FULL=false, repair fraud
range 0..5, and the IVF_SCALE value under test.
Manual contention knobs:
benchmark_stack_cpuset=0: pin the standalone LB and WebApi containers to one host CPU.benchmark_k6_cpuset=0: also pin k6 to that CPU. Use only when diagnosing host contention; it is intentionally harsher than normal candidate tracking.benchmark_api_cpusandbenchmark_proxy_cpus: override service CPU quotas for calibrated or split-screening runs, for example0.40and0.20.benchmark_repetitions: run k6 multiple times and archive the median-p99 result, with raw repetition files uploaded as artifacts.
Report files
| File | Purpose |
|---|---|
latest.json | latest benchmark result |
latest-candidate.json | latest default submission-stack result |
latest-calibrated.json | latest official-calibrated prediction run |
latest-experiment.json | latest non-default experiment result |
index.json | sorted benchmark history |
rinha-benchmark-*.json | immutable benchmark records |
rinha-benchmark-*.html | k6 HTML reports when generated |
Uploaded workflow artifacts also include docker-state-*.txt with Docker
limits, cpuset, memory, and cgroup counters captured before and after k6. Use
those files to confirm which cpuset mode the run used.
The report archive commit is docs-only. The build workflow ignores docs/**, so
report commits do not trigger a new benchmark loop.
When benchmark reports change, the build workflow triggers the Pages workflow so
/reports/ refreshes without manual action. The manual benchmark workflow does
the same refresh after archiving a report.