Documentation Hub

Python 3.14 implementation for the Rinha de Backend 2024/Q1 challenge. This site documents how the fictional bank API keeps the HTTP layer thin, pushes balance consistency into PostgreSQL stored procedures, and keeps k6 evidence tied to committed report pages and release-workflow artifacts.

source-backed map

What to read, and why it matters

The public docs are organized around the real runtime: two Flask/Gunicorn API containers behind NGINX, one PostgreSQL 16.7 database, and historical k6 reports published from docs/public/reports.

RuntimePython 3.14Flask 3.1 + Gunicorn 25.3

Concurrency2 APIs4 workers × 2 threads each

StatePostgreSQLatomic stored procedures

Envelope1.5 CPU / 550MBchallenge-wide budget

Fast Paths

01 · contract Understand the challenge API Endpoints, request rules, status codes, client IDs, and the resource budget. 02 · runtime Trace a request through the stack NGINX least_conn, Flask/Gunicorn workers, connection pooling, and database procedures. 03 · local loop Run the stack locally Compose commands, smoke requests, k6 execution, and troubleshooting checkpoints. 04 · evidence Read the benchmark trail How to interpret k6 reports and connect them back to architecture choices.

Wiki Pages

Page	Best for	Source of truth
Challenge	API contract and constraints	Original Rinha spec + `src/WebApi/app.py`
Architecture	Runtime topology and consistency model	`docker-compose.yml`, `nginx.conf`, SQL init scripts
Getting Started	Local execution and smoke checks	Docker Compose stack
Performance	Benchmark interpretation and report links	`docs/public/reports`, k6 artifact uploads
CI/CD Pipeline	Release, checks, Pages deployment	`.github/workflows/*`

Implementation Signals

The HTTP layer validates shape, client IDs, and request payloads before handing work to PostgreSQL.
InsertTransacao() performs balance updates and limit checks atomically with row-level locking.
GetSaldoClienteById() returns statement-ready JSON so the API can avoid heavy response shaping.
NGINX uses least_conn to distribute load between two identical API instances.
Release automation publishes GHCR images, runs container checks, executes k6, uploads the latest stress-test artifact, and deploys this documentation to GitHub Pages.

External Links

GitHub · Jonathan Peris

Architecture

Tech Stack

Technology	Version	Purpose
Python	3.14	Language runtime
Flask	3.1.3	Lightweight HTTP routing and JSON responses
Gunicorn	25.3.0	WSGI HTTP server (`4` workers, `2` threads)
psycopg2-binary	2.9.12	PostgreSQL adapter and connection pool
PostgreSQL	16.7	Database with stored procedures and row-level locking
NGINX	1.27	Reverse proxy / load balancer (`least_conn`)
Docker	-	Containerization (`python:3.14-slim` base)
k6	-	Load / stress testing

Runtime Topology

client / k6
   │
   ▼
NGINX (:9999, least_conn, 0.2 CPU, 20MB)
├── webapi1-python (:8080, 0.4 CPU, 100MB) — Gunicorn 4w × 2t
├── webapi2-python (:8080, 0.4 CPU, 100MB) — Gunicorn 4w × 2t
└── PostgreSQL 16.7 (0.5 CPU, 330MB)
    ├── InsertTransacao() — atomic balance update + limit validation
    └── GetSaldoClienteById() — statement projection with JSONB

Request Flow

NGINX receives traffic on :9999 and routes to the API instance with the fewest active connections.
Flask validates the route, client ID, JSON payload shape, transaction type, and description length.
psycopg2 checks out a connection from a SimpleConnectionPool(1, 10) inside the API process.
PostgreSQL stored procedures perform the balance mutation or statement projection.
The API commits successful transactions, rolls back failures, returns the compact response, and releases the connection.

Services and Resource Envelope

Service	Role	CPU	RAM
`webapi1-python`	Python API instance (Gunicorn `4w × 2t`)	`0.4`	`100MB`
`webapi2-python`	Python API instance (Gunicorn `4w × 2t`)	`0.4`	`100MB`
`nginx`	Reverse proxy / load balancer (`least_conn`)	`0.2`	`20MB`
`db`	PostgreSQL 16.7 with stored procedures	`0.5`	`330MB`
`k6`	Load testing runner	not counted	not counted
`grafana`, `influxdb`, `prometheus`	Local observability loop	not counted	not counted

Database Responsibilities

Business logic is implemented in PostgreSQL stored procedures (InsertTransacao, GetSaldoClienteById). The database owns the race-sensitive parts of the challenge:

Atomic debit/credit application.
Credit-limit rejection for invalid debit outcomes.
Recent-transaction selection for statement responses.
JSON-shaped statement data returned to the API.

The physical schema is also tuned for the benchmark workload:

Element	Source-backed behavior
`Clientes` table	`CREATE UNLOGGED TABLE`, seeded with five fixed client IDs and limits.
`Transacoes` table	`CREATE UNLOGGED TABLE ... WITH (fillfactor = 90)` for write-heavy inserts.
`InsertTransacao()`	Uses `SELECT ... FOR UPDATE` on the client row to serialize concurrent balance changes for the same client.
`GetSaldoClienteById()`	Returns the current balance plus the latest `10` transactions as `jsonb`.
`IX_Transacoes_ClienteId_Id_Desc`	Composite index on `(ClienteId, Id DESC)` so statement reads can fetch recent transactions for one client efficiently.

For benchmark speed, the database is tuned with durability trade-offs:

synchronous_commit=0 — do not wait for WAL flush.
fsync=0 — skip fsync on writes.
full_page_writes=0 — skip full-page writes.

These settings are useful for the contest-style load test and are not production-safe for real banking data.

Gunicorn Configuration

--workers=4 --threads=2 --worker-class=sync --bind=0.0.0.0:8080 --timeout=30

Each API container therefore exposes multiple synchronous workers while keeping the Python layer intentionally small: validation, stored-procedure calls, response mapping, and connection cleanup.

Challenge

Rinha de Backend 2024/Q1

The Rinha de Backend is a Brazilian backend programming challenge. The 2024/Q1 edition simulates a fictional bank called “Rinha Financeira” that manages up to 5 named clients, each seeded at startup with a credit limit and initial balance.

The interesting part is not the endpoint count; it is the combination of high write contention, strict response semantics, and a shared container budget small enough to punish waste.

Endpoint Contract

Endpoint	Method	Success	Validation / rejection
`/clientes/{id}/transacoes`	`POST`	`200` with `limite` and updated `saldo`	`400` for a missing JSON payload, `404` for unknown client, `422` for invalid transaction fields or exceeded limit
`/clientes/{id}/extrato`	`GET`	`200` with `saldo` and recent transactions	`404` for unknown client
`/healthz`	`GET`	`200` with `Healthy`	Used by container and CI smoke checks

Transaction payload

{
  "valor": 1000,
  "tipo": "c",
  "descricao": "deposito"
}

Runtime validation in src/WebApi/app.py keeps the contract narrow:

valor must be a positive integer value.
tipo must be c for credit or d for debit.
descricao must be a non-empty string with at most 10 characters.
Client IDs are fixed to 1 through 5.

The runtime client table is intentionally static and mirrors the seed data in docker-entrypoint-initdb.d/rinha.dump.sql:

Client ID	Credit limit	Initial balance
`1`	`100000`	`0`
`2`	`80000`	`0`
`3`	`1000000`	`0`
`4`	`10000000`	`0`
`5`	`500000`	`0`

Consistency Requirement

Debit requests must never push a client past their configured credit limit. This implementation keeps that invariant in the database by calling InsertTransacao() for the balance update and limit check in one atomic operation.

Constraints

The challenge imposes strict resource limits across all containers combined:

Resource	Total budget	Where it goes in this repo
CPU	`1.5`	API instances, PostgreSQL, and NGINX
Memory	`550MB`	API instances, PostgreSQL, and NGINX
Load	k6 stress script	Concurrent transactions and statement requests

The observability sidecars and k6 runner are useful during local experiments, but the challenge budget centers on the application stack under test.

Source

Full specification: github.com/zanfranceschi/rinha-de-backend-2024-q1

Getting Started

Prerequisites

Docker with Docker Compose
A local clone of this repository

Clone and Run

git clone https://github.com/jonathanperis/rinha2-back-end-python.git
cd rinha2-back-end-python
docker compose up nginx -d --build

The nginx service depends on the two API containers, and each API waits for PostgreSQL health before starting.

Access

The API is available at http://localhost:9999.

Endpoint	Method	Description
`/clientes/{id}/transacoes`	`POST`	Submit debit or credit transaction
`/clientes/{id}/extrato`	`GET`	Get account balance statement
`/healthz`	`GET`	Health check

Smoke Check

curl http://localhost:9999/healthz

Expected response:

Healthy

Example Requests

Create Transaction

curl -X POST http://localhost:9999/clientes/1/transacoes \
  -H "Content-Type: application/json" \
  -d '{"valor": 1000, "tipo": "c", "descricao": "deposito"}'

The response includes the client’s credit limit and updated balance:

{
  "limite": 100000,
  "saldo": 1000
}

Get Statement

curl http://localhost:9999/clientes/1/extrato

The statement response contains the current balance envelope and the recent transaction list:

{
  "saldo": {
    "total": 1000,
    "limite": 100000,
    "data_extrato": "2026-04-01T07:36:51"
  },
  "ultimas_transacoes": [
    { "valor": 1000, "tipo": "c", "descricao": "deposito" }
  ]
}

Run Stress Tests

docker compose up k6 --build --force-recreate

The compose file supports a local observability loop with InfluxDB, Prometheus, Grafana, postgres-exporter, and the k6 web dashboard. The k6 service defaults to MODE=dev for local dashboard/export behavior; the production compose file uses MODE=prod and writes ./prod/conf/stress-test/reports/stress-test-report.html, which main-release.yml uploads as the stress-test-report workflow artifact.

Dev vs Production Compose

Concern	`docker-compose.yml`	`prod/docker-compose.yml`
API image source	Builds from `./src/WebApi`	Pulls `ghcr.io/jonathanperis/rinha2-back-end-python:latest`
API host ports	`6665:8080` and `6666:8080` for direct instance access	`8081:8080` and `8082:8080` for direct instance access
Public ingress	NGINX on `9999:9999`	NGINX on `9999:9999`
k6 mode	`MODE=dev`, InfluxDB export enabled	`MODE=prod`, HTML report export enabled
Report output	local k6 dashboard / InfluxDB data	`./prod/conf/stress-test/reports/stress-test-report.html`

Local Observability Ports and Credentials

Service	Port	Notes
Grafana	`3000`	Anonymous admin is enabled for the local dashboard loop.
Prometheus	`9090`	Reads `./prometheus/prometheus.yml`.
InfluxDB	`8086`	Requires `INFLUXDB_PASSWORD` and `INFLUXDB_TOKEN`.
postgres-exporter	`9187`	Exposes PostgreSQL metrics to Prometheus.
k6 web dashboard	`5665`	Enabled by `K6_WEB_DASHBOARD=true`.

Set the InfluxDB values before starting k6/observability services, for example:

export INFLUXDB_PASSWORD=local-rinha-password
export INFLUXDB_TOKEN=local-rinha-token

Troubleshooting Checklist

Symptom	Check
`nginx` does not respond	`docker compose ps` and API container logs
API containers restart	PostgreSQL health, `DATABASE_URL`, and init scripts
Transactions return `422`	Payload type/description rules or credit-limit rejection
k6 cannot connect	Confirm `BASE_URL=http://nginx:9999` inside the compose network
Observability stack fails	Provide the required InfluxDB environment values before starting those services

Stop the Stack

docker compose down

Add -v when you intentionally want to remove the PostgreSQL volume and start from a fresh seed state.

Performance

Resource Constraints

The challenge allows a total of 1.5 CPU and 550MB RAM across the application containers. This repository spends that budget on two API instances, PostgreSQL, and NGINX:

Component	CPU	RAM	Performance role
Two API containers	`0.8` total	`200MB` total	Parallel request handling and validation
PostgreSQL	`0.5`	`330MB`	Atomic balance mutation and statement projection
NGINX	`0.2`	`20MB`	Low-overhead load balancing

Optimization Shape

The implementation optimizes for the contest workload by keeping each layer focused:

Thin Python path: Flask validates input and delegates consistency-sensitive work to SQL.
Database-owned invariants: InsertTransacao() handles balance updates and limit rejection atomically.
Compact statement reads: GetSaldoClienteById() returns statement data already shaped for the API response, ordered by Id DESC and limited to the latest 10 rows.
Two API replicas: NGINX uses least_conn so a busy worker pool does not become the only ingress path.
Durability trade-offs: PostgreSQL write-safety settings are relaxed for benchmark throughput, not for production banking data.
UNLOGGED hot tables: Clientes and Transacoes are unlogged, and Transacoes uses fillfactor = 90, trading crash durability for faster contest writes.
Targeted statement index: IX_Transacoes_ClienteId_Id_Desc backs the per-client latest-transaction lookup pattern.

Reading the Reports

Published reports are the source of truth for run-level performance evidence:

Stress test report index lists committed historical HTML reports from docs/public/reports/.
New non-doc mainline releases also upload a fresh stress-test-report artifact from the production k6 run; promote artifacts into docs/public/reports/ only when you want them to become part of the published historical archive.
Compare published reports and workflow artifacts by commit context and workflow timing, not just by a single latency number.

When evaluating a run, inspect:

Signal	Why it matters
HTTP failure rate	Validates that concurrency did not break the API contract
Request duration percentiles	Shows tail behavior under contention
Transaction throughput	Reveals whether app/DB coordination is saturated
Statement latency	Confirms reads stay responsive while writes are active
Report timestamp	Connects the result to the release workflow and code state

Benchmark Interpretation

This is a deliberately constrained system, so performance conclusions should stay tied to the workload:

Local Docker and GitHub-hosted runners can be noisy.
PostgreSQL tuning favors contest throughput over durability.
The Python layer is intentionally not a business-logic engine; moving limit checks out of SQL would change the contention model.
A good report is both fast and correct: successful status codes and stable statement semantics matter as much as latency.

Useful Next Reads

Architecture for the runtime and stored-procedure boundaries.
CI/CD Pipeline for how reports are produced, uploaded, and published.
Reports for archived k6 HTML output.

CI/CD Pipeline

Workflows

This repository uses four GitHub Actions workflows:

Workflow	File	Trigger	Description
Build Check	`build-check.yml`	Pull requests to `main` and manual dispatch	Builds/starts the development Compose stack with `docker compose -f ./docker-compose.yml up nginx --wait`, then probes `/healthz`.
Main Release	`main-release.yml`	Pushes to `main` except `docs/**`, plus manual dispatch	Builds/pushes GHCR images, creates the multi-arch `latest` manifest, runs the production compose health check, runs k6, and uploads the HTML report as an artifact.
CodeQL	`codeql.yml`	Pushes to `main`, PRs to `main`, weekly schedule	Runs Python security-and-quality analysis.
Deploy to GitHub Pages	`deploy.yml`	Pushes to `main` and manual dispatch	Calls the shared `jonathanperis/.github` Pages workflow to build and deploy `docs/` with Bun.

Pull Request Gate

PRs validate the container path before merge:

Check out the full repository.
Build and start the development Compose stack with docker compose -f ./docker-compose.yml up nginx --wait.
Probe http://localhost:9999/healthz once after Compose reports the service ready.
Run CodeQL for Python security-and-quality analysis.

This catches container build failures, dependency regressions, and startup problems before they reach main.

Main Release

main-release.yml is the image-and-artifact workflow. On a non-doc push to main, it:

Builds and pushes the amd64 image as ghcr.io/jonathanperis/rinha2-back-end-python:latest.
Builds and pushes the arm64 image as ghcr.io/jonathanperis/rinha2-back-end-python:latest-arm64.
Merges both digests into the multi-architecture latest manifest.
Starts the production compose stack and retries /healthz up to 20 times.
Runs k6 with prod/docker-compose.yml (MODE=prod).
Uploads ./prod/conf/stress-test/reports/stress-test-report.html as the stress-test-report Actions artifact.

The workflow intentionally ignores docs/** pushes, so documentation-only merges deploy Pages without rebuilding images or running k6.

Report Publication Model

There are two report lanes:

Lane	Source	Where to inspect
Latest release artifact	`main-release.yml` k6 job output	The `stress-test-report` artifact attached to the workflow run
Published historical archive	Committed files under `docs/public/reports/*.html`	Stress test reports on GitHub Pages

A release artifact does not automatically rewrite the committed Pages archive. Promote a new artifact into docs/public/reports/ only when it should become part of the public historical record.

Pages Deployment

deploy.yml publishes the static documentation site to GitHub Pages from the docs/ package. The published site includes:

The homepage proof narrative.
The docs hub and wiki pages.
The stress-test report index.
Committed historical k6 HTML reports.

After a docs PR merges to main, the deployment run should complete before the live URL is treated as updated.

Release Evidence Loop

PR branch
  ├─ build-check.yml validates container startup
  └─ codeql.yml scans the change
       │
       ▼
main merge
  ├─ main-release.yml builds images + runs k6 + uploads report artifact
  │   └─ skipped for docs/**-only pushes
  └─ deploy.yml publishes docs and committed report archive to Pages

Where to Look

GitHub Actions for workflow status.
Stress test reports for committed historical k6 output.
GHCR package for published images.