Performance

Resource Constraints

The challenge allows a total of 1.5 CPU and 550MB RAM across the application containers. This repository spends that budget on two API instances, PostgreSQL, and NGINX:

Component	CPU	RAM	Performance role
Two API containers	`0.8` total	`200MB` total	Parallel request handling and validation
PostgreSQL	`0.5`	`330MB`	Atomic balance mutation and statement projection
NGINX	`0.2`	`20MB`	Low-overhead load balancing

Optimization Shape

The implementation optimizes for the contest workload by keeping each layer focused:

Thin Python path: Flask validates input and delegates consistency-sensitive work to SQL.
Database-owned invariants: InsertTransacao() handles balance updates and limit rejection atomically.
Compact statement reads: GetSaldoClienteById() returns statement data already shaped for the API response, ordered by Id DESC and limited to the latest 10 rows.
Two API replicas: NGINX uses least_conn so a busy worker pool does not become the only ingress path.
Durability trade-offs: PostgreSQL write-safety settings are relaxed for benchmark throughput, not for production banking data.
UNLOGGED hot tables: Clientes and Transacoes are unlogged, and Transacoes uses fillfactor = 90, trading crash durability for faster contest writes.
Targeted statement index: IX_Transacoes_ClienteId_Id_Desc backs the per-client latest-transaction lookup pattern.

Reading the Reports

Published reports are the source of truth for run-level performance evidence:

Stress test report index lists committed historical HTML reports from docs/public/reports/.
New non-doc mainline releases also upload a fresh stress-test-report artifact from the production k6 run; promote artifacts into docs/public/reports/ only when you want them to become part of the published historical archive.
Compare published reports and workflow artifacts by commit context and workflow timing, not just by a single latency number.

When evaluating a run, inspect:

Signal	Why it matters
HTTP failure rate	Validates that concurrency did not break the API contract
Request duration percentiles	Shows tail behavior under contention
Transaction throughput	Reveals whether app/DB coordination is saturated
Statement latency	Confirms reads stay responsive while writes are active
Report timestamp	Connects the result to the release workflow and code state

Benchmark Interpretation

This is a deliberately constrained system, so performance conclusions should stay tied to the workload:

Local Docker and GitHub-hosted runners can be noisy.
PostgreSQL tuning favors contest throughput over durability.
The Python layer is intentionally not a business-logic engine; moving limit checks out of SQL would change the contention model.
A good report is both fast and correct: successful status codes and stable statement semantics matter as much as latency.

Useful Next Reads

Architecture for the runtime and stored-procedure boundaries.
CI/CD Pipeline for how reports are produced, uploaded, and published.
Reports for archived k6 HTML output.