Architecture
Tech Stack
| Technology | Version | Purpose |
|---|---|---|
| Python | 3.14 | Language runtime |
| Flask | 3.1.3 | Web framework |
| Gunicorn | 25.3.0 | WSGI HTTP server (4 workers, 2 threads) |
| psycopg2-binary | 2.9.12 | PostgreSQL adapter |
| PostgreSQL | 16.7 | Database with stored procedures |
| NGINX | 1.27 | Reverse proxy / load balancer (least_conn) |
| Docker | - | Containerization (python:3.14-slim base) |
| k6 | - | Load / stress testing |
Overview
NGINX (:9999, least_conn)
├── webapi1-python (:8080, 0.4 CPU, 100MB) — Gunicorn 4w x 2t
├── webapi2-python (:8080, 0.4 CPU, 100MB) — Gunicorn 4w x 2t
└── PostgreSQL (0.5 CPU, 330MB)
├── InsertTransacao() — atomic balance + validation
└── GetSaldoClienteById() — statement with JSONB
Services
| Service | Role | CPU | RAM |
|---|---|---|---|
| webapi1 | Python API instance (Gunicorn 4w x 2t) | 0.4 | 100MB |
| webapi2 | Python API instance (Gunicorn 4w x 2t) | 0.4 | 100MB |
| nginx | Reverse proxy / load balancer (least_conn) | 0.2 | 20MB |
| postgresql | Database with stored procedures | 0.5 | 330MB |
| k6 | Load testing | (not counted) | (not counted) |
| grafana + influxdb | Observability dashboards | (not counted) | (not counted) |
Load Balancing
Nginx uses least_conn strategy to distribute requests across the two API instances.
Database
Business logic is implemented in PostgreSQL stored procedures (InsertTransacao, GetSaldoClienteById). The database uses UNLOGGED tables and is tuned for maximum write performance:
synchronous_commit=0— no wait for WAL flushfsync=0— skip fsync on writesfull_page_writes=0— skip full page writes
Connection Management
The API uses psycopg2 SimpleConnectionPool with 1-10 connections per instance for efficient database access.
Gunicorn Configuration
--workers=4 --threads=2 --worker-class=sync --bind=0.0.0.0:8080 --timeout=30