βββββββββββββββ
β Flutter App β (Mobile Client)
ββββββββ¬βββββββ
β HTTPS
βΌ
ββββββββββββββββββββββββ
β Load Balancer/CDN β (AWS ALB / Cloudflare)
ββββββββββββ¬ββββββββββββ
β
βββββββ΄ββββββ¬ββββββββββ¬ββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββ βββββββββββ βββββββββββ βββββββββββ
βBackend 1β βBackend 2β βBackend 3β βBackend Nβ (Rust Servers)
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
β β β β
βββββββββββββ΄ββββββββββ΄ββββββββββ
β
βΌ
ββββββββββββββββ
β Redis Clusterβ (Caching Layer)
ββββββββ¬ββββββββ
β
βΌ
ββββββββββββββββ
β Open-Meteo β (External API)
β API β
ββββββββββββββββ
1. User opens app
2. App gets GPS coordinates (40.7128, -74.0060)
3. App calls: GET https://api.weathernote.com/api/v1/weather?lat=40.7128&lon=-74.0060
4. Backend receives request:
- Rounds coordinates: (40.7, -74.0)
- Creates cache key: "weather:lat:40.7:lon:-74.0"
- Checks Redis cache
5a. Cache HIT (99% case):
- Returns cached data instantly (<10ms)
- Response includes "cached": true
5b. Cache MISS (1% case):
- Calls Open-Meteo API
- Stores result in Redis (TTL: 10 minutes)
- Returns fresh data (~300ms)
6. App receives response and displays weather
Background Worker (runs every 10 minutes):
1. Iterate through TOP_CITIES list (100 cities)
2. For each city:
- Round coordinates
- Check if cache exists
- If expired, fetch from Open-Meteo
- Update cache
- Wait 100ms (to avoid rate limiting)
Result: Popular cities always have fresh cached data
Without Rounding:
- 10,000 users in New York City area
- Each at slightly different GPS coordinates
- Result: 10,000 unique API calls
With 0.1Β° Rounding (~11km grid):
- 10,000 users in New York City area
- All rounded to (40.7, -74.0)
- Result: 1 API call (cached for 10 minutes)
- Reduction: 99.99%
Scenario: 500,000 active users
Without Optimization:
- 500,000 users checking weather once/hour
- 8 hours average usage/day
- = 4,000,000 API calls/day
With Our Architecture:
- Users distributed across ~2,000 unique rounded locations
- Each location cached for 10 minutes
- 2,000 locations Γ 6 updates/hour Γ 24 hours
- = 288,000 API calls/day
- With pre-warming: ~5,000 API calls/day
- Reduction: 99.875%
Format: "{prefix}:lat:{latitude}:lon:{longitude}"
Examples:
- weather:lat:40.7:lon:-74.0
- forecast:lat:40.7:lon:-74.0:days:7
- rate_limit:hash:abc123defCurrent Weather: 600 seconds (10 minutes)
ββ Weather changes slowly
ββ 10 minutes is acceptable staleness
ββ Balances freshness vs API calls
Forecast: 600 seconds (10 minutes)
ββ Daily forecast rarely changes
ββ Can be extended to 30-60 minutes
Rate Limit: 60 seconds (1 minute)
ββ Rolling window
ββ Auto-expires
1. Extract client identifier:
- Custom header: x-client-id (device UUID)
- Fallback: IP address (x-forwarded-for)
2. Hash identifier (privacy):
- BLAKE3 hash of client ID
- First 8 chars used as key
3. Redis increment:
- Key: rate_limit:hash:abc12345
- Increment counter
- Set TTL: 60 seconds
- If counter > limit: return 429 Too Many RequestsLayer 1: Load Balancer (DDoS protection)
ββ 1000 req/sec per IP
Layer 2: Backend Rate Limiter (fair usage)
ββ 60 req/minute per client
ββ Enforced by Rust middleware
Layer 3: Redis Circuit Breaker
ββ If Redis fails, allow requests
ββ Graceful degradation
# redis.conf
# Maximum memory: 512MB for 500k users
maxmemory 512mb
# Eviction: Remove least recently used keys
maxmemory-policy allkeys-lru
# Persistence: AOF for durability
appendonly yes
appendfsync everysec
# Performance
tcp-keepalive 300
timeout 300
Per Cache Entry:
- Key: "weather:lat:40.7:lon:-74.0" (~30 bytes)
- Value: JSON response (~500 bytes)
- Total: ~600 bytes
Cache Capacity:
- 512MB / 600 bytes = ~850,000 entries
- Active cached locations: ~2,000
- Actual usage: ~1.2MB data + overhead
- Remaining: 500MB for rate limiting, etc.
Each backend instance:
ββ No local state
ββ Shares Redis connection
ββ Independent workers
ββ Can be added/removed dynamically
Load Balancer:
ββ Round-robin distribution
ββ Health check: /health endpoint
ββ Automatic failover
ββ Session-independent (no sticky sessions)
Auto-Scaling Rules:
Scale Up When:
- CPU > 70% for 5 minutes
- Memory > 80%
- Request latency > 500ms
Scale Down When:
- CPU < 30% for 15 minutes
- Traffic drops
- Minimum: 2 instances (high availability)
Max Instances: 20 (cost control)Scenario 1: Cache Hit (Hot Data)
ββ Redis lookup: ~1-5ms
ββ Network latency: ~20-50ms
ββ Total: 30-60ms
Scenario 2: Cache Miss (Cold Data)
ββ Open-Meteo API call: ~200-400ms
ββ Redis store: ~5ms
ββ Network latency: ~20-50ms
ββ Total: 250-500ms
Scenario 3: Pre-warmed Data (Most Common)
ββ Redis lookup: ~1-5ms
ββ Network latency: ~20-50ms
ββ Total: 30-60ms
Single Backend Instance:
- CPU: 2 vCPU
- RAM: 2GB
- Capacity: ~1,000 req/sec (cached)
- Capacity: ~50 req/sec (fresh API calls)
3 Backend Instances:
- Combined: ~3,000 req/sec (cached)
- With 99% cache hit rate
- Effective: ~2,970 cached + 30 fresh
- Daily capacity: ~250M requests
1. Transport Layer:
ββ TLS 1.3 encryption
ββ SSL certificate (Let's Encrypt)
2. Application Layer:
ββ Rate limiting
ββ Input validation (lat/lon ranges)
ββ Client ID hashing
3. Infrastructure Layer:
ββ Firewall rules (only 80, 443, 22)
ββ VPC/Private networking
ββ Redis password protection
4. Monitoring Layer:
ββ Request logging
ββ Anomaly detection
ββ Alert on unusual patterns
Performance:
ββ Native code (no VM overhead)
ββ Zero-cost abstractions
ββ Memory efficient (~20MB per instance)
Reliability:
ββ Memory safety (no segfaults)
ββ Type system prevents bugs
ββ Production-ready ecosystem
Concurrency:
ββ Async/await (tokio runtime)
ββ Handles 10k concurrent connections
ββ No GC pauses
Speed:
ββ In-memory data structure store
ββ Single-digit millisecond latency
ββ 100k+ ops/sec on commodity hardware
Simplicity:
ββ Built-in TTL support
ββ Atomic operations (INCR for rate limiting)
ββ Easy replication/clustering
Cost:
ββ Low memory footprint
ββ Can run on small instances
ββ Managed services available ($50/month)
Advantages:
ββ Free tier (10,000 calls/day)
ββ No API key required
ββ Reliable uptime
ββ Global coverage
Our Usage:
ββ ~5,000 calls/day (with caching)
ββ Well within free tier
ββ Can scale to paid tier if needed
- Cache-Aside Pattern: Check cache first, populate on miss
- Circuit Breaker: Graceful degradation if Redis fails
- Rate Limiting: Token bucket algorithm
- Pre-warming: Cache warming with background workers
- Coordinate Rounding: Spatial quantization for cache efficiency
- Stateless Services: Horizontal scalability
- Health Checks: Automatic failover detection
Application Metrics:
- requests_total
- request_duration_seconds
- cache_hit_rate
- cache_miss_rate
- rate_limit_exceeded_total
- open_meteo_calls_total
Infrastructure Metrics:
- cpu_usage_percent
- memory_usage_percent
- redis_connected_clients
- redis_keyspace_hits
- redis_keyspace_misses
Business Metrics:
- active_users
- requests_per_user
- popular_locations
- cost_per_million_requests- CDN Integration: Cache API responses at edge
- GraphQL API: Reduce over-fetching
- WebSocket: Real-time weather updates
- ML Pre-warming: Predict user locations
- Multi-region: Deploy closer to users
This implementation can easily handle 1M+ users with proper infrastructure scaling.