From 53 to 2 Million Locations
Executive Review • March 2026
Recommendation: Implement a Tiered Aggregation Architecture that pre-computes data at 4 levels of detail, reducing storage by 98% while maintaining sub-200ms response times.
Current PostgreSQL database with 53 locations cannot handle 2 million records with spatial queries.
Scale PostgreSQL vertically, add indexes, optimize queries
Verdict: Fails at >100K locations. Query times exceed 2 seconds.
Only load data visible in current map viewport
Verdict: Works but creates "loading holes" when panning. Complex implementation.
Pre-aggregate data into 4 tiers by zoom level (World → Country → City → Street)
Verdict: Predictable performance, 98% storage reduction, clean architecture.
| Data Tier | Record Count | Storage | Latency |
|---|---|---|---|
| Tier 1 (World) | 200 | ~50 KB | <10ms |
| Tier 2 (Country) | 5,000 | ~2 MB | <20ms |
| Tier 3 (City) | 100,000 | ~40 MB | <50ms |
| Tier 4 (Street) | 2,000,000 | ~1 GB | <100ms |
| Redis Cache | Hot data + tiles | ~5 GB | <5ms |
| Total | 2.1M records | ~12 GB |
98% reduction vs. naive approach (would require 600GB+)
| Metric | Target | Current | Status |
|---|---|---|---|
| Global zoom (Tier 1) | < 50ms | N/A | On Track |
| Country zoom (Tier 2) | < 100ms | N/A | On Track |
| City zoom (Tier 3) | < 150ms | N/A | On Track |
| Street zoom (Tier 4) | < 200ms | N/A | On Track |
| Cache hit rate | > 80% | 0% | On Track |
| Image load time | < 2s | 5-10s | On Track |
Database schema design, PostgreSQL partitioning, Redis setup
Build tiered aggregation pipeline, data transformation scripts
Refactor location endpoints, implement tier selection logic
Implement lazy loading, CDN integration, URL-only storage
Batch sync system, rate limiting, error handling
Load testing, performance validation, production rollout
| Component | Spec | Cost/Month |
|---|---|---|
| PostgreSQL | 4 vCPU, 8GB RAM, 100GB SSD | $60 |
| Redis Cache | 6GB memory, clustered | $40 |
| Application Server | 2 vCPU, 4GB RAM | $35 |
| CDN (Wikipedia) | Free tier via Wikimedia | $0 |
| Bandwidth | ~500GB/month estimated | $20 |
| Monitoring | Datadog/DataDog equivalent | $30 |
| Total | $185/month |
Estimated first-year cost: $2,220 + $5,000 one-time development
| Risk | Severity | Mitigation |
|---|---|---|
| Wikipedia API limits | HIGH | Implement exponential backoff, cache aggressively, request API key increase |
| Aggregation complexity | MEDIUM | Automated testing, gradual rollout, fallback to raw data |
| Cache invalidation | MEDIUM | Time-based TTL, versioned cache keys, manual flush capability |
| Image availability | LOW | Graceful degradation, placeholder images, fallback to description |
| Performance regression | MEDIUM | A/B testing, feature flags, rollback strategy |
Begin database schema migration, set up Redis cluster, establish monitoring baseline
• Finalize technical specification
• Request Wikipedia API key increase
• Set up staging environment
Review Phase 1 results after Week 2. Proceed to Phase 2 if performance targets met.
ExploreIt Scalability Architecture Proposal
From 53 to 2 Million Locations
Use arrow keys or buttons to navigate • ESC to exit