DB Scaling: Replication or Sharding?
Your startup took off. But joy turns into panic: the database is "suffocating". CPU is maxed out, queries hang for 5 seconds. Just "adding RAM" doesn't help anymore. It's time to choose the architectural pill: Replication or Sharding?
The Architect's Dilemma
Choosing a strategy depends on exactly where your "bottleneck" is: in reading (Read) or writing (Write).

Fig 1. Left: Master-Slave Replication. Right: Horizontal Sharding.
1. Replication: Scaling Reads
Essence: You have one "Boss" (Master) who accepts all changes, and many "Subordinates" (Slaves) who only serve data.
- When to use: 80-90% of the load is reading (Read-heavy). Typical for media, blogs, e-commerce catalogs.
- Pros: Easy to configure (PostgreSQL Streaming Replication, MySQL Binlog). Data is duplicated (backup).
- Cons: Replication Lag. You wrote data to Master, but it appears on Slave after 100ms. The user might not see their comment immediately.
2. Sharding: Scaling Writes
Essence: Master can't cope with writing. We cut the database into pieces. Users A-M go to Server 1, N-Z to Server 2.
- When to use: Data is so large it doesn't fit on one disk. Or when one Master can't keep up with writing (Write-heavy).
- Pros: Theoretically infinite scaling.
- Cons: It hurts. You lose ACID transactions between shards. You lose JOIN (how to join a table from Server 1 and Server 2?). Backups become a nightmare.
NineLab Verdict
Golden Rule: Postpone sharding until the last moment. It's a "nuclear button".
First — indexes. Then — caching (Redis). Then — replication. And only if you have traffic level of Telegram or Uber — sharding. Don't complicate architecture prematurely.
Related services
FAQ for this topic
Traffic shape and data rarely match prod. You need scenarios, the same metrics as prod, and gradual ramp with rollback.
Often DB/query plans, connection pools, synchronous external calls, and queues are the first suspects for a quick checklist.
Not necessarily: invalidation, cold starts, and key skew can hurt. Cache is designed around read models and SLOs.
When vertical scaling and query tuning hit a ceiling and data growth is predictable along a shard key.
Want to apply this in practice?
Tell us about your system — we’ll propose a work plan and the metrics worth fixing in an SLA/SLO.
Related articles
Excel Isn't Enough Anymore: 5 Signs Your Business Needs a Custom App
Clear signs your company has outgrown spreadsheets: accounting mistakes, chat-based approvals, lost requests, and no end-to-end visibility. Learn when it’s time to automate business processes and build an internal web app (portal, customer cabinet, ticketing workflow) that fits how your team actually works.
Read ArticleHow to DIY stress test your website and know when it will crash
Instructions on testing your site yourself: basic tools (k6, Apache Benchmark), common pitfalls, and a detailed breakdown of why online stores fall during ad campaigns.
Read ArticleSaaS Platform Development: Why Writing Code Is Only Half the Battle
The full cycle of SaaS product creation: from architecture design to server configuration for thousands of users. Why 90% of startups fail not because of code, but because of infrastructure.
Read ArticleHigh-Load System Architecture: Handling a Million Requests per Second
Breaking down the principles of building systems that don't fail under load: horizontal scaling, load balancers, caches, and queues.
Read Article