James Peralta

Why Scale?

Scalability is the capability of a system, process, or network to grow and manage increased demand. As demand increases, can your system handle it? If it can, it's scalable. Any distributed system that can continuously evolve to support a growing amount of work is considered scalable.

This is a good problem to have—you want your system to grow to the point where it needs to support more users. And when that happens, it needs to be scalable.

The Single-Server Problem

Think back to the original single-server setup. If you have a lot of clients sending requests to one server, that server will become overloaded. It won't be able to serve all the requests. Latency and response times will go up, causing a bad experience for users. To keep latency and response times low, you need to scale your server.

There are two main ways to do that: vertically and horizontally.

Vertical Scaling

Vertical scaling (scale up) means making your existing server stronger. Add more CPU, more RAM—throw more compute at it. On the extreme end, you could have a supercomputer as your server.

Pros

Easier to do. In the cloud, you can often just flip a switch and move to a stronger instance. No architectural changes.
Simpler. One server, no coordination between machines.

Cons

You can only make a computer so strong. Eventually you hit a ceiling. The stronger it gets, the more expensive it is.
Specialized parts and expertise. High-end hardware is harder to source and maintain. You need specialized people to run it.
Single point of failure. If you have one server and it goes down, your whole system goes down. Making changes (upgrades, maintenance) usually means downtime.

Horizontal Scaling

Horizontal scaling (scale out) means adding more servers instead of making one server stronger. Distribute the load across multiple machines.

Pros

Commodity hardware. You can use many weaker, cheaper machines instead of one supercomputer. Standard parts, no specialized equipment.
No single point of failure. If one server goes down, traffic can be routed to the others. You can avoid downtime during maintenance or failures.
Higher ceiling. You can keep adding servers as demand grows (within reason).

Cons

Coordination. You now have to coordinate between multiple servers. Load balancing, data consistency, communication—these become problems you need to solve. We'll get to that.

Summary

	Vertical	Horizontal
Approach	Make the server stronger	Add more servers
Ease	Easier, often a config change	Harder, requires architecture
Failure	Single point of failure	Distributed, more resilient
Ceiling	Limited by hardware	Can add more machines
Trade-off	Simplicity vs. limits	Complexity vs. scale

Vertical Scaling versus Horizontal Scaling

Why Scale?

The Single-Server Problem

Vertical Scaling

Pros

Cons

Horizontal Scaling

Pros

Cons

Summary

Databases

Load Balancer