Why You Need Databases
You need databases because you need to persist data. You can't rely on data sitting on the client or the application server—what if the server reboots, or the user closes their laptop or browser? Databases are the main way we persist data because their job is to store it and retrieve it efficiently. Every application has to store and retrieve data, and we usually do that with databases.
The two main types you'll work with are SQL and NoSQL. Neither is inherently better; each has its use cases. Pick the best one based on the problem.
SQL (Relational)
Traditional relational databases—PostgreSQL, MySQL, MariaDB—store data in tables with rows and columns. Think of it like Excel: structured, tabular. They're usually ACID compliant, meaning transactions are atomic, consistent, isolated, and durable.
Joins
Tables can be linked by keys (e.g., user_id in an orders table pointing to the users table). A join combines rows from two or more tables based on those relationships. For example, "get all orders with their user names" might join the orders table and the users table on user_id. This keeps data normalized (no duplication) but requires joins when you need related data.
Schema and Structure
Data is highly structured and must follow a schema you define upfront. You specify columns, types, and constraints. If you need to change the schema (add a column, change a type), you create migrations—explicit, versioned changes to the database structure.
Scaling
SQL databases are mostly vertically scalable: add more CPU, RAM, or disk to a single machine. Horizontal scaling (adding more machines, sharding) is possible but more challenging and time-consuming. Many production systems do it, but it's not as straightforward as with NoSQL.
NoSQL (Non-Relational)
NoSQL databases are non-relational. They emerged with a focus on scalability and availability for large, distributed systems. Again: not "better" than SQL—different trade-offs.
Characteristics
- No joins. Data is often denormalized: you duplicate data across documents or records for faster reads. Instead of joining tables, you store everything you need in one place.
- Less structured. Many NoSQL stores are schemaless—you don't define columns upfront. You can add new fields as you go. Documents are often stored as JSON-like structures.
- Horizontally scalable. Many NoSQL databases are designed to scale out across many machines (sharding, replication) more easily than traditional SQL.
Types of NoSQL
| Type | Description | Examples |
|---|
| Key-value | Store value by key; fastest for simple lookup | Redis, DynamoDB |
| Document | Store JSON-like documents; flexible schema | MongoDB, CouchDB |
| Column / wide-column | Store by column families; good for analytics | Cassandra, HBase |
| Graph | Nodes and edges; optimized for relationships | Neo4j, Neptune |
Other Database Types
Beyond SQL and NoSQL, there are specialized databases for specific use cases:
| Type | Use case | Examples |
|---|
| Graph | Modeling relationships—social graphs, recommendations, fraud detection | Neo4j, Amazon Neptune |
| Vector | Similarity search, embeddings, AI/ML—"find items similar to this" | Pinecone, Weaviate, pgvector |
| Time-series | Metrics, IoT, events—data indexed by time | InfluxDB, TimescaleDB |
| Search engine | Full-text search, fuzzy matching, faceted search | Elasticsearch, OpenSearch |
| In-memory | Caching, session store, real-time—ultra-fast reads | Redis, Memcached |
These often sit alongside your primary SQL or NoSQL database for specific workloads. For example: Postgres for transactional data, Redis for cache, Elasticsearch for search.