James Peralta

Why You Need Databases

You need databases because you need to persist data. You can't rely on data sitting on the client or the application server—what if the server reboots, or the user closes their laptop or browser? Databases are the main way we persist data because their job is to store it and retrieve it efficiently. Every application has to store and retrieve data, and we usually do that with databases.

The two main types you'll work with are SQL and NoSQL. Neither is inherently better; each has its use cases. Pick the best one based on the problem.

SQL (Relational)

Traditional relational databases—PostgreSQL, MySQL, MariaDB—store data in tables with rows and columns. Think of it like Excel: structured, tabular. They're usually ACID compliant, meaning transactions are atomic, consistent, isolated, and durable.

Joins

Tables can be linked by keys (e.g., user_id in an orders table pointing to the users table). A join combines rows from two or more tables based on those relationships. For example, "get all orders with their user names" might join the orders table and the users table on user_id. This keeps data normalized (no duplication) but requires joins when you need related data.

Schema and Structure

Data is highly structured and must follow a schema you define upfront. You specify columns, types, and constraints. If you need to change the schema (add a column, change a type), you create migrations—explicit, versioned changes to the database structure.

Scaling

SQL databases are mostly vertically scalable: add more CPU, RAM, or disk to a single machine. Horizontal scaling (adding more machines, sharding) is possible but more challenging and time-consuming. Many production systems do it, but it's not as straightforward as with NoSQL.

NoSQL (Non-Relational)

NoSQL databases are non-relational. They emerged with a focus on scalability and availability for large, distributed systems. Again: not "better" than SQL—different trade-offs.

Characteristics

No joins. Data is often denormalized: you duplicate data across documents or records for faster reads. Instead of joining tables, you store everything you need in one place.
Less structured. Many NoSQL stores are schemaless—you don't define columns upfront. You can add new fields as you go. Documents are often stored as JSON-like structures.
Horizontally scalable. Many NoSQL databases are designed to scale out across many machines (sharding, replication) more easily than traditional SQL.

Types of NoSQL

Type	Description	Examples
Key-value	Store value by key; fastest for simple lookup	Redis, DynamoDB
Document	Store JSON-like documents; flexible schema	MongoDB, CouchDB
Column / wide-column	Store by column families; good for analytics	Cassandra, HBase
Graph	Nodes and edges; optimized for relationships	Neo4j, Neptune

Other Database Types

Beyond SQL and NoSQL, there are specialized databases for specific use cases:

Type	Use case	Examples
Graph	Modeling relationships—social graphs, recommendations, fraud detection	Neo4j, Amazon Neptune
Vector	Similarity search, embeddings, AI/ML—"find items similar to this"	Pinecone, Weaviate, pgvector
Time-series	Metrics, IoT, events—data indexed by time	InfluxDB, TimescaleDB
Search engine	Full-text search, fuzzy matching, faceted search	Elasticsearch, OpenSearch
In-memory	Caching, session store, real-time—ultra-fast reads	Redis, Memcached

These often sit alongside your primary SQL or NoSQL database for specific workloads. For example: Postgres for transactional data, Redis for cache, Elasticsearch for search.

Databases