Popular SQL database engines include MySQL, PostgreSQL, SQLite, and MariaDB.
Here’s a simple example:
CustomerID & OrderID are primary keys, while CustomerID serves as a foreign key in Orders Table
Here’s the MySQL code to generate the Customers and Orders Tables:
You can perform CRUD Operations on these tables
CREATE: Add new data
READ: Retrieve data
UPDATE: Modify existing data
DELETE: Remove data
You can also Join two different tables, to join tables, they must share some common key:
NoSQL(Document Model)
Popular NoSQL database engines include MongoDB, Apache Cassandra, Redis, Amazon DynamoDB, and Neo4j.
There are many types of NoSQL databases, first, look at Document Stores.
Document Stores
Common Document Stores database engines include MongoDB and CouchDB.
Document stores save data in flexible, JSON-like documents. Each document can have a different structure.
A collection is a grouping of documents. Documents within a collection can have different fields. A collection is the equivalent of a table in a relational database system. A collection exists within a single database.
One can perform CRUD operations (ex: **db.collection.find(), db.collection.find())** in a Document Store DB, but native join operation is not supported, that is to say, the application code has to emulate the join operation with multiple queries.
Key-Value Stores
The most popular Key-Value Store is Redis.
Key-value stores are simple databases that store data as a collection of key-value pairs.
This type of DB serves as a cache in RAM to other persistence DB, while the data capacity is limited, the speed is ultra-fast.
Wide-Column Stores
Popular options include Cassandra, HBase.
Wide-Column Stores is just like adding another layer on top of Key-Value Stores, essentially it is a 2-dimensional key-value store, which makes it more efficient to query data compared to the full table scan required in SQL.
Here’s an example of data stored in Cassandra.
Cassandra provides Cassandra Query Language (CQL) for user to query data, it looks like SQL, but does not support Join.
Graph Databases
Popular options include Neo4j, Amazon Neptune.
A graph database is a systematic collection of data that emphasizes the relationships between the different data entities.
When to use this kind of DB? When many-to-many relationships are very common in your data.
Eventual consistency or BASE (Basically Available, Soft state, Eventually consistent)
Scalability
Vertical scaling (scale-up by increasing hardware capacity)
Horizontal scaling (scale-out by adding more servers)
Flexibility
Less flexible; changes to schema can be complex and disruptive
Highly flexible; easy to add new types of data or change schema
Data Relationships
Well-suited for complex relationships using joins
Less efficient for complex relationships; better suited for hierarchical or flat data
Examples
MySQL, PostgreSQL, Oracle, SQL Server
MongoDB, Cassandra, Redis, Neo4j (graph database)
The reason that it is more difficult to scale a SQL DB is because they typically promise strong ACID.
SQL(declarative query) has an internal optimizer, which is usually better than imperative query APIs.
The fact that SQL is more limited in functionality gives the database much more room for automatic optimizations.
Imperative code is very hard to parallelize across multiple cores and multiple machines because it specifies instructions that must be performed in a particular order.
Join operation is more costly in NoSQL
Does Join matters? Data has a tendency of becoming more interconnected as features are added to applications, and at that point, Join does matter.
MapReduce
Google’s MapReduce lies somewhere between Declarative Query and Imperative Query. Its “map” and “reduce”, where both have to be pure functions, which enable “map” and “reduce” to be run in any order and in anywhere.