14. Databases
Database refers to logical grouping of data that can support the electronic storage and manipulation of data from a computer system.
- Typically controlled by a database management system (DBMS)
Over past five decades, databases have evolved from flat file-based types to innovative relational and non-relational databases. Historically, databases have categorized into four groups:
- Flat file-based
- Consists of file systems with data maintained in files.
- Heirarchical file-based database
- Similar to a flat file-based system but the files share a parent-child relationship.
- Relational databases
- Manage data into tables
- Non-relational databases
- Customized databases
A. Relational Databases
(aka) relational database management system (RDMB), stores data in tables.
Tables use columns to hep define the information being stored and rows that hold actual data.
- Contains atleast column with unique values and that acts as the
primary key
- When a table’s primary key is used in a different table, the column in the second table is called a
foreign key
.
- When a table’s primary key is used in a different table, the column in the second table is called a
SQL
- Structured query language is used to interact with relational databases. It is a standard language for relational database management systems and supports data retrival, query, and manipulation.
Examples:
- MySQL
- Oracle
- Microsoft SQL Server
- PostgreSQL
- MariaDB
Advantages
- Data accuracy - usage of primary and foreign keys help connect and identify databases
- Simplified model
- Easy access to data
- Normalization
I. When to use a relational database
- Preferred option for house data that contains a fairly strong structure with rows and columns
- Prime candidates - Include data points with a consistent meaning that can be placed into categories and that have relationships
- Better choice in scenarios in which repeated data analysis will result in a need to constantly query specific data cross sections.
B. Non-relational Databases
Non-relational databases contain data stored in a non-tabular format and commonly use data structures such as documents or objects.
Two most common types of non-relational databases are:
- Document-based
- Stores data in documents
- Supports variety of data types, such as strings, numbers, arrays, and objects
- Key-value
- Stores data in key-value pairs
Examples:
- MongoDB
- Amazon DynamoDB
- Redis
- Cassandra
- ETCD
- Google cloud Firestore
Advantages
- Simple data management
- Greater readability
- Enhanced ability
- Open-source options
- High-performance
- Better scalability
I. When to use a non-relational database
- Suitable for use cases in which there is a large amount of data that relates to a single topic
- Audience segments
- Unified customer profiles
- Industry-wide trend data
- Application databases
- Large collections of images, text or other data
- When data that is stored needs to be flixble in terms of size or shape