Relational Database Services (RDS)

RDS is a managed database service from AWS
It uses SQL and supports the following databases:

Postgres
MySQL
MariaDB
Oracle
Microsoft SQL server
IBM DB2
Aurora (AWS proprietary service)

Why use a managed service?

AWS manages the whole service for you, including deployment of databases onto infrastructure. Because it’s managed, you don’t have access to the underlying infra, so you can’t SSH into the DB servers.

Advantages of this:

Automated provisioning, OS patches
Backups
Monitoring dashboards
Scaling (horizontal and vertical)
Multi AZ (recovery)
Read replicas (performance)

Storage Auto Scaling

AWS can scale DB stores automatically depending on usage. For e.g when you are running out of DB space.

You have to set a Maximum Storage Threshold.

Auto scaling is good for apps with unpredictable DB operations## Storage Auto Scaling

Read Replicas VS Multi AZ

Read Replicas

Read replicas are DB replicas of the main DB that allow for more read operations (scalability).

They work by replicating the main DB and then allowing read operations to those replicas.

The data between the main DB and replica DBs is eventually consistent, meaning that they will eventually have identical data, but there is a chance a read operation to a replica will receive back outdated data.

Replicas can also be promoted to their own DB.

RDS replicas

Network Costs for replicas

There are network costs associated with replicas. If you replicate across a different AZ, a fee is incurred, else it’s free.

Multi AZ

Multi AZ RDS are DB instances that are on standby should something go wrong with the master DB.

This helps increase availability

Multi AZ RDS is good for database availability

There is just one DNS name needed and if there is network loss, instance of storage failure, read and write operations will be passed to the instance on standby:

Multi AZ RDS

There is no downtime associated when creating Multi AZ - you just modify the DB

Important to know that you can also use read replicas for Disaster Recovery (DS)

RDS Custom

While it’s been mentioned that RDS is a fully managed service, with two databases you do get OS access and can SSH into the instances:

Orcale
Microsoft SQL server

For these two DBs you can:

Configure settings
Install patchesEnable features
Access EC2 instance using SSH

RDS is a managed services, except for Oracle and MS SQL, which allow customisation and EC2 access

Amazon Aurora

This is Amazon’s proprietary database offering, it’s not open source. It:

Aurora works with bothPostgres and MySql.
Has 5x better perf than MySQL and 3x the perf of Postgres
Storage automatically grows in 10GB increments, up to 128TB
Can have 15 read replicas
Failover is instantaneous

Aurora availability and read scaling

Aurora created 6 copies of data across 3 AZs (diagram).

The storage is self healing and auto expanding.

One instance takes writes (the master) and the data is replicated across the instances, which can be used for read operations.

Aurora Cluster

Aurora has a:

Shared volume
1 master writer - one endpoint
5 readers - one endpoint that scale automatically

Advanced - Aurora replicas and auto scaling

If read endpoints receive much more traffic then the read replicas will autoscale:

Avanced - replicas and custom endpoints

By default, replicas share the same endpoint, but you can specify a custom endpoint should some of your replicas be unique, for example the DBs are larger or more powerful:

Advanced - Aurora serverless

Serverless option from AWS that:

automatically scales depending on usage
no storage planning needed
pay per second pricing model

Advanced - global availability

For global availability you can:

Create Aurora cross region availability using read replicas
Use an Aurora global database (recommended):

The global database has:

1 primary region
up to 5 additional read only regions
16 read replicas in the regions
helps decrease latency globally
replication takes less than 1s

In the exam when “cross region replication < 1s” is mentioned, it’s referring to Aurora Global

Advanced - Aurora Machine Learning

SQL integration with other AWS ML tools: SageMaker and Comprehend. It basically takes data from your Aurora tables and uses them to power ML tools. The data will be routed through Aurora:

RDS Backups

There are two ways to backup:

Automated Backups: daily full backups, can restore from any point up to 5 min before backup (so any time in the past up until 5min ago). 1-35 days retention
Manual DB Snapshots: manually triggered and retention as long as you want

To save money, instead of stopping a DB (you will still pay) you can create a snapshot and restore from there in the future

Aurora backups

It’s similar, with automated backups and manual DB snapshots. Automated backups cannot be turned off!

Restoring operations

Two options:

Restoring automatically from backup/snapshot. This will create a new DB.
Restore MYSQL / Aurora from S3. Involves creating a backup on S3 and then restoring from there. For Aurora you have to use Percona XtraBackup for this.

Aurora DB cloning

Creates new Aurora cluster from an existing one. You can do this to run DB testing, for example.

When the new cluster is created and new write operations are made to it, new storage is allocated only to the new DB cluster.

RDS Proxy

There is a fully managed proxy service with RDS. A proxy is a middle man between client applications and (in this case) a DB server.

Proxies help reduce load on DB servers by improving the performance, scalability and security of DB communication. They do this by: load balancing, pooling connections, caching, query optimisation etc.

The RDS proxy service:

Allows apps to pool and share DB connections
Improves DB efficiency by reducing stress on DB resources
Autoscales, multi AZ
Supports SQL (Postgres, mySQL, MariaDB etc)
Enforce IAM auth and store creds in AWS secrets manager
Proxy is never publicly acessible