Aurora

Amazon Aurora is a relational database service that's compatible with either MySQL or PostgreSQL (set at creation time) that can scale to cluster volume sizes of 128TiB. It's implemented as an RDS engine type.

Concepts

  • Clusters house one or more instances.
    • The primary cluster is permitted readers and writers.
    • Secondary clusters are permitted only readers, and may be deployed to other regions to increase read performance and enable rapid failover. Writes from secondary clusters are forwarded from the local region to the cluster in the primary region before being replicated back, simplifying management of endpoints within applications.
  • Instances have an associated role:
    • A single writer may accept writes at any time.
    • Readers

Performance

  • < 1m to accept read/write workload after region failure.
  • < 1s cross-country replication lag (using physical replication).
  • <= 200k writes/second with negligible performance impact.
  • Supports >= 15 read replicas, for scalability and availability

Design

  • Storage and compute decoupled to better take advantage of available capacity
  • Log-structured distributed storage layer
  • Six copies of the data stored across three availability zones
  • Write forwarding from secondary clusters to primary clusters (currently MySQL only!) allows applications to keep a single endpoint for all database operations. This option can be enabled on secondary clusters.
    • Three consistency levels:
      • Eventual
      • Session is required for write forwarding
      • Global (strongest)
    • No support for DDL operations, table locking, LOAD, XA, SAVEPOINT, ROLLBACK or UPDATE FROM TEMP TABLE statements.

Serverless

Aurora Serverless builds on Aurora to respond more quickly to application scaling: by scaling the compute layer on top of the storage more quickly. Scale out grabs a warm instance from a pool, while scale in freezes instances marked for removal (for < 1s to "a small number of seconds") to move session state to another instance before returning the instance to the pool. This happens transparently, without dropping sessions.

Amazon RDS provides managed databases of multiple engines backed by EC2:

  • Aurora
  • MariaDB
  • MySQL
  • Oracle
  • PostgreSQL
  • SQL Server

The platform automates common administration tasks, which take place during a weekly scheduled maintenance window to minimise impact to client applications:

  • Management of underlying hardware.
  • OS and database configuration.
  • Database engine (and operating system) updates.
  • Backups (both scheduled and ad hoc).

Concepts

  • Instances are individual managed database engine instances.
  • Clusters allow grouping read replicas with the primary instances.
  • Parameter Groups configure core features of the database engine. They're engine and version specific, and each instance must be associated with one.
    • Parameters configure individual values, and have an associated apply type that determines whether they can be applied at run-time or require a reboot.
  • Option Groups provide access to engine-specific add-on features. They're optional.
    • Options represent individual features.
      • Option Settings configure the feature.

Resizable capacity

Storage size limits differ by database engine and instance size:

  • MariaDB 20GiB-64TiB
  • MySQL 20GiB-64TiB
  • Oracle 20GiB-64TiB
  • PostgreSQL 20GiB-64TiB
  • SQL Server 20GiB-16TiB

Reserved Instances

Reserved Instances reduce the cost of instances based on a time commitment (30-60% reduction on On-Demand pricing).

Maintenance

Each DB instance has a weekly 30 minute maintenance window which can be used for AWS platform maintenance and deferred DB instance modifications.

Backups

DB instances can be backed up via EBS volume snapshots. The first snapshot is full, then subsequent snapshots are incremental.

Backups of the database can be taken during a set window, provided they're in the AVAILABLE state. Backups can be retained for 0-35 days. An RDS instance can be restored to any time during retention period.

Monitoring

The service can be monitored via CloudWatch:

  • Events are raised for state changes.
  • Database log files are made available through the RDS console, CLI and API.
  • Enhanced Monitoring provides real-time metrics for the host OS of an RDS instance via CloudWatch Logs. It can provide additional insight into the running processes and threads over the default CloudWatch metrics.
  • Performance Insights provides high level overviews of the performance of SQL queries.
  • Recommendations suggest carrying out maintenance operations such as engine upgrades, pending operations or enabling backups.

Networking

DB instances belong to a VPC (unless they're legacy instances, which really ought to be migrated), though public accessibility can be enabled. A multi-AZ RDS instance requires the VPC to contain at least two different subnets, each in a different AZ. DB subnet groups can be created to group together subnets within a VPC for easier assignment of RDS instances.

Security groups allow you to control traffic in and out of DB instances:

  • VPC security groups
  • DB security groups
  • EC2-Classic security groups

Scaling

  • Scaling up storage can happen online; scaling down requires downtime.
  • Storage type changes require a brief outage.
  • Instance class changes.
  • Out, with read replicas; maintained asynchronously.

Multi-AZ

Multi-AZ enhances durability and reliability through automatic failover to standby instances in different availability zones, updating DNS entries automatically. This effectively doubles the instance cost, since you also pay for a hot standby.

Replication to standby instances takes place synchronously, imparting a slight performance penalty on writes. For all engines but SQL Server, which uses its own built-in DBM functionality, the replication takes place via an AWS proprietary implementation. Backup operations will take place against one of the standby instances, offloading the performance workload.

Failover is automatic, triggered by regional availability issues, routine maintenance, and customer intervention (either via promotion of a read replica or reboot with failover to another AZ).

Billing

  • Hosting instance (instance type: size, on-demand, reserved).
  • Storage utilisation, both in terms of:
    • Storage (per GB, per month).
    • Provisioned IOPS at each tier.
    • IO (per million requests).
    • Backup storage.
  • Transfer to/from the Internet and other AWS regions.

Connectivity

A few best practices:

  • Always use DNS, not instance IP address, allowing transparent failover.
  • RDS instances can be queried either in cleartext or over an encrypted tunnel.
    • MySQL instances have both per-user (ssl_type) and global (skip_ssl, require_secure_transport) flags for this.
    • PostgreSQL instances have a single rds.force_ssl parameter.

References


Backlinks