DMS

AWS DMS provides a gateway facilitating migration between different relational databases, data warehouses, NoSQL databases and other data stores. It can migrate between combinations of both hosted RDS and on-premise instances.

Concepts

  • A Replication Instance handles the extraction from the source, formatting it for and loading it into the target. This is a managed EC2 instance.
    • Each server can run one or more Replication Tasks.
    • Each task has two associated Endpoints: a source and target. These can be different database engines, e.g. an Oracle source can be migrated to a PostgreSQL target.

Phases

Replication tasks run through three major phases:

  • Full load of existing data, table by table, during which time changes made to the tables currently being loaded are captured and cached on the replication server.
  • Upon completion of each table's full load, cached changes are applied. Once complete, tables are transactionally consistent.
  • Ongoing replication begins when cached changes are applied, typically with some degree of lag due to the backlog of transactions. Once the migration reaches a steady state applications can be pointed at the target database.

Each task can be configured to perform either a full load, full load and online replication, or just online replication (where the initial full load has to be completed separately).

Schema objects migrated

DMS creates only the target schema objects necessary for the migration (tables, primary keys, and some unique indexes). Secondary indexes, non-primary key constraints and default values are not migrated.

Migrate these with the database engine's native tools if homogenous, else use the SCT.

Instance classes

Lower tier instance classes include 50GB of storage, and larger ones include 100GB. This may not be enough for high-throughput environments, as data generated by CDC will be buffered to disk where the replication instance or the target is unable to keep pace.