Midas

Midas, or MPM, is Google's internal package manager. It stores package metadata in Bigtable and data in Colossus. Transport of packages is peer to peer.

Packages

Packages comprise:

  • Contents.
  • A secure hash represents each unique version's ID.
  • Signatures, for verification and auditing.
  • Labels
    • canary
    • live
    • release_candidate_xxx
    • production=yyyy_mm_dd_xx
  • Pre-packaging commands
  • Optional pre- and post-installation commands.

Creation

Packages are created from a definition file, generated by the build system, including:

  • A list of files included in the package.
  • Ownership, and file modes.
  • Post-install and pre-uninstall commands.

Creation is idempotent, and the resulting package is immutable, though it is possible to modify the applied labels.

Labelling

Packages can be pulled using labels in addition to names, allowing code promotion by simply re-assigning a label. Labels can also be used to identify where in the development lifecycle a version of a package is.

File groups

File groups allow subdividing files within a package into different group which may be fetched individually.

Encryption

Individual files can be encrypted within a package, with ACLs determining who can decrypt files. MPM servers aren't able to read encrypted data -- all operations happen locally.

Signing

Packages can be signed either at build time or after the fact. A secure key escrow services generates signatures using the package name and metadata (including checksums of included files).

Signatures can later be verified using the package name and expected signer. Signed metadata is re-verified against package contents.

Pruning

Packages are garbage collected based on a durability level:

  • Test retains packages for 3 days.
  • Ephemeral packages live for 7 days, saving on storage resources whilst being ideal for short-lived content like configuration files.
  • Durable packages live for 3 months since their last use.

Distribution

Distribution is pull-based, avoiding network congestion and allowing job owners to opt into receipt of new versions. This requires additional logic in the job to seek out new versions, or the ability to restart affected jobs easily. It can be difficult to identify which jobs use which versions.

Retry in the event of an unreachable local root server is left to the client, which selects another root server based on geographic location.

Package data is stored in a two-tier architecture, with frequently used packages cached locally. Replication and distribution use a torrent-like protocol.

Organisation and indexing

Package namespace is hierarchical.

Permissions

ACLs created at any level of the hierarchy can be inherited by a point below. The system considers three levels of access:

  • Owners can create and delete packages, modify labels and alter ACLs.
  • Builders can create packages and add and modify labels.
  • Labellers can add/modify specific labels.

Management

  • mpmdiff allows comparing any two packages, including most attributes.
  • Web interface lets users browse MPM packages, displaying metadata.

Backlinks