Protocol Buffers

Protocol buffers offer an alternative to XML (Private) and JSON (Private). They're designed to offer a a bunch of advantages:

  • Performance, with rapid serialisation and deserialisation.
  • Efficiency, in terms of network bandwidth with a binary format that separates name from context (schema).
  • Versioning through native deprecation warnings and field reservations.

Concepts

  • Definitions declare the schema.
  • Messages contain the data and references to the fields declared in a Definition.
  • Fields define values on Messages.
  • Packages namespace Messages.

Versions

  • proto1 was used within Google, and not visible to the wider community.
  • proto2 is the current stable release.
  • proto3 is in-development.

Definitions

Definitions outline the schema of objects sent in the protocol buffer format.

They're stored in *.proto files:

syntax = "proto3";

message Person {
  enum Role {
    REGULAR = 0;
    MODERATOR = 1;
    ADMIN = 2;
  }

  string name = 1;
  repeated string email_addresses = 2;
  Role role = 3;
}

Members

Members are comprised of a number of fields.

Fields

Fields are comprised of:

  • Rules, which define presence constraints:
    • required (removed in proto3).
    • optional (removed in proto3; the default if no rule is specified).
    • repeated allows multiple values (list/vector of type).
  • Types declare the types of values:
    • Scalar:
      • bool
      • bytes
      • float
      • double
      • uint64
      • string (default to "").
    • Enumeration types (default to 0).
    • Message types
    • oneof allow union types -- only one property within the block can be set.
    • map<K, V> defines a string or integer indexed map of values.
  • Names should be lowercase, with words delimited by underscores. Language conventions will be applied to getters and setters.
  • Field numbers set integer identifiers for use in messages to relate data back to the type. They must be unique within each type, within the range (1 - 2^29) - 1 and outside of the range 19,000-19,999.
  • Field options can be specified in [this = "syntax"] at the end of the line.
    • proto2 allowed setting default values overriding the type defaults, but this is no longer possible in proto3.
    • json_name defines the name in a JSON representation of a message.

Messages can be declared either globally or tested within a parent message.

Any type

The google.protobuf.Any type can stand in for any type. To use it, import google/protobuf/any.proto.

oneof

The oneof type allows a union of multiple types:

message OneOf {
  oneof {
    string guid = 1;
    string username = 2;
    string email = 3;
  }
}

RPCs

An RPC defines an interaction between a client and a server. They're named, and define a request and response message:

service Greetings {
  rpc Greet(GreetRequest) returns GreetResponse) {}
}

Enumerations

Enumerations can be declared either globally or within a message. Their values must all be integers.

Imports

The import statement allows importing messages defined in other *.proto files:

import "somefile.proto";

Packages

Packages provide namespaces, to avoid cluttering the global namespace with unrelated concerns.

package myapp.entities;

Options

Options define values related to the export process:

  • java_package defines the package namespace used for Java exports.

Versioning

Protocol buffers are designed to solve some of the pain points of API versioning.

Reservations

Fields can be reserved by both name and field number, either for future use or to signal that the fields were previously used and should not be reused:

message Reservations {
  reserved 1;
  reserved 2, 3, 4;
  reserved 5 to 10;
  // or
  reserved 1, 2, 3, 4, 5 to 10;

  reserved "id";
}

Deprecations

Fields you're in the process of winding down can be marked as deprecated:

message Deprecation {
  bool gendered_bathrooms = 1 [deprecated=true];
}

Compatible field types

Compatible field types allow you to change types without having to retire a field and create a new one:

 message CompatibleTypes {
-  bytes data = 1;
+  string data = 1;
 }

The following groups are interchangeable:

  • Integers: bool, int32, int64, uint32, uint64
  • Signed integers: sint32, sint64
  • Bytes: string, bytes
  • Fixed size integers: fixed32, fixed64, sfixed32, sfixed64

Rules of thumb

In general:

  1. Avoid changing field numbers for existing fields.
  2. When removing obsolete fields, reserve both name and number to cover both JSON and binary forms.
  3. Compatible field types allow type changes:
  4. Prefix old field names with OBSOLETE_

Building

To build all *.proto files in the current working directory for JavaScript:

protoc --<lang>_out=target/js *.proto


Backlinks