Data Firehose

Kinesis Firehose provides a fully-managed service for capturing, transforming and loading streaming data into AWS in real-time. The service scales to match throughput of the data, and requires no management.

Concepts

  • Data producers provide data records.
  • Data records are payloads of data up to 1,000KB.
  • Destinations receive the transformed data. These can be inside AWS (ES, Redshift, or S3) or any external provider that supports the HTTP (Private) API.
  • Transformations can be used to mutate the data before it's written to the destination.

Buffering

Incoming data records are buffered by both a buffer size (in MB) and a buffer interval (in seconds) before being delivered to destinations to facilitate batching.

Encrypting data

Data held in a delivery stream can be encrypted at rest via SSE. Note that enabling encryption won't encrypt existing messages already in the stream, only newly added ones.

Transformations

Buffers of data records (up to 3MB by default) can be batched to a Lambda function for transformation. The data is then relayed via Firehose to the destination.

Records for which transformation fails can be sent to an S3 bucket. Source records can also be retained in an S3 "backup" bucket.


Backlinks