<h1 id="amazon-redshift">Amazon Redshift<a aria-hidden="true" class="anchor-heading icon-link" href="#amazon-redshift"></a></h1>
<p>Redshift is a data warehouse designed to hold data used for analysis. It uses a PostgreSQL-like engine, offering native redundancy. Scaling up or out is possible with user intervention</p>
<h1 id="concepts">Concepts<a aria-hidden="true" class="anchor-heading icon-link" href="#concepts"></a></h1>
<ul>
<li><strong>Clusters</strong> represent individual data warehouses.</li>
<li><strong>Databases</strong> house the data.</li>
<li><strong>Users</strong> are granted access to databases. By default only an initial administrative user is created.</li>
<li><strong>Schemas</strong> provide namespaces within a database, and can in turn contain tables, views and functions.</li>
<li><strong>Data shares</strong> allow sharing live data across Redshift clusters, <abbr title="Amazon Web Services">AWS</abbr> accounts, and <abbr title="Amazon Web Services">AWS</abbr> regions (preview).</li>
</ul>
<h1 id="spectrum">Spectrum<a aria-hidden="true" class="anchor-heading icon-link" href="#spectrum"></a></h1>
<p>Spectrum allows running queries directly against external data (from files in <a href="/notes/0e7fb4d2-045e-45d7-a27a-adc8580d02ea">S3</a> buckets). Queries are executed in separate Redshift servers independent of the cluster, allowing compute-intensive operations to be offloaded to Spectrum (to potentially thousands of nodes) to save on cluster resources.</p>
<p>To use Spectrum, an external schema must be created to house the external tables. External table schema can be sourced either from <a href="/notes/4b0e7107-7505-4336-9830-311d1d30c68d">Athena</a> or the <a href="/notes/37d68ae2-6377-4be5-9c75-e005ba298b9b">Glue</a> data catalog.</p>
<p>Spectrum databases may be viewed in <a href="/notes/4b0e7107-7505-4336-9830-311d1d30c68d">Athena</a>.</p>
<h1 id="federated-queries">Federated queries<a aria-hidden="true" class="anchor-heading icon-link" href="#federated-queries"></a></h1>
<p>Federated queries allow querying live data from <a href="/notes/df4cac1d-19bb-47a2-96c3-ed7850913740">RDS</a> and <a href="/notes/94aaaaeb-fde1-4486-b142-b0ffd77387f9">Aurora</a> without copying it.</p>
<h1 id="limits">Limits<a aria-hidden="true" class="anchor-heading icon-link" href="#limits"></a></h1>
<ul>
<li>8PB of storage</li>
</ul>
<hr>
<strong>Backlinks</strong>
<ul>
<li><a href="/notes/e5640e01-6e06-4daa-929d-fa5ddbaeb82e">Data Analytics (public)</a></li>
<li><a href="/notes/410a428a-4828-4e82-8d84-a3f3279635c6">Data Firehose (public)</a></li>
</ul>

Amazon Redshift


![Digital brain](assets/images/logo.svg){display: block, margin: 0 auto, max-height: 50vh}

Hi, I'm Luke 👋

I'm a software engineer turned site reliability engineer. I'm currently big on knowledge management and learning.

# Find me elsewhere

- [👨‍💻 DEV](https://dev.to/lukecarrier)
- [🚢 GitHub](https://github.com/LukeCarrier)
- [👔 LinkedIn](https://www.linkedin.com/in/lukecarrier/)
- [📄 Résumé](https://github.com/LukeCarrier/resume)