[Redshift Week] Principles
Let’s have an overview on Amazon Redshift Principles : MPP/Shared nothing architecture, cluster, distributed computing.
Since last week we are official Technology Partner with the Amazon Web Service Partner Network. It is a clear occasion to explain how we work with Amazon Redshift on a Tech point of view.
Julien Theulier, Data Plumber in chief at Squid, has done a brilliant work explaining RedShift. As it is a long and technical Blog Post we are going to post a piece of the analysis every day.
Let’s have an overview on Amazon Redshift Principles:
MPP / Shared Nothing Architecture:
Key concept I : Cluster
Manage communications between clients and compute nodes, compute execution plan from SQL, distribute data and execution code to nodes, aggregate results.
Each compute node has its own dedicated CPU, memory, and attached disk storage. Execute code on distributed data and send result back to leader.
Key concept II : Distributed Computing
A compute node is partitioned into slices; one slice for each core of the node’s multi-core processor. The leader node manages distributing data to the slices. The slices then work in parallel to complete the operation.
Data are distributed to slices according to the TABLE distribution policy (column, random, all).
Next blog post tomorrow – stay tuned!