Logo_Squid_blanc_600x250_

[Redshift Week] Maintenances and Performance

Vaccum, analyze and backup.

Vaccum

  • Redshift handles record modification like Postgres
    • Tuples that are deleted or obsoleted by an update are not physically removed from their table;
    • They remain present until a VACUUM is done.
    • Therefore it’s necessary to do VACUUM periodically, especially on frequently-updated tables (upsert operations).
  • Vacuum on Redshift has a specific option: sort only.
    • That made to re-order sorted tables according to their key.
  • We recommend to
    • Use vacuum (delete only) upserted tables, every loading
    • Use vacuum (sort only) sorted append only tables every loading (copy)
    • Add to insert statements ” order by sort key” on sorted tables to avoid vacuum
    • Use vacuum (full) upserted tables once a month if possible (it may be too costly)

 

Analyze

  • Analyzing data is the operation of computing table & columns statistics for the query planner.
  • Analyzing tables is mandatory on any new table & at least once a week (ex fact tables) or best every loading (dimension data).

 

Backup

  • We recommend to put in place a batch job to backup the database with the following policy:
    • Keep one daily snapshot
    • Keep 4 last weekly snapshot
    • Keep last 1st of last month snapshot
  • AWS API allows to perform such operations (Java, python, ruby).