Among the several factors that may affect the Amazon Redshift billing in an AWS account, it is worth of considering:
A. Regions:
Depending on the region, the same type of node may significatively change of price (price per hour, in American Dollars):
An inadequate region may create an unnecessary overcost in the Redshift cluster.
B. Types of Nodes
When creating a cluster for Redshift, it is important to choose the correct type of node to use (prices for October 4th, 2.018):
Redshift has 2 types of nodes (https://aws.amazon.com/es/redshift/pricing):
- Dense Compute (dcX.XXXX): from 30% to 60% cheaper than Dense Storage, optimized for faster queries, and generally recommended for data sets not larger than 500GB.
- Dense Storage (dsX.XXXX): more expensive than Dense Compute, but optimized to store large data sets, it is usually recommended for data sets larger than 500GB.
C. Snapshots
Let’s suppose we have a dc2.large cluster working all the time in the N. Virginia region (cost per hour = $0.25). In an ordinary month (30 days) there are 720 hours, which would result in a US$180 monthly billing. But, what if we could have this same cluster working ONLY during the working hours (8 hours per day, 5 days per week, 4 weeks per month)?:
It would represent a 78% savings!
However, Redshift doesn’t allow to stop and resume a cluster. The alternative process could be, when finishing every working day:
- Create a snapshot of the cluster
- Delete the cluster
- And before starting every working day, create a cluster with the snapshot generated.
The next command deletes a cluster, generating before a snapshot:
aws redshift delete-cluster –cluster-identifier motest –final-cluster-snapshot-identifier motest-daily-snapshot
While the snapshot is being generated:
The cluster will remain active, but once the snapshot is complete:
The cluster deletion will begin:
To restore the cluster, use this commnad:
aws redshift restore-from-cluster-snapshot –cluster-identifier motest –snapshot-identifier motest-daily-snapshot
This process can be monitored from AWS console:
Until the cluster is completely restored:
These commands can be executed as administrative tasks, depending on the operating system, for Windows (through the “Task Scheduler”) or for Linux (using crontab).
For Windows, we can create a PowerShell script with a content similar to this one:
aws configure set AWS_ACCESS_KEY_ID xxxx
aws configure set AWS_SECRET_ACCESS_KEY yyyy
aws configure set default.region zzzz
aws redshift delete-cluster –cluster-identifier aaaa –final-cluster-snapshot-identifier bbbb
Where:
- xxxx = access key ID
- yyyy = secret access key
- zzzz = región del cluster
- aaaa = nombre del cluster
- bbbb = nombre del snapshot
Both access key ID and secret access key correspond to a user with enough permissions to run the commands via AWS CLI:
The rest of the process is similar to the creation of a normal task for Windows. However please keep in mind that the file .ps1 must be considered as argument of the task:
And the program must be powershell.exe
The task to restore the cluster may be created by following a process similar to the previous one, but this time the content of the script must be:
aws configure set AWS_ACCESS_KEY_ID xxxx
aws configure set AWS_SECRET_ACCESS_KEY yyyy
aws configure set default.region zzzz
aws redshift restore-from-cluster-snapshot –cluster-identifier aaaa –snapshot-identifier bbbbDo
{
$ClusterJSON = aws redshift describe-clusters –cluster-identifier aaaa | ConvertFrom-Json
Start-Sleep -s 30
} While ($ClusterJSON.Clusters.ClusterStatus –ne ‘available’)
aws redshift modify-cluster –cluster-identifier aaaa –vpc-security-group-ids ssss
Where:
- xxxx = access key ID
- yyyy = secret access key
- zzzz = región del cluster
- aaaa = nombre del cluster
- bbbb = nombre del snapshot
- ssss = Security Group (el ID, no el nombre: sg…..)
It is important to note that when restoring a cluster from a snapshot, the resulting cluster will have the same configuration as the original cluster from which the snapshot was created, EXCEPT for the SecurityGroup. That’s why the last command should be update the resulting cluster to associate it with the proper SecurityGroup, but this change can only be applied when the cluster is already available.
Content generated by Morris & Opazo team