Cloudability’s Rightsizing Recommendations allow you to optimize cost by building and scaling your cloud infrastructure tightly around your needs. Since cloud vendors bill you for services you provision rather than what you consume, underutilized resources will cause you to incur excessive spend that you can avoid by moving to resources that better match your environment. We provide a time trend of the utilization metrics per resource so you can evaluate our recommendations and take the action that best suits your resource demands. We’ll also show you multiple scenarios with associated risk and savings to arm you with data to make the rightsizing decisions. We’ll walk through the Rightsizing Recommendations feature starting with the overview page.
On the main table for EC2 and EBS, we display resources sorted by Savings 10 Days. For each resource, in addition to resource and account information, you’ll see the Current resource type and its associated Idle Score . Note that Cost (Total) 10 Days shows your spend  over the time period, which potentially includes RI’s and Spot. Next we’ll show you our top Recommendation:
- Rightsize - resize to the New resource type
- Terminate - the resource is predominantly idle
- Autoscale - set up autoscaling for the resource
- No Action - do not take any action on the resource at this time
- Missing Permissions - ensure your policy is up to date with the latest permissions
Export overview table to see additional details
Additional details for all resources are available via export on the main page. On the spreadsheet you can find Account ID, Availability Zone, and Operating System as well as the Effective Rate for the current and new resource type. Tags available per resource are present as well.
Exports show the top 5 recommendations per resource with default preferences. If you need to adjust default preferences, you will need to use the API and do the filtering client-side. The key to the export is that resources are grouped by Resource ID and ranked by Recommendation Order ascending.
When you select Details for a resource, you’ll see time trends for utilization metrics relevant to each service. For EC2, for example, we plot CPU (%), Network (Mbps), Disk (MB/S), and Memory (%) over the past 10 days. Ten days captures the most recent performance trends and is more predictive of future resource use. We use maximum values to ensure when we model the recommendations that we’ve taken into account peak utilization, crucial to determine viable sizes for the recommendations.
On the details page, we display the top recommendations above the utilization metrics charts. For each recommendation option selected, we’ll show the capacity of the recommended resource (relative to the capacity of the original resource) as a yellow-dashed line on the chart for each metric. Select additional Recommendations options to model against actual utilization over the past ten days.
The metric maximums previously mentioned ensure that we show the peak of the resource’s utilization. Relying on averages may result in ‘clipping’ if you move your workload to an instance that does not have the capacity you’ll need.
Our rightsizing recommendations give you the opportunity to balance risk versus savings, where risk measures the likelihood of resource ‘clipping.’ We arrange the recommendations from left to right, with the left-most option being the top recommendation -- the highest savings option in the lowest risk category. We’ll model for you additional recommendations which will return a higher savings, though with a higher risk. As you move right through the recommendation options, we’ll show highest savings in higher risk categories before returning to lower risk, second highest savings options.
How are the recommendations calculated?
To arm your teams to action the rightsizing recommendations, let’s review how we calculate each of the recommendations. For any particular resource, we first find all candidate recommendations. With an EC2 instance, for example, this would include all instance types that are available for the particular Region and OS used in the original workflow. We then limit the candidate set to those which have a lower On-Demand price relative to the effective rate paid over the prior ten days (this would potentially include savings from RI’s and Spot).
We then use proprietary algorithms to model the performance of the existing workload on each element in the candidate set of recommendations, looking at key performance characteristics as well as theoretical and empirical performance limits. Our algorithms return a ranked list of recommendations, with explicit scores for both Savings and Risk, so that you can run your infrastructure with maximum efficiency.
Note that for EC2, we use baseline performance for burstable resources (e.g. T2 instance types) and do not account for the burst in determining the recommendations. When we recommend that you move an EC2 instance type to an instance with no local storage, we incorporate into the recommendation that you add EBS, and account for both in the savings. Lastly, EC2 Network and Disk throughput we approximate since AWS does not report maximum theoretical values.
 Idle Score for EC2 is the time spent below 2% CPU, on a scale of 1-100. Idle Score for EBS is percent of hours with zero IOPS.
 Spend is instance usage spend for EC2 and RDS, Redshift excludes data transfer, and S3 and EBS are GB Months.