Cloudability’s Rightsizing Recommendations allow you to optimize cost by building and scaling your cloud infrastructure tightly around your needs. Since cloud vendors bill you for services you provision rather than what you consume, underutilized resources will cause you to incur excessive spend that you can avoid by moving to resources that better match your environment. We provide a time trend of the utilization metrics per resource so you can evaluate our recommendations and take the action that best suits your resource demands. We’ll also show you multiple scenarios with associated risk and savings to arm you with data to make the rightsizing decisions. We’ll walk through the Rightsizing Recommendations feature starting with the overview page.
Access the Rightsizing feature via the Optimize dropdown in the top navigation bar. Select your desired cloud vendor and service. Note recommendations are not yet available for Amazon S3, RDS, and Redshift.
The main table shows all resources with spend in the specified time period. Currently we set our analysis to evaluate the last 10 days. Read our FAQ to understand why we chose 10 days. We display resources sorted descending by Cost Savings. By starting at the top of the list, you focus your attention on underutilized resources that can save you the most money.
For each resource, in addition to resource and account information, you’ll see the Current resource type and its associated Idle value. Read our FAQ to understand how we define Idle. Cost (Total) shows your spend over the time period, which potentially includes committed use discounts, custom pricing, and for Amazon EC2 any Spot usage in the billing for that resource. Additional detail on how we determine spend is in the FAQ. Next we’ll show you our top Recommendation and what the New resource type should be in cases of a rightsize recommendation.
- Rightsize - resize to the New resource type
- Terminate - the resource is predominantly idle
- Autoscale - set up autoscaling for the resource
- No Action - do not take any action on the resource at this time
- Incomplete Util Data - (1) ensure your policy is up to date with the latest permissions, or (2) determine if resources were active for less than an hour as they may not appear in the Describe Data cache from AWS
Export overview table to see additional details
Additional details for all resources are available via export on the main page. On the spreadsheet you can find additional account information, region, operating system as well as the Effective Rate for the current and new resource type. Tags available per resource are present as well.
Details page for utilization time trends
To know how to rightsize your resources, you’ll need just-in-time information on the utilization of each resource. Select the Details button at the right of each row in the main table to get time trends of utilization metrics relevant to each service. For compute instances (for example Amazon EC2 or Azure Compute), we plot CPU (%), Network (Mbps), Disk (MB/S), and Memory (%). Again, read our FAQ to find out why our analysis runs on a 10-day window and our glossary to understand the source of each metric.
We use maximum values for each utilization metric. Why is maximum value instead of average the best practice? Maximums ensure that we show the peak of the resource’s utilization. Relying on averages may result in ‘clipping’ if you move your workload to an instance that does not have the capacity you’ll need.
To give you the greatest flexibility when rightsizing your infrastructure, we provide multiple recommendations and allow you to model each recommended resource against your current utilization. We display up to 5 recommendations on the details page. Click on each recommendation to toggle the recommended yellow-dashed line on the chart.
We know our customers want the ability to update the recommendations based on the requirements of their organization. For example, we recommend for Amazon EC2 the newest generation instances by default, but you may have pricing or an infrastructure need that requires you resize only to prior generation instances. Likewise, you may prefer to stay within the same instance family for ease of resizing, as is the case for some Azure Compute workloads. Because of this, we allow you to select options on the details page that will update the recommendations.
For Amazon EC2, the options are:
- Show newest generation instance recommendations to ensure the recommendation is restricted to the latest instance families
- Remain in the same instance family if you have a need to keep your workload on the particular family
- Model equivalent memory capacity for cases where the memory metric is not available for the resource
Our rightsizing recommendations give you the opportunity to balance risk (likelihood of resource ‘clipping’) versus savings. We arrange the recommendations from left to right, with the left-most option being the top Cloudability recommendation -- the highest savings option in the lowest risk category. We’ll model for you additional recommendations which will return a higher savings, though with a higher risk. As you move right through the recommendation options, we’ll show highest savings in higher risk categories before returning to lower risk, second highest savings options.
How are the recommendations calculated?
To arm your teams to action the rightsizing recommendations, let’s review how we calculate each of the recommendations. For any particular resource, we first find all candidate recommendations. With a compute instance, for example, this would include all instance types that are available for the particular Region and OS used in the original workflow. We then limit the candidate set to those which have a lower hourly (On-Demand or Pay-as-you-go) price relative to the effective rate paid over the prior ten days.
We then use proprietary algorithms to model the performance of the existing workload on each element in the candidate set of recommendations, looking at key performance characteristics as well as theoretical and empirical performance limits. Our algorithms return a ranked list of recommendations, with explicit scores for both Savings and Risk, so that you can run your infrastructure with maximum efficiency.
Why is the recommendation time period 10 days?
Ten days captures the most recent performance trends and is more predictive of future resource use.
How is Idle defined?
Idle for EC2 is the time spent below 0.5% CPU, on a scale of 1-100. Idle for EBS is percent of hours with zero IOPS.
How do you determine spend?
Spend is instance usage spend for EC2 and RDS, Redshift excludes data transfer, and S3 and EBS are GB Months.
Why do I see (not set) under State for EBS?
The account may have insufficient permissions (ensure your policy is up to date with the latest permissions), or the resource was active for less than an hour (so may not appear in the Describe Data cache from AWS).
For EC2, do the recommendations take into account burst?
We use baseline performance for burstable resources (e.g. T2 instance types) and do not account for the burst in determining the recommendations. This ensures we make conservative recommendations that won't result in resource ‘clipping.’
For EC2, how do you determine Network and Disk?
AWS does not report EC2 Network and Disk throughput limits. Furthermore, the throughput capacity will vary based on both workload (e.g. sequential versus random read and write) and transfer (e.g. data transfer within a region versus across regions). We use common observed limits across our customer portfolio to approximate capacity.
Note: When we recommend that you move an EC2 instance type to an instance with no local storage, we incorporate into the recommendation that you add EBS, and account for both in the savings.