Enterprises migrating their infrastructure to the cloud is no news in 2023. However, in the democratization of comprehensive and expensive cloud services, one aspect that often gets overlooked is the cost associated with it. This is where cloud cost optimization comes into play.
2023, thus far, has been a year of Cloud Cost Optimization. Companies across industries are focused on reducing their cloud spends and CEOs of major cloud providers (AWS, Azure, GCP) have openly discussed during financial results how companies are focusing on optimizations and how cloud providers are helping with cost optimizations.
To effectively optimize cloud costs, organizations must have a thorough understanding of their cloud usage patterns, resource allocation, and pricing models. By analyzing these factors, businesses in 2023 identify areas where costs can be reduced, such as eliminating idle resources, rightsizing instances, or leveraging discounts and reserved instances.
Amreth Chandrasehar co-created the Conducktor platform, built Observability platforms at T-Mobile and Informatica, and is now exploring the possibilities in the domain of ML Engineering (ML Ops and ML Infrastructure). The Conducktor platform hosts applications that handle T-Mobile’s 100+ million customers and is considered to be a state-of-the-art solution.
His projects, combined, have successfully saved more than $50 Million at T-Mobile and Informatica on infrastructure, tools, and license costs. He has also built one of the largest Kubernetes and Elasticsearch clusters, serving critical tools in large organizations.
Cloud Cost Optimizations can be achieved in many ways. Some easy wins are to identify and eliminate unused and underutilized resources. Examples are deleting unused storage volumes, buckets in object storage (e.g. AWS S3), Managed services such as databases, streaming services, load balancers and VMs. Also, by carefully using Observability data such as CPU, Memory and Network traffic patterns for VMs (e.g. AWS EC2 Instances), database and stream services, the nodes can be resized to reduce cost. Utilizing auto scaling features for VMs and in container platforms such as Kubernetes can help to scale on-demand by setting right policies.
Cost can also be reduced by moving to a different instance type such as AWS Graviton (ARM) and AMD processors to reduce upto 40% cost on VMs, Database services. In AWS, by migrating to GP3 volume, we were able to reduce EBS cost by 20% compared to GP2. The cloud platform teams should continue to look for newer generation of VMs, network and storage services to optimize cost and get more performance for the same dollars spent. Cloud Governance policies and automated reports to keep track of Cloud usage is key to ensure the teams do not drift away from all the savings done and will ensure teams are focused on spending less on cloud but continue to innovate to provide best customer experience. These changes have enabled organizations to reduce cloud spend in millions.
Amreth Chandrasehar is a cloud computing veteran and has been in the business for quite a while. He is currently the Director of ML Engineering, Observability and SRE at Informatica. He has been an AWS Community Builder since 2022 and is among the finest software engineers in the arena of cloud computing, reflecting his expertise in a slew of academic papers including his papers titled “Generative AI and LLM Optimizing Techniques for Developing Cost Effective Enterprise Applications” and “Solving complexity and improving storage efficiency in managing Elasticsearch clusters using Kubernetes and ECK operators”
Diving into cloud cost optimization strategies which can help save substantially to an enterprise, Amreth shared his curated expertise with us:
Optimizing cloud cost can enable organizations in number of ways. Organizations can see improved profits by reduce cloud cost which improves their bottom line. It also enables competitiveness by reducing the price of the products and enable greater value add to their customers. More importantly it helps to foster innovation, the savings from optimizations can be used to invest in new product development, improving experience for internal and external customers. In long term this can potentially be a new line of business. Example how Slack was born from a gaming company, Tiny Speck.
In organizations, I have worked for, the savings from cost optimizations has been crucial several innovative projects. At T-Mobile, Conducktor Platform was developed to keep the cost flat, but we continued to onboard dozens of new applications and workloads to Cloud from on-premise. The critical need to provide great customer experience (T-Mobile’s Uncarrier moves) but at the same time keep the cloud cost to grow expansional, enabled us to innovate by developing new platform named Conducktor for container management and Kardio.io for endpoint monitoring and Observability. At Informatica, moving to in-house Observability solution helped save $20 million over 3 years. We continued to optimize cost and reduce the cost per gb of data stored in Observability too by 20% every year for last 3 years. This enabled the team to invest in developing Machine Learning platforms for ML Infrastructure, ML Ops solutions for Generative AI and AIOps solutions to enable operations teams to debug issues faster (Improve MTTx metrics) and reduce alert fatigue.