TechLevity
← Back to Insights
cloudcost optimisation

Cloud Cost Optimisation: A 90-Day Guide to 30–50% Savings

··13 min read·

Cloud costs spiralling faster than revenue? Here is the 90-day programme TechLevity uses to achieve 30–50% cost reduction for UK PE-backed scale-ups — without touching performance.

Cloud Cost Optimisation: A 90-Day Guide to 30–50% Savings

The cloud bill arrives at the end of the month. It is 40% higher than last month. No one can explain why. The ops team blame the growth in traffic. The engineering team blame the ops team. The CFO is asking whether this is sustainable.

It is not.

Cloud costs that grow faster than revenue are one of the most common and most avoidable problems in PE-backed scale-ups. The good news: in TechLevity's experience across UK PE portfolio companies, 30–50% cost reduction is achievable in 90 days, without touching performance, without a migration, and without a budget for new tooling. The savings are already in your estate — they just need to be found.

This guide covers how to find them. It is the methodology behind TechLevity's cloud cost optimisation programme — a 90-day engagement that identifies waste in Week 1, implements quick wins by Day 30, and leaves you with a FinOps framework that keeps costs controlled as you scale.

Whether you are a CTO trying to explain a growing bill to the board, a PE operating partner looking for EBITDA improvement levers, or an engineering lead who knows there is waste but cannot quantify it — this guide gives you the framework to act.

Why Cloud Bills Spiral Out of Control

Cloud cost overruns are rarely the result of a single bad decision. They are the accumulated consequence of dozens of reasonable decisions made under pressure, never revisited, and never assigned an owner. Four patterns account for 80% of the waste we find.

Over-Provisioning During Rapid Scaling

When companies scale quickly — whether through organic growth or post-acquisition — engineering teams default to over-provisioning. Larger instances, more replicas, wider load balancer capacity. The reasoning is sound: availability is non-negotiable, and the cost of an outage outweighs the cost of unused capacity. The problem is that the "temporary" over-provisioning never gets reviewed. Six months later, you are running production workloads on instances three times larger than they need to be.

Development and Staging Environments That Never Get Switched Off

Development environments are typically provisioned at 30–50% of production capacity. They are also typically left running 24/7, 365 days a year. For a company spending £50k/month on production, that is £15k–£25k/year on infrastructure that is used for eight hours a day, five days a week. The fix takes an afternoon.

No Cost Ownership Culture

In most scale-ups, cloud costs are treated as an infrastructure budget line owned by the CTO or head of platform. Product and engineering teams have no visibility into what their features cost to run and no incentive to optimise. A recommendation system that runs unnecessarily on GPU instances because no one checked whether CPU would be sufficient is a product decision with a £15k/month price tag — but the product team never sees that bill.

Post-Acquisition Cloud Sprawl

PE buy-and-build strategies create a specific cloud cost problem: multiple acquisitions running separate cloud accounts, separate tooling, separate reserved instance commitments, and no consolidated view. Cross-portfolio rationalisation is one of the highest-leverage interventions available to a PE operating partner — but it requires someone to own it. For a detailed approach to technology consolidation after acquisitions, see our guide on post-M&A technology integration.

⚠️The most expensive cloud architecture decision is the one made under time pressure during a scaling event and never revisited. We consistently find 20–30% of cloud spend at PE portfolio companies is directly attributable to provisioning decisions made 12–24 months earlier that were never reviewed.

The Cloud Cost Audit: What to Measure First

The audit is where the money is found. It is not a strategic exercise — it is a forensic one. You are looking for resources that cost money and deliver no value. Structure the audit in four phases.

Resource Inventory

Start with a complete inventory of every compute, storage, networking, and managed service resource across all accounts and regions. Use cloud-native tooling (AWS Cost Explorer, GCP Cloud Asset Inventory, Azure Resource Graph) supplemented by a resource tagging audit. Any resource without a cost allocation tag is a red flag — it means the cost has no owner and no business justification.

Waste Identification

Flag every resource that is either stopped (but still accruing storage costs), idle (running but consuming minimal traffic), or severely over-provisioned. Common findings: stopped EC2 instances still paying for EBS volumes; RDS databases running in multi-AZ mode for workloads that do not require it; NAT gateways in VPCs with no traffic; Elastic IPs associated with terminated instances.

Rightsizing Analysis

For each compute resource, compare the committed instance size against actual CPU, memory, and I/O utilisation over a 30-day period. Most cloud platforms have built-in rightsizing recommendations — but they tend to be conservative. In practice, teams can often right-size more aggressively for non-production workloads. The key is to separate production from non-production: production right-sizing requires monitoring windows and rollback plans; non-production can be resized immediately.

Reserved Instance and Savings Plan Mapping

Review what proportion of your compute spend is on on-demand pricing versus reserved instances or savings plans. On-demand is 40–60% more expensive than equivalent committed use. For any workload running continuously for more than 12 months, a 1-year reserved instance or savings plan commitment should be the default. Use our technology assessment checklist to structure your cloud cost audit and ensure nothing is missed.

💡Most companies find 15–20% of their cloud waste in the first week of the audit — in stopped resources still paying for storage, dev environments running overnight, and instances sized for traffic peaks that happened 18 months ago. Start there. The quick wins pay for the deeper work.

Compute Optimisation: Right-Sizing, Reserved Instances, and Spot Pricing

Compute is typically 50–70% of cloud spend. It is also where the largest single-action savings are available.

Right-Sizing in Practice

Right-sizing is not a one-time event. Workloads evolve, traffic patterns change, and instance requirements shift with product features. Build right-sizing reviews into your monthly FinOps cadence. For production workloads, move conservatively — one instance family at a time, with a two-week monitoring window after each change. For non-production workloads, be aggressive: development environments rarely need more than 20% of production capacity during business hours.

AWS-specific: use Compute Optimizer recommendations filtered by the last 14 days of utilisation (not the default 3-day lookback, which is too short to capture weekly patterns). GCP: use the recommender API via Cost Management. Azure: use Azure Advisor. All three platforms offer free rightsizing recommendations — the only cost is the engineering time to implement them.

Reserved Instances and Savings Plans

The single highest-leverage action for most companies spending more than £10k/month on compute. A 1-year no-upfront reserved instance commitment saves 40% over on-demand for equivalent capacity. A 3-year commitment saves 60%. The key is selecting the right commitment type: EC2 instance savings plans (most flexible), compute savings plans (covers Lambda and Fargate too), or region-specific reserved instances (cheapest, but least flexible).

For PE portfolio companies running multiple AWS accounts, consolidated billing with a management account unlocks volume discounts and makes reserved instance sharing across accounts significantly simpler. This is one of the fastest value creation levers available to a PE operating partner — and one we implement in the first 30 days of every engagement.

Spot and Preemptible Instances

For batch workloads, CI/CD jobs, data processing pipelines, and any workload that can tolerate interruption, spot instances (AWS), preemptible VMs (GCP), and spot VMs (Azure) deliver 70–90% savings over on-demand. The interruption risk is manageable for the right workloads — use spot for anything that can checkpoint and restart, never for stateful production databases.

A practical approach: run your CI/CD pipeline on spot instances with a fallback to on-demand if spot capacity is unavailable. Most CI jobs complete in under 30 minutes — well within typical spot instance lifespans. The savings on CI alone can be £2k–£8k/month for a mid-sized engineering team.

Data Transfer and Storage Costs: The Hidden Budget Drains

Data transfer and storage are where cloud bills hide their most invisible costs. They rarely appear in architecture discussions, they are almost never included in feature cost estimates, and they accumulate silently until someone notices the line item has doubled.

Egress Costs

Transferring data into a cloud region is free. Transferring data between availability zones within a region is not (typically $0.01–$0.02/GB). Transferring data between regions is more expensive. Transferring data to the internet is most expensive of all. Applications that were architected assuming all traffic stays within a single AZ — then deployed multi-AZ for resilience — generate egress costs the original design never anticipated.

Audit your inter-AZ data transfer first; it is consistently the largest unbudgeted cost category we find. A microservices architecture with 10 services communicating across AZs at 100 requests/second generates meaningful data transfer costs that appear nowhere in the service design documentation.

Storage Tier Mismatches

S3/GCS/Blob storage has multiple tiers: frequent access (highest cost, lowest retrieval cost), infrequent access, archive, and deep archive. Most companies have the majority of their data in frequent access regardless of how often it is actually accessed. A lifecycle policy that moves objects older than 30 days to infrequent access and older than 90 days to archive is a 50–70% storage cost reduction for most data lake and backup workloads.

The implementation is straightforward: S3 Intelligent-Tiering automates this for objects over 128KB with no retrieval fees. For GCP and Azure, lifecycle policies require manual configuration but take under an hour to set up per bucket or container.

Unoptimised Data Transfer Architecture

Services that make unnecessary round trips between microservices, or that fetch large datasets from databases when they only need a subset, generate data transfer costs that are invisible until you see the bill. Use VPC endpoints (AWS) or Private Service Connect (GCP) to route traffic between services privately rather than through the public internet tier. For cross-region replication, audit whether every replicated dataset is actually consumed in the target region — we regularly find replication jobs that were set up for a feature that was later decommissioned.

FinOps as a Practice: From Cost Centre to Shared Accountability

FinOps is not a tool. It is an organisational practice that makes cloud costs visible, attributable, and therefore controllable. Without it, every cost reduction you achieve will drift back within 6–12 months.

Tagging Strategy

Every cloud resource should be tagged with: team or squad owner, product or feature, environment (production/staging/development), and cost centre. Enforce tagging via infrastructure-as-code policy — untagged resources should fail CI. A well-implemented tagging strategy takes 2–3 weeks to roll out and makes every subsequent cost analysis 10× faster.

Start with four mandatory tags: team, environment, product, and cost-centre. Add optional tags for project and temporary-resource (with an expiry date). Audit compliance weekly — any resource missing mandatory tags should generate an alert to the owning team.

Showback vs Chargeback

Showback: show engineering teams and product squads what their infrastructure costs. Chargeback: allocate those costs to their budgets. Start with showback. Most organisations find that visibility alone drives 10–15% cost reduction — teams optimise once they can see the bill for their decisions. Move to chargeback only when your tagging is mature enough to be accurate and your teams have enough context to make informed trade-offs.

The Monthly FinOps Review

A 60-minute monthly review with engineering leads, the CFO, and the CTO (or fractional CTO) covering: total cloud spend vs budget, top 5 cost drivers, month-on-month changes and explanations, reserved instance utilisation rate, and upcoming commitment renewals. This single meeting creates more cost awareness than any tooling investment.

Automated Cost Alerting

Set budget alerts at 80% and 100% of monthly budget, with Slack or Teams notification to engineering leads. Set anomaly detection alerts for any service whose daily cost increases by more than 20% over the 7-day rolling average. Both are built into AWS Budgets, GCP Budget Alerts, and Azure Cost Management — and both take under 30 minutes to configure.

The goal is not to prevent all cost increases — some are justified by growth. The goal is to ensure no cost increase goes unnoticed and unexplained for more than 48 hours.

Cross-Portfolio Cloud Cost Rationalisation for PE Buy-and-Build Strategies

For PE firms running buy-and-build strategies, cloud cost rationalisation is one of the fastest value creation levers available — and one of the most consistently overlooked.

The typical buy-and-build scenario: three acquisitions in 24 months, each running its own AWS or Azure account, its own reserved instance commitments, its own tooling stack. The PE operating partner has a consolidated P&L view but no consolidated cloud view. Each portfolio company is negotiating separately with cloud providers and paying separate prices.

Four levers for cross-portfolio rationalisation:

  1. Consolidated billing: Link all portfolio company accounts under a single AWS Organisation or GCP Billing Account. This gives the PE firm a single consolidated view and unlocks volume discount thresholds that individual companies cannot reach on their own.
  2. Enterprise discount programmes: AWS Enterprise Discount Programme (EDP), GCP Committed Use Discounts at an organisation level, and Azure Enterprise Agreements offer 5–15% additional discounts for committed annual spend across a portfolio. Negotiate these at the portfolio level, not the company level.
  3. Reserved instance sharing: Reserved instances purchased at the management account level can be shared across all member accounts. Portfolio companies with variable workloads can benefit from reserved instance capacity purchased for the portfolio's aggregate baseline.
  4. Cross-portfolio benchmarking: Compare cloud cost efficiency metrics across portfolio companies (cost per transaction, cost as percentage of revenue, cost per active user). This identifies outliers and creates healthy competitive pressure within the portfolio.
The most common question I get from PE operating partners is "why are all three of our portfolio companies paying different prices for the same AWS services?" The answer is always the same: they are negotiating separately, committing separately, and leaving cross-portfolio leverage on the table. Consolidating billing and committing at the portfolio level typically saves 10–20% on top of whatever individual-company optimisations you have already done.

The 90-Day Cloud Cost Reduction Programme

Based on TechLevity's cloud cost engagements across UK PE portfolio companies. Average outcome: 30–50% reduction, results identified in Week 1, implemented by Day 30.

Week 1 — Audit and Triage

  • Complete resource inventory across all accounts and regions
  • Flag all stopped resources still accruing costs (delete or snapshot)
  • Identify all untagged resources (ownership triage)
  • Pull rightsizing recommendations from cloud-native tools
  • Review reserved instance coverage and upcoming commitment renewals
  • Produce a prioritised findings report with estimated savings per item

Days 8–30 — Quick Wins Implementation

  • Switch off development and staging environments outside business hours
  • Delete stopped EC2 instances and their EBS volumes
  • Purchase 1-year reserved instances or savings plans for baseline compute
  • Implement S3 lifecycle policies for infrequent-access and archive tiers
  • Enable spot instances for CI/CD and batch workloads
  • Roll out tagging enforcement via infrastructure-as-code

Days 31–60 — Architecture Changes

  • Right-size production workloads (conservatively, one change at a time)
  • Implement VPC endpoints to reduce inter-AZ data transfer costs
  • Optimise database instance sizing and backup retention periods
  • Review managed service tier selection (e.g. RDS multi-AZ vs single-AZ for non-critical databases)
  • Set up automated cost alerting and anomaly detection

Days 61–90 — FinOps Framework Rollout and Board Report

  • Launch the monthly FinOps review cadence
  • Roll out showback dashboards by team, product, and environment
  • Negotiate enterprise discount programmes if portfolio-level consolidation is in scope
  • Produce the board-ready cloud cost reduction report: before/after spend comparison, savings breakdown by category, ongoing optimisation roadmap, FinOps governance structure

TechLevity runs cloud cost audits for PE-backed UK scale-ups. We identify the savings in Week 1, implement the quick wins in 30 days, and leave you with a FinOps framework that keeps costs controlled without requiring dedicated headcount.

To discuss your cloud cost situation, book a cloud cost audit — no commitment, just an honest assessment of where the savings are.

Want a second opinion on your AI initiative?

30-minute sanity check call. No pitch, no slides.

Book your call →

Newsletter

This is where I share what I can't post publicly.

AI strategy for UK scale-ups. Monthly. No fluff.

Subscribe to Beyond Growth →