Building CloudBurn: What Worked, What's Still Unsolved, and What's Next

Lessons from building a tool to shift FinOps left into code review. What succeeded, what challenges remain, and how you can help shape what comes next.

9 min read
0 views
--- likes

Most teams discover AWS cost problems in production—after the infrastructure is already running and the money is being spent. By then, fixing it means refactoring live systems, which is expensive and risky.

CloudBurn changes this. It's a GitHub App that automatically calculates AWS infrastructure costs and posts them as comments on pull requests. Developers see cost impact during code review, before anything deploys to production.

I built CloudBurn to shift FinOps left—catch expensive decisions when they're easy to fix, not when they're expensive to change.

Here's what worked, what didn't, and what I'm working on next.

What Worked

1. Progressive Enhancement Over Perfection

I initially wanted to track everything: EC2, RDS, Lambda invocations, S3 storage, data transfer, CloudWatch logs - the complete AWS bill reconstructed from infrastructure code.

That would have taken 6 months to build and required estimating usage patterns (how many Lambda invocations? how much S3 data?). Get those estimates wrong, and developers stop trusting the numbers.

What I did instead: Focus on baseline provisioned costs only - the 38 CloudFormation resource types with fixed hourly rates.

EC2 instances, RDS databases, ElastiCache clusters, EKS clusters. Resources that cost money whether you use them or not.

This covers about 80% of most AWS bills and requires zero estimation. A t3.large costs $61.30/month in us-east-1. That's deterministic.

The result: Launched in 3 weeks instead of 6 months. Developers trusted the accuracy immediately. Could iterate based on real feedback instead of theoretical requirements.

The lesson: Ship the high-value subset first. You can always expand later. Better to be precisely right about 80% than approximately right about 100%.

2. Meet Developers Where They Are

I considered building:

  • A dashboard showing cost trends
  • A Slack bot for cost notifications
  • A CLI tool for local cost checks
  • An IDE extension

All of these would have been useful. None of them would have gotten used.

What I did instead: GitHub PR comments. That's it.

No new tool to learn. No dashboard to remember to check. No CLI to install. No extra step in the workflow.

Infrastructure changes trigger a webhook. CloudBurn posts a comment. Developers see it automatically during code review.

The result: 100% adoption from day one. Because there was nothing to adopt.

The lesson: The best developer tool is the one that requires zero behavior change. Augment existing workflows instead of creating new ones.

3. Resilience Over Reliability

Early versions of CloudBurn would fail completely if:

  • AWS Pricing API returned unexpected data
  • Instance type wasn't found
  • Property extraction from source code failed
  • Network timeout occurred

Every failure meant no cost information for that PR. Developers would shrug and merge anyway.

What I built instead: Progressive degradation at every layer.

Can't extract instance type from source code? Show the resource type without pricing. Can't find exact pricing? Show approximate range or similar instances. AWS Pricing API times out? Fall back to cached historical prices. Everything fails? Show: "CloudBurn couldn't determine cost, review manually."

The result: CloudBurn always provides some information, even if incomplete.

The lesson: For feedback tools, partial information beats no information. Developers will work with "approximately right" but ignore "frequently absent."

4. The 80/20 Rule Applies to Everything

Resources: 38 baseline cost types (2.6%) account for ~80% of AWS bills.

Pricing queries: 20% of services (EC2, RDS, ElastiCache) account for 80% of queries.

Code patterns: 20% of CDK constructs (Instance, DatabaseInstance, Cluster) appear in 80% of PRs.

Architecture decisions: 20% of instance type choices (t3.medium, t3.large, t3.xlarge) cover 80% of use cases.

Optimizing for the common 20% covered the vast majority of real-world usage. The remaining 80% of edge cases could be handled with "good enough" fallbacks.

What I did: Built robust handling for the top 20%, acceptable handling for the rest.

The lesson: Identify your critical path and make it bulletproof. Everything else can be "good enough."

What Surprised Me

1. Developers Wanted This More Than I Expected

I worried that CloudBurn would be seen as:

  • Another tool adding noise to PRs
  • FinOps surveillance
  • Gatekeeping disguised as "visibility"

The reality: Developers loved it immediately.

They didn't see it as restriction. They saw it as finally having the information they needed to make good decisions.

Multiple people said variations of: "I always wondered how much things cost but was too lazy to look it up."

The insight: Developers want to do the right thing. They just need the information. When you remove friction to doing the right thing, they'll do it naturally.

2. Cost Discussions Improved Architecture

I expected: "This is too expensive, use something smaller."

What actually happened: "Why is this architecture so expensive?" led to "Do we need this architecture at all?"

Example conversation from a PR:

  • CloudBurn: "This change adds 3 RDS instances, $320/month"
  • Reviewer: "Why three databases?"
  • Developer: "One for each microservice"
  • Discussion: "Can't they share a database with separate schemas?"
  • Result: Simpler architecture, one database, $110/month

Cost became a catalyst for simplification. Expensive architectures are often unnecessarily complex. Optimizing for cost often means optimizing for clarity.

The insight: Cost awareness doesn't just reduce spending. It improves design by forcing justification of complexity.

3. The Compound Interest of Prevention Is Massive

Before CloudBurn:

  • Make decisions without cost visibility
  • Discover problems in production
  • Fix them reactively (one-time save)

After CloudBurn:

  • Make decisions with cost visibility
  • Prevent problems during code review
  • Savings compound forever

Every prevented cost mistake saves money in perpetuity.

The RDS instance downsized from db.m5.4xlarge to db.t3.large during code review saves $973/month = $11,676/year. That's not a one-time optimization. That's permanent.

Multiply this across dozens of decisions per month. Small preventions compound into massive savings over time.

The insight: Prevention has infinite ROI because the savings never stop. It's the compound interest of FinOps.

4. Education Through Exposure Works

I didn't build any training materials. No documentation about AWS pricing. No workshops about cost optimization.

Yet after 3 months, developers had internalized AWS pricing basics:

  • "t3.large is about $60/month"
  • "RDS is expensive compared to DynamoDB"
  • "NAT Gateways add up if you have many VPCs"

How? They saw pricing information daily in their natural workflow. Osmosis.

Same principle as how developers learn API response times from APM tools, or memory patterns from profilers. Constant exposure creates intuition without explicit teaching.

The insight: The best education is ambient, not formal. Put information where people naturally encounter it repeatedly.

5. Autonomy + Information Beats Control

I never added:

  • Cost approval workflows
  • Budget enforcement
  • Required FinOps review
  • Blocked PRs for expensive changes

CloudBurn just shows information. The team decides what to do with it.

This created more cost-conscious behavior than any policy would have.

Why? Because developers owned the decisions. They weren't following rules; they were making informed tradeoffs.

Sometimes they'd see "$500/month" and say "worth it, this feature needs performance." That's fine! The point is they made a conscious decision.

The insight: Trust + information is more powerful than control + restriction. People make better decisions when empowered than when constrained.

What's Still Unsolved

Usage-Based Costs Are Hard

EC2 instances have fixed hourly costs. Easy to predict.

Lambda has per-invocation costs. How many invocations? No idea until runtime.

S3 has storage + request costs. How much data? How many requests? Depends on usage patterns.

The challenge: Can't accurately predict usage-based costs from infrastructure code alone.

Current approach: Don't show them. Only show baseline provisioned costs.

Future approach: Maybe estimate ranges based on historical patterns? "Similar Lambdas cost $20-80/month based on past usage."

But getting this wrong destroys trust. Better to show nothing than wrong information.

CloudBurn shows: "This PR adds $100/month."

But what if your costs have been growing 20% month-over-month? That context matters.

Missing feature: "Your infrastructure costs grew from $500 to $2,000 over the last quarter. Here are the PRs that added the most cost."

This would connect individual decisions to overall trends. Show compound effects more clearly.

What I'm Working on Next

I'm actively working on improving CloudBurn based on what I've learned. You can see the full roadmap and track progress here:

CloudBurn Roadmap

Currently In Progress

Better property extraction: More CDK patterns, Terraform module support, Pulumi support

Historical trends: Connect PR costs to overall growth patterns and show cost evolution over time

Smarter caching: Longer TTL, cache warming, predictive refresh

What Should I Prioritize?

I'd love to hear what features would be most valuable to you. Some ideas I'm considering:

  • Multi-cloud support (Azure ARM/Bicep, GCP Deployment Manager)
  • Usage-based cost estimation for Lambda, S3, and data transfer
  • Architecture recommendations to simplify expensive designs
  • Auto-generated optimization PRs based on actual usage patterns
  • Budget integration with alerts for team spending

What would help your team most? Leave a comment below. Your feedback directly influences what gets built next.

Get Involved

CloudBurn is open source and built collaboratively. Here's how you can help shape what comes next:

Try CloudBurn

Want to see AWS costs in your pull requests?

Contribute

Want to help build CloudBurn?

  • Browse issues: Find good first issues in the GitHub repository
  • Submit PRs: Contributions are welcome—add support for new IaC tools, improve cost calculations, or enhance the developer experience
  • Report bugs: Found something broken? Open an issue

The goal is simple: shift FinOps left. Give developers cost visibility during code review and prevent expensive mistakes before they reach production.

If you're tired of optimizing AWS costs reactively, let's build something better together.

Share this article on ↓

Join ---- other developers staying ahead of cloud costs. Get updates on new CloudBurn features, FinOps best practices, cost optimization strategies, and real-world insights on shifting cost awareness left in your development workflow.