Amazon Athena pricing looks simple on paper: $5 per TB scanned, pay only for what you query, no servers to manage. Then you get the bill.
I've seen teams move their log analytics from CloudWatch to S3 + Athena specifically to save money — and end up with a larger bill because their data was unpartitioned, stored as CSV, and scattered across millions of tiny files. The Athena line item was fine. The S3 GET charges were not.
The core rate is $5.00 per TB scanned. But the Athena line item in Cost Explorer is not the complete picture. S3 GET requests, Glue Data Catalog calls, and Lambda invocations for federated queries all add to your actual spend — and none of them show up under the Athena service.
There is also no free tier for Athena SQL queries. The 10 MB minimum means even a trivially small query costs $0.00005. That's not a problem on its own, but it's worth knowing before you wire up a dashboard that polls Athena every 30 seconds.
By the end of this article you'll understand every billing dimension, know which optimization to tackle first, and have the math to evaluate whether Capacity Reservations make sense for your workload. Let's start with what Athena actually charges for.
What Does Athena Actually Cost?
Athena has three distinct billing models, and the choice happens at the workgroup level — not the account level. This means you can run on-demand and reserved capacity side by side within the same account, with different workgroups assigned to each. Most teams start on on-demand SQL, which is the simplest model and requires no upfront commitment.
Here's a quick overview before diving into each model:
| Feature | Pricing Model | Rate |
|---|---|---|
| SQL queries (on-demand) | Per TB scanned | $5.00 / TB |
| SQL queries — minimum charge | Per query | 10 MB (rounds up to nearest MB) |
| SQL Capacity Reservations | Per DPU-hour reserved | $0.30 / DPU-hour |
| Capacity Reservations minimum interval | Per billing period | 1 minute |
| Capacity Reservations minimum size | Per reservation | 4 DPUs |
| Apache Spark (driver + workers) | Per DPU-hour | $0.35 / DPU-hour |
| DDL queries | No charge | Free |
| Query results storage (S3) | Standard S3 rates | ~$0.023 / GB-month (us-east-1) |
| AWS Glue Data Catalog (first million) | No charge | Free tier |
| AWS Lambda (federated connectors) | Standard Lambda rates | Free tier applies |
| Setup fee | None | $0 |
| Minimum monthly fee | None | $0 |
No setup fees, no minimum monthly charge, and no idle compute charges for on-demand SQL. You pay for queries that run, not for the time in between. For the latest rates, check the current Athena pricing page. Now let's look at each model in detail.
SQL On-Demand: $5 Per TB Scanned
The on-demand model bills per query. Running on Athena engine version 3, the SQL engine measures how much data it reads from S3 to execute the query, rounds up to the nearest MB, applies a 10 MB minimum, and charges $5.00 per TB.
A few important nuances that trip people up:
- DDL queries are free. CREATE TABLE, DROP TABLE, ALTER TABLE, SHOW, DESCRIBE — all free, regardless of how much metadata they touch.
- Failed queries still cost money. If a query fails partway through, you pay for data scanned up to the point of failure. Same applies to canceled queries. It's unclear from the documentation whether the 10 MB minimum applies to partially executed canceled queries, so treat canceled queries as if they will hit the minimum.
- Query results in S3 are not counted as scanned data. Athena writes results to an S3 bucket you specify; that storage is billed at S3 rates, not Athena rates.
The official AWS pricing examples show the impact clearly. Take a 3 TB table with 4 columns, querying on 1 column:
- Uncompressed CSV: Athena reads the full 3 TB. Cost: $15.00
- GZIP-compressed CSV: File compresses to ~1 TB. Athena still reads the full file. Cost: $5.00
- GZIP + Parquet: Parquet is columnar, so Athena reads only the 1 queried column — 0.25 TB total. Cost: $1.25
Single-query examples like these consistently understate the financial impact. Here's what those same formats look like for a team running 100 queries per day on that same 3 TB table:
| Data Format | Single Query Cost | Monthly Cost (100 queries/day, 30 days) | Savings vs. CSV |
|---|---|---|---|
| Uncompressed CSV | $15.00 | $45,000 | - |
| GZIP-compressed CSV | $5.00 | $15,000 | 67% |
| Parquet (columnar only) | ~$3.75 | $11,250 | 75% |
| Parquet + GZIP/Snappy | $1.25 | $3,750 | 92% |
That's $45,000/month versus $3,750/month for the same queries on the same data — just stored differently. On-demand is the right starting point for most teams. But if your workload is predictable and concurrent, Capacity Reservations can change the economics entirely.
Capacity Reservations: $0.30 Per DPU-Hour
Capacity Reservations swap the per-scan billing model for reserved compute capacity. You provision a number of DPUs (Data Processing Units), and queries running on those DPUs pay no per-TB charge. Instead, you pay $0.30 per DPU-hour for the capacity you hold.
One DPU provides approximately 4 vCPUs and 16 GB of RAM. DML queries (SELECT, INSERT, etc.) consume between 4 and 124 DPUs dynamically based on complexity — Athena allocates them automatically. DDL queries always consume 4 DPUs.
As of February 10, 2026, Capacity Reservations now support 1-minute billing intervals and a 4 DPU minimum per reservation. Previously, the higher minimums made this model impractical for short workloads. With 1-minute granularity, AWS claims up to 95% cost savings for short-duration workloads compared to on-demand.
A few limits to be aware of: 1,000 DPUs maximum per account per region, 100 reservations maximum, and up to 20 workgroups per reservation. If your reserved capacity is fully utilized and more queries arrive, those queries queue for up to 10 hours before timing out.
Capacity Reservations are not available in Israel (Tel Aviv), Middle East (UAE), Middle East (Bahrain), or Asia Pacific (New Zealand).
Apache Spark: $0.35 Per DPU-Hour
Athena for Apache Spark is a separate engine with its own billing model. You pay $0.35 per DPU-hour for both the driver and worker DPUs, with per-second granularity.
The billing structure has two parts:
- Driver: 1 DPU per session, charged for the full session duration. If you open a notebook and leave it idle for 2 hours, that's $0.70 in driver charges.
- Workers: Auto-scaled per calculation, charged only for the duration of each calculation.
The official example: a 1-hour notebook session running 6 calculations with 20 worker DPUs each produces 3.0 total DPU-hours at $0.35 = $1.05. That's for actual computation. An idle session costs only the driver DPU — $0.35/hour.
As of November 2025, Athena Spark is available inside Amazon SageMaker Unified Studio notebooks, running on Spark 3.5.6 with support for Apache Iceberg and Delta Lake.
The Costs Athena Doesn't Show You
Here's what surprises most teams: the Athena line item in Cost Explorer is not the complete cost of running Athena. Three other AWS services contribute to your real spend, and they're billed separately. Understanding S3 storage costs, GET request pricing, and lifecycle policies is worth doing before you finalize any Athena cost estimate.
| Cost Source | What Triggers It | Rate | Significant When |
|---|---|---|---|
| Athena SQL on-demand | Data scanned per query | $5.00/TB | Always |
| S3 storage (results) | Query results written to S3 | $0.023/GB-month (us-east-1) | High result volume |
| S3 GET requests | Reading data files during query | $0.40/million requests | Many small files |
| S3 LIST requests | Listing partitions/prefixes | $5.00/million requests | Tables with many partitions |
| Glue Data Catalog | Metadata reads beyond 1M free | Glue catalog rates | Large partition counts |
| Glue crawlers | Schema discovery runs | Glue DPU-hour rates | Frequent crawl schedules |
| Lambda connectors | Federated query execution | Standard Lambda rates | Federated workloads |
S3 storage costs are usually predictable. The GET charges are where teams get surprised.
S3 Storage and GET Request Charges
Every Athena query issues GET requests to S3 to read data. Those requests are billed by S3 at $0.40 per million requests — not by Athena, which is why they don't show up on the Athena service line in Cost Explorer.
The math becomes a problem at scale. Consider this scenario: you have 1 million files, each 1 MB, making a 1 TB total dataset. Athena scan cost for one query: $5.00. S3 GET cost to read those 1 million files: $0.40. That seems manageable — until you query this dataset 100 times per day. The S3 GET charges compound to $40/day, $1,200/month, independent of compression or column format.
Athena's own guidance recommends a minimum file size of 128 MB. Below that threshold, file header parsing and metadata overhead dominate actual read time, and S3 throttling becomes a real risk — S3 enforces a default limit of 5,500 GET requests per second per prefix. A heavily partitioned table with thousands of small files can hit this limit and start returning throttling errors.
The fix is CTAS compaction. This merges many small files into fewer large ones in a single step:
CREATE TABLE compacted_table
WITH (
format = 'PARQUET',
external_location = 's3://your-bucket/compacted/'
)
AS SELECT * FROM fragmented_table;
After compaction, drop the old table definition and re-register the compacted location. For streaming pipelines that naturally produce small files (Kinesis Firehose, MSK to S3), set buffer intervals to at least 10 minutes and target 128 MB+ per output file.
AWS Glue Data Catalog Charges
Glue charges apply when Athena uses the Glue Data Catalog to resolve table metadata and partition locations. The first 1 million metadata objects and 1 million requests per month are free — most teams with a reasonable number of tables and partitions stay within this limit.
Beyond the free tier, Glue charges standard catalog rates. The distinction worth knowing: Glue crawlers are an entirely separate cost from catalog lookups. Crawlers are billed per DPU-hour by Glue, not by Athena, and they don't appear anywhere in Athena billing. Teams that run crawlers on a frequent schedule against large datasets can generate significant Glue charges that have nothing to do with how many Athena queries they run.
For tables with highly regular partition schemes — date-based logs are the classic example — partition projection calculates partition values from table configuration without querying the catalog at all. This keeps Glue request counts low and speeds up query planning for heavily partitioned tables.
Lambda Charges for Federated Queries
Federated queries let Athena query non-S3 data sources like Amazon DynamoDB, RDS, CloudWatch Logs, and DocumentDB. The mechanism is Lambda: Athena invokes a Lambda-based data source connector for each federated data source in the query. Those Lambda invocation charges are billed at standard Lambda rates on top of the Athena charge.
The Athena charge still applies at $5.00 per TB scanned, aggregated across all data sources. For querying DynamoDB through Athena federated connectors, factor in the DynamoDB read costs as well.
The Lambda free tier (1 million requests and 400,000 GB-seconds per month) means moderate federated query usage may generate zero Lambda charges. But high-frequency federated workloads can push past this quickly. One path around per-TB charges entirely: federated queries running on Capacity Reservations are not subject to per-TB data scanned charges — the DPU reservation replaces scan billing.
How to Reduce Your Athena Bill
The most common mistake I see is teams optimizing in the wrong order. They convert to Parquet — good. They don't add partition filters — expensive. They end up with beautiful columnar files that Athena still reads in full because no partition pruning is happening.
Order your optimizations by impact:
- Data format (up to 96% reduction) — biggest single lever, requires no application code changes
- Partitioning (workload-dependent, often 50-90% reduction for time-series data) — requires data reorganization but not schema changes
- Small file compaction (prevents compounding S3 GET costs) — one-time CTAS operation per table
- Query tuning (10-40% reduction depending on query patterns) — ongoing discipline
If you're scanning more than 1 TB per day, format conversion alone will likely show ROI within the first week.
Convert to Parquet or ORC
Parquet and ORC are columnar formats. Athena reads only the columns your query references, skipping the rest. For a 4-column table queried on 1 column, that's a 75% reduction in data read before compression even enters the picture.
Two mechanisms drive this:
- Column projection: Athena skips entire column chunks that aren't in the SELECT list.
- Predicate pushdown: Block-level min/max statistics let Athena skip data blocks that can't contain rows matching your WHERE clause — without reading those blocks.
GZIP compression on CSV reduces the file size, but Athena must still decompress and read the entire file. With Parquet + compression, you get both column skipping and smaller files — up to 96% reduction from uncompressed CSV baseline.
Convert existing data with CTAS:
CREATE TABLE your_table_parquet
WITH (
format = 'PARQUET',
parquet_compression = 'SNAPPY',
external_location = 's3://your-bucket/parquet-data/'
)
AS SELECT * FROM your_table_csv;
ORC is worth considering for tables with complex types (structs, maps, arrays) where ORC's type-aware compression outperforms Parquet. For everything else, Parquet with Snappy or GZIP is the default recommendation.
Partition Your Data
Partitions map to S3 prefixes. A date-partitioned table stores data like s3://bucket/logs/year=2026/month=03/day=15/. When a query filters on WHERE year=2026 AND month=03 AND day=15, Athena only reads that one prefix and skips every other day's data.
The gotcha that drives the most community questions: filtering on a non-partitioned column does not prune partitions. A query like WHERE user_id = '12345' on a table partitioned by date still reads every partition unless user_id is also a partition key. This is the #1 reason teams see high bills despite having "WHERE clauses everywhere."
A few partitioning decisions that matter:
- Choose columns that are commonly used as query filters. Date fields are the canonical choice.
- Avoid high-cardinality partition keys. Partitioning by
user_idcreates millions of tiny prefixes — which circles back to the small-files problem. - Use
EXPLAIN ANALYZEafter running a query to verify partition pruning is actually occurring. The output shows how many partitions were scanned vs. skipped.
For tables with a regular partition scheme, partition projection configures Athena to calculate partition values mathematically rather than looking them up in the Glue catalog. This eliminates Glue catalog calls per partition and speeds up query planning.
Fix the Small Files Problem
Even with Parquet and partitioning in place, small files can still drive costs up through S3 GET charges. Each file in S3 generates at least one GET request when Athena reads it, regardless of how small the file is.
Streaming pipelines are the most common source: Kinesis Firehose buffering every 60 seconds produces 1,440 files per day per partition. A table partitioned by hour across 30 days is 720 partitions × 1,440 files = over 1 million files. Every query touches all of them.
The fix is CTAS compaction. Run once per partition (or schedule it nightly for active tables):
CREATE TABLE compacted
WITH (
format = 'PARQUET',
external_location = 's3://your-bucket/compacted/'
)
AS SELECT * FROM fragmented;
Drop the old table definition, re-register the compacted location, and you're done. For new data arriving from streaming pipelines, configure your Firehose buffer to 128 MB or 10 minutes — whichever comes first — to produce files that stay above the 128 MB threshold.
Write Better Queries
Data organization handles most of the cost. Query discipline handles the rest.
SELECT * defeats column pruning on columnar formats. Athena reads every column in every row, turning a Parquet optimization into a wasted investment. Always name the columns you need.
ORDER BY without LIMIT runs a full distributed sort across the entire dataset — every row participates. Add a LIMIT unless you genuinely need all rows sorted.
For distinct value estimation, approx_distinct(column) uses HyperLogLog with approximately 2.3% standard error. It uses far less memory than COUNT(DISTINCT column), which requires reading every value. For cardinality-heavy columns, the approximation is usually close enough.
Order GROUP BY columns from highest cardinality to lowest. This minimizes memory use on worker nodes and reduces intermediate shuffle data.
Window functions require the full dataset on a single node before any computation happens. Filter and partition data before applying windows, not after.
Athena does not cache query results by default. Re-running identical queries on static datasets costs the same every time. Enable query result reuse at the workgroup level to avoid redundant scans on unchanged data.
Athena Capacity Reservations: When They Make Sense
Capacity Reservations make economic sense when the cost of reserving DPUs falls below the equivalent per-TB charge for your workload. The February 2026 update — 1-minute billing intervals and a 4 DPU minimum — significantly lowered the break-even point. Workloads that would have been uneconomical under the older higher-minimum model should be recalculated.
Best candidates: BI dashboard backends with predictable peak windows, high-concurrency reporting workloads running on a schedule, and applications where query prioritization matters.
Poor candidates: ad-hoc analysis, dev/test environments, and workloads that run only a few times per week where idle DPU time would exceed the cost of paying on-demand.
According to the AWS Big Data Blog post on the February 2026 capacity reservation updates, short-duration workloads can achieve up to 95% cost savings versus on-demand.
How Capacity Reservations Are Billed
You pay for the DPUs you hold in the reservation, not for the DPUs queries actually use. If you reserve 160 DPUs and a query only needs 40, you still pay for 160 during that period. Idle capacity in a reservation costs $0.30 per DPU-hour — real money if your peak window is short and you don't scale down afterward.
DML queries consume 4 to 124 DPUs automatically based on complexity. DDL queries always consume 4 DPUs. This means a reservation sized for DML concurrency requirements will handle DDL without issue.
If your reservation is fully utilized and more queries arrive, those queries queue for up to 10 hours before timing out. This queuing behavior is predictable for batch workloads, but can cause user-facing latency issues for interactive BI if your reservation is undersized.
To minimize idle DPU costs, use AWS Step Functions to orchestrate scale-down and scale-up based on CloudWatch utilization metrics. Scale down to the 4 DPU minimum during off-hours, scale up 2 minutes before peak begins.
Break-Even Analysis: On-Demand vs. Reserved
The official AWS example: a BI application with 20 concurrent queries, each consuming 8 DPUs, requires a 160 DPU reservation.
Peak period (15 minutes): 160 DPU × $0.30/DPU-hour × 0.25 hours = $12.00
Off-peak period (45 minutes): 4 concurrent queries, 16 DPU reservation × $0.30 × 0.75 hours = $3.60
Total per hour: $15.60
Whether this beats on-demand depends on how much data those 20 queries scan. If each query scans 100 GB, 20 concurrent queries scan 2 TB per peak cycle = $10.00 on-demand just for the scans. Run 4 cycles per hour and on-demand costs $40/hour vs. $15.60 for the reservation.
The new 1-minute billing intervals mean you're no longer losing an hour of reservation cost when a peak window is only 15 minutes long. Under the old model, a 15-minute peak window consumed a full hour of billing. Now it consumes 15 minutes.
Rule of thumb: if your peak concurrency regularly exceeds 10 simultaneous queries on a predictable schedule, run the numbers with the Athena pricing calculator. Reservations are likely cheaper, especially now.
Athena for Apache Spark Pricing
Spark in Athena is a different engine with different use cases. The $0.35/DPU-hour rate applies to both the driver and all worker DPUs, billed per second.
The driver runs for the full session duration. An engineer who opens a notebook, writes some exploratory code over 90 minutes, and then closes it pays $0.35 × 1.5 hours = $0.525 in driver charges — even if the actual computation took 10 minutes. This is the most important cost driver to understand for interactive notebook usage. Teams with idle notebook habits should consider setting session timeouts.
Workers auto-scale per calculation. Each time you run a cell, Athena allocates workers for that specific calculation and charges for that duration only. The official example — a 1-hour session, 6 calculations, 20 workers each for 1 minute — works out to:
- Driver: 1 DPU × 1 hour = 1.0 DPU-hours
- Workers: 6 calculations × 20 DPUs × (1/60 hour) = 2.0 DPU-hours
- Total: 3.0 DPU-hours × $0.35 = $1.05
The 160 DPU total concurrency limit per account per region is not adjustable. Per session, the limit is 60 DPUs. For organizations running multiple data scientists simultaneously, this constraint matters.
Since November 2025, Athena Spark is available in Amazon SageMaker Unified Studio notebooks, running on Spark 3.5.6 with Apache Iceberg and Delta Lake support. Athena's Apache Iceberg support appears to be billed under the standard $5/TB model, but AWS hasn't documented a separate pricing tier for ACID operations specifically. Starting with Spark 3.5, cost allocation tags can be specified at session start and will appear in Cost Explorer — useful for attributing Spark costs to individual teams or projects.
Best use cases: Python/PySpark ETL pipelines, ML preprocessing, and interactive data exploration where you want Spark semantics without managing an EMR cluster.
Controlling Costs with Workgroups
Workgroups are Athena's primary cost control mechanism. Every query runs in a workgroup, and workgroups carry two types of controls: per-query scan limits (which automatically cancel queries) and per-workgroup usage alerts (which notify but don't cancel). These are safety nets, not substitutes for efficient data organization. Set them before production workloads go live.
For monitoring, Athena publishes two key CloudWatch metrics per workgroup:
DataScannedInBytes— total data read across all queries in the workgroupDPUConsumed— capacity usage for workgroups on Capacity Reservations
Cost allocation tags can be applied to workgroups, data catalogs, and capacity reservations — up to 50 tags each. Activate them in AWS Billing to get per-team or per-project cost attribution in Cost Explorer.
For deeper context on CloudWatch metrics pricing when building dashboards and alarms around these metrics, the CloudWatch billing adds a small but real overhead.
Per-Query Scan Limits
A per-query scan limit sets a maximum amount of data any single query in a workgroup can read. If a query hits the limit, Athena cancels it automatically. The charge for data scanned up to the cancellation point still applies — the limit prevents further charges, not the charges already incurred.
Only one per-query limit can be set per workgroup. Minimum: 10 MB. Maximum: 7 EB.
Configure via AWS CLI:
aws athena update-work-group \
--work-group my-workgroup \
--configuration-updates "BytesScannedCutoffPerQuery=10737418240"
That sets a 10 GB limit (10,737,418,240 bytes). A reasonable starting point for teams with large tables: set the limit to 10x the expected scan size for your largest legitimate query, then tighten it over time as you understand your workload patterns. See the AWS documentation on Athena workgroup data controls for the full configuration reference.
Workgroup-Level Usage Alerts
Per-workgroup alerts fire an SNS notification when aggregate data scanned across all queries exceeds a threshold over a specified time period (hourly or daily). Unlike per-query limits, these do not cancel queries — they alert and leave the response to you.
You can configure multiple thresholds per workgroup. A common pattern: alert at 500 GB/day for awareness, alert again at 1 TB/day to trigger action.
After the SNS notification fires, you have options: invoke a Lambda to automatically disable the workgroup if the threshold is breached, or page the on-call engineer for manual review. The DataScannedInBytes CloudWatch metric underlies these alerts — you can also query it directly for custom dashboards or more granular alarms than the built-in alert configuration supports.
This distinction matters: per-query limits cancel. Workgroup alerts notify. Teams that set only an alert and expect it to stop runaway queries will be surprised when it doesn't.
Athena vs. Redshift vs. BigQuery
The right service depends entirely on query frequency and concurrency. No single service wins across all workloads.
| Dimension | Athena | Redshift Serverless | BigQuery |
|---|---|---|---|
| Base pricing model | $5.00/TB scanned | Per RPU-hour (base capacity fee) | $5.00/TB scanned (on-demand) |
| Idle cost | $0 (on-demand) | Base capacity applies | $0 (on-demand) |
| Best fit | Intermittent, ad-hoc | Sustained, high-concurrency | Intermittent or scalable capacity |
| Infrastructure management | None | None | None |
| Data lake integration | Native (S3) | External tables possible | Native (GCS, also supports S3) |
| Concurrency model | Per-TB or DPU reservation | RPU auto-scaling | Slot-based or on-demand |
Athena and BigQuery have identical base scan rates on on-demand: $5.00 per TB. The economics diverge at scale. BigQuery has autoscaling capacity modes (formerly BigQuery Reservations) with per-slot pricing; Athena has Capacity Reservations at $0.30/DPU-hour. Both favor similar workloads — just within their respective ecosystems.
Redshift Serverless makes sense when query loads are sustained and concurrent. A team running 8+ hours of queries per day against the same data set will usually find Redshift's per-RPU-hour model cheaper than Athena at $5/TB, because the fixed capacity cost is amortized across more queries. Athena wins for workloads that are bursty, infrequent, or span many datasets.
Snowflake's credit model is harder to compare directly but typically favors warehouse-scale analytics and becomes expensive for infrequent or low-volume queries where credit minimums apply.
The rough decision framework: if your team runs queries less than 8 hours per day against evolving datasets on S3, Athena on-demand is hard to beat. Once queries consistently run 12+ hours per day against the same data, a provisioned warehouse often comes out cheaper once you account for the per-scan cost compounding.
Athena's structural advantage is real: no infrastructure to manage, zero idle costs on-demand, and it queries data that's already in your S3 data lake without requiring an import step.
Note: Redshift and BigQuery pricing figures should be verified against current pricing pages before making budget decisions. AWS doesn't publish per-region breakdowns for Athena SQL on-demand pricing either — use the AWS Pricing Calculator to confirm rates for your target region.
Frequently Asked Questions
Does Amazon Athena have a free tier?
How much does Amazon Athena cost per month?
Does Athena charge for failed queries?
Can Athena query data in S3 Glacier?
What are the Athena query concurrency limits?
How do I find which queries are driving my Athena costs?
Is Athena Capacity Reservations available in all regions?
What do Lambda charges look like for Athena federated queries?
The Bottom Line on Athena Costs
Amazon Athena pricing is genuinely simple for on-demand SQL: $5.00 per TB scanned. But the bill that lands at the end of the month reflects more than that one rate.
Key takeaways:
- The Athena line item in Cost Explorer is not the complete cost. S3 GET requests, Glue crawler charges, and Lambda invocations for federated queries all contribute — often invisibly.
- Converting to Parquet with compression is the single highest-ROI optimization available: up to 96% scan reduction, no application code changes required, one CTAS query per table.
- Capacity Reservations became significantly more practical in February 2026 with 1-minute intervals and a 4 DPU minimum. If you evaluated them under the older model and passed, run the numbers again.
- Workgroup per-query scan limits are the safety net that prevents a single ad-hoc query from generating an outsized bill. Set them before production workloads go live, not after the first billing surprise.
- Athena wins for infrequent and ad-hoc queries against S3. Evaluate Redshift Serverless or BigQuery once sustained daily query hours consistently exceed 8-12 hours against the same data.
The specific next action: pull the last 7 days of DataScannedInBytes from CloudWatch, identify the top 3 workgroups by scan volume, and check whether those datasets are in Parquet with partition filters applied. That single check will tell you where the money is going and which optimization to tackle first. To estimate your monthly bill, try the Athena pricing calculator. For a deeper look at the S3 costs that affect your total Athena spend, see the Amazon S3 pricing breakdown.
If you want to embed cost awareness earlier (before the query runs rather than after the bill arrives), CloudBurn can help with that.
CloudBurn
Catch Expensive AWS Patterns in Code Review
CloudBurn scans your Terraform and CloudFormation for cost anti-patterns directly in CI. Same rules also run against your live AWS account to find what's already wasting money. Open source, installs in seconds.