AWS Cost Explorer Architecture

The AWS Cost Explorer API (ce) is the acquisition surface that feeds aggregated, time-series billing telemetry into the first stage of a production cost pipeline. This page covers one specific mechanism inside the broader FinOps Architecture & Billing Fundamentals pipeline: how to drive GetCostAndUsage, GetCostForecast, and the Cost Categories APIs as a deterministic, idempotent ingestion source rather than a dashboard you click through. Unlike Cost and Usage Reports (CUR), which deliver line-item Parquet to S3 on a 24–48 hour delay, the ce API returns a managed, pre-aggregated query layer optimized for daily reconciliation, anomaly detection, and showback. The engineering constraints that shape every decision below are its hard 1,000-row response ceiling, its per-request monetary cost, and a default 5 requests-per-second throttle — none of which forgive naive polling. Lead with those constraints, implement the ingestion engine that respects them, and instrument the failure modes that bite teams who treat ce like an unmetered database.

Architecture Context & Data-Flow Position

Within the four-stage pipeline (acquisition → normalization → allocation → persistence) defined by the parent FinOps Architecture & Billing Fundamentals reference, Cost Explorer occupies the acquisition stage and a thin slice of preliminary normalization. Its job is to hand the next stage a complete, ordered, de-duplicated set of cost records — or fail loudly — never a silently truncated page.

The flow is deterministic: scheduled trigger → query construction → cursor-based pagination loop → metric extraction → dimensional canonicalization → handoff to the normalization stage. Because the API returns data that AWS has already aggregated and reconciled, it is the right source for daily syncs and executive refreshes, and the wrong source for line-item forensic work — that belongs to CUR. A mature stack runs both: cheap ce aggregates nightly for fast reconciliation, reconciled against authoritative CUR totals.

When telemetry is standardized across providers, the ce output must align with the GCP Billing Export Configuration dataset and the Azure Cost Management Setup export so that one normalized model can drive cross-cloud cost allocation strategies. Concretely, that means mapping AWS SERVICE and USAGE_TYPE onto GCP service.description / sku.description and Azure meterCategory / meterName, then normalizing currency and aggregation windows before allocation runs.

API Surface Map

Operation	Purpose	Key constraints
`GetCostAndUsage`	Primary aggregated spend/usage retrieval	1,000-row cap, `NextPageToken` pagination, per-request cost
`GetCostAndUsageWithResources`	Resource-level granularity	Last 14 days only, higher cost, opt-in
`GetCostForecast`	Forward spend projection	Requires ≥ a few weeks of history; confidence intervals
`GetAnomalies` / `GetAnomalyMonitors`	Cost Anomaly Detection feed	Eventual consistency on recent windows
`ListCostCategories` / `UpdateCostCategoryDefinition`	Business-context dimensioning	Definition changes apply forward, not retroactively

Core Implementation Patterns

1. IAM & Least Privilege

Attach only the read actions the pipeline needs — ce:GetCostAndUsage, ce:GetCostForecast, ce:GetAnomalies, and ce:ListCostCategories — to the execution role. The ce API is global and account-scoped, so resource ARNs are coarse; compensate with explicit deny guardrails and condition keys rather than relying on resource-level restriction. For an Organizations estate, deploy the ingestion role in the management (payer) account, because only the payer can see linked-account spend, and use sts:AssumeRole with an external ID for any cross-account orchestration. Centralize credential rotation there to keep one audit trail.

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "ce:GetCostAndUsage",
      "ce:GetCostForecast",
      "ce:GetAnomalies",
      "ce:ListCostCategories"
    ],
    "Resource": "*"
  }]
}

2. Query Construction & Metric Selection

Define TimePeriod with ISO-8601, half-open boundaries (Start inclusive, End exclusive), set Granularity to DAILY or MONTHLY, and choose Metrics deliberately. Use UnblendedCost for raw, undiscounted rate tracking; AmortizedCost for Reserved Instance and Savings Plans spend spread across the commitment term; and NetUnblendedCost for post-discount, post-credit cash actuals. Apply GroupBy on a DIMENSION (SERVICE, LINKED_ACCOUNT, USAGE_TYPE, REGION), a TAG key, or a COST_CATEGORY. The API allows at most two GroupBy entries per request — deeper breakdowns require either multiple passes or post-ingestion fan-out.

query = {
    "TimePeriod": {"Start": "2026-06-01", "End": "2026-06-27"},  # End is exclusive
    "Granularity": "DAILY",
    "Metrics": ["AmortizedCost"],
    "GroupBy": [
        {"Type": "DIMENSION", "Key": "LINKED_ACCOUNT"},
        {"Type": "DIMENSION", "Key": "SERVICE"},
    ],
}

3. Pagination & Rate-Limit Handling

The ce API returns a NextPageToken whenever a result set exceeds the 1,000-row per-response ceiling, and it throttles at roughly 5 requests per second with ThrottlingException. Production clients must loop on the token, persist pagination state so a restart resumes instead of re-fetching, and back off exponentially with jitter on throttling. Boto3’s adaptive retry mode handles transient throttling for you, but high-cardinality groupings (for example RESOURCE_ID across a large estate) still need an explicit, durable cursor.

4. Metric & Dimension Mapping

The raw Keys/Metrics shape returned by ce is not a storage schema. Canonicalize each record into a unified dimensional model at the edge of acquisition: split multi-key groups, coerce Amount to a typed decimal, normalize the currency Unit, and stamp the amortization basis (which metric produced the value). This is also where AWS native tags are reconciled against centralized Cost Categories so allocation never depends on fragile tag inheritance.

Production-Grade Python Ingestion Engine

The module below is self-contained and runnable. It composes a retry decorator with structured logging, a typed dataclass record model, durable cursor pagination, metric mapping, and a __main__ guard. It is designed to run as a scheduled Lambda or a containerized cron job.

import functools
import json
import logging
import random
import time
from dataclasses import dataclass, asdict
from typing import Callable, Dict, Iterable, List, Optional

import boto3
import botocore.exceptions
from botocore.config import Config

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s | %(levelname)s | %(name)s | %(message)s",
)
logger = logging.getLogger("ce_ingestor")

# Metrics that carry a monetary Amount + currency Unit in the ce response.
MONETARY_METRICS = {"UnblendedCost", "AmortizedCost", "NetUnblendedCost", "BlendedCost"}


def with_retries(max_attempts: int = 6, base_delay: float = 0.5) -> Callable:
    """Retry on throttling/transient errors with exponential backoff + jitter.

    Boto3 adaptive mode covers most throttling, but a high-cardinality paging
    loop can still exhaust it; this decorator is the durable outer guard.
    """
    transient = {"ThrottlingException", "TooManyRequestsException",
                 "RequestLimitExceeded", "ServiceUnavailable"}

    def decorator(fn: Callable) -> Callable:
        @functools.wraps(fn)
        def wrapper(*args, **kwargs):
            attempt = 0
            while True:
                try:
                    return fn(*args, **kwargs)
                except botocore.exceptions.ClientError as exc:
                    code = exc.response.get("Error", {}).get("Code", "")
                    attempt += 1
                    if code not in transient or attempt >= max_attempts:
                        raise
                    sleep = base_delay * (2 ** (attempt - 1)) + random.uniform(0, 0.4)
                    logger.warning("Throttled (%s); retry %d/%d in %.2fs",
                                   code, attempt, max_attempts, sleep)
                    time.sleep(sleep)
        return wrapper
    return decorator


@dataclass(frozen=True)
class CostRecord:
    """Normalized, storage-ready cost row — the acquisition handoff contract."""
    period_start: str
    period_end: str
    metric: str
    amount: float
    currency: str
    dimensions: Dict[str, str]

    def fingerprint(self) -> str:
        """Stable key for idempotent, exactly-once persistence."""
        dims = "|".join(f"{k}={v}" for k, v in sorted(self.dimensions.items()))
        return f"{self.period_start}:{self.metric}:{dims}"


class CostExplorerIngestor:
    def __init__(self, region: str = "us-east-1",
                 assume_role_arn: Optional[str] = None) -> None:
        config = Config(retries={"max_attempts": 5, "mode": "adaptive"},
                        max_pool_connections=10)
        session = self._session(assume_role_arn)
        # ce is a global service; us-east-1 is the canonical endpoint.
        self.client = session.client("ce", region_name=region, config=config)

    @staticmethod
    def _session(assume_role_arn: Optional[str]) -> boto3.Session:
        if not assume_role_arn:
            return boto3.Session()
        sts = boto3.client("sts")
        creds = sts.assume_role(
            RoleArn=assume_role_arn,
            RoleSessionName="ce-ingestor",
        )["Credentials"]
        return boto3.Session(
            aws_access_key_id=creds["AccessKeyId"],
            aws_secret_access_key=creds["SecretAccessKey"],
            aws_session_token=creds["SessionToken"],
        )

    @with_retries()
    def _page(self, query: Dict) -> Dict:
        return self.client.get_cost_and_usage(**query)

    def fetch_costs(
        self,
        start_date: str,
        end_date: str,
        group_by: Iterable[Dict[str, str]] = ({"Type": "DIMENSION", "Key": "SERVICE"},),
        metric: str = "AmortizedCost",
        granularity: str = "DAILY",
    ) -> List[CostRecord]:
        if metric not in MONETARY_METRICS:
            raise ValueError(f"Unsupported metric: {metric}")

        group_by = list(group_by)
        if len(group_by) > 2:
            raise ValueError("ce allows at most two GroupBy entries per request")

        query: Dict = {
            "TimePeriod": {"Start": start_date, "End": end_date},  # End exclusive
            "Granularity": granularity,
            "Metrics": [metric],
            "GroupBy": group_by,
        }

        records: List[CostRecord] = []
        next_token: Optional[str] = None
        page = 0

        while True:
            if next_token:
                query["NextPageToken"] = next_token
            response = self._page(query)
            page += 1

            for day in response.get("ResultsByTime", []):
                period = day["TimePeriod"]
                groups = day.get("Groups", [])
                if not groups and day.get("Total"):
                    # Ungrouped query: the metric lives under Total.
                    records.append(self._record(period, metric, group_by,
                                                ["ALL"], day["Total"]))
                for grp in groups:
                    records.append(self._record(period, metric, group_by,
                                                grp["Keys"], grp["Metrics"]))

            next_token = response.get("NextPageToken")
            logger.info("page=%d running_total=%d", page, len(records))
            if not next_token:
                break

        logger.info("Ingested %d records for %s..%s", len(records), start_date, end_date)
        return records

    @staticmethod
    def _record(period: Dict, metric: str, group_by: List[Dict[str, str]],
                keys: List[str], metrics: Dict) -> CostRecord:
        dims = {gb["Key"]: keys[i] for i, gb in enumerate(group_by)} if group_by else {}
        cell = metrics[metric]
        return CostRecord(
            period_start=period["Start"],
            period_end=period["End"],
            metric=metric,
            amount=float(cell["Amount"]),
            currency=cell["Unit"],
            dimensions=dims,
        )


if __name__ == "__main__":
    ingestor = CostExplorerIngestor()
    rows = ingestor.fetch_costs(
        start_date="2026-06-01",
        end_date="2026-06-27",  # exclusive
        group_by=[{"Type": "DIMENSION", "Key": "LINKED_ACCOUNT"},
                  {"Type": "DIMENSION", "Key": "SERVICE"}],
        metric="NetUnblendedCost",
    )
    # Emit newline-delimited JSON for the normalization stage to consume.
    for row in rows:
        print(json.dumps({**asdict(row), "_fp": row.fingerprint()}))

Schema Reference Table

The ce response is a nested ResultsByTime → Groups → {Keys, Metrics} structure. The mapping below collapses it into the normalized dimensional model shared with the other providers, which is what makes cross-cloud cost allocation strategies operate on one schema.

AWS `ce` field	Normalized field	Type	Notes
`ResultsByTime[].TimePeriod.Start`	`period_start`	date	Inclusive lower bound
`ResultsByTime[].TimePeriod.End`	`period_end`	date	Exclusive upper bound
`Groups[].Keys[0]`	`dimensions.<group_by>`	string	First `GroupBy` key (e.g. `SERVICE`)
`Groups[].Keys[1]`	`dimensions.<group_by_2>`	string	Optional second `GroupBy` key
`Metrics.<metric>.Amount`	`amount`	decimal	Parse as `Decimal`, never `float`, for ledger math
`Metrics.<metric>.Unit`	`currency`	string	ISO-4217; normalize multi-currency payers
`Metrics` key name	`metric`	enum	Amortization basis: `UnblendedCost` / `AmortizedCost` / `NetUnblendedCost`
`NextPageToken`	(cursor state)	string	Persist for idempotent restart

UnblendedCost is the undiscounted rate for baseline forecasting; AmortizedCost spreads upfront Reserved Instance and Savings Plans fees across the usage period per accrual accounting; NetUnblendedCost reflects active discounts and credits — the true cash impact. The commitment-spreading logic that reconciles these against purchase records is detailed in Reserved Instance Mapping Logic.

Operational Considerations

Rate limits: the ce API throttles at ~5 TPS and returns ThrottlingException; sustained pipelines should serialize requests and rely on adaptive retries plus the durable backoff above.
Per-request cost: each paginated request is billed (around $0.01 at time of writing). Wide GroupBy over high-cardinality dimensions multiplies pages — and cost — so prefer MONTHLY granularity for back-fills and DAILY only for the rolling window you actually reconcile.
Row ceiling: the 1,000-row response cap is per page, not per query; never assume a single call is complete without checking NextPageToken.
Eventual consistency: the trailing 24–72 hours of cost data is provisional and re-stated as AWS finalizes billing. Treat recent partitions as mutable and re-ingest them; only data older than the finalization window is authoritative.
Cost Category drift: UpdateCostCategoryDefinition applies forward, not retroactively for already-finalized months, so categorize via centralized Cost Categories rather than relying on tag inheritance — the multi-account patterns are covered in How to Structure AWS Cost Categories for Multi-Account Orgs.
Monitoring hooks: emit record counts, page counts, and per-run wall-clock to CloudWatch; alert when ingested totals deviate from the prior run by more than a tuned threshold. Schedule daily syncs during AWS billing-finalization off-peak windows (typically 02:00–05:00 UTC).

Troubleshooting

1. Silent truncation — fewer rows than expected. Root cause: the pagination loop exits on the first page because NextPageToken is ignored. Detection: ingested row count is a suspiciously round number near a multiple of 1,000. Remediation: assert the loop only breaks when next_token is falsy, and log page and running_total every iteration (as the engine above does).

2. ThrottlingException storms during back-fill. Root cause: parallel workers or a tight retry loop exceed 5 TPS. Detection: spike in ThrottlingException in logs, elevated run duration. Remediation: serialize requests, keep boto3 mode="adaptive", and add jittered exponential backoff. For large back-fills, switch to MONTHLY granularity to collapse page counts.

3. Double-counted dollars after a restart. Root cause: the pipeline re-ran a window and appended instead of replacing. Detection: totals jump by ~2x for a re-processed period. Remediation: write with partition-scoped, idempotent upserts keyed on CostRecord.fingerprint(), so re-running a window replaces exactly that window’s rows.

4. Numbers don’t reconcile against the invoice. Root cause: mixing metrics — comparing UnblendedCost to an amortized invoice line, or summing across overlapping GroupBy passes. Detection: a consistent percentage gap between ce totals and CUR/invoice. Remediation: pin a single amortization basis per report, parse Amount as Decimal, and reconcile against authoritative CUR nightly as described in FinOps Framework Implementation.

5. Recent days keep changing. Root cause: querying the trailing 72-hour provisional window and treating it as final. Detection: yesterday’s total differs across two runs. Remediation: re-ingest the trailing window every run and only freeze partitions older than the finalization lag.

Frequently Asked Questions

When should I use the Cost Explorer API instead of Cost and Usage Reports?

Use the ce API for aggregated, near-real-time reconciliation, dashboards, and showback where pre-aggregated daily or monthly totals are sufficient. Use CUR for line-item forensic analysis, resource-level attribution, and authoritative monthly close, accepting its 24–48 hour delivery latency. Mature pipelines run both and reconcile the cheap ce aggregates against authoritative CUR totals.

What is the difference between UnblendedCost, AmortizedCost, and NetUnblendedCost?

UnblendedCost is the undiscounted on-demand rate. AmortizedCost spreads upfront Reserved Instance and Savings Plans fees across the commitment term to align with accrual accounting. NetUnblendedCost applies active discounts and credits, reflecting true cash impact. Pin one basis per report — mixing them is the most common reconciliation bug.

How do I avoid throttling on the ce API?

Stay under the ~5 requests-per-second ceiling by serializing requests, enabling boto3 mode="adaptive", and adding jittered exponential backoff on ThrottlingException. For large historical back-fills, switch to MONTHLY granularity to collapse the number of paginated requests.

Why does the trailing few days of cost data keep changing?

The most recent 24–72 hours of cost data is provisional and re-stated as AWS finalizes billing. Treat recent partitions as mutable, re-ingest them on every run, and only freeze partitions older than the finalization window.

FinOps Architecture & Billing Fundamentals — the parent reference that defines the acquisition → normalization → allocation → persistence pipeline this page plugs into.
GCP Billing Export Configuration — the equivalent acquisition surface on GCP, for aligning schemas across providers.
Azure Cost Management Setup — the Azure export pattern whose meterCategory fields map onto the same normalized model.
Reserved Instance Mapping Logic — the commitment-amortization reconciliation behind the AmortizedCost metric.
How to Structure AWS Cost Categories for Multi-Account Orgs — applying business context to raw ce dimensions at scale.

Up: FinOps Architecture & Billing Fundamentals

AWS Cost Explorer Architecture

# Architecture Context & Data-Flow Position

# API Surface Map

# Core Implementation Patterns

# 1. IAM & Least Privilege

# 2. Query Construction & Metric Selection

# 3. Pagination & Rate-Limit Handling

# 4. Metric & Dimension Mapping

# Production-Grade Python Ingestion Engine

# Schema Reference Table

# Operational Considerations

# Troubleshooting

# Frequently Asked Questions

# Related