Azure Cost Management Setup

Azure Cost Management is the acquisition surface that feeds Microsoft billing telemetry into the first stage of a production cost pipeline. This page covers two specific mechanisms inside the broader FinOps Architecture & Billing Fundamentals pipeline: the scheduled Exports feature that lands columnar billing data in blob storage, and the Microsoft.CostManagement/query REST API that drives deterministic, idempotent reconciliation runs. Azure’s billing model is hierarchical — Management Group → Subscription → Resource Group → Resource — and the same usage window is re-published, re-amortized, and re-paginated as Microsoft finalizes charges, so the engineering constraints that shape every decision below are its 24–72 hour eventual-consistency window, its 429-driven throttling on the query endpoint, and the tag-flattening behaviour of the export schema. Lead with those constraints, implement the ingestion engine that respects them, and instrument the failure modes that bite teams who treat Cost Management like a static report.

Architecture Context & Data-Flow Position

Within the four-stage pipeline (acquisition → normalization → allocation → persistence) defined by the parent FinOps Architecture & Billing Fundamentals reference, Azure Cost Management occupies the acquisition stage and a thin slice of preliminary normalization. Its job is to hand the next stage a complete, ordered, de-duplicated set of cost records keyed on a stable fingerprint — or fail loudly — never a silently truncated page or a half-resolved tag hierarchy.

Two acquisition paths coexist, and a mature stack runs both. Scheduled Exports deliver bulk, line-item history as CSV or Parquet to an Azure Storage container on a daily cadence; this is the authoritative source for monthly close and resource-level forensics, accepting a 24–48 hour delivery latency. The Query REST API returns pre-aggregated, near-real-time totals for fast nightly reconciliation, anomaly detection, and showback. The flow is deterministic: scheduled trigger → token acquisition → query construction → nextLink pagination loop → metric extraction → dimensional canonicalization → handoff to the normalization stage. The export path mirrors the intent of AWS Cost Explorer Architecture and GCP Billing Export Configuration, where teams standardize on daily Parquet to unify ingestion logic across providers.

API & Scope Map

The billing boundary you target determines both the IAM role and the URL scope segment. The query endpoint accepts any of these scopes verbatim:

Billing boundary	Scope path segment	Required role
Subscription	`subscriptions/{subId}`	`Cost Management Reader`
Resource group	`subscriptions/{subId}/resourceGroups/{rg}`	`Cost Management Reader`
Management group	`providers/Microsoft.Management/managementGroups/{mgId}`	`Cost Management Reader` (inherited)
EA enrollment account	`providers/Microsoft.Billing/billingAccounts/{enrollmentId}`	`Billing Account Reader` (Enterprise)
MCA billing profile	`providers/Microsoft.Billing/billingAccounts/{id}/billingProfiles/{pid}`	`Billing Profile Reader`

The query URL is always https://management.azure.com/{scope}/providers/Microsoft.CostManagement/query?api-version=2023-11-01. The choice between Enterprise Agreement (EA) and Microsoft Customer Agreement (MCA) scopes changes the field names you receive downstream — the EA-to-tag reconciliation is detailed in Mapping Azure EA Billing to FinOps Tags.

Core Implementation Patterns

1. IAM & Least Privilege

Cost Management requires explicit read access scoped to the precise billing boundary your pipeline monitors. Assign Cost Management Reader for subscription- or resource-group-level visibility, or Billing Account Reader for EA enrollment-level aggregation, to the service principal or managed identity that runs the pipeline. Cross-subscription visibility is granted once at the Management Group level and inherits downward — never enumerate per-subscription role assignments by hand.

Avoid contributor-level roles on any billing scope; they grant write access you do not need and widen the blast radius of a leaked credential. Where the runner is an Azure-hosted workload (Data Factory, Databricks, AKS, a Function App), prefer a managed identity over a client-secret service principal, and provision the identity before configuring exports so the first run does not deadlock on authentication. Use Azure Policy to restrict export destinations to approved storage accounts, closing off exfiltration to an attacker-controlled container.

# Least-privilege role assignment via the Azure CLI (run once, out of band).
# az role assignment create \
#   --assignee-object-id "$IDENTITY_OBJECT_ID" \
#   --assignee-principal-type ServicePrincipal \
#   --role "Cost Management Reader" \
#   --scope "/subscriptions/$AZURE_SUBSCRIPTION_ID"

2. Configuring Scheduled Exports

In the portal, navigate to Cost Management → Exports → Add, or provision the export as code. Define the export scope, set frequency to Daily, and target an Azure Storage container with hierarchical namespace (ADLS Gen2) enabled — it is mandatory for efficient partition pruning and clean Delta Lake or Apache Iceberg integration. Choose Parquet over CSV for columnar compression, schema enforcement, and query acceleration, and enable Include amortized cost so reservation and savings-plan fees are spread across the usage period for accurate showback and chargeback.

Lock down the path: permit the AzureCostManagement service tag on the storage firewall, and use private endpoints if your organization enforces network isolation. The amortized basis here is the same accrual logic reconciled in Reserved Instance Mapping Logic.

3. Query Construction, Pagination & Rate-Limit Handling

The Query API is the right tool for daily reconciliation. A request body declares a type (ActualCost or AmortizedCost), a timeframe, a dataset.granularity, an aggregation, and up to two grouping dimensions. The response is a columnar properties.columns / properties.rows structure plus an optional properties.nextLink. Two rules keep the loop correct: the nextLink is a fully-formed URL fetched with GET and no body (not a re-POST), and the endpoint throttles with HTTP 429 plus a Retry-After header that you must honour rather than hammering. Treat the trailing window as mutable and re-query it every run.

4. Metric & Dimension Mapping

Azure returns PreTaxCost or Cost in the billing currency of the scope, not necessarily USD — multi-currency enrollments must normalize on an ISO-4217 code before any ledger math. Tags are exposed as a nested object and must be resolved through the inheritance chain (Resource → Resource Group → Subscription), defaulting to an Unallocated bucket when absent. That canonicalization is what lets Azure records share one schema with the other providers in cross-cloud cost allocation strategies.

Production-Grade Python Ingestion Engine

The implementation below drives the Query API directly with requests and azure-identity. Driving the REST surface rather than a high-level SDK gives explicit control over retry, nextLink pagination, and header inspection, and avoids SDK version churn. It uses DefaultAzureCredential for identity resolution across local development and cloud runners, a typed CostRecord dataclass for the normalized model, structured logging, and a __main__ guard.

import os
import time
import json
import logging
from dataclasses import dataclass, field, asdict
from datetime import datetime, timedelta, timezone
from decimal import Decimal
from hashlib import sha256
from typing import Any, Dict, List, Optional

import requests
from azure.identity import DefaultAzureCredential

logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
logger = logging.getLogger(__name__)

COST_MGMT_BASE = "https://management.azure.com"
API_VERSION = "2023-11-01"
ARM_SCOPE = "https://management.azure.com/.default"


@dataclass
class CostRecord:
    """Normalized cost record shared across all cloud providers."""
    period_start: str
    resource_group: str
    service_name: str
    amount: Decimal
    currency: str
    scope: str
    tags: Dict[str, str] = field(default_factory=dict)

    def fingerprint(self) -> str:
        """Stable idempotency key for partition-scoped upserts."""
        basis = f"{self.scope}|{self.period_start}|{self.resource_group}|{self.service_name}"
        return sha256(basis.encode("utf-8")).hexdigest()

    def to_dict(self) -> Dict[str, Any]:
        d = asdict(self)
        d["amount"] = str(self.amount)  # never serialize Decimal as float
        d["fingerprint"] = self.fingerprint()
        return d


class AzureCostClient:
    """
    Production Azure Cost Management client using the Query REST API directly.
    Handles token acquisition, exponential backoff on 429/5xx, and nextLink pagination.
    """

    def __init__(self, scope: str, max_retries: int = 5):
        # scope example: "subscriptions/00000000-0000-0000-0000-000000000000"
        self.scope = scope.strip("/")
        self.credential = DefaultAzureCredential()
        self.session = requests.Session()
        self.max_retries = max_retries

    def _headers(self) -> Dict[str, str]:
        token = self.credential.get_token(ARM_SCOPE).token
        return {"Authorization": f"Bearer {token}", "Content-Type": "application/json"}

    def _request(self, method: str, url: str, payload: Optional[Dict] = None) -> Dict:
        """Single durable request with 429/5xx aware retry."""
        for attempt in range(self.max_retries):
            try:
                response = self.session.request(
                    method, url, headers=self._headers(), json=payload, timeout=60
                )
                if response.status_code == 429:
                    retry_after = int(response.headers.get("Retry-After", 60))
                    logger.warning(
                        "Throttled (429). Waiting %ds (attempt %d)", retry_after, attempt + 1
                    )
                    time.sleep(retry_after)
                    continue
                if response.status_code >= 500:
                    backoff = min(2 ** attempt + (os.urandom(1)[0] / 255), 30)
                    logger.warning(
                        "Server error %d. Backing off %.2fs", response.status_code, backoff
                    )
                    time.sleep(backoff)
                    continue
                response.raise_for_status()
                return response.json()
            except requests.exceptions.RequestException as exc:
                logger.error("Request failed: %s", exc)
                if attempt == self.max_retries - 1:
                    raise
                time.sleep(min(2 ** attempt, 30))
        raise RuntimeError(f"Max retries exceeded for {method} {url}")

    def fetch_cost_data(
        self, start: str, end: str, granularity: str = "Daily"
    ) -> List[CostRecord]:
        """
        Fetch cost data for the configured scope as normalized CostRecords.

        Args:
            start: ISO date string (YYYY-MM-DD), inclusive lower bound.
            end:   ISO date string (YYYY-MM-DD), inclusive upper bound.
            granularity: "Daily" or "Monthly".
        """
        query_url = (
            f"{COST_MGMT_BASE}/{self.scope}"
            f"/providers/Microsoft.CostManagement/query?api-version={API_VERSION}"
        )
        payload = {
            "type": "ActualCost",
            "timeframe": "Custom",
            "timePeriod": {"from": start, "to": end},
            "dataset": {
                "granularity": granularity,
                "aggregation": {
                    "totalCost": {"name": "PreTaxCost", "function": "Sum"}
                },
                "grouping": [
                    {"type": "Dimension", "name": "ResourceGroup"},
                    {"type": "Dimension", "name": "ServiceName"},
                ],
            },
        }

        records: List[CostRecord] = []
        # First page is a POST with the query body.
        data = self._request("POST", query_url, payload)

        while True:
            props = data.get("properties", {})
            columns = [c["name"] for c in props.get("columns", [])]
            currency_idx = columns.index("Currency") if "Currency" in columns else None
            for row in props.get("rows", []):
                mapped = dict(zip(columns, row))
                records.append(
                    CostRecord(
                        period_start=str(mapped.get("UsageDate", start)),
                        resource_group=(mapped.get("ResourceGroup") or "unallocated").lower(),
                        service_name=mapped.get("ServiceName", "unknown"),
                        amount=Decimal(str(mapped.get("PreTaxCost", "0"))),
                        currency=row[currency_idx] if currency_idx is not None else "USD",
                        scope=self.scope,
                    )
                )

            next_link = props.get("nextLink")
            if not next_link:
                break
            # nextLink is a full URL fetched with GET (no body).
            data = self._request("GET", next_link)

        logger.info("Ingested %d cost records for scope %s", len(records), self.scope)
        return records


if __name__ == "__main__":
    subscription_id = os.getenv("AZURE_SUBSCRIPTION_ID")
    if not subscription_id:
        raise ValueError("AZURE_SUBSCRIPTION_ID environment variable is required.")

    today = datetime.now(timezone.utc).date()
    start_date = (today - timedelta(days=7)).isoformat()
    end_date = today.isoformat()

    client = AzureCostClient(scope=f"subscriptions/{subscription_id}")
    rows = client.fetch_cost_data(start=start_date, end=end_date)
    print(json.dumps([r.to_dict() for r in rows[:3]], indent=2))

Every retry and nextLink hop is explicit, so a partial run fails loudly instead of handing the normalization stage a truncated page. Persisting with CostRecord.fingerprint() as the upsert key makes re-ingesting the mutable trailing window converge on one answer.

Schema Reference Table

The query response is a columnar properties.columns / properties.rows structure. The export (CSV/Parquet) carries a wider, flatter schema. The mapping below collapses both onto the normalized dimensional model shared across providers.

Azure source field	Normalized field	Type	Notes
`properties.rows[][UsageDate]`	`period_start`	date	Integer `yyyymmdd` in query responses; ISO date in exports
`properties.rows[][ResourceGroup]`	`resource_group`	string	Lower-case to deduplicate casing variants
`properties.rows[][ServiceName]` / `meterCategory`	`service_name`	string	`ServiceName` in query; `MeterCategory` in EA exports
`properties.rows[][PreTaxCost]`	`amount`	decimal	Parse as `Decimal`, never `float`, for ledger math
`properties.rows[][Currency]`	`currency`	string	Billing currency (ISO-4217); normalize multi-currency payers
`type` request field	`metric`	enum	`ActualCost` vs `AmortizedCost` amortization basis
`tags` (export)	`dimensions.tags`	map	Flattened; resolve Resource → RG → Subscription inheritance
`properties.nextLink`	(cursor state)	string	Persist for idempotent restart

ActualCost reflects charges as billed; AmortizedCost spreads upfront reservation and savings-plan purchases across the commitment term per accrual accounting. Pin one basis per report — the commitment-spreading reconciliation lives in Reserved Instance Mapping Logic.

Operational Considerations

Throttling: the query endpoint returns HTTP 429 with a Retry-After header and x-ms-ratelimit-remaining-microsoft.costmanagement-* counters; honour the header rather than retrying immediately. ARM also enforces a subscription-level ceiling of roughly 12,000 read requests per hour, so serialize pipeline calls and prefer Monthly granularity for back-fills.
Eventual consistency: the trailing 24–72 hours of cost data is provisional and re-stated as Microsoft finalizes billing. Treat recent partitions as mutable, re-ingest them every run, and only freeze partitions older than the finalization lag.
Export latency: scheduled exports typically land within 24–48 hours; add a health check that alerts if no new export blob appears within 72 hours.
Amortization alignment: an ActualCost export overstates the month a reservation is purchased and distorts monthly burn — match the basis (ActualCost vs AmortizedCost) to your reporting cadence and keep it consistent.
Currency normalization: amounts are returned in the scope’s billing currency, not always USD; capture the Currency column and convert on a pinned daily rate before aggregating across enrollments.
Pipeline self-cost: tag the storage account, compute, and orchestration that run ingestion, and exclude them from engineering showback to avoid recursive allocation loops.
Monitoring hooks: emit record counts, page counts, and per-run wall-clock to Azure Monitor; alert when ingested totals deviate from the prior run beyond a tuned threshold.

Troubleshooting

1. Silent truncation — fewer rows than expected. Root cause: the loop re-POSTs the query body to nextLink, or exits without checking it. Detection: ingested row count plateaus at a round page boundary. Remediation: fetch nextLink with GET and no body (as the engine does) and only break when it is falsy.

2. 429 TooManyRequests storms. Root cause: parallel workers or a tight retry loop exceed the ARM read budget. Detection: a spike in 429s and elevated run duration. Remediation: serialize requests, honour Retry-After, add jittered backoff, and switch back-fills to Monthly granularity to collapse call counts.

3. Authentication deadlock on first run. Root cause: the export or query was configured before the managed identity received its role assignment. Detection: 401/403 from ARM despite a “successful” deployment. Remediation: provision the identity and assign Cost Management Reader before the first pipeline run; verify with az role assignment list --assignee <objectId>.

4. Double-counted dollars after a restart. Root cause: a re-run appended the trailing window instead of replacing it. Detection: totals for a re-processed period jump by ~2x. Remediation: write partition-scoped upserts keyed on CostRecord.fingerprint() so re-running a window replaces exactly that window’s rows.

5. Numbers drift after schema evolution. Root cause: Azure added or renamed export columns (e.g. PricingModel, ChargeType, Frequency) and the parser silently dropped them. Detection: a consistent gap against the invoice, or a missing dimension downstream. Remediation: validate exports against a schema registry or Parquet metadata check that alerts on column drops or type mismatches before data lands in production tables.

Frequently Asked Questions

When should I use the Query API instead of scheduled exports?

Use the Query API for aggregated, near-real-time reconciliation, anomaly detection, and showback where daily or monthly totals are enough. Use scheduled Parquet exports for resource-level forensics and authoritative monthly close, accepting their 24–48 hour delivery latency. Mature pipelines run both and reconcile the cheap query aggregates against the authoritative export.

What is the difference between ActualCost and AmortizedCost?

ActualCost reflects charges exactly as billed, so a reservation purchase lands entirely in its purchase month. AmortizedCost spreads upfront reservation and savings-plan fees across the commitment term to match accrual accounting. Pin one basis per report — mixing them is the most common Azure reconciliation bug.

How do I handle 429 throttling on the Cost Management API?

Honour the Retry-After header on every 429 rather than retrying immediately, serialize pipeline requests, and stay under the subscription-level ARM read budget of roughly 12,000 reads per hour. For historical back-fills, use Monthly granularity to collapse the number of paginated requests.

Why does the last few days of Azure cost keep changing?

The trailing 24–72 hours of cost data is provisional and re-stated as Microsoft finalizes billing. Treat recent partitions as mutable, re-ingest them on every run, and only freeze partitions older than the finalization window.

How are resource tags resolved when the export flattens them?

Apply an inheritance engine that resolves tags Resource → Resource Group → Subscription and routes records with no usable metadata to an Unallocated bucket. The full EA-specific reconciliation is covered in Mapping Azure EA Billing to FinOps Tags.

FinOps Architecture & Billing Fundamentals — the parent reference that defines the acquisition → normalization → allocation → persistence pipeline this page plugs into.
AWS Cost Explorer Architecture — the equivalent acquisition surface on AWS, for aligning schemas and pagination patterns across providers.
GCP Billing Export Configuration — the GCP daily-Parquet export pattern that unifies ingestion logic with Azure’s exports.
Mapping Azure EA Billing to FinOps Tags — resolving the flattened tag hierarchy into accurate, actionable showback.
Reserved Instance Mapping Logic — the commitment-amortization reconciliation behind the AmortizedCost basis.
Cross-Cloud Cost Allocation Strategies — how the normalized Azure records join one dimensional model spanning AWS, GCP, and Azure.

Up: FinOps Architecture & Billing Fundamentals

Azure Cost Management Setup

# Architecture Context & Data-Flow Position

# API & Scope Map

# Core Implementation Patterns

# 1. IAM & Least Privilege

# 2. Configuring Scheduled Exports

# 3. Query Construction, Pagination & Rate-Limit Handling

# 4. Metric & Dimension Mapping

# Production-Grade Python Ingestion Engine

# Schema Reference Table

# Operational Considerations

# Troubleshooting

# Frequently Asked Questions

# Related