Master Terraform state isolation using S3 and Azure Storage path keys. Learn why vm/01/terraform.tfstate patterns reduce blast radius, and why Terraform Cloud fails for enterprise resource isolation.
The Monolithic State Problem
Most teams start Terraform the wrong way: everything in one state file.
1# The classic mistake - one state for everything2terraform {3 backend "s3" {4 bucket = "my-terraform-state"5 key = "terraform.tfstate" # One file to rule them all6 region = "ap-southeast-2"7 }8}This works until it doesn't. A typo in your dev VM configuration corrupts the state file, and suddenly your production database is in an unknown state. A junior developer runs terraform destroy thinking they're in staging, but the single state file includes production resources.
The blast radius is your entire infrastructure.
State Independence: The Path-Based Pattern
The solution is state independence—each logical component gets its own state file, stored at a unique path in your backend.
S3 Backend with Path Keys
1# infrastructure/network/main.tf2terraform {3 backend "s3" {4 bucket = "company-terraform-state"5 key = "network/prod/terraform.tfstate"6 region = "ap-southeast-2"7 encrypt = true8 dynamodb_table = "terraform-locks"9 }10}1112# infrastructure/compute/vm-01/main.tf13terraform {14 backend "s3" {15 bucket = "company-terraform-state"16 key = "compute/vm-01/terraform.tfstate"17 region = "ap-southeast-2"18 encrypt = true19 dynamodb_table = "terraform-locks"20 }21}2223# infrastructure/compute/vm-02/main.tf24terraform {25 backend "s3" {26 bucket = "company-terraform-state"27 key = "compute/vm-02/terraform.tfstate"28 region = "ap-southeast-2"29 encrypt = true30 dynamodb_table = "terraform-locks"31 }32}3334# infrastructure/database/prod/main.tf35terraform {36 backend "s3" {37 bucket = "company-terraform-state"38 key = "database/prod/terraform.tfstate"39 region = "ap-southeast-2"40 encrypt = true41 dynamodb_table = "terraform-locks"42 }43}Azure Storage with Container Keys
1# infrastructure/network/main.tf2terraform {3 backend "azurerm" {4 resource_group_name = "rg-terraform-state"5 storage_account_name = "stterraformstate"6 container_name = "tfstate"7 key = "network/prod/terraform.tfstate"8 }9}1011# infrastructure/compute/vm-01/main.tf12terraform {13 backend "azurerm" {14 resource_group_name = "rg-terraform-state"15 storage_account_name = "stterraformstate"16 container_name = "tfstate"17 key = "compute/vm-01/terraform.tfstate"18 }19}The Resulting Structure
1s3://company-terraform-state/2├── network/3│ ├── prod/terraform.tfstate4│ ├── staging/terraform.tfstate5│ └── dev/terraform.tfstate6├── compute/7│ ├── vm-01/terraform.tfstate8│ ├── vm-02/terraform.tfstate9│ ├── vm-03/terraform.tfstate10│ └── vm-web-cluster/terraform.tfstate11├── database/12│ ├── prod/terraform.tfstate13│ ├── staging/terraform.tfstate14│ └── dev/terraform.tfstate15├── dns/16│ └── terraform.tfstate17└── cdn/18 └── terraform.tfstateWhy Terraform Cloud Fails for Complex Deployments
Terraform Cloud (now HCP Terraform) sounds great in demos, but it has fundamental limitations for enterprise deployments.
Resource Limits and Pricing
| Tier | Managed Resources | Cost |
|---|---|---|
| Free (ending March 2026) | 500 resources | $0 |
| Standard | Pay-per-resource | ~$0.00014/hour/resource |
| Plus | Pay-per-resource | Higher |
The problem: A single EKS cluster with networking, IAM, security groups, and add-ons can consume 500 resources easily. Clone that for staging, and you've doubled your count—but pricing can jump 7x rather than 2x due to non-linear scaling.
State Isolation Limitations
Key Terraform Cloud Limitations
| Issue | Impact |
|---|---|
| Workspace = State | Can't have multiple state files per workspace |
| Resource counting | Charged per resource, penalizes environment duplication |
| Proprietary backend | Migrating away requires manual effort |
| No OpenTofu support | Locked to Terraform |
| No Terragrunt support | Can't use advanced IaC patterns |
| Concurrency limits | Blocks parallel development |
| Free tier ending | March 2026 discontinuation |
The Vendor Lock-In Problem
Terraform Cloud uses a proprietary backend. Your state files are stored in HashiCorp's infrastructure, and migrating them out requires:
- Manual state pulls from each workspace
- Reconfiguring backends across all configurations
- Re-importing resources if state becomes inconsistent
With S3/Azure Storage, you own your state files. Migration is a bucket copy.
The Blast Radius Principle
Blast radius = the extent of damage a single mistake can cause.
Isolation Strategies by Risk Level
| Component Type | Isolation Level | State Path Example |
|---|---|---|
| Production databases | Per-instance | database/prod/postgres-01/terraform.tfstate |
| Stateful resources | Per-instance | storage/prod/bucket-logs/terraform.tfstate |
| Compute clusters | Per-cluster | compute/eks-prod/terraform.tfstate |
| Individual VMs | Per-VM | compute/vm-01/terraform.tfstate |
| Networking | Per-environment | network/prod/terraform.tfstate |
| DNS records | Shared (careful) | dns/terraform.tfstate |
| Dev resources | Per-developer | dev/john/terraform.tfstate |
Stateful vs Stateless Separation
Rule: Stateful resources (databases, storage) that persist data should have their own state files. Destroying them by accident means data loss. Stateless resources (VMs, containers) can be grouped—they're recreatable.
Directory Structure for State Independence
Enterprise Pattern
1infrastructure/2├── _modules/ # Shared modules (no state)3│ ├── vpc/4│ ├── ec2/5│ └── rds/6├── network/7│ ├── prod/8│ │ ├── main.tf9│ │ ├── variables.tf10│ │ ├── outputs.tf11│ │ └── backend.tf # key = "network/prod/terraform.tfstate"12│ ├── staging/13│ │ └── ... # key = "network/staging/terraform.tfstate"14│ └── dev/15│ └── ... # key = "network/dev/terraform.tfstate"16├── compute/17│ ├── vm-web-01/18│ │ └── ... # key = "compute/vm-web-01/terraform.tfstate"19│ ├── vm-web-02/20│ │ └── ... # key = "compute/vm-web-02/terraform.tfstate"21│ ├── vm-api-01/22│ │ └── ... # key = "compute/vm-api-01/terraform.tfstate"23│ └── eks-prod/24│ └── ... # key = "compute/eks-prod/terraform.tfstate"25├── database/26│ ├── prod/27│ │ ├── postgres-primary/28│ │ │ └── ... # key = "database/prod/postgres-primary/terraform.tfstate"29│ │ └── postgres-replica/30│ │ └── ... # key = "database/prod/postgres-replica/terraform.tfstate"31│ └── staging/32│ └── postgres/33│ └── ... # key = "database/staging/postgres/terraform.tfstate"34└── shared/35 ├── dns/36 │ └── ... # key = "shared/dns/terraform.tfstate"37 └── iam/38 └── ... # key = "shared/iam/terraform.tfstate"Dynamic Backend Configuration
The key to reusable Terraform modules is partial backend configuration. Define shared settings in backend.tf, then inject the unique key at init time.
1# backend.tf (in each component)2terraform {3 backend "s3" {4 # Partial configuration - key injected via CLI5 bucket = "company-terraform-state"6 region = "ap-southeast-2"7 encrypt = true8 dynamodb_table = "terraform-locks"9 }10}1# Initialize with dynamic key2terraform init -backend-config="key=compute/vm-01/terraform.tfstate"Using Variables for Dynamic Keys
For CI/CD pipelines, use environment variables or script parameters to make the key dynamic:
1#!/bin/bash2# deploy.sh - Reusable deployment script34COMPONENT="${1:-compute/vm-01}" # Default or passed as argument5ENVIRONMENT="${2:-prod}"67# Build the state key dynamically8STATE_KEY="${COMPONENT}/${ENVIRONMENT}/terraform.tfstate"910terraform init \11 -backend-config="key=${STATE_KEY}" \12 -reconfigure1314terraform apply -auto-approveUsage:
1# Deploy different components with same script2./deploy.sh compute/vm-01 prod # key = compute/vm-01/prod/terraform.tfstate3./deploy.sh compute/vm-02 prod # key = compute/vm-02/prod/terraform.tfstate4./deploy.sh database/postgres dev # key = database/postgres/dev/terraform.tfstateEnvironment-Specific Backend Files
For cleaner separation, use .hcl backend config files:
1# backends/prod.hcl2bucket = "company-terraform-state-prod"3region = "ap-southeast-2"4encrypt = true5dynamodb_table = "terraform-locks-prod"1# backends/staging.hcl2bucket = "company-terraform-state-staging"3region = "ap-southeast-2"4encrypt = true5dynamodb_table = "terraform-locks-staging"1# Initialize for prod with dynamic key2terraform init \3 -backend-config=backends/prod.hcl \4 -backend-config="key=compute/vm-01/terraform.tfstate"56# Initialize for staging with same key structure7terraform init \8 -backend-config=backends/staging.hcl \9 -backend-config="key=compute/vm-01/terraform.tfstate"CI/CD Pattern with Dynamic Keys
1# .github/workflows/terraform.yml2env:3 TF_STATE_BUCKET: company-terraform-state4 TF_STATE_REGION: ap-southeast-256jobs:7 deploy:8 runs-on: ubuntu-latest9 strategy:10 matrix:11 component: [compute/vm-01, compute/vm-02, database/prod]12 steps:13 - uses: actions/checkout@v41415 - name: Terraform Init with Dynamic Key16 run: |17 terraform init \18 -backend-config="bucket=${{ env.TF_STATE_BUCKET }}" \19 -backend-config="region=${{ env.TF_STATE_REGION }}" \20 -backend-config="key=${{ matrix.component }}/terraform.tfstate" \21 -backend-config="encrypt=true"2223 - name: Terraform Apply24 run: terraform apply -auto-approveAzure Storage Dynamic Keys
The same pattern works with Azure Storage:
1# backend.tf2terraform {3 backend "azurerm" {4 resource_group_name = "rg-terraform-state"5 storage_account_name = "stterraformstate"6 container_name = "tfstate"7 # key injected via -backend-config8 }9}1# Dynamic key injection for Azure2terraform init -backend-config="key=compute/vm-01/terraform.tfstate"Cross-State Dependencies with Remote State
When components need to reference each other, use terraform_remote_state:
1# compute/vm-01/main.tf23# Read VPC outputs from network state4data "terraform_remote_state" "network" {5 backend = "s3"6 config = {7 bucket = "company-terraform-state"8 key = "network/prod/terraform.tfstate"9 region = "ap-southeast-2"10 }11}1213# Use network outputs14resource "aws_instance" "vm" {15 ami = var.ami_id16 instance_type = var.instance_type1718 subnet_id = data.terraform_remote_state.network.outputs.private_subnet_ids[0]19 vpc_security_group_ids = [data.terraform_remote_state.network.outputs.default_sg_id]2021 tags = {22 Name = "vm-01"23 }24}Dependency Graph
Design Your Outputs for Consumers
1# network/prod/outputs.tf23output "vpc_id" {4 description = "VPC ID for compute and database resources"5 value = aws_vpc.main.id6}78output "private_subnet_ids" {9 description = "Private subnet IDs for internal resources"10 value = aws_subnet.private[*].id11}1213output "public_subnet_ids" {14 description = "Public subnet IDs for load balancers"15 value = aws_subnet.public[*].id16}1718output "default_sg_id" {19 description = "Default security group allowing internal traffic"20 value = aws_security_group.default.id21}2223output "nat_gateway_ips" {24 description = "NAT Gateway public IPs for allowlisting"25 value = aws_nat_gateway.main[*].public_ip26}Terragrunt: State Independence at Scale
For large deployments, Terragrunt automates state isolation patterns.
Terragrunt Configuration
1# terragrunt.hcl (root)2remote_state {3 backend = "s3"4 generate = {5 path = "backend.tf"6 if_exists = "overwrite_terragrunt"7 }8 config = {9 bucket = "company-terraform-state"10 key = "${path_relative_to_include()}/terraform.tfstate"11 region = "ap-southeast-2"12 encrypt = true13 dynamodb_table = "terraform-locks"14 }15}1# infrastructure/compute/vm-01/terragrunt.hcl2include "root" {3 path = find_in_parent_folders()4}56# State will be: compute/vm-01/terraform.tfstate7terraform {8 source = "../../../_modules/ec2"9}1011inputs = {12 instance_name = "vm-01"13 instance_type = "t3.medium"14}1516dependency "network" {17 config_path = "../../network/prod"18}1920inputs = {21 vpc_id = dependency.network.outputs.vpc_id22 subnet_id = dependency.network.outputs.private_subnet_ids[0]23}Run-All for Coordinated Deployments
1# Deploy all components in dependency order2terragrunt run-all apply34# Plan across all components5terragrunt run-all plan67# Destroy in reverse dependency order8terragrunt run-all destroyState Locking: Native S3 vs DynamoDB
As of Terraform 1.10, S3 supports native state locking without DynamoDB:
1terraform {2 backend "s3" {3 bucket = "company-terraform-state"4 key = "compute/vm-01/terraform.tfstate"5 region = "ap-southeast-2"6 encrypt = true7 use_lockfile = true # Native S3 locking (Terraform 1.10+)8 }9}Comparison
| Feature | DynamoDB Locking | Native S3 Locking |
|---|---|---|
| Extra resource | Yes (DynamoDB table) | No |
| Cost | ~$1/month + read/write | Included in S3 |
| Setup complexity | Medium | Low |
| Terraform version | All versions | 1.10+ |
| OpenTofu version | All versions | 1.8+ |
CI/CD Integration
GitHub Actions with State Isolation
1# .github/workflows/terraform.yml2name: Terraform Deploy34on:5 push:6 paths:7 - 'infrastructure/**'89jobs:10 detect-changes:11 runs-on: ubuntu-latest12 outputs:13 matrix: ${{ steps.changes.outputs.matrix }}14 steps:15 - uses: actions/checkout@v416 - id: changes17 run: |18 # Detect which components changed19 changed_dirs=$(git diff --name-only HEAD~1 | grep '^infrastructure/' | cut -d'/' -f1-3 | sort -u)20 matrix=$(echo "$changed_dirs" | jq -R -s -c 'split("\n") | map(select(length > 0))')21 echo "matrix=$matrix" >> $GITHUB_OUTPUT2223 terraform:24 needs: detect-changes25 runs-on: ubuntu-latest26 strategy:27 matrix:28 component: ${{ fromJson(needs.detect-changes.outputs.matrix) }}29 fail-fast: false # Don't fail all if one component fails30 steps:31 - uses: actions/checkout@v43233 - name: Setup Terraform34 uses: hashicorp/setup-terraform@v33536 - name: Terraform Init37 working-directory: ${{ matrix.component }}38 run: terraform init3940 - name: Terraform Plan41 working-directory: ${{ matrix.component }}42 run: terraform plan -out=tfplan4344 - name: Terraform Apply45 if: github.ref == 'refs/heads/main'46 working-directory: ${{ matrix.component }}47 run: terraform apply -auto-approve tfplanParallel Deployments
With state isolation, independent components can deploy in parallel:
Benefits Summary
| Benefit | Monolithic State | Path-Based Isolation |
|---|---|---|
| Blast radius | Entire infrastructure | Single component |
| Concurrent work | Blocked by locks | Parallel deploys |
| State file size | Large, slow | Small, fast |
| Access control | All or nothing | Per-component IAM |
| Recovery | Complex | Simple per-component |
| Team autonomy | Limited | High |
Best Practices Checklist
- One state per logical component — VMs, databases, networks separate
- Stateful resources isolated — Databases get their own state
- Path structure mirrors org structure — Easy to understand
- Use native S3/Azure locking — Eliminate DynamoDB dependency
- Design outputs for consumers — Clean remote_state interface
- Consider Terragrunt — For 10+ components
- Avoid Terraform Cloud — For complex resource isolation needs
- CI/CD per-component — Parallel, isolated pipelines
When Terraform Cloud Makes Sense
Despite limitations, Terraform Cloud works for:
- Small teams (< 500 resources)
- Simple architectures (few state files)
- Teams valuing managed experience over flexibility
- Organizations already invested in HashiCorp ecosystem
But for enterprise deployments with hundreds of resources, multiple teams, and granular isolation needs—path-based state isolation on S3/Azure Storage is superior.
Brisbane Infrastructure Consulting
At Buun Group, we help organizations implement Terraform state strategies:
- State architecture design — Path patterns for your org structure
- Migration from Terraform Cloud — Move to self-managed backends
- Terragrunt implementation — Automated state isolation at scale
- CI/CD integration — Per-component deployment pipelines
We've managed Terraform at scale. We know the patterns that work.
Need Terraform architecture help?
Topics
Comments
Sign in to join the conversation
LoginNo comments yet. Be the first to share your thoughts!
Found an issue with this article?
