Terraform can provision your entire cloud infrastructure in minutes — or it can be the source of your worst production incidents. After using it to manage enterprise AWS and Azure environments, here are the practices that made the difference.
1. Remote State is Non-Negotiable
Never use local state in a team environment. The moment two engineers run terraform apply against local state files, you have a split-brain situation.
# backend.tf
terraform {
backend "s3" {
bucket = "company-terraform-state"
key = "prod/networking/terraform.tfstate"
region = "ap-south-1"
encrypt = true
dynamodb_table = "terraform-state-lock" # prevents concurrent applies
}
}
The DynamoDB lock table is critical — it prevents two engineers from running apply simultaneously. Without it, you will have a race condition at the worst possible time.
2. Workspace-per-Environment (with Caveats)
Terraform workspaces let you use the same code for multiple environments:
locals {
env = terraform.workspace # "staging" or "production"
config = {
staging = {
instance_type = "t3.medium"
min_capacity = 1
max_capacity = 3
}
production = {
instance_type = "r6i.xlarge"
min_capacity = 3
max_capacity = 20
}
}
current = local.config[local.env]
}
The caveat: workspaces share a backend bucket. For true environment isolation (separate AWS accounts), use separate state files and Terragrunt. For a single-account setup, workspaces work fine.
3. Module Design: Keep Them Small and Focused
The temptation is to build a giant "vpc" module that does everything. Resist it.
modules/
├── networking/
│ ├── vpc/ # just the VPC, subnets, IGW
│ ├── security-groups/
│ └── vpc-peering/
├── compute/
│ ├── ec2/
│ ├── asg/ # Auto Scaling Group
│ └── alb/
├── database/
│ ├── rds/
│ └── elasticache/
└── observability/
├── cloudwatch/
└── sns-alarms/
Small modules are testable, reusable, and composable. A 2000-line "infrastructure" module is a maintenance nightmare.
4. Use locals to Reduce Repetition
Every resource tagging strategy ends up with the same 6 tags. Don't repeat them:
locals {
common_tags = {
Environment = var.environment
Project = var.project_name
ManagedBy = "terraform"
Owner = var.team_email
CostCenter = var.cost_center
CreatedAt = timestamp()
}
}
resource "aws_instance" "app" {
ami = data.aws_ami.ubuntu.id
instance_type = local.current.instance_type
tags = merge(local.common_tags, {
Name = "${var.project_name}-${var.environment}-app"
Role = "application"
})
}
5. Secrets: Never in .tfvars
I've seen .tfvars files with database passwords committed to Git. Don't be that team.
Pattern 1: AWS Secrets Manager + data source
data "aws_secretsmanager_secret_version" "db" {
secret_id = "prod/rds/master-password"
}
resource "aws_db_instance" "main" {
password = jsondecode(data.aws_secretsmanager_secret_version.db.secret_string)["password"]
}
Pattern 2: Environment variables for provider credentials
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
Never put AWS credentials in your Terraform files.
6. The Drift Detection Habit
Before every apply in production, run plan and review it carefully. We automated this into CI:
# GitHub Actions
- name: Terraform Plan
run: |
terraform plan -out=tfplan -detailed-exitcode
# Exit code 2 = changes detected
# Exit code 0 = no changes
# Exit code 1 = error
A PR that only touches documentation shouldn't have Terraform changes. If your plan shows unexpected changes, stop and investigate before applying.
7. terraform_remote_state Over Hard-coded Values
Avoid hardcoding resource IDs between modules. Use remote state references:
# In the compute module, referencing networking outputs
data "terraform_remote_state" "networking" {
backend = "s3"
config = {
bucket = "company-terraform-state"
key = "prod/networking/terraform.tfstate"
region = "ap-south-1"
}
}
resource "aws_instance" "app" {
subnet_id = data.terraform_remote_state.networking.outputs.private_subnet_ids[0]
vpc_security_group_ids = [data.terraform_remote_state.networking.outputs.app_sg_id]
}
The Result: Hours → 15 Minutes
Before Terraform, standing up a new client environment (VPC, subnets, security groups, EC2, RDS, ALB) took 3-4 hours of console clicking and was error-prone. With the module library established, a terraform apply runs in under 15 minutes with zero manual steps.
The investment in module design and remote state setup pays for itself on the third environment provisioned.
Quick Reference Checklist
- →✅ Remote state with S3 + DynamoDB lock
- →✅ Separate state files per environment/component
- →✅ Small, focused modules
- →✅
localsfor common tags and computed values - →✅ Secrets from AWS Secrets Manager, never in tfvars
- →✅
terraform planreviewed in every CI pipeline - →✅
terraform_remote_statefor cross-module references - →✅ Version constraints on providers and modules
// Written by Lavi Singodiya · February 10, 2026