Reducing East-West Traffic in AWS: Hidden Networking Costs in Kubernetes and Microservice Architectures
Everything looked healthy.
- →CPU utilization was normal
- →Application latency seemed acceptable
- →Infrastructure scaled correctly
- →No major production incidents occurred
But AWS networking costs kept increasing every month.
The issue was not internet traffic.
It was uncontrolled east-west traffic inside the cloud environment.
This problem becomes increasingly common in modern architectures using:
- →Kubernetes
- →EKS
- →microservices
- →service meshes
- →centralized logging
- →distributed monitoring systems
Most teams carefully monitor:
- →Internet ingress traffic
- →Load balancer usage
- →NAT Gateway costs
- →EC2 utilization
Very few teams actively analyze how services communicate internally.
That becomes expensive at scale.
What Is East-West Traffic?
Cloud traffic generally falls into two categories.
North-South Traffic
Traffic entering or leaving the environment:
Internet ↔ Application
Examples:
- →User requests
- →Public APIs
- →External integrations
- →CDN traffic
East-West Traffic
Traffic flowing internally between systems:
Service ↔ Service
Pod ↔ Pod
Node ↔ Node
VPC ↔ VPC
Examples:
- →Microservice API calls
- →Kubernetes pod communication
- →Database requests
- →Logging pipelines
- →Monitoring agents
- →Service discovery traffic
This traffic is often invisible until environments become large.
Why East-West Traffic Becomes a Problem
Modern cloud environments generate massive amounts of internal communication.
Especially when using:
- →distributed microservices
- →EKS clusters
- →multi-AZ architectures
- →centralized observability platforms
Initially everything works fine.
But over time:
- →Cross-AZ traffic costs increase
- →Latency gradually rises
- →API retries become more frequent
- →Kubernetes networking overhead grows
- →Troubleshooting becomes difficult
The dangerous part is that most of this traffic is considered “internal,” so teams assume it is free.
That assumption is incorrect.
The AWS Pricing Reality Many Teams Miss
Many engineers believe internal AWS traffic costs nothing.
That is only partially true.
Generally Free
- →Traffic within the same Availability Zone
- →Internal communication on the same node/VPC
Charged
- →Cross-AZ traffic
- →Inter-region traffic
- →NAT Gateway processing
- →Transit Gateway traffic
- →VPC peering transfer traffic
In Kubernetes environments, pods are frequently distributed across multiple AZs automatically.
That creates hidden networking costs very quickly.
How Kubernetes Amplifies East-West Traffic
Kubernetes improves scalability and availability.
But it also increases internal network communication significantly.
Typical EKS traffic includes:
- →pod-to-pod communication
- →kube-dns lookups
- →ingress controller traffic
- →service mesh sidecars
- →Prometheus scraping
- →Fluent Bit log forwarding
- →readiness/liveness probes
Now combine this with multi-AZ scheduling.
Example:
Frontend Pod → AZ-a
Backend Pod → AZ-c
Redis → AZ-b
Every request crosses Availability Zones.
Now multiply that by:
- →thousands of requests per second
- →retries
- →monitoring traffic
- →health checks
- →service mesh overhead
The result becomes expensive surprisingly fast.
Real Production Symptoms
The first signs are usually indirect.
We initially noticed:
- →rising NAT Gateway costs
- →unexpected inter-AZ transfer charges
- →occasional latency spikes
- →elevated Kubernetes networking metrics
Infrastructure scaling appeared healthy.
No obvious production failures existed.
But VPC Flow Logs showed services communicating excessively across Availability Zones.
The architecture looked clean on diagrams.
The actual traffic flow was not.
Common Architectural Problems
1. Excessively Chatty Microservices
Many microservice environments become overloaded with synchronous API calls.
Example:
Frontend → API Gateway → Auth Service → User Service → Billing Service → Notification Service
One user request may trigger:
- →dozens of internal API calls
- →retries
- →duplicate queries
- →monitoring events
At scale, this creates substantial east-west traffic.
2. Random Kubernetes Pod Scheduling
By default, Kubernetes distributes workloads across nodes and AZs automatically.
This improves availability.
But tightly coupled services often communicate constantly.
If pods are distributed randomly:
Frontend → Backend → Cache
may cross multiple Availability Zones repeatedly.
3. Centralized Shared Services
Many organizations deploy:
- →centralized logging
- →authentication
- →monitoring
- →shared databases
and allow every environment to access them continuously.
This creates excessive cross-VPC and cross-AZ traffic.
4. Monitoring and Logging Overhead
Observability systems themselves generate large amounts of internal traffic.
Common examples:
- →Prometheus scraping every few seconds
- →Fluent Bit forwarding logs continuously
- →Datadog agents collecting metrics
- →distributed tracing systems
In large EKS environments, monitoring traffic alone can become significant.
Visualizing the Problem
Inefficient Architecture
Frontend (AZ-a)
↓
Backend API (AZ-c)
↓
Redis Cache (AZ-b)
↓
Database (AZ-c)
Every request crosses multiple Availability Zones.
Optimized Architecture
Frontend + Backend + Cache aligned within same AZ
AZ-a:
Frontend-a → Backend-a → Redis-a
AZ-b:
Frontend-b → Backend-b → Redis-b
This reduces:
- →latency
- →inter-AZ transfer cost
- →unnecessary network hops
while maintaining high availability.
Using Topology-Aware Routing in Kubernetes
Kubernetes supports topology-aware routing.
This helps keep traffic closer to local nodes and Availability Zones.
Example:
service.kubernetes.io/topology-mode: Auto
Or:
topologyKeys:
- topology.kubernetes.io/zone
This improves:
- →request locality
- →response latency
- →cross-zone traffic optimization
Many teams never enable this feature.
Using Pod Affinity for Chatty Services
For tightly coupled workloads, pod placement matters significantly.
Example:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- backend
topologyKey: topology.kubernetes.io/zone
This helps place related workloads within the same zone.
Useful for:
- →frontend/backend pairs
- →cache-heavy applications
- →internal APIs with high request volume
Reducing Synchronous Communication
One of the biggest causes of east-west traffic is excessive synchronous APIs.
Example:
Service A → Service B → Service C → Service D
Failures cascade quickly.
Latency accumulates across every hop.
Instead, asynchronous communication often works better.
Examples:
- →Amazon SQS
- →Kafka
- →EventBridge
- →SNS
Benefits:
- →lower retry traffic
- →reduced service coupling
- →better scalability
- →improved resilience
Using Caching Effectively
Poor caching strategies increase unnecessary internal communication.
Common issues include:
- →repeated database queries
- →duplicate API requests
- →cache misses across zones
Using Redis or application-level caching significantly reduces traffic volume.
Especially for:
- →session data
- →authentication tokens
- →frequently requested metadata
Reviewing VPC Flow Logs
One of the best ways to identify hidden east-west traffic is through VPC Flow Logs.
Many teams enable flow logs but rarely analyze them properly.
Useful analysis targets:
- →top communicating services
- →unexpected cross-AZ traffic
- →noisy workloads
- →repeated retries
- →excessive internal bandwidth usage
CloudWatch Insights example:
fields srcAddr, dstAddr, bytes
| stats sum(bytes) by srcAddr, dstAddr
| sort by sum(bytes) desc
This quickly identifies high-volume internal communication paths.
Monitoring Inter-AZ Traffic Separately
Many teams only monitor total bandwidth.
That hides the real problem.
Monitor separately:
- →inter-AZ transfer
- →inter-region transfer
- →NAT Gateway traffic
- →Transit Gateway traffic
- →pod-to-pod communication patterns
Without visibility, optimization becomes impossible.
Service Mesh Overhead
Service meshes improve observability and security.
But they also introduce additional east-west traffic.
Examples:
- →Envoy sidecar communication
- →mTLS encryption overhead
- →telemetry export traffic
- →tracing pipelines
In large environments, service mesh traffic itself becomes significant.
This does not mean service meshes are bad.
It means traffic overhead must be measured carefully.
Operational Improvements We Achieved
After optimizing service placement and reducing unnecessary east-west traffic:
| Metric | Result |
|---|---|
| Inter-AZ transfer costs | Reduced by ~35% |
| API latency | Improved by ~18% |
| NAT Gateway utilization | Reduced noticeably |
| Kubernetes network overhead | Reduced during peak traffic |
| Troubleshooting complexity | Improved with traffic visibility |
The biggest improvement was visibility.
Once internal communication patterns became measurable, optimization opportunities became obvious.
Lessons Learned
The most important realization was:
Internal cloud traffic is not automatically free or efficient.
As architectures become more distributed:
- →networking complexity increases
- →hidden costs accumulate
- →latency compounds gradually
Another important lesson:
Kubernetes scalability can unintentionally amplify inefficient traffic patterns.
Many performance and cost issues were architectural — not infrastructure-related.
Final Recommendations
If you operate Kubernetes or microservice environments on AWS:
Do:
- →analyze east-west traffic regularly
- →monitor inter-AZ transfer separately
- →use topology-aware routing
- →colocate tightly coupled services
- →implement effective caching
- →reduce synchronous service chains
- →review VPC Flow Logs frequently
Avoid:
- →excessive cross-zone communication
- →overly chatty internal APIs
- →uncontrolled observability traffic
- →unnecessary service mesh complexity
- →blindly scaling distributed systems
In cloud environments, moving data internally still has a cost.
At scale, optimizing east-west traffic becomes both a performance improvement and a financial optimization strategy.
// Written by Lavi Singodiya · May 14, 2026