Reducing East-West Traffic in AWS: Hidden Networking Costs in Kubernetes and Microservice Architectures

Everything looked healthy.

→CPU utilization was normal
→Application latency seemed acceptable
→Infrastructure scaled correctly
→No major production incidents occurred

But AWS networking costs kept increasing every month.

The issue was not internet traffic.

It was uncontrolled east-west traffic inside the cloud environment.

This problem becomes increasingly common in modern architectures using:

→Kubernetes
→EKS
→microservices
→service meshes
→centralized logging
→distributed monitoring systems

Most teams carefully monitor:

→Internet ingress traffic
→Load balancer usage
→NAT Gateway costs
→EC2 utilization

Very few teams actively analyze how services communicate internally.

That becomes expensive at scale.

What Is East-West Traffic?

Cloud traffic generally falls into two categories.

North-South Traffic

Traffic entering or leaving the environment:

Internet ↔ Application

Examples:

→User requests
→Public APIs
→External integrations
→CDN traffic

East-West Traffic

Traffic flowing internally between systems:

Service ↔ Service
Pod ↔ Pod
Node ↔ Node
VPC ↔ VPC

Examples:

→Microservice API calls
→Kubernetes pod communication
→Database requests
→Logging pipelines
→Monitoring agents
→Service discovery traffic

This traffic is often invisible until environments become large.

Why East-West Traffic Becomes a Problem

Modern cloud environments generate massive amounts of internal communication.

Especially when using:

→distributed microservices
→EKS clusters
→multi-AZ architectures
→centralized observability platforms

Initially everything works fine.

But over time:

→Cross-AZ traffic costs increase
→Latency gradually rises
→API retries become more frequent
→Kubernetes networking overhead grows
→Troubleshooting becomes difficult

The dangerous part is that most of this traffic is considered “internal,” so teams assume it is free.

That assumption is incorrect.

The AWS Pricing Reality Many Teams Miss

Many engineers believe internal AWS traffic costs nothing.

That is only partially true.

Generally Free

→Traffic within the same Availability Zone
→Internal communication on the same node/VPC

Charged

→Cross-AZ traffic
→Inter-region traffic
→NAT Gateway processing
→Transit Gateway traffic
→VPC peering transfer traffic

In Kubernetes environments, pods are frequently distributed across multiple AZs automatically.

That creates hidden networking costs very quickly.

How Kubernetes Amplifies East-West Traffic

Kubernetes improves scalability and availability.

But it also increases internal network communication significantly.

Typical EKS traffic includes:

→pod-to-pod communication
→kube-dns lookups
→ingress controller traffic
→service mesh sidecars
→Prometheus scraping
→Fluent Bit log forwarding
→readiness/liveness probes

Now combine this with multi-AZ scheduling.

Example:

Frontend Pod → AZ-a
Backend Pod → AZ-c
Redis → AZ-b

Every request crosses Availability Zones.

Now multiply that by:

→thousands of requests per second
→retries
→monitoring traffic
→health checks
→service mesh overhead

The result becomes expensive surprisingly fast.

Real Production Symptoms

The first signs are usually indirect.

We initially noticed:

→rising NAT Gateway costs
→unexpected inter-AZ transfer charges
→occasional latency spikes
→elevated Kubernetes networking metrics

Infrastructure scaling appeared healthy.

No obvious production failures existed.

But VPC Flow Logs showed services communicating excessively across Availability Zones.

The architecture looked clean on diagrams.

The actual traffic flow was not.

Common Architectural Problems

1. Excessively Chatty Microservices

Many microservice environments become overloaded with synchronous API calls.

Example:

Frontend → API Gateway → Auth Service → User Service → Billing Service → Notification Service

One user request may trigger:

→dozens of internal API calls
→retries
→duplicate queries
→monitoring events

At scale, this creates substantial east-west traffic.

2. Random Kubernetes Pod Scheduling

By default, Kubernetes distributes workloads across nodes and AZs automatically.

This improves availability.

But tightly coupled services often communicate constantly.

If pods are distributed randomly:

Frontend → Backend → Cache

may cross multiple Availability Zones repeatedly.

3. Centralized Shared Services

Many organizations deploy:

→centralized logging
→authentication
→monitoring
→shared databases

and allow every environment to access them continuously.

This creates excessive cross-VPC and cross-AZ traffic.

4. Monitoring and Logging Overhead

Observability systems themselves generate large amounts of internal traffic.

Common examples:

→Prometheus scraping every few seconds
→Fluent Bit forwarding logs continuously
→Datadog agents collecting metrics
→distributed tracing systems

In large EKS environments, monitoring traffic alone can become significant.

Visualizing the Problem

Inefficient Architecture

Frontend (AZ-a)
   ↓
Backend API (AZ-c)
   ↓
Redis Cache (AZ-b)
   ↓
Database (AZ-c)

Every request crosses multiple Availability Zones.

Optimized Architecture

Frontend + Backend + Cache aligned within same AZ

AZ-a:
Frontend-a → Backend-a → Redis-a

AZ-b:
Frontend-b → Backend-b → Redis-b

This reduces:

→latency
→inter-AZ transfer cost
→unnecessary network hops

while maintaining high availability.

Using Topology-Aware Routing in Kubernetes

Kubernetes supports topology-aware routing.

This helps keep traffic closer to local nodes and Availability Zones.

Example:

service.kubernetes.io/topology-mode: Auto

Or:

topologyKeys:
  - topology.kubernetes.io/zone

This improves:

→request locality
→response latency
→cross-zone traffic optimization

Many teams never enable this feature.

Using Pod Affinity for Chatty Services

For tightly coupled workloads, pod placement matters significantly.

Example:

affinity:
  podAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
            - key: app
              operator: In
              values:
                - backend
        topologyKey: topology.kubernetes.io/zone

This helps place related workloads within the same zone.

Useful for:

→frontend/backend pairs
→cache-heavy applications
→internal APIs with high request volume

Reducing Synchronous Communication

One of the biggest causes of east-west traffic is excessive synchronous APIs.

Example:

Service A → Service B → Service C → Service D

Failures cascade quickly.

Latency accumulates across every hop.

Instead, asynchronous communication often works better.

Examples:

→Amazon SQS
→Kafka
→EventBridge
→SNS

Benefits:

→lower retry traffic
→reduced service coupling
→better scalability
→improved resilience

Using Caching Effectively

Poor caching strategies increase unnecessary internal communication.

Common issues include:

→repeated database queries
→duplicate API requests
→cache misses across zones

Using Redis or application-level caching significantly reduces traffic volume.

Especially for:

→session data
→authentication tokens
→frequently requested metadata

Reviewing VPC Flow Logs

One of the best ways to identify hidden east-west traffic is through VPC Flow Logs.

Many teams enable flow logs but rarely analyze them properly.

Useful analysis targets:

→top communicating services
→unexpected cross-AZ traffic
→noisy workloads
→repeated retries
→excessive internal bandwidth usage

CloudWatch Insights example:

fields srcAddr, dstAddr, bytes
| stats sum(bytes) by srcAddr, dstAddr
| sort by sum(bytes) desc

This quickly identifies high-volume internal communication paths.

Monitoring Inter-AZ Traffic Separately

Many teams only monitor total bandwidth.

That hides the real problem.

Monitor separately:

→inter-AZ transfer
→inter-region transfer
→NAT Gateway traffic
→Transit Gateway traffic
→pod-to-pod communication patterns

Without visibility, optimization becomes impossible.

Service Mesh Overhead

Service meshes improve observability and security.

But they also introduce additional east-west traffic.

Examples:

→Envoy sidecar communication
→mTLS encryption overhead
→telemetry export traffic
→tracing pipelines

In large environments, service mesh traffic itself becomes significant.

This does not mean service meshes are bad.

It means traffic overhead must be measured carefully.

Operational Improvements We Achieved

After optimizing service placement and reducing unnecessary east-west traffic:

Metric	Result
Inter-AZ transfer costs	Reduced by ~35%
API latency	Improved by ~18%
NAT Gateway utilization	Reduced noticeably
Kubernetes network overhead	Reduced during peak traffic
Troubleshooting complexity	Improved with traffic visibility

The biggest improvement was visibility.

Once internal communication patterns became measurable, optimization opportunities became obvious.

Lessons Learned

The most important realization was:

Internal cloud traffic is not automatically free or efficient.

As architectures become more distributed:

→networking complexity increases
→hidden costs accumulate
→latency compounds gradually

Another important lesson:

Kubernetes scalability can unintentionally amplify inefficient traffic patterns.

Many performance and cost issues were architectural — not infrastructure-related.

Final Recommendations

If you operate Kubernetes or microservice environments on AWS:

Do:

→analyze east-west traffic regularly
→monitor inter-AZ transfer separately
→use topology-aware routing
→colocate tightly coupled services
→implement effective caching
→reduce synchronous service chains
→review VPC Flow Logs frequently

Avoid:

→excessive cross-zone communication
→overly chatty internal APIs
→uncontrolled observability traffic
→unnecessary service mesh complexity
→blindly scaling distributed systems

In cloud environments, moving data internally still has a cost.

At scale, optimizing east-west traffic becomes both a performance improvement and a financial optimization strategy.

All Articles

// Written by Lavi Singodiya · May 14, 2026