Kubernetes manages load balancing using a concept called Services.
Let’s say you have 3 containers running the same microservice — these are usually called Pods in Kubernetes. Kubernetes doesn't send traffic to them directly. Instead, you create a Service, which acts like a stable entry point (like a virtual IP address or DNS name) for your microservice.
The most common type of service for internal communication is called a ClusterIP service. It automatically load balances traffic across all the pods behind it. Kubernetes keeps track of which pods are running and uses a built-in component called kube-proxy to route the incoming traffic to one of the available pods.
Here’s how it works step-by-step:
- You define a Deployment (or some controller) to run 3 identical pods.
- You create a Service (like
my-service
) that selects these pods using labels.
- When another service or client inside the cluster sends a request to
my-service
, Kubernetes randomly or round-robin distributes that request to one of the 3 pods.
Kubernetes automatically monitors the health of pods and distributes incoming traffic across them, eliminating the need for special manual configurations.
For external access, Kubernetes provides options such as LoadBalancer or Ingress, depending on your deployment configuration.
Simple Explanation
Kubernetes uses something called a Service to manage load balancing.
Let’s say you deploy 3 pods running the same microservice. Kubernetes can automatically balance the traffic between them using a Service, so no single pod gets overloaded.
Step-by-Step with YAML
1. Create a Deployment with 3 replicas (pods)
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-microservice
spec:
replicas: 3
selector:
matchLabels:
app: my-microservice
template:
metadata:
labels:
app: my-microservice
spec:
containers:
- name: my-container
image: myusername/my-microservice:latest
ports:
- containerPort: 8080
This will launch 3 pods running the same container image.
2. Create a Service to load balance between the pods
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: my-microservice
ports:
- protocol: TCP
port: 80 # Port on the service
targetPort: 8080 # Port on the pods
type: ClusterIP # Internal load balancing within the cluster
This Service will expose a stable IP (or DNS like my-service.default.svc.cluster.local
) that other pods can call. Kubernetes will automatically route traffic to one of the 3 pods using round-robin or random strategy (depending on configuration and setup).
Optional: Expose to the Outside World
If you want to expose it outside the cluster:
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app: my-microservice
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: LoadBalancer
This will ask your cloud provider (like AWS, GCP, Azure) to create an external load balancer that forwards traffic to the 3 pods inside your cluster. So, you don't have to manually configure anything special - Kubernetes takes care of keeping track of the healthy pods and evenly spreading the traffic between them.