Kubernetes has a reputation for complexity, and it's deserved. But for most applications, you only need to understand a handful of concepts to deploy reliably.
This guide covers the fundamentals that matter—the 20% of Kubernetes that solves 80% of deployment problems.
Why Kubernetes?
Before diving in, let's be clear about when Kubernetes makes sense:
Good fit:
- Multiple services that need to scale independently
- Traffic patterns that vary significantly
- Teams that deploy frequently (daily or more)
- Need for zero-downtime deployments
Probably overkill:
- Single application with predictable traffic
- Small team with infrequent deploys
- Budget-constrained projects (managed Kubernetes isn't cheap)
If a simple VPS or managed platform (Vercel, Railway, Render) meets your needs, use that instead. Kubernetes is powerful but has operational overhead.
The essential concepts
1. Pods: The basic unit
A Pod is one or more containers that share storage and network. In practice, most Pods run a single container.
# pod.yaml - You rarely create Pods directly
apiVersion: v1
kind: Pod
metadata:
name: web-app
spec:
containers:
- name: web
image: your-registry/web-app:v1.0.0
ports:
- containerPort: 3000
Pods are ephemeral. Kubernetes can terminate them at any time. Never store state in a Pod—treat them as disposable.
2. Deployments: Managing replicas
Deployments manage multiple Pod replicas and handle updates. This is what you'll actually create.
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3 # Run 3 identical Pods
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: web
image: your-registry/web-app:v1.0.0
ports:
- containerPort: 3000
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
Key settings to always configure:
replicas: Start with at least 2 for high availability. Scale up based on load.
resources.requests: What your container needs to run. Kubernetes uses this for scheduling.
resources.limits: Maximum allowed. Container gets killed if it exceeds memory limit.
3. Services: Internal networking
Services provide stable network endpoints for Pods. Since Pod IPs change, Services give you a consistent address.
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: web-app
spec:
selector:
app: web-app # Matches Pods with this label
ports:
- port: 80 # Service port
targetPort: 3000 # Container port
type: ClusterIP # Internal only
Service types:
- ClusterIP: Internal access only (default)
- NodePort: Exposes on each node's IP
- LoadBalancer: Creates external load balancer (cloud only)
4. Ingress: External traffic
Ingress routes external traffic to Services, typically handling TLS termination.
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-app
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- app.example.com
secretName: web-app-tls
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app
port:
number: 80
You'll need an Ingress Controller (nginx-ingress, Traefik, etc.) installed in your cluster.
Health checks: The key to reliability
Health checks are the most important concept for reliable deployments. Kubernetes uses them to:
- Know when to route traffic to a Pod
- Know when to restart a failing Pod
Liveness probe: "Is it alive?"
If the liveness probe fails, Kubernetes restarts the container.
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10 # Wait before first check
periodSeconds: 10 # Check every 10 seconds
failureThreshold: 3 # Restart after 3 failures
Readiness probe: "Can it handle traffic?"
If the readiness probe fails, the Pod is removed from Service endpoints (no traffic).
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 2
Implementation tip: Your /health endpoint should verify critical dependencies (database connection, cache). Your /ready endpoint should return quickly—just confirm the app is accepting requests.
// Simple Node.js health check
app.get("/health", async (req, res) => {
try {
await db.query("SELECT 1");
res.status(200).json({ status: "healthy" });
} catch (error) {
res.status(503).json({ status: "unhealthy", error: error.message });
}
});
app.get("/ready", (req, res) => {
res.status(200).json({ status: "ready" });
});
Zero-downtime deployments
Kubernetes supports several update strategies. The default (RollingUpdate) works well for most cases.
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # Create 1 extra Pod during update
maxUnavailable: 0 # Never have fewer than desired replicas
For this to work:
- New Pods must become ready before old ones terminate
- Readiness probes must accurately reflect ability to handle traffic
- Application must handle graceful shutdown
Graceful shutdown
When Kubernetes terminates a Pod, it sends SIGTERM. Your app should:
- Stop accepting new connections
- Finish processing current requests
- Exit
// Node.js graceful shutdown
process.on("SIGTERM", () => {
console.log("SIGTERM received, shutting down gracefully");
server.close(() => {
console.log("HTTP server closed");
db.end(() => {
console.log("Database connection closed");
process.exit(0);
});
});
// Force exit after timeout
setTimeout(() => process.exit(1), 10000);
});
Set terminationGracePeriodSeconds in your Pod spec to match your shutdown timeout.
Rollbacks
When deployments fail, roll back quickly:
# View deployment history
kubectl rollout history deployment/web-app
# Roll back to previous version
kubectl rollout undo deployment/web-app
# Roll back to specific revision
kubectl rollout undo deployment/web-app --to-revision=2
Kubernetes keeps revision history by default. Increase revisionHistoryLimit if you need more:
spec:
revisionHistoryLimit: 10
Configuration management
Never hardcode configuration. Use ConfigMaps and Secrets.
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: web-app-config
data:
LOG_LEVEL: "info"
API_TIMEOUT: "30s"
# secret.yaml (base64 encoded values)
apiVersion: v1
kind: Secret
metadata:
name: web-app-secrets
type: Opaque
data:
DATABASE_URL: cG9zdGdyZXM6Ly91c2VyOnBhc3NAaG9zdDo1NDMyL2Ri
Reference in Deployment:
containers:
- name: web
envFrom:
- configMapRef:
name: web-app-config
- secretRef:
name: web-app-secrets
The minimal production setup
A typical web application needs:
- Deployment with health checks and resource limits
- Service for internal routing
- Ingress with TLS for external access
- ConfigMap/Secret for configuration
Here's a complete example:
# Combine in a single file with '---' separators
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: web
image: your-registry/web-app:v1.0.0
ports:
- containerPort: 3000
envFrom:
- configMapRef:
name: web-app-config
resources:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: web-app
spec:
selector:
app: web-app
ports:
- port: 80
targetPort: 3000
What we've skipped
This guide covers the essentials. Topics for future exploration:
- Horizontal Pod Autoscaling: Automatic scaling based on metrics
- Pod Disruption Budgets: Ensuring availability during maintenance
- Network Policies: Restricting Pod communication
- Persistent Volumes: Stateful workloads
- Helm: Package management for Kubernetes
Start with the basics. Add complexity only when needed.
Need help with Kubernetes deployment or migration? Let's talk about your infrastructure needs.