Your Kubernetes Cluster Is Probably Overkill (Still)

Kubernetes is a marvel—so is a CNC machine. The trouble isn’t that it’s powerful. The trouble is that most teams keep buying power tools when they actually need a bench vise. Three years after I made this argument, the replies are still furious. That tells me the real issue isn’t Kubernetes. It’s what people do with it.

Kubernetes is great for running complex, multi-team systems at scale. But for the majority of teams running it today—especially those under ~30 engineers—it’s usually not the best default. You’re paying an operational complexity tax for capabilities you may never use.

This is not a hit piece. It’s a reality check, with practical alternatives you can adopt without lighting your roadmap on fire.

Why Kubernetes becomes “invisible infrastructure”—and why that’s costly⌗

The biggest lie about Kubernetes is that it’s just “infrastructure as code,” like Terraform. In practice, Kubernetes introduces a new category of work that doesn’t feel like feature development but still consumes engineering energy: continual operational alignment.

You rarely “set up Kubernetes once.” Instead, you maintain an ecosystem:

Identity and access (RBAC roles, service accounts, least privilege)
Networking (ingress controllers, service types, network policies, DNS quirks)
Storage (storage classes, volume lifecycle, reclaim policies)
Security plumbing (certificates, secrets management patterns, image scanning integration)
Deployment mechanics (rollouts, readiness/liveness, autoscaling rules, dashboards/observability hooks)

Even if your team uses sensible defaults, the work doesn’t disappear. It just moves into “tribal knowledge” and recurring operational tickets.

A common pattern looks like this:

You ship features in sprint cadence, but you also spend cycles every week debugging environment drift—why one pod behaves differently under the new ingress, why a migration script fails only in staging, why TLS renewals stalled, why a network policy blocked something you forgot to label.

That time doesn’t always show up as “Kubernetes time” on a timesheet. But it shows up in your delivery speed.

And if you’re not explicitly using Kubernetes primitives—multi-tenancy boundaries, advanced scheduling, complex stateful workloads, multi-region failover—then a lot of that overhead is mostly theater.

The “capability mismatch” problem: you’re building the wrong platform⌗

Kubernetes rewards teams that need platform behaviors. The platform behaviors people often cite—self-healing, rolling updates, scaling—are real. But they’re not exclusive to K8s. What matters is whether your application portfolio demands the orchestration features K8s makes easy.

Consider a typical team of ~10–25 engineers building a handful of services:

A web app
One or two background workers
A few domain services (maybe a queue consumer, maybe a scheduled job)
A database and cache
Some integrations and an admin UI

This is a perfect use case for simpler deployment models. You can still get reliability: retries, health checks, rollouts, environment separation. You just don’t need to bring your own cluster-building discipline.

Here’s the litmus test I use:

Do you run multiple deployable services per team, with strict ownership boundaries and heavy automation demands?
Do you need fine-grained traffic routing across many versions (beyond basic blue/green)?
Do you manage stateful workloads with custom storage behavior frequently?
Do you require complex autoscaling based on app-level metrics?
Do you operate across multiple regions or have a real multi-cluster story?

If you’re answering “mostly no,” then Kubernetes becomes a capability mismatch. You’re paying to orchestrate complexity you don’t need, which slows down everything else.

What those engineers are really doing (and what to stop doing)⌗

Let’s talk about the day-to-day work that tends to consume Kubernetes teams. This isn’t a list of “bad practices.” It’s a list of the recurring categories of tasks that Kubernetes introduces.

1) RBAC and permissions drift⌗

When you add a new job, service, or endpoint, you often have to update RBAC bindings. Even with automation, the process adds friction and review surface area. It’s not hard—it’s just constant.

Practical move: if you must use Kubernetes, invest early in a clean permission model with templates. But if you don’t need Kubernetes, don’t recreate this treadmill elsewhere.

2) Ingress, TLS, and “why is staging different?”⌗

Ingress controllers and certificate management sound solved—until you have five environments, two domains, and one legacy wildcard certificate strategy. Then you get a steady drip of “it works locally / fails in staging” issues.

Practical move: fewer moving parts wins. Managed PaaS and simpler container platforms typically reduce this entire class of work.

3) Storage class complexity and volume lifecycle surprises⌗

Stateful workloads require careful thinking: provisioning behavior, reclaim policies, migration impact, backups, and incident recovery. If your services are mostly stateless (common for modern web stacks), you can keep your state on managed services and reduce Kubernetes storage complexity dramatically. But when teams don’t do this, they end up spending time in storage configuration instead of product.

4) Observability wiring that never fully stabilizes⌗

Kubernetes monitoring stacks are powerful, but the integration path can be endless: metrics, logs, traces, pod identity, dashboards, alert tuning, SLO definitions. You’ll still need observability with any platform—but Kubernetes makes it more configurable, and configuration invites drift.

Practical move: choose an ecosystem with sensible defaults and fewer integration seams unless you truly need the control.

Faster paths: Docker Compose, Kamal, and managed PaaS⌗

Here are three alternatives I genuinely think most teams should consider first. The point isn’t ideology. The point is shipping faster with fewer operational surprises.

Docker Compose: the “serious local + simple deploy” baseline⌗

Docker Compose is underrated because it’s not glamorous. But if your architecture is straightforward, Compose gives you:

Repeatable local development
Clear service boundaries
A clean way to mirror staging
Straightforward CI builds

You can run Compose in production with the right tooling around it, but even if you don’t, Compose still pays off by removing environment differences. Less drift means fewer “Kubernetes-only” bugs and fewer emergency rollbacks.

Concrete example: If your team currently builds Docker images and then struggles with dev/staging parity, start by making Compose the canonical way to run the whole stack locally. When you later deploy via another method, keep the service definitions aligned.

Kamal (or similar “deploy via SSH + containers” approaches): pragmatic deployment⌗

Tools in the “run containers on servers” family let teams focus on what matters: build artifacts, run them reliably, and roll forward/back without cluster semantics.

Kamal-style workflows shine for teams that don’t want to learn the full operational Kubernetes surface area:

No RBAC learning curve
Less networking plumbing
Fewer controllers to manage
Predictable runtime behavior

If your deployment targets are a small set of VMs (or a limited fleet), this approach can be shockingly productive.

Concrete example: A two-service Rails + worker setup can deploy via a single command, with database migrations handled in a controlled step. When something fails, you can inspect the running container set directly rather than spelunking through pod states, events, and controller reconciliation.

Managed PaaS: let someone else babysit the platform⌗

If you’re not actively developing a platform, using a managed PaaS can be the fastest path to “production without the platform tax.”

Managed offerings typically handle:

Ingress routing
TLS automation
Build pipelines and rollouts
Environment configuration
Some level of scaling (often good enough)

This isn’t about avoiding performance constraints forever. It’s about delaying complexity until you truly need it.

Concrete example: If your team’s bottleneck is feature throughput, the move is to adopt a PaaS that supports your deployment model and database integration well—then use the saved engineering time to improve reliability at the app layer (idempotent jobs, safe migrations, better caching strategies, and robust retry policies).

When Kubernetes actually earns its keep⌗

To be fair, Kubernetes earns its keep in specific situations. You shouldn’t feel guilty for using it when you need what it provides.

Kubernetes is a strong choice when you have:

A large portfolio of services with frequent deployments across many teams
Strong multi-tenancy or isolation requirements (not just “we should be careful,” but enforceable boundaries)
Advanced scheduling and placement needs
Complex stateful workloads that genuinely benefit from Kubernetes-native patterns
Multi-cluster or multi-region architectures with orchestrated failover strategies
A platform team that can maintain the ecosystem as a product

The key is organizational. If you have the headcount and mandate to operate Kubernetes well, the complexity tax is an investment, not a leak.

But if you’re a small team building product—Kubernetes can become your unwilling second job.

Conclusion: choose the simplest system that meets your real requirements⌗

Kubernetes isn’t wrong. It’s just often misapplied.

If your team is under 30 engineers and you don’t have multi-region needs, complex autoscaling, canary-heavy rollout strategies, or a platform team dedicated to cluster operations, you’re probably paying a complexity tax you could eliminate. Start with Docker Compose for parity, consider Kamal-style pragmatic deployments for speed, or go managed PaaS to reclaim engineering time.

Ship faster. Learn fewer moving parts. Spend your energy on the product—not on keeping controllers and certificates reconciled.