Ideas Engineered for Tomorrow
We Engineer Services & Solutions for Your Business Needs
Home About
Products
Services
Hire
Industries
Consulting
Partners
Articles Careers Contact
Cloud & DevOps

Platform Engineering: Building Developer Platforms That Stick

Most internal platforms fail because they're built by infrastructure teams for infrastructure teams. Here's how to build one developers actually want to use.

October 25, 2025 15 min read

Platform engineering has gone from obscure Gartner mention to every tech company's hiring priority in about two years. But strip away the hype and the real question is simple: how do you make your developers faster without hiring more DevOps engineers? That's what an Internal Developer Platform (IDP) solves — when it's built right.

What Platform Engineering Actually Is (and Isn't)

Platform engineering is building and maintaining a self-service layer on top of your infrastructure so that application developers can deploy, monitor, and operate their services without filing tickets or waiting for another team.

Let's kill some misconceptions:

It is NOT It IS
DevOps rebranded A product discipline applied to internal tools
Forcing everyone to use Kubernetes Abstracting complexity so devs don't need to care what's underneath
Building an internal AWS console clone Paved roads that handle 80% of use cases with zero friction
A ticket system with a UI Actual self-service — devs click a button, stuff happens in minutes
One team building everything An enabling team that treats developers as customers

The mindset shift that matters: your platform is a product. Developers are your users. If they're not using it voluntarily, it's failed — no matter how technically elegant it is.

Why Platform Engineering Is Happening Now

Three forces collided:

1. "You build it, you run it" scaled poorly. Giving every team full DevOps responsibility sounded great until you had 15 teams, each reinventing deployment pipelines, monitoring setups, and incident response from scratch. The cognitive load became unsustainable.

2. Cloud complexity exploded. AWS alone has 200+ services. A developer who wants to deploy a simple web app faces choices about VPCs, security groups, IAM roles, load balancers, container orchestration, secrets management, logging, monitoring, alerting, DNS, SSL certificates... The infrastructure menu got too long.

3. Developer productivity became a C-level concern. McKinsey's developer productivity report, the DORA metrics movement, and the realization that developer time is the most expensive resource in most tech companies — all of this pushed productivity tooling from "nice to have" to strategic priority.

Golden Paths: The Core Concept

A golden path is the blessed, paved, supported way to do something. Not the only way — the easy way. Teams can go off-path, but they take on the complexity themselves.

What a Good Golden Path Looks Like

For a "deploy a new microservice" golden path:

  1. Developer runs platform create service --template=api (or clicks a button in a portal)
  2. Platform scaffolds: Git repo with CI/CD pipeline, Dockerfile, Kubernetes manifests, monitoring dashboards, alerting rules, service catalog entry
  3. Developer writes code, pushes to main
  4. Pipeline builds, tests, deploys to staging automatically
  5. Developer promotes to production via a single PR merge or button click
  6. Monitoring, logging, and alerting are already configured — zero setup

Time from "I need a new service" to "it's running in staging": under 30 minutes. That's the target.

Golden Path Examples by Use Case

Use Case Golden Path What It Abstracts
New service Service template with CI/CD, monitoring, K8s manifests Infrastructure provisioning, pipeline setup, observability config
New database Self-service form: Postgres/MySQL/Redis, size, region → provisioned in 5 min Terraform, security groups, backup config, credential rotation
New environment Clone staging → new namespace with isolated resources Namespace creation, resource quotas, network policies, secrets
Add monitoring Standard dashboards auto-generated per service; custom metrics via annotations Prometheus config, Grafana dashboards, alerting rules
Secret management CLI command or UI to store/rotate secrets; auto-injected at deploy time Vault/KMS setup, K8s secrets, rotation schedules

IDP Components and Architecture

An Internal Developer Platform isn't one tool — it's a stack. Here's how the layers fit together.

Layer 1: Developer Portal (The Front Door)

This is what developers interact with. A web UI and/or CLI that provides the service catalog (what exists, who owns it, how to reach it), self-service workflows (create service, provision database, manage secrets), documentation (auto-generated API docs, runbooks), and team/ownership information.

Layer 2: Orchestration Engine (The Brain)

Takes developer requests and translates them into infrastructure actions. When a developer says "create a new service," this layer triggers the template engine, runs Terraform, configures the GitOps pipeline, and registers the service in the catalog. Most teams build this with a combination of CI/CD pipelines and custom automation.

Layer 3: Infrastructure Layer (The Foundation)

Kubernetes clusters, cloud resources, networking, security policies. This is managed by the platform team directly — developers shouldn't need to touch it. Infrastructure as Code (Terraform, Pulumi) manages the state.

Layer 4: Observability Layer (The Feedback Loop)

Unified logging, metrics, tracing, and alerting across all services. This needs to be consistent regardless of which team built the service. Golden signals (latency, traffic, errors, saturation) pre-configured for every service.

# Example: Platform service template (what gets generated)
# templates/api-service/skeleton/
#
# .github/workflows/ci.yml    - Build, test, scan, deploy
# Dockerfile                   - Multi-stage, security-hardened
# k8s/
#   deployment.yaml            - Resource limits, health checks
#   service.yaml               - ClusterIP + Ingress
#   hpa.yaml                   - Autoscaling thresholds
# monitoring/
#   dashboard.json             - Grafana dashboard (auto-imported)
#   alerts.yaml                - Critical alerts pre-configured
# catalog-info.yaml            - Backstage service registration

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: {{ service_name }}
  description: {{ description }}
  annotations:
    github.com/project-slug: pillai-infotech/{{ service_name }}
    grafana/dashboard-selector: "service={{ service_name }}"
    pagerduty.com/service-id: {{ pagerduty_id }}
spec:
  type: service
  lifecycle: production
  owner: {{ team_name }}
  providesApis:
    - {{ service_name }}-api

Tool Landscape: What We've Seen Work

Layer Tool Best For Our Take
Portal Backstage (Spotify) Service catalog, docs, plugins The standard, but requires significant investment to customize. Start here if you have 50+ services
Portal Port Self-service actions, RBAC Easier to set up than Backstage. Good for teams that want results fast without maintaining React plugins
Portal Cortex Scorecards, standards enforcement Best for "are our services meeting our standards?" visibility. Pairs well with Backstage
Orchestration Argo CD + Argo Workflows GitOps + workflow automation Our default recommendation. Argo CD for deployments, Workflows for platform actions
Orchestration Crossplane Cloud resources as K8s custom resources Powerful but complex. Worth it if your entire platform runs on K8s and you want one control plane
IaC Terraform + Atlantis Infrastructure provisioning via PRs Battle-tested. Atlantis adds PR-based workflow which fits developer habits
Templates Cookiecutter / Yeoman Service scaffolding Simple, effective. Don't over-engineer this — a good template with a shell script works fine to start

Build vs. Buy vs. Assemble

This is the question every platform team faces. After helping teams at different stages, here's our framework:

Assemble (Recommended for Most)

Pick open-source building blocks (Backstage, Argo CD, Terraform, Grafana), glue them together with automation, build custom pieces only where off-the-shelf doesn't fit. This is what 80% of platform teams should do.

Pros: Flexible, community-supported, no vendor lock-in, can start small.
Cons: Integration work, maintenance burden, requires solid engineering skills.

Buy (For Fast Results)

Commercial platforms like Humanitec, Massdriver, or Qovery provide pre-built IDPs. Good for teams that need results fast and have budget but limited platform engineering headcount.

Pros: Fast to deploy, maintained by vendor, less engineering needed.
Cons: Opinionated (may not fit your workflow), ongoing cost, potential vendor lock-in.

Build (Only If You Must)

Building a fully custom platform from scratch only makes sense at very large scale (500+ engineers) with very specific requirements that no existing tool addresses. Companies like Uber, Spotify, and Netflix built custom platforms — but they also have 50+ platform engineers.

Measuring Developer Experience

If you don't measure it, you can't prove your platform is working. Here are the metrics that matter.

Metric What It Measures Target Red Flag
Time to first deploy New service → running in staging < 30 minutes > 2 days (means golden paths are missing)
Deployment frequency How often teams ship (DORA metric) Daily or more Weekly or less (pipeline friction)
Lead time for changes Commit → production (DORA metric) < 1 hour > 1 week
Platform adoption rate % of teams using platform vs going rogue > 80% < 50% (platform doesn't meet needs)
Ticket volume Requests to platform team that should be self-service Decreasing monthly Increasing (missing self-service capabilities)
Developer NPS Would developers recommend the platform? > 30 Negative (serious usability issues)
The metric we care about most: "How long does it take a new engineer to deploy their first change to production?" This single number captures onboarding quality, documentation, golden path completeness, and pipeline reliability. Track it quarterly.

Case Study: 60-Person Engineering Org

A SaaS company in Bangalore — 60 engineers, 8 teams, ~40 microservices on AWS EKS. Their problems were typical:

  • New service setup took 1-2 weeks of tickets and back-and-forth with the DevOps team (3 people)
  • Each team had its own CI/CD pipeline config, slightly different. Debugging pipeline failures meant understanding 8 different setups
  • The DevOps team spent 70% of their time on toil — fielding requests, fixing pipelines, managing access
  • No service catalog. "Who owns this service?" required Slack archaeology

What We Built (12 Weeks)

Weeks 1-2: Developer interviews (every single team). Identified the top 5 pain points. Number 1 wasn't what management expected — it was "I can't find documentation for internal services."

Weeks 3-6: Backstage deployment with service catalog populated from GitHub repos. Every service got a catalog-info.yaml. Owners, API docs, runbooks — all in one place. This alone made the DevOps team's Slack volume drop 30%.

Weeks 7-10: Two golden paths — "new API service" and "new worker service." Templates included standardized Dockerfile, GitHub Actions pipeline, K8s manifests with resource limits, Grafana dashboard, PagerDuty integration. backstage create → running in staging in 20 minutes.

Weeks 11-12: Self-service database provisioning (RDS Postgres, ElastiCache Redis) via Backstage actions backed by Terraform + Atlantis. No more tickets for "I need a database."

Results After 6 Months

  • New service setup: 1-2 weeks → 25 minutes
  • DevOps team toil: 70% → 30% (they now spend time on platform improvements)
  • Deployment frequency: weekly → 3x daily (averaged across teams)
  • Developer NPS for internal tools: -12 → +38
  • 5 new cloud infrastructure services added without hiring additional DevOps

Total cost: ₹15 lakh in consulting + 3 months of a platform engineer's time. ROI was positive by month 4, primarily from reduced DevOps toil and faster feature delivery.

Frequently Asked Questions

How big does our engineering team need to be to justify platform engineering?

We see the tipping point around 30-40 engineers or 15+ services. Below that, a good DevOps engineer with standardized templates gets you 80% of the benefit. Above that, the coordination costs without a platform become the bottleneck.

Should platform engineers be a separate team or embedded?

Separate team, but with a product mindset. They're building a product for internal developers. Embed temporarily during adoption phases (2-4 weeks per team) to understand pain points, then return to the platform team to build solutions.

Is Backstage worth the investment?

For 50+ services, yes — the service catalog alone justifies it. For smaller setups, start with a simple wiki and CLI tools. Backstage requires ongoing maintenance (React plugins, catalog updates) so budget 20-30% of a developer's time for upkeep.

How do we handle teams that refuse to use the platform?

First, understand why — they usually have valid concerns. If the platform doesn't support their use case, fix it. If it's change resistance, make the golden path so much easier than the alternative that adoption happens naturally. Mandates without value just create workarounds.

Pillai Infotech Engineering Team

We've built internal developer platforms for companies ranging from 30-person startups to 500-person enterprises. Our approach: start small, measure everything, and let developer feedback drive what you build next.

Ready to Build Your Developer Platform?

From Backstage setup to golden path design — we help engineering teams build platforms that reduce toil and accelerate delivery.

Get a Free Platform Assessment Cloud & DevOps Services