Platform Engineering Guide: Build an IDP

Q: Should platform engineers be a separate team or embedded?

Separate team, but with a product mindset. Embed temporarily during adoption phases to understand pain points, then return to build solutions.

Q: Is Backstage worth the investment?

For 50+ services, yes. For smaller setups, start with a simple wiki and CLI tools. Budget 20-30% of a developer's time for Backstage upkeep.

Q: How do we handle teams that refuse to use the platform?

First understand why. Make the golden path so much easier than the alternative that adoption happens naturally. Mandates without value create workarounds.

Platform engineering has gone from obscure Gartner mention to every tech company's hiring priority in about two years. But strip away the hype and the real question is simple: how do you make your developers faster without hiring more DevOps engineers? That's what an Internal Developer Platform (IDP) solves — when it's built right.

What We'll Cover

What Platform Engineering Actually Is (and Isn't)
Why It's Happening Now
Golden Paths: The Core Concept
IDP Components and Architecture
Tool Landscape
Build vs. Buy vs. Assemble
Measuring Developer Experience
Case Study: 60-Person Engineering Org
FAQ

What Platform Engineering Actually Is (and Isn't)

Platform engineering is building and maintaining a self-service layer on top of your infrastructure so that application developers can deploy, monitor, and operate their services without filing tickets or waiting for another team.

Let's kill some misconceptions:

It is NOT	It IS
DevOps rebranded	A product discipline applied to internal tools
Forcing everyone to use Kubernetes	Abstracting complexity so devs don't need to care what's underneath
Building an internal AWS console clone	Paved roads that handle 80% of use cases with zero friction
A ticket system with a UI	Actual self-service — devs click a button, stuff happens in minutes
One team building everything	An enabling team that treats developers as customers

The mindset shift that matters: your platform is a product. Developers are your users. If they're not using it voluntarily, it's failed — no matter how technically elegant it is.

Why Platform Engineering Is Happening Now

Three forces collided:

1. "You build it, you run it" scaled poorly. Giving every team full DevOps responsibility sounded great until you had 15 teams, each reinventing deployment pipelines, monitoring setups, and incident response from scratch. The cognitive load became unsustainable.

2. Cloud complexity exploded. AWS alone has 200+ services. A developer who wants to deploy a simple web app faces choices about VPCs, security groups, IAM roles, load balancers, container orchestration, secrets management, logging, monitoring, alerting, DNS, SSL certificates... The infrastructure menu got too long.

3. Developer productivity became a C-level concern. McKinsey's developer productivity report, the DORA metrics movement, and the realization that developer time is the most expensive resource in most tech companies — all of this pushed productivity tooling from "nice to have" to strategic priority.

Golden Paths: The Core Concept

A golden path is the blessed, paved, supported way to do something. Not the only way — the easy way. Teams can go off-path, but they take on the complexity themselves.

What a Good Golden Path Looks Like

For a "deploy a new microservice" golden path:

Developer runs platform create service --template=api (or clicks a button in a portal)
Platform scaffolds: Git repo with CI/CD pipeline, Dockerfile, Kubernetes manifests, monitoring dashboards, alerting rules, service catalog entry
Developer writes code, pushes to main
Pipeline builds, tests, deploys to staging automatically
Developer promotes to production via a single PR merge or button click
Monitoring, logging, and alerting are already configured — zero setup

Time from "I need a new service" to "it's running in staging": under 30 minutes. That's the target.

Golden Path Examples by Use Case

Use Case	Golden Path	What It Abstracts
New service	Service template with CI/CD, monitoring, K8s manifests	Infrastructure provisioning, pipeline setup, observability config
New database	Self-service form: Postgres/MySQL/Redis, size, region → provisioned in 5 min	Terraform, security groups, backup config, credential rotation
New environment	Clone staging → new namespace with isolated resources	Namespace creation, resource quotas, network policies, secrets
Add monitoring	Standard dashboards auto-generated per service; custom metrics via annotations	Prometheus config, Grafana dashboards, alerting rules
Secret management	CLI command or UI to store/rotate secrets; auto-injected at deploy time	Vault/KMS setup, K8s secrets, rotation schedules

IDP Components and Architecture

An Internal Developer Platform isn't one tool — it's a stack. Here's how the layers fit together.

Layer 1: Developer Portal (The Front Door)

This is what developers interact with. A web UI and/or CLI that provides the service catalog (what exists, who owns it, how to reach it), self-service workflows (create service, provision database, manage secrets), documentation (auto-generated API docs, runbooks), and team/ownership information.

Layer 2: Orchestration Engine (The Brain)

Takes developer requests and translates them into infrastructure actions. When a developer says "create a new service," this layer triggers the template engine, runs Terraform, configures the GitOps pipeline, and registers the service in the catalog. Most teams build this with a combination of CI/CD pipelines and custom automation.

Layer 3: Infrastructure Layer (The Foundation)

Kubernetes clusters, cloud resources, networking, security policies. This is managed by the platform team directly — developers shouldn't need to touch it. Infrastructure as Code (Terraform, Pulumi) manages the state.

Layer 4: Observability Layer (The Feedback Loop)

Unified logging, metrics, tracing, and alerting across all services. This needs to be consistent regardless of which team built the service. Golden signals (latency, traffic, errors, saturation) pre-configured for every service.

# Example: Platform service template (what gets generated)
# templates/api-service/skeleton/
#
# .github/workflows/ci.yml    - Build, test, scan, deploy
# Dockerfile                   - Multi-stage, security-hardened
# k8s/
#   deployment.yaml            - Resource limits, health checks
#   service.yaml               - ClusterIP + Ingress
#   hpa.yaml                   - Autoscaling thresholds
# monitoring/
#   dashboard.json             - Grafana dashboard (auto-imported)
#   alerts.yaml                - Critical alerts pre-configured
# catalog-info.yaml            - Backstage service registration

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: {{ service_name }}
  description: {{ description }}
  annotations:
    github.com/project-slug: pillai-infotech/{{ service_name }}
    grafana/dashboard-selector: "service={{ service_name }}"
    pagerduty.com/service-id: {{ pagerduty_id }}
spec:
  type: service
  lifecycle: production
  owner: {{ team_name }}
  providesApis:
    - {{ service_name }}-api

Tool Landscape: What We've Seen Work

Layer	Tool	Best For	Our Take
Portal	Backstage (Spotify)	Service catalog, docs, plugins	The standard, but requires significant investment to customize. Start here if you have 50+ services
Portal	Port	Self-service actions, RBAC	Easier to set up than Backstage. Good for teams that want results fast without maintaining React plugins
Portal	Cortex	Scorecards, standards enforcement	Best for "are our services meeting our standards?" visibility. Pairs well with Backstage
Orchestration	Argo CD + Argo Workflows	GitOps + workflow automation	Our default recommendation. Argo CD for deployments, Workflows for platform actions
Orchestration	Crossplane	Cloud resources as K8s custom resources	Powerful but complex. Worth it if your entire platform runs on K8s and you want one control plane
IaC	Terraform + Atlantis	Infrastructure provisioning via PRs	Battle-tested. Atlantis adds PR-based workflow which fits developer habits
Templates	Cookiecutter / Yeoman	Service scaffolding	Simple, effective. Don't over-engineer this — a good template with a shell script works fine to start

Build vs. Buy vs. Assemble

This is the question every platform team faces. After helping teams at different stages, here's our framework:

Assemble (Recommended for Most)

Pick open-source building blocks (Backstage, Argo CD, Terraform, Grafana), glue them together with automation, build custom pieces only where off-the-shelf doesn't fit. This is what 80% of platform teams should do.

Pros: Flexible, community-supported, no vendor lock-in, can start small.
Cons: Integration work, maintenance burden, requires solid engineering skills.

Buy (For Fast Results)

Commercial platforms like Humanitec, Massdriver, or Qovery provide pre-built IDPs. Good for teams that need results fast and have budget but limited platform engineering headcount.

Pros: Fast to deploy, maintained by vendor, less engineering needed.
Cons: Opinionated (may not fit your workflow), ongoing cost, potential vendor lock-in.

Build (Only If You Must)

Building a fully custom platform from scratch only makes sense at very large scale (500+ engineers) with very specific requirements that no existing tool addresses. Companies like Uber, Spotify, and Netflix built custom platforms — but they also have 50+ platform engineers.

Measuring Developer Experience

If you don't measure it, you can't prove your platform is working. Here are the metrics that matter.

Metric	What It Measures	Target	Red Flag
Time to first deploy	New service → running in staging	< 30 minutes	> 2 days (means golden paths are missing)
Deployment frequency	How often teams ship (DORA metric)	Daily or more	Weekly or less (pipeline friction)
Lead time for changes	Commit → production (DORA metric)	< 1 hour	> 1 week
Platform adoption rate	% of teams using platform vs going rogue	> 80%	< 50% (platform doesn't meet needs)
Ticket volume	Requests to platform team that should be self-service	Decreasing monthly	Increasing (missing self-service capabilities)
Developer NPS	Would developers recommend the platform?	> 30	Negative (serious usability issues)

The metric we care about most: "How long does it take a new engineer to deploy their first change to production?" This single number captures onboarding quality, documentation, golden path completeness, and pipeline reliability. Track it quarterly.

Case Study: 60-Person Engineering Org

A SaaS company in Bangalore — 60 engineers, 8 teams, ~40 microservices on AWS EKS. Their problems were typical:

New service setup took 1-2 weeks of tickets and back-and-forth with the DevOps team (3 people)
Each team had its own CI/CD pipeline config, slightly different. Debugging pipeline failures meant understanding 8 different setups
The DevOps team spent 70% of their time on toil — fielding requests, fixing pipelines, managing access
No service catalog. "Who owns this service?" required Slack archaeology

What We Built (12 Weeks)

Weeks 1-2: Developer interviews (every single team). Identified the top 5 pain points. Number 1 wasn't what management expected — it was "I can't find documentation for internal services."

Weeks 3-6: Backstage deployment with service catalog populated from GitHub repos. Every service got a catalog-info.yaml. Owners, API docs, runbooks — all in one place. This alone made the DevOps team's Slack volume drop 30%.

Weeks 7-10: Two golden paths — "new API service" and "new worker service." Templates included standardized Dockerfile, GitHub Actions pipeline, K8s manifests with resource limits, Grafana dashboard, PagerDuty integration. backstage create → running in staging in 20 minutes.

Weeks 11-12: Self-service database provisioning (RDS Postgres, ElastiCache Redis) via Backstage actions backed by Terraform + Atlantis. No more tickets for "I need a database."

Results After 6 Months

New service setup: 1-2 weeks → 25 minutes
DevOps team toil: 70% → 30% (they now spend time on platform improvements)
Deployment frequency: weekly → 3x daily (averaged across teams)
Developer NPS for internal tools: -12 → +38
5 new cloud infrastructure services added without hiring additional DevOps

Total cost: ₹15 lakh in consulting + 3 months of a platform engineer's time. ROI was positive by month 4, primarily from reduced DevOps toil and faster feature delivery.

Frequently Asked Questions

How big does our engineering team need to be to justify platform engineering?

We see the tipping point around 30-40 engineers or 15+ services. Below that, a good DevOps engineer with standardized templates gets you 80% of the benefit. Above that, the coordination costs without a platform become the bottleneck.

Should platform engineers be a separate team or embedded?

Separate team, but with a product mindset. They're building a product for internal developers. Embed temporarily during adoption phases (2-4 weeks per team) to understand pain points, then return to the platform team to build solutions.

Is Backstage worth the investment?

For 50+ services, yes — the service catalog alone justifies it. For smaller setups, start with a simple wiki and CLI tools. Backstage requires ongoing maintenance (React plugins, catalog updates) so budget 20-30% of a developer's time for upkeep.

How do we handle teams that refuse to use the platform?

First, understand why — they usually have valid concerns. If the platform doesn't support their use case, fix it. If it's change resistance, make the golden path so much easier than the alternative that adoption happens naturally. Mandates without value just create workarounds.

Pillai Infotech Engineering Team

We've built internal developer platforms for companies ranging from 30-person startups to 500-person enterprises. Our approach: start small, measure everything, and let developer feedback drive what you build next.

Platform Engineering: Building Developer Platforms That Stick

What We'll Cover

What Platform Engineering Actually Is (and Isn't)

Why Platform Engineering Is Happening Now

Golden Paths: The Core Concept

What a Good Golden Path Looks Like

Golden Path Examples by Use Case

IDP Components and Architecture

Layer 1: Developer Portal (The Front Door)

Layer 2: Orchestration Engine (The Brain)

Layer 3: Infrastructure Layer (The Foundation)

Layer 4: Observability Layer (The Feedback Loop)

Tool Landscape: What We've Seen Work

Build vs. Buy vs. Assemble

Assemble (Recommended for Most)

Buy (For Fast Results)

Build (Only If You Must)

Measuring Developer Experience

Case Study: 60-Person Engineering Org

What We Built (12 Weeks)

Results After 6 Months

Frequently Asked Questions

How big does our engineering team need to be to justify platform engineering?

Should platform engineers be a separate team or embedded?

Is Backstage worth the investment?

How do we handle teams that refuse to use the platform?

Pillai Infotech Engineering Team

Related Articles

Ready to Build Your Developer Platform?

Related Articles

What is Agentic AI?Complete guide to autonomous AI agents

AI Agents in EnterpriseHow agents are transforming workflows

RAG GuideRetrieval-augmented generation explained

Prompt EngineeringAdvanced techniques for developers

Generative AI Use CasesReal-world business applications

SLMs vs LLMsWhen small models beat large ones

MLOps GuideProduction ML lifecycle management

Vector DatabasesEmbeddings, similarity search, use cases

AI in Software DevHow AI is changing how we build

AI Coding AssistantsCopilot, Claude, and the future

Computer VisionBusiness applications & use cases

React vs AngularWhich frontend framework to choose

Next.js vs Nuxt.jsSSR framework comparison 2026

TypeScript Best PracticesType safety patterns & tips

GraphQL vs RESTAPI design approaches compared

Python vs Node.jsBackend language decision guide

Rust vs GoSystems programming showdown

Full-Stack Trends 2026What's shaping full-stack in 2026

PWA GuideBuilding installable web apps

Svelte vs ReactLightweight alternative showdown

Web PerformanceSpeed optimization techniques

Low-Code vs CustomWhen to build vs buy

AWS vs Azure vs GCPCloud platform comparison 2026

Kubernetes vs Docker SwarmContainer orchestration compared

Terraform GuideInfrastructure as Code best practices

CI/CD Best PracticesPipeline design & optimization

Cloud Native GuideBuilding for the cloud from day one

Serverless ArchitectureWhen & when not to go serverless

Docker Best PracticesContainer patterns & anti-patterns

DevOps Best PracticesFor startups & enterprises

Platform Engineering: Building Developer Platforms That Stick

What We'll Cover

What Platform Engineering Actually Is (and Isn't)

Why Platform Engineering Is Happening Now

Golden Paths: The Core Concept

What a Good Golden Path Looks Like

Golden Path Examples by Use Case

IDP Components and Architecture

Layer 1: Developer Portal (The Front Door)

Layer 2: Orchestration Engine (The Brain)

Layer 3: Infrastructure Layer (The Foundation)

Layer 4: Observability Layer (The Feedback Loop)

Tool Landscape: What We've Seen Work

Build vs. Buy vs. Assemble

Assemble (Recommended for Most)

Buy (For Fast Results)

Build (Only If You Must)

Measuring Developer Experience

Case Study: 60-Person Engineering Org

What We Built (12 Weeks)

Results After 6 Months

Frequently Asked Questions

How big does our engineering team need to be to justify platform engineering?

Should platform engineers be a separate team or embedded?

Is Backstage worth the investment?

How do we handle teams that refuse to use the platform?

Pillai Infotech Engineering Team

Related Articles

Ready to Build Your Developer Platform?

Book a Free Consultation

Your Details

Pick a 30-min Slot

Thank You!