Platform Engineering Guide — Build Your Internal Developer Platform

Most software organizations do not have a delivery speed problem. They have a friction problem. Every time a developer needs to provision infrastructure, configure a pipeline, set up monitoring, or figure out "the approved way" to deploy a new service, they lose time to organizational overhead rather than building features. Platform engineering is the discipline of eliminating that friction by building internal developer platforms that make the right thing the easy thing.

According to Gartner, by 2026, 80% of software engineering organizations will establish platform teams as internal providers of reusable services, components, and tools for application delivery. That projection was published in 2023, and adoption has accelerated since. The question is no longer whether platform engineering matters. It is whether your organization will build platforms deliberately or let them emerge accidentally from a patchwork of scripts, wikis, and tribal knowledge.

I have watched engineering organizations scale from 10 to 200 developers. The inflection point where ad-hoc processes collapse is surprisingly consistent: somewhere around 40 to 60 engineers. Before that threshold, individual heroics and Slack messages keep things moving. After it, the absence of a platform becomes the primary drag on engineering velocity. This guide covers what platform engineering is, how it differs from DevOps, what an Internal Developer Platform includes, and how codebase intelligence serves as a foundational layer.

What Is Platform Engineering

Platform engineering is the practice of designing and building internal developer platforms (IDPs) that give software engineers self-service access to the tools, workflows, and infrastructure they need to build, test, deploy, and operate software.

The core idea is treating infrastructure and developer tooling as a product. A platform team operates like an internal product team whose customers are the organization's developers. They research developer needs, build capabilities, measure adoption, and iterate based on feedback, just like a product team building for external users.

The output is an IDP: a curated set of tools, APIs, documentation, and golden paths that abstract away infrastructure complexity. Instead of every team configuring their own CI pipelines, Kubernetes manifests, monitoring dashboards, and secret management from scratch, the platform provides standardized, pre-configured components that teams consume through self-service interfaces.

A practical example: instead of a developer reading a 40-page wiki about how to deploy a new service (provision a namespace, configure Helm charts, set up secrets, create monitoring dashboards, configure alerting rules, register with the service mesh), the platform provides a single command or form that handles all of it. The developer describes what they want ("a new Python service with a PostgreSQL database and Redis cache"). The platform handles the how.

Puppet's 2023 State of DevOps Report found that organizations with dedicated platform teams deploy code 4.3x more frequently and have 27% faster lead time for changes than organizations without platform teams. The efficiency gain comes not from doing things faster, but from eliminating the toil that slows things down.

Platform Engineering vs DevOps

Platform engineering and DevOps share goals (faster, more reliable software delivery) but differ in approach. Understanding the distinction prevents confusion and helps organizations adopt platform engineering effectively.

DevOps is a culture and set of practices. DevOps breaks down silos between development and operations. It emphasizes shared responsibility, automation, continuous integration, and continuous delivery. DevOps says: "Developers and operations should collaborate, and everyone should care about the full lifecycle of software."

Platform engineering is a discipline and organizational model. It takes the DevOps principles and operationalizes them through a dedicated team and product. Platform engineering says: "A specialized team should build the internal tools and workflows that make DevOps practices easy for everyone else to follow."

The relationship is complementary, not competitive. DevOps established the cultural foundation. Platform engineering builds the infrastructure that makes DevOps scalable.

Where the distinction becomes practical: in a DevOps-oriented organization without platform engineering, every team is responsible for their own pipelines, infrastructure, and operational tooling. This works at small scale but creates duplication and inconsistency as the organization grows. Team A's deployment process differs from Team B's. Each team's Kubernetes configuration has different patterns. Debugging an incident that spans multiple teams requires understanding multiple bespoke setups.

Platform engineering consolidates these shared concerns into a single, maintained platform. Individual teams still own their services and deployments, but they operate within a consistent framework that the platform team maintains.

For a deeper look at the DevOps principles that platform engineering builds on, our guide on what DevOps is provides the foundational context.

According to the CNCF's 2024 Platform Engineering survey, 78% of organizations that adopted platform engineering previously practiced DevOps. Platform engineering is the next evolutionary step, not a replacement.

Core Components of an IDP

An Internal Developer Platform is not a single tool. It is an integrated set of capabilities that together provide a self-service developer experience. The core components include:

Infrastructure provisioning. Developers request compute, storage, databases, and networking through a standardized interface. The platform handles provisioning, configuration, and compliance. Terraform, Crossplane, and Pulumi are common infrastructure-as-code tools that platforms wrap with templates and guardrails.

CI/CD pipelines. Standardized build, test, and deployment pipelines that teams consume rather than build from scratch. The platform team defines the pipeline templates. Application teams customize parameters (build targets, test suites, deployment targets) without managing pipeline infrastructure.

Service catalog. A registry of all services in the organization with metadata: ownership, dependencies, SLOs, documentation links, and health status. Backstage (originally from Spotify) is the most widely adopted open-source service catalog. A service catalog answers "what services exist, who owns them, and how are they connected?" at the organizational level.

Secrets and configuration management. Centralized management of environment variables, API keys, certificates, and feature flags with proper access controls and audit logging. Vault (HashiCorp), AWS Secrets Manager, and Doppler are common choices.

Observability integration. Pre-configured monitoring, logging, and tracing that activates automatically when a new service is deployed through the platform. Instead of each team manually instrumenting and configuring dashboards, the platform provides observability out of the box.

Documentation and runbooks. Centralized, discoverable documentation for platform capabilities, service operation, and incident response. This is often the weakest component because documentation maintenance is tedious and frequently deprioritized.

Developer portal. A unified entry point (typically web-based) where developers access all platform capabilities: create services, view the catalog, check pipeline status, read documentation, and manage secrets. This portal is the platform's "user interface."

The 2024 Humanitec IDP Benchmark found that the average enterprise IDP integrates 15 to 25 tools. The platform team's job is not to build all of these tools from scratch but to integrate, configure, and present them as a coherent experience.

Building Golden Paths

Golden paths (sometimes called "paved roads") are the platform team's most important output. A golden path is the pre-defined, recommended way to accomplish a common developer task. It is the path of least resistance, designed so that following it produces the correct result with minimal effort.

The concept is simple: make the right thing the easy thing. If the approved way to deploy a service requires fewer steps and less effort than the non-approved way, developers will follow it naturally. If the approved way requires more effort (filling out forms, waiting for approvals, reading a 30-page wiki), developers will route around it.

Characteristics of effective golden paths:

They are fast. A developer should be able to go from "I need a new service" to "the service is running in staging" in under 30 minutes. If it takes a day, the golden path is not golden.

They are opinionated. Golden paths make choices for the developer (programming language, framework, database, deployment target) based on organizational standards. This is not restrictive. It is liberating. Developers do not want to make infrastructure decisions for every new project. They want to make product decisions.

They are escapable. A golden path is a default, not a mandate. Teams with legitimate reasons to deviate (performance requirements, compliance constraints, specialized technology needs) should be able to do so. But they should know they are leaving the supported path and accept the maintenance burden.

They are documented. A golden path without documentation is just a convention. Document each golden path with: what it provides, what choices it makes, how to use it step by step, and how to customize or escape it.

Spotify's internal platform team, which pioneered many of these concepts, reported that golden paths reduced new service creation time from 2 weeks to under 4 hours. That is not a marginal improvement. It is a category change in developer velocity.

The most common golden paths are: new service creation, database provisioning, CI/CD pipeline setup, production deployment, and monitoring/alerting configuration. Start with these five. Each one eliminates a category of developer toil.

Developer Self-Service

Self-service is the platform's defining principle. If developers must file tickets and wait for another team to provision infrastructure, configure pipelines, or grant access, you do not have a platform. You have a service desk with better tooling.

True self-service means developers can:

Provision resources on demand. Create a database, spin up a new environment, or add a cache without waiting for another team. The platform enforces guardrails (sizing limits, cost controls, security policies) automatically rather than through approval workflows.

Deploy independently. Push code through the pipeline and deploy to staging or production without requiring operations team involvement. The platform ensures compliance (tests pass, security scans clear, change management policies satisfied) through automated checks, not human gates.

Access information instantly. Understand what services exist, who owns them, how they connect, and what their health looks like, all from the developer portal. Reducing the time between "I have a question about the system" to "I have the answer" is one of the platform's highest-value capabilities.

Debug without escalation. When something goes wrong, developers should be able to access logs, traces, metrics, and runbooks without waiting for an on-call engineer from another team. Self-service debugging reduces mean time to recovery and distributes operational knowledge.

According to a 2024 Puppet survey, organizations with self-service developer platforms report 43% less time spent on operational tasks per developer per week. For a 100-developer organization, that translates to approximately 43 engineer-weeks per year recovered for product work.

The self-service principle has limits. Some operations (production database migrations, major infrastructure changes, security-sensitive configurations) warrant human review. The platform should make these exceptions explicit and provide clear escalation paths. The goal is not to eliminate all human oversight. It is to eliminate unnecessary human bottlenecks.

Platform Engineering Tools

The platform engineering tooling ecosystem has grown rapidly. Understanding the major categories and tools helps teams make informed decisions.

Service catalogs and developer portals: Backstage (CNCF, originally Spotify) is the dominant open-source option. Port and Cortex are commercial alternatives with more opinionated feature sets. These tools provide the "front door" to your platform.

Infrastructure-as-code: Terraform remains the most widely used. Crossplane takes a Kubernetes-native approach, defining infrastructure as Kubernetes custom resources. Pulumi lets you define infrastructure in general-purpose programming languages (TypeScript, Python, Go) rather than domain-specific languages.

Internal developer platforms (integrated): Humanitec, Kratix, and Qovery provide integrated platform frameworks that bundle multiple capabilities. These reduce integration work but may constrain flexibility.

CI/CD: GitHub Actions, GitLab CI, and ArgoCD (for Kubernetes-native GitOps deployments) are the most adopted. The platform team's role is not to build CI/CD but to create standardized pipeline templates that application teams consume.

Score (CNCF): An open-source specification for describing workload requirements in a platform-agnostic way. A developer writes a Score file describing what their service needs (compute, database, cache), and the platform translates it to the appropriate infrastructure primitives.

The common pattern across successful platform teams: they do not build from scratch. They integrate best-of-breed tools and present them through a unified interface. The platform team's value is in the integration, documentation, and golden paths, not in building yet another deployment tool.

For teams tracking DORA metrics, the platform should measure and report deployment frequency, lead time, change failure rate, and mean time to recovery at both the platform level and per-team level. These metrics quantify the platform's impact and guide prioritization of platform improvements.

Measuring Platform Success

A platform without metrics is a project without accountability. Measuring platform success requires both adoption metrics (are developers using it?) and impact metrics (is it making them faster?).

Adoption metrics:

Percentage of services deployed through the platform. Target: 80%+ within 18 months of platform launch. Services that bypass the platform represent either gaps in platform capabilities or cultural resistance, both of which need attention.

New service creation rate through golden paths versus manual provisioning. If developers are still setting up services manually, the golden paths are not meeting their needs.

Developer portal active users as a percentage of total developers. If 60% of developers never visit the portal, it is not providing enough value to earn a place in their workflow.

Impact metrics:

Time from code commit to production deployment. This is the lead time component of DORA metrics and the most direct measure of platform efficiency.

Developer time spent on operational tasks. Survey-based, quarterly. A decreasing trend validates the platform's self-service investment.

Mean time to recovery from incidents. Faster recovery indicates that the platform's observability integration and runbook documentation are working.

Onboarding time for new developers. How quickly can a new engineer make their first deployment? The platform should compress this from weeks to days.

A 2024 McKinsey study on developer productivity found that top-performing engineering organizations spend 30% less time on operational overhead compared to bottom performers. Platform engineering is the primary mechanism for achieving that difference.

Codebase Intelligence as a Platform Layer

Traditional platform engineering focuses on infrastructure, pipelines, and tooling. But there is a layer that most platforms miss entirely: codebase intelligence.

Developers do not just need tools to deploy code. They need to understand the code they are deploying: what it does, how it connects to other services, who owns it, and what the implications of changes are. This knowledge layer is typically left to tribal knowledge, ad-hoc documentation, and architecture diagrams that went stale six months ago.

Codebase intelligence fills this gap by providing automated, always-current understanding of the codebase. It answers questions that developers ask dozens of times per week: "What does this service do?" "Which other services depend on it?" "Who was the last person to modify this module?" "What is the blast radius if I change this function?"

When integrated into the internal developer platform, codebase intelligence transforms the service catalog from a static registry into a living knowledge base. Instead of a catalog entry that says "payment-service: owned by Team Payments, deployed on Kubernetes," the entry becomes an interactive resource where developers can explore the service's architecture, understand its dependencies, and assess the impact of proposed changes.

Glue provides this codebase intelligence layer. By connecting to your Git repositories and building a semantic understanding of your code, Glue gives platform teams the ability to surface codebase knowledge alongside infrastructure capabilities. A developer using the platform to create a new service can simultaneously understand how that service will interact with existing services, not through manual documentation but through AI-analyzed code reality.

For engineering leaders building platform teams, adding codebase intelligence is a force multiplier. The platform already standardizes how developers deploy and operate code. Codebase intelligence standardizes how developers understand code. Together, they reduce the two biggest sources of developer friction: "how do I ship this?" and "how does this system work?"

The result is a platform that does not just accelerate delivery but accelerates understanding. And understanding is what transforms new hires into productive contributors, what turns post-incident reviews into lasting improvements, and what makes architectural decisions informed rather than guessed.

FAQ

What is platform engineering?

Platform engineering is the discipline of designing and building internal developer platforms that give engineers self-service access to the tools, workflows, and infrastructure they need to build, test, deploy, and operate software. A platform team operates like an internal product team whose customers are the organization's developers. The output is an integrated set of capabilities (infrastructure provisioning, CI/CD, service catalogs, observability, documentation) that abstract away operational complexity and let developers focus on writing code rather than managing infrastructure.

How is platform engineering different from DevOps?

DevOps is a cultural movement and set of practices focused on collaboration between development and operations teams. Platform engineering is an organizational model that operationalizes DevOps principles through a dedicated team and product. DevOps says "everyone should care about the full software lifecycle." Platform engineering says "a specialized team should build the tools that make caring about the full lifecycle easy." Most organizations that adopt platform engineering previously practiced DevOps. Platform engineering is the next step, not a replacement.

What tools do platform engineers use?

Platform engineers commonly use: Backstage or Port for service catalogs and developer portals; Terraform, Crossplane, or Pulumi for infrastructure-as-code; GitHub Actions, GitLab CI, or ArgoCD for CI/CD; HashiCorp Vault or AWS Secrets Manager for secrets management; and Kubernetes as the runtime platform. The platform team's primary value is not in any single tool but in integrating these tools into a coherent, self-service experience with standardized golden paths and documentation.

Do small teams need platform engineering?

Teams under 30 to 40 developers generally do not need a dedicated platform team. At that scale, ad-hoc processes and direct communication work well enough. The inflection point typically arrives between 40 and 60 engineers, when the duplication, inconsistency, and friction of every team managing their own tooling becomes a measurable drag on velocity. However, small teams can benefit from platform thinking: standardizing common workflows, documenting golden paths, and choosing consistent tools. You do not need a platform team to start building a platform. You just need someone who cares about developer experience.

What Is Platform Engineering

Platform Engineering vs DevOps

The relationship is complementary, not competitive. DevOps established the cultural foundation. Platform engineering builds the infrastructure that makes DevOps scalable.

For a deeper look at the DevOps principles that platform engineering builds on, our guide on what DevOps is provides the foundational context.

Core Components of an IDP

An Internal Developer Platform is not a single tool. It is an integrated set of capabilities that together provide a self-service developer experience. The core components include:

Building Golden Paths

Characteristics of effective golden paths:

They are fast. A developer should be able to go from "I need a new service" to "the service is running in staging" in under 30 minutes. If it takes a day, the golden path is not golden.

Developer Self-Service

True self-service means developers can:

Platform Engineering Tools

The platform engineering tooling ecosystem has grown rapidly. Understanding the major categories and tools helps teams make informed decisions.

Measuring Platform Success

A platform without metrics is a project without accountability. Measuring platform success requires both adoption metrics (are developers using it?) and impact metrics (is it making them faster?).

Adoption metrics:

New service creation rate through golden paths versus manual provisioning. If developers are still setting up services manually, the golden paths are not meeting their needs.

Developer portal active users as a percentage of total developers. If 60% of developers never visit the portal, it is not providing enough value to earn a place in their workflow.

Impact metrics:

Time from code commit to production deployment. This is the lead time component of DORA metrics and the most direct measure of platform efficiency.

Developer time spent on operational tasks. Survey-based, quarterly. A decreasing trend validates the platform's self-service investment.

Mean time to recovery from incidents. Faster recovery indicates that the platform's observability integration and runbook documentation are working.

Onboarding time for new developers. How quickly can a new engineer make their first deployment? The platform should compress this from weeks to days.

Codebase Intelligence as a Platform Layer

Traditional platform engineering focuses on infrastructure, pipelines, and tooling. But there is a layer that most platforms miss entirely: codebase intelligence.

Platform Engineering: Building the Internal Developer Platform Your Team Needs

What Is Platform Engineering

Platform Engineering vs DevOps

Core Components of an IDP

Building Golden Paths

Developer Self-Service

Platform Engineering Tools

Measuring Platform Success

Codebase Intelligence as a Platform Layer

FAQ

What is platform engineering?

How is platform engineering different from DevOps?

What tools do platform engineers use?

Do small teams need platform engineering?

Frequently asked questions

Keep reading

What Is DevOps? A Plain-English Guide for Non-Engineers

Shift Left: How Moving Testing Earlier Cuts Defect Costs by 100x

Feature Flags: The Complete Guide to Safe, Fast Feature Releases

Platform Engineering: Building the Internal Developer Platform Your Team Needs

What Is Platform Engineering

Platform Engineering vs DevOps

Core Components of an IDP

Building Golden Paths

Developer Self-Service

Platform Engineering Tools

Measuring Platform Success

Codebase Intelligence as a Platform Layer

FAQ

What is platform engineering?

How is platform engineering different from DevOps?

What tools do platform engineers use?

Do small teams need platform engineering?

Frequently asked questions

Keep reading

What Is DevOps? A Plain-English Guide for Non-Engineers

Shift Left: How Moving Testing Earlier Cuts Defect Costs by 100x

Feature Flags: The Complete Guide to Safe, Fast Feature Releases