The AI-Native Internal Developer Platform

The AI-Native Internal Developer Platform

Internal Developer Platforms have a well-known dirty secret: most developers do not use them. The platform team builds something genuinely useful — curated pipelines, Terraform modules, Helm charts, GitOps patterns, cost guardrails — wraps it in a service catalogue, adds a web portal, writes runbooks, and waits. Adoption trickles in, usually driven by mandate rather than enthusiasm.

This is not a technology problem. It is a distribution problem. The platform team solved the right problem and delivered it to the wrong address.

The Web Portal is Not Where Developers Live

The classic IDP model assumes developers will context-switch into the platform’s world. They go to the portal, find the right template, follow a runbook, submit a request, and wait for the pipeline to run. Each step is friction. Each unfamiliar UI or undiscoverable option is a reason to ask a colleague, open a ticket, or just do it the old way.

The first attempt to reduce that friction was the chatbot: train a bot on Confluence pages, wire it into Slack or Teams, and let developers ask natural-language questions. This helped with documentation discovery but did not solve the deployment or provisioning problem.

Consider what a developer actually wants: “I need a small database for a POC” or “I need something big enough to run a load test against.” They do not think in terms of RDS versus Aurora versus Azure Database for PostgreSQL. They think in terms of size and purpose. But the chatbot — trained on your Confluence runbooks — only knows how to answer “how do I create an RDS instance?” That question already assumes the developer knows your cloud provider, your approved database service, and your team’s naming conventions. Most of them do not, and they should not have to.

There is a deeper problem here too. This opinionated, platform-specific design is exactly the kind of knowledge that is hard to document well. The decisions are often implicit: we use db.t3.medium for dev because it fits the budget, we never use db.t3.micro in production because it caused incidents, we default to eu-west-1 unless the workload is latency-sensitive. Platform teams know these things. They rarely write them down comprehensively. And the chatbot, trained on whatever Confluence pages exist, does not know what it does not know — so it gives confident, partially correct answers that erode trust faster than no answer at all.

The adoption problem remained. The runbooks were cleaner, but the journey was just as long.

Developers Already Have an AI Agent Open

Something shifted in the last eighteen months. Developers at every level — senior engineers, juniors, contractors — now spend a significant portion of their day talking to AI coding tools. Claude Code, GitHub Copilot, Codex. These are not search engines or documentation viewers. They are interactive agents that write code, fix bugs, run commands, and increasingly take multi-step actions in the developer’s environment.

This is the distribution channel that the IDP has been missing.

Developers do not need to learn a new portal. They are already in the agent. The question is: can the platform team plug its capabilities into the place where developers already are?

The answer is yes, and the mechanism already exists.

Skills, Context Files, and Platform-Aware Agents

Modern AI coding tools expose extension points that platform teams can use directly. In Claude Code these are called Skills — markdown files that describe workflows, commands, and conventions. Other tools have equivalent mechanisms: custom instructions, system prompts, workspace context files.

A platform team can use these primitives to make the AI agent platform-aware:

Context files (.md files checked into the repository or loaded at agent startup) describe the company’s infrastructure conventions, approved patterns, naming standards, tagging policies, and cost constraints. When a developer asks the agent to “add a new service,” the agent already knows that services go in the platform/services/ directory, need specific labels, must use the approved base image, and must declare resource limits.

Skills encode the workflows that used to live in runbooks. “Create a new EKS namespace” becomes a skill that the agent executes step by step — running the right Terraform module, updating the ArgoCD app-of-apps, creating the service account, setting up IRSA — without the developer needing to know any of those steps exist.

Platform-aware agents take this further. An agent that has read your infrastructure state, knows your deployment topology, understands your cost budget for the quarter, and is aware of the current on-call rotation can make sensible decisions that a generic LLM cannot. It can say “this region is at 80% of this month’s budget, do you want to deploy to eu-west-1 instead?” without the developer having thought to ask.

The platform team’s job shifts from maintaining a portal to maintaining the context that makes the agent useful.

The IDP’s Customers Are Now Also Agents

This is the part that changes the architecture assumptions most significantly.

Traditional IDP design optimised for human consumers: discoverability, good UI, clear documentation. The assumption was that a person would read, decide, and click.

In an AI-native model, a large fraction of the requests reaching your platform will come from other agents. A developer’s coding agent asks the deployment agent to provision a staging environment. The deployment agent calls the cost-estimation agent before deciding on instance sizes. The security agent reviews the generated Terraform before apply.

This means the IDP needs to expose its capabilities in a way that agents can consume reliably. Human-readable portals and prose runbooks are insufficient. Agents need structured interfaces.

The Determinism Problem and the API Contract

Here is the honest challenge with plugging LLMs into infrastructure workflows: LLMs are stochastic. Even with excellent context files and well-written skills, a model might generate slightly different Terraform, use a different module version, or interpret an ambiguous instruction in a way that diverges from your standards.

For infrastructure, “slightly different” is not acceptable. A misconfigured security group or a missing lifecycle rule is not a style issue.

The solution is to treat the IDP as a contract, not a corpus of documentation. Instead of giving the agent a Terraform module and hoping it uses it correctly, you expose an API — a structured interface that accepts intent (“give me a PostgreSQL instance, t3.medium, in the dev environment”) and returns a concrete, validated artefact (a YAML config, a rendered Terraform plan, a Helm values file).

The agent’s job is to translate the developer’s natural-language request into a well-formed API call. The platform’s job is to turn that API call into infrastructure. The LLM never touches the Terraform directly.

This separation gives you:

  • Determinism: the same input always produces the same infrastructure
  • Auditability: every API call is a log entry; you know exactly what was requested and when
  • Guardrails: the API can enforce policies at the boundary — no t3.2xlarge in dev, no public S3 buckets, no missing cost-centre tags
  • Reviewability: generated artefacts go into a pull request, human or automated review happens before apply, and the git history tells the story

A Practical Path for Existing Infrastructure

Most organisations cannot start over. They have existing pipelines, existing Terraform state, existing GitOps repositories built over years. The instinct when reading about AI-native IDPs is “that sounds great for greenfield, but we can’t rewrite everything.”

You do not need to rewrite anything.

The API contract layer sits in front of your existing systems. Your Terraform modules, your Helm charts, your ArgoCD applications — they do not change. The API generates the input artefacts they already expect: a terraform.tfvars file, a Helm values.yaml, a Kubernetes manifest. Your existing pipelines pick those up and run as they always have, via git.

The migration path is incremental:

  1. Start with context files — add a CLAUDE.md or equivalent to your repositories describing your conventions. Cost: a few hours. Immediate gain: the agent stops suggesting patterns that violate your standards.
  2. Wrap one common workflow as a skill — “create a new microservice scaffold” is a good first candidate. Cost: a day. Gain: a repeatable, reviewable workflow that developers can trigger from their agent.
  3. Build a thin API endpoint for one resource type — a Lambda that takes a JSON payload and returns a validated values.yaml for your standard application Helm chart. Cost: a sprint. Gain: provable determinism for that resource type.
  4. Expand the API surface incrementally, driven by what developers actually request.

At each step, the platform team is reducing the gap between what developers ask for and what the platform can safely deliver.

What Actually Changes

The platform team’s deliverable shifts from a portal to a contract. The runbook becomes a skill. The service catalogue becomes a set of API endpoints. The web UI becomes optional — useful for visibility and reporting, but no longer the primary interface.

Developers stop needing to know where the platform lives. They ask their agent. The agent knows.

And when the next wave of autonomous agents arrives — agents that spin up environments, run load tests, and tear down infrastructure on a schedule without a human in the loop — the platform is already ready for them. They call the same API the developer’s agent calls.

That is what an AI-native IDP looks like: not a smarter portal, but a platform that speaks the language of agents.