See all posts

Securing AI in Software Engineering: The Platform Layer

This article is a follow-up to our previous article: ‘Securing AI in Software Engineering: A Field Guide’. This article focuses on building the supporting platforms for Agentic AI. As use cases scale from power-user orchestration to full multi-agent orchestration, the automation increasingly runs off the developer's local environment, and it's this shift the platforms need to support.

10 minJul 02, 2026By Karim El-Melhaoui
Karim El-Melhaoui

To read the initial article: Securing AI in Software Engineering: A Field Guide

Whereas developers at an immature stage of AI adoption normally operate their agent under their own identity, machine and IDE, this post will primarily focus on remote managed runtime environments, with managed, non-human identities, that requires advanced platform capabilities, such as Identity, Runtime and Integrations. 

The goal in this article is to explore how you can run Agentic AI in secure, purpose-built environments on hyperscalers or on Anthropic’s platform, exploring the limitations of agent identity, runtime and integrations and what options can be adopted already today.

This post does not cover Model Context Protocol (MCP) and Agent-to-Agent (A2A) in any depth. MCP is how an agent reaches external tools and data in a standard way, so it is present throughout even where we don't name it. A2A, which is becoming the standard for agent to agent communication, sits a layer above the identity foundation we cover here. It assumes that foundation rather than replacing it, since two agents still each need an identity before they can delegate, so we leave the multi-agent case to a later post.

The Platform

The platform itself must be built with a set of foundational capabilities, that we aim to summarize below with its benefits and constraints. We’ll touch upon the why, followed by the reflections of the mandatory capabilities.

Why a platform

The ambition of a platform is to provide a runtime environment that supports authentication, least privilege authorization and guardrails for your Agentic AI. Our take is a lot of this is supported by the basic guardrails we’re used to from traditional infrastructure, such as:

  • Provision runtime with network egress filtering and restrictions based on the agent’s job

  • Consolidate runtime, identity and system logs to understand what actions were performed by agent

  • Provision identities with least privilege to reduce blast radius

  • Hardened runtime (e.g. Seccomp, gVisor or similar) or other means of strong tenant isolation

Identity

It is worth mentioning that NIST is interested in defining a standard for Software and AI Agent Identity and Authorization that you can read more about. The draft seems to align with our current views.

From our experience, Identity has proven to be the most challenging aspect of supporting Agentic AI. This is mostly due to the need of replicating a developer's permissions, and systems that have poor integration for segmented application access using modern standards such as OIDC or SPIFFE for authentication. The cloud platforms have mostly solved this, due to identities being an integrated part of their platform. AWS, Azure and Google Cloud all act as a relying party and accept a federated token from your own identity provider, so the agent never has to hold a long lived secret. The gap however becomes more visible with code repositories, e.g. GitHub issues OIDC tokens outward to the cloud, but it does not act as a relying party, so you cannot federate an agent into GitHub the way you can into a cloud environment. The closest workaround is a GitHub App, and that still means long lived secrets for token exchange. 

While GitHub.com supports application identities through GitHub Apps, you still have to hold a long lived private key or client secret to mint tokens. The installation token itself is short lived and scoped, but there is no way to obtain it through federation, so you end up storing a root secret somewhere. The way to solve Agentic access to GitHub is an abstraction layer that vends scoped installation tokens from a GitHub App you control, bootstrapped together with the agent. One App keeps registration manageable, and you handle per agent attribution in your own layer by binding each token to the agent and the run it was issued for.

You want the agent to have the exact permissions, with short-lived federation token scoped to the operation it is meant to do, such as writing to a single repository. GitHub App permissions scope to a repository and a permission type, not to a branch, so confining an agent to one branch is something you would need to layer on with rulesets and branch protection.

I chose to specifically highlight GitHub in this article based on my own experience, it is likely that any platform you want to integrate with that doesn’t support acting as a relying party for federation or has an intuitive API for supporting agent identities will pose limitations.

Runtime

Runtime is where your agent executes. Ideally that is not your laptop, for several reasons, one being that you want it to keep running while you are not working. When you think about runtime for AI agents it helps to separate two cases. The first is running agents that do your own work. The second is building and running agents you serve to others as functionality, such as a chatbot. The requirements overlap but the platforms differ. This post focuses on the first, though the principles carry over to the second.

For running agents that do your own work, you want an environment that runs around the clock, with strong isolation from other processes and filesystems, that does not depend on your laptop being on. This is where Cloud Development Environments are a good fit, such as Coder or Gitpod, or short lived VMs on your cloud of choice. 

For building and serving agents, you are more likely to adopt an existing application platform or an Agent Platform. Microsoft Foundry with its Agent Service, Amazon Bedrock Agents, Gemini Enterprise Agent Platform or agents running on Cloud Run or microVMs will mostly fit, and in many cases you can use the application platform you already run, such as Kubernetes.

Whichever option you prefer, none of it works if agent identity is not solved first, and some of these services give you limited visibility into what the agent actually did. The value of a purpose built agent platform depends on your use case, so be clear about which of the two problems you are solving before you pick one.

Rami McCarthy has written about the correlation between Cloud Development Environments and adoption of AI that allows organizations to realize the value of AI: https://ramimac.me/devboxen.

Platforms that offer Identity and Runtime

I’ve included examples and descriptions of the capabilities related to the platforms we’ve recently worked on, covering its capabilities for Identity and Runtime. We have not experimented with Coder or GitPod for the purposes of running Agentic AI, as our current focus is to cover hyperscalers, cloud providers and AI platforms (currently Claude). 

Anthropic/Claude: 

Anthropic offers a managed runtime called Claude Managed Agents. It runs in one of two modes. In the default mode, tool execution happens inside Anthropic's own sandboxes. In self-hosted mode the orchestration stays on Anthropic's side, but tool execution moves into infrastructure you control, so the agent's code, filesystem and network egress never leave your environment. That gives you granular control over egress, lets you bring your own machine identities, and lets you monitor the runtime, while developers can still call custom tools.

The platform has not fully solved identity yet, but it supports Vaults for secrets, and the interesting part is how they are handled. A credential proxy outside the sandbox injects the secret at request time, so the secret never enters the sandbox and the agent cannot read it. This structurally prevents credential theft through prompt injection, even if an attacker fully controls the model's reasoning. For Git, Anthropic clones the repository with its access token during sandbox setup and wires it into the local remote, so push and pull work without the agent ever handling the token. It serves as a workaround for some of the limitations that we argued for in the identity section. Claude has also introduced their ‘Identity Access Model’ with their latest Claude Tag offering that integrates with Slack. This addresses identity through OAuth authorization to specific Slack channels, and can also run in Slack DMs in the context of a user. The foundation is built on ‘boundaries’, where a private Slack channel has an identity bound to it, and where in Slack Enterprise an admin can determine which actions one can use the agent for. We might want to cover this in more detail later, depending on how it evolves and our future research.

agent-identity-access-model.jpg

Source: https://claude.com/blog/agent-identity-access-model 

Reference: https://platform.claude.com/docs/en/managed-agents/overview

Microsoft Azure:

The closest offering Azure has to running remote agents would be any of their runtime options, of which they recently launched the Azure Container Apps Sandboxes that is built for this purpose, allowing lifecycle policies, network egress, managed volumes, native integration with Agent IDs (covered below), and offers secret management. As it was only introduced in early June, we haven’t had the time to fully test it out yet.

Azure’s Entra Agent IDs are built on Entra ID and Service Principals. Authentication can be performed through federated identity credentials (OIDC) issued by the blueprint, and those can trust GitHub Actions, Kubernetes or any external OIDC issuer. In this scenario Azure is the Relying Party, and permissions can be granted to Microsoft Graph or Azure Resource Manager.

Reference: https://learn.microsoft.com/en-us/entra/agent-id/agent-identities

AWS:

AgentCore Runtime is a serverless execution environment with integrated identity management. Its most relevant feature for SWE use cases is support for persistent interactive shell sessions, where environment variables, working directories and running processes persist for the lifetime of the session. A hard limitation is that the runtime will stop at a maximum of 8 hours, and you will need to manually resume the session.

For identity, AgentCore builds on native AWS capabilities to handle both inbound and outbound authentication and authorization, with centrally managed machine identities that bridge external platforms and AWS resources.

Reference: https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-oauth.html 

Google Cloud: 

Google offers two primary paths for hosting agents. Agent Runtime provides a fully managed experience, abstracting away deployment, scaling, and memory persistence while integrating with GCP native observability stack. However, because its code execution sandbox is session-scoped, it is mostly designed for analytical workflows rather than persistent remote development. 

For more control, the GKE Agent Sandbox utilizes gVisor-isolated pods with persistent storage, network, and identities, though it requires you to manage the underlying Kubernetes orchestration. One drawback with GKE Agent Sandbox is that it doesn’t support Google’s Cloud Agent Identity covered below, but rather relies on Workload Identity Federation for now. For developer-owned persistent environments, Google Cloud Workstations is a simpler option.

Google Cloud's Agent Identity is built on SPIFFE. Every agent gets a SPIFFE ID and a short lived X.509 certificate. The agent can act on its own authority, where its SPIFFE identity is exchanged for a Google Cloud access token bound to that certificate, or on a user's behalf.

Also worth highlighting from Google’s documentation: Unlike service accounts, agent identities are not shared by multiple workloads by default, can't be impersonated, and don't allow developers to generate long-lived service account keys. Access tokens generated for Google Cloud are cryptographically bound to the agent's unique X.509 certificates to prevent token theft.

Reference: https://docs.cloud.google.com/iam/docs/agent-identity-overview 

Summary:

Across these platforms a couple of patterns repeat. Isolation is handled using sandboxing technology such as gVisor, as Claude and GKE Agent Sandbox do, or with microVMs, as AgentCore and Azure Container Apps Sandboxes do. On identity, all of them try to avoid that a secret enters the sandbox, by rather injecting it at the network boundary at request time. 

The platform model we opened with, egress filtering, least privilege identity and secure credentials seems to be the common pattern among the providers.

Integrations

Ideally, the platform should leverage your existing investment into security tooling, to give visibility and insights to all assets that are involved in producing AI, such as your CSPM/CNAPP, logging and monitoring, identity platform and developer portals. You will also have increased satisfaction of supplementing development with Agentic AI if it can adhere to existing processes, understand important context and you’ll feel safer if the same guardrails apply.

We are also seeing the addition to Cloud Native Application Protection Platform’s (CNAPP), adding ‘AI Security Posture Management (AI-SPM) capabilities. This leverages the existing insights the CNAPP has into your cloud environment, to determine agent identity access, correlate runtime events, identify misconfigured model APIs and govern platforms such as Claude Enterprise by integrating with its Compliance API. This will be covered in a follow-up post in more detail, where we will also cover the governance capabilities of OpenAI and how AI-SPM integrates.

Conclusion

We believe non-human identity is the most significant challenge for running Agentic AI on any platform. It is complex, fast-moving and remains an unsolved problem. The biggest challenges are related to authentication, authorization and striking the balance between cost and efficiency when it comes to adoption of platforms and capabilities.

There are existing technologies that are likely to serve as a foundation. Such as authentication, which is possible solely through token exchange and federation. This is largely being solved for the cloud providers, through OIDC Relying Parties and SPIFFE, and it is the direction more ecosystems should follow. Then comes the challenge of authorization, what actions the agent is allowed to perform. What actions an agent can perform should ideally be defined per system, and systems must provide APIs that allow provisioning the exact access required and managing agent identities. The uncertain lifetime of an agent adds the complexity of lifecycle management.

Solving both will let us grant an autonomous agent access to exactly what it requires. Neither allows us to fully govern the Agentic AI, which we will cover in a blog post detailing practical use cases for AI-SPM.