aws-lambda-microvms

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

AWS Lambda MicroVMs

AWS Lambda MicroVMs

The AWS MCP server is recommended for sandboxed execution and audit logging.
AWS Lambda MicroVMs are serverless compute environments that combine Firecracker VM isolation with container-like efficiency. Each MicroVM:
  • Runs your application as a container inside a Firecracker microVM — you can reproduce the environment locally.
  • Runs Amazon Linux 2023 as the base OS inside the MicroVM.
  • Boots from a memory + disk snapshot captured at image build time, so application init is skipped on run.
  • Has a dedicated, TLS-terminated HTTPS endpoint reachable with an auth token.
  • Can be suspended and resumed with state preserved; lives up to 8 hours.
Two-resource model:
  • MicrovmImage
    — a versioned artifact built from
    {S3 zip with Dockerfile} + baseImageArn
    . Each version has per-architecture/chipset
    Build
    s.
  • Microvm
    — a running instance created (
    RunMicrovm
    ) from an image version.
Two roles:
  • buildRoleArn
    — used during image build (S3 read, CloudWatch logs, optional ECR).
  • executionRoleArn
    — assumed at runtime by the running MicroVM.
推荐使用AWS MCP服务器进行沙箱执行和审计日志记录。
AWS Lambda MicroVMs是结合Firecracker VM隔离与容器级效率的无服务器计算环境。每个MicroVM具备以下特性:
  • 将您的应用作为Firecracker microVM内的容器运行——您可以在本地复现该环境。
  • 在MicroVM内部以Amazon Linux 2023作为基础操作系统。
  • 从镜像构建时捕获的内存+磁盘快照启动,运行时跳过应用初始化步骤。
  • 拥有带TLS终止的专用HTTPS端点,需通过认证令牌访问。
  • 暂停和恢复并保留状态,最长生命周期为8小时。
双资源模型:
  • MicrovmImage
    ——基于
    {包含Dockerfile的S3压缩包} + baseImageArn
    构建的版本化制品。每个版本包含对应架构/芯片组的
    Build
  • Microvm
    ——从镜像版本创建(
    RunMicrovm
    )的运行实例。
两种角色:
  • buildRoleArn
    ——镜像构建期间使用(S3读取、CloudWatch日志、可选ECR访问)。
  • executionRoleArn
    ——运行中的MicroVM在运行时承担的角色。

When to use

适用场景

Choose Lambda MicroVMs when

选择Lambda MicroVMs的场景

  • Analytics workloads — isolated compute for data processing, ETL jobs, or query execution with strong tenant separation.
  • AI / agent code execution sandboxes — fresh, isolated environment per session, fast resume between turns.
  • Interactive code playgrounds & notebooks — Jupyter, REPLs, dev environments executing user code.
  • Reinforcement-learning environments — clean per-episode envs with tool access.
  • Multi-tenant CI executors / build runners — strong tenant isolation.
  • Game / simulation servers — sessionful, long-lived (up to 8 hr) workloads.
  • Security scanning — running untrusted analyzers in isolation.
In general, Lambda MicroVMs are suited for long-lived sessions, real port-listening servers (gRPC, WebSocket, custom TCP protocols), state preserved across periods of inactivity (suspend/resume), container-level access (FUSE, eBPF, custom syscalls), or session-affine routing to a specific compute environment.
  • 分析类工作负载——为数据处理、ETL任务或查询执行提供隔离计算能力,实现租户间强隔离。
  • AI/Agent代码执行沙箱——每个会话使用全新隔离环境,会话间可快速恢复。
  • 交互式代码游乐场与笔记本——运行用户代码的Jupyter、REPL、开发环境。
  • 强化学习环境——每个训练周期使用干净环境,支持工具访问。
  • 多租户CI执行器/构建运行器——实现租户间强隔离。
  • 游戏/模拟服务器——会话型、长生命周期(最长8小时)的工作负载。
  • 安全扫描——在隔离环境中运行不可信分析工具。
总体而言,Lambda MicroVMs适用于长会话、真实端口监听服务器(gRPC、WebSocket、自定义TCP协议)、非活动期间保留状态(暂停/恢复)、容器级访问权限(FUSE、eBPF、自定义系统调用),或需要会话亲和路由到特定计算环境的场景。

Choose AWS Lambda (functions) when

选择AWS Lambda(函数)的场景

  • The workload fits in 15 minutes.
  • Per-invocation isolation is fine; no need for session state held in memory.
  • Fully automatic scaling is preferred (no
    RunMicrovm
    to manage).
  • Event-source integrations (S3, SQS, EventBridge, etc.) drive the function.
  • 工作负载运行时长在15分钟以内。
  • 每次调用隔离即可,无需在内存中保留会话状态。
  • 偏好全自动扩展(无需管理
    RunMicrovm
    )。
  • 工作负载由事件源集成(S3、SQS、EventBridge等)驱动。

Choose something else when

选择其他服务的场景

  • Continuous compute beyond 8 hr → ECS / EKS / EC2.
  • Lift-and-shift workloads needing kernel modifications or a non-Linux OS → EC2.
  • 连续运行时长超过8小时 → ECS / EKS / EC2。
  • 需要修改内核或使用非Linux操作系统的迁移工作负载 → EC2。

Typical workflow

典型工作流

  1. Check regional availability — confirm Lambda MicroVMs is available in your target region (use a
    ListMicrovmImages
    call). Your S3 artifact bucket and any network connectors must be in the same region as the image.
  2. Package an app: zip with a
    Dockerfile
    at the root, upload to S3 (same region as the image).
  3. Implement lifecycle hooks (optional but recommended) — HTTP endpoints on a port you specify (commonly
    9000
    ) for
    /run
    ,
    /resume
    ,
    /suspend
    ,
    /terminate
    ,
    /ready
    ,
    /validate
    .
  4. CreateMicrovmImage — pointing at the S3 artifact, a managed base image, and a build role. Lambda compiles the Dockerfile into an OCI image, starts your app, calls
    /ready
    , snapshots disk + memory, optionally validates with
    /validate
    . Lambda will periodically release new managed image versions, and customers should re-build using the latest version to ensure they have up to date images.
  5. RunMicrovm — pick an image version, attach
    executionRoleArn
    , set
    idlePolicy
    , ingress/egress connectors, and (optionally) a
    runHookPayload
    . Receive an
    endpoint
    URL and
    microvmId
    .
  6. CreateMicrovmAuthToken — get an auth token (max 60 min) with
    allowedPorts
    specifying which ports the token grants access to. Send traffic to the endpoint with
    X-aws-proxy-auth: <token>
    .
  7. Suspend / Resume / Terminate — explicit APIs, or let the
    idlePolicy
    drive it (
    maxIdleDurationSeconds
    ,
    suspendedDurationSeconds
    ,
    autoResumeEnabled
    ).
  1. 检查区域可用性——确认Lambda MicroVMs在目标区域可用(调用
    ListMicrovmImages
    )。您的S3制品存储桶和所有网络连接器必须与镜像位于同一区域。
  2. 打包应用:将包含根目录Dockerfile的文件压缩为zip包,上传至S3(与镜像同区域)。
  3. 实现生命周期钩子(可选但推荐)——在指定端口(通常为
    9000
    )提供HTTP端点,用于
    /run
    /resume
    /suspend
    /terminate
    /ready
    /validate
  4. CreateMicrovmImage——指向S3制品、托管基础镜像和构建角色。Lambda会将Dockerfile编译为OCI镜像,启动您的应用,调用
    /ready
    ,快照磁盘+内存,可选通过
    /validate
    验证。Lambda会定期发布新的托管镜像版本,客户应使用最新版本重新构建以确保镜像处于更新状态。
  5. RunMicrovm——选择镜像版本,附加
    executionRoleArn
    ,设置
    idlePolicy
    、入站/出站连接器,以及(可选)
    runHookPayload
    。获取
    endpoint
    URL和
    microvmId
  6. CreateMicrovmAuthToken——获取认证令牌(最长60分钟),通过
    allowedPorts
    指定令牌允许访问的端口。在请求中携带
    X-aws-proxy-auth: <token>
    向端点发送流量。
  7. 暂停/恢复/终止——使用显式API,或由
    idlePolicy
    驱动(
    maxIdleDurationSeconds
    suspendedDurationSeconds
    autoResumeEnabled
    )。

Core CLI commands

核心CLI命令

bash
undefined
bash
undefined

Create an image (zip with Dockerfile at root in S3, plus a managed base image)

创建镜像(S3中包含根目录Dockerfile的zip包,加上托管基础镜像)

aws lambda-microvms create-microvm-image
--name my-image
--base-image-arn arn:aws:lambda:<region>:aws:microvm-image:al2023-1
--build-role-arn arn:aws:iam::<acct>:role/MicroVMBuildRole
--code-artifact '{"uri":"s3://<bucket>/<key>.zip"}'
aws lambda-microvms create-microvm-image
--name my-image
--base-image-arn arn:aws:lambda:<region>:aws:microvm-image:al2023-1
--build-role-arn arn:aws:iam::<acct>:role/MicroVMBuildRole
--code-artifact '{"uri":"s3://<bucket>/<key>.zip"}'

Run a MicroVM (returns endpoint + microvmId). --image-identifier takes the

运行MicroVM(返回endpoint + microvmId)。--image-identifier需传入镜像ARN(仅名称会被拒绝);--image-version为完整的主版本.次版本字符串。

image ARN (the bare name is rejected); --image-version is the full major.minor string.

aws lambda-microvms run-microvm
--image-identifier arn:aws:lambda:<region>:<acct>:microvm-image:my-image
--image-version 1.0
--execution-role-arn arn:aws:iam::<acct>:role/MicroVMExecutionRole
--idle-policy '{"maxIdleDurationSeconds":900,"suspendedDurationSeconds":300,"autoResumeEnabled":true}'
aws lambda-microvms run-microvm
--image-identifier arn:aws:lambda:<region>:<acct>:microvm-image:my-image
--image-version 1.0
--execution-role-arn arn:aws:iam::<acct>:role/MicroVMExecutionRole
--idle-policy '{"maxIdleDurationSeconds":900,"suspendedDurationSeconds":300,"autoResumeEnabled":true}'

Mint an auth token and call the endpoint

生成认证令牌并调用端点

TOKEN=$(aws lambda-microvms create-microvm-auth-token
--microvm-identifier microvm-... --expiration-in-minutes 30
--allowed-ports '[{"port":8080}]'
--query 'authToken."X-aws-proxy-auth"' --output text) curl "<endpoint>/" -H "X-aws-proxy-auth: $TOKEN"
TOKEN=$(aws lambda-microvms create-microvm-auth-token
--microvm-identifier microvm-... --expiration-in-minutes 30
--allowed-ports '[{"port":8080}]'
--query 'authToken."X-aws-proxy-auth"' --output text) curl "<endpoint>/" -H "X-aws-proxy-auth: $TOKEN"

Lifecycle

生命周期操作

aws lambda-microvms suspend-microvm --microvm-identifier microvm-... aws lambda-microvms resume-microvm --microvm-identifier microvm-... aws lambda-microvms terminate-microvm --microvm-identifier microvm-...

See [`references/getting-started.md`](references/getting-started.md) for the full walkthrough including `--hooks` config and lifecycle hooks.
aws lambda-microvms suspend-microvm --microvm-identifier microvm-... aws lambda-microvms resume-microvm --microvm-identifier microvm-... aws lambda-microvms terminate-microvm --microvm-identifier microvm-...

完整入门指南包括`--hooks`配置和生命周期钩子,请参阅[`references/getting-started.md`](references/getting-started.md)。

Hook configuration

钩子配置

Hooks are organized into two groups under the
--hooks
parameter:
钩子在
--hooks
参数下分为两组:

microvmImageHooks
(build-time)

microvmImageHooks
(构建阶段)

Recommendation: Implement the image build hooks (
/ready
and
/validate
) for best performance. They enable the platform to capture a complete snapshot and prefetch the portions accessed at run time.
HookPurposeTimeout range
ready
Called during application boot. When this hook returns a 200 status code, it signals to the platform that the application is ready to be snapshotted. Use this to ensure your application is fully booted before a snapshot is taken. If your application is not yet ready, return a 503 status code until it is ready for snapshotting.1–3600s (default 30s)
validate
Called after running your application from the microVM snapshot. Use this hook to validate the application is ready to serve traffic. This hook additionally allows the platform to sample the portions of the snapshot that are used when your application is ran, allowing Lambda to prefetch those portions of the snapshot to reduce latency. To get the best performance, run mock payloads through the application during validate. When this hook returns a 200, it signals to the Lambda the MicroVM image is valid. If your application needs more time to run its validate workflow, return a 503 status code.1–3600s (default 30s)
Why implement
/ready
?
It signals the platform that your application has fully booted. Without it, the snapshot may be taken mid-initialization, meaning the cached state is incomplete and every run repeats part of the boot sequence.
Why implement
/validate
?
It lets the platform verify the snapshot is correct, and also samples which portions of the snapshot are accessed during
RunMicrovm
. This allows the platform to prefetch those portions on future launches, reducing cold-start times.
建议: 实现镜像构建钩子(
/ready
/validate
)以获得最佳性能。它们能让平台捕获完整快照,并预取运行时访问的部分内容。
钩子用途超时范围
ready
应用启动期间调用。当此钩子返回200状态码时,向平台发出信号,表示应用已准备好被快照。使用此钩子确保应用在快照前完全启动。如果应用尚未准备好,返回503状态码,直到可以进行快照。1–3600秒(默认30秒)
validate
从microVM快照运行应用后调用。使用此钩子验证应用是否已准备好处理流量。此钩子还允许平台采样应用运行时访问的快照部分,使Lambda能够预取这些部分以降低延迟。为获得最佳性能,请在验证期间运行模拟负载。当此钩子返回200时,向Lambda发出信号,表示MicroVM镜像有效。如果应用需要更多时间运行验证流程,返回503状态码。1–3600秒(默认30秒)
为什么要实现
/ready
它向平台发出信号,表明您的应用已完全启动。如果没有它,快照可能在初始化过程中被捕获,导致缓存状态不完整,每次运行都会重复部分启动序列。
为什么要实现
/validate
它让平台验证快照是否正确,还能采样
RunMicrovm
期间访问的快照部分。这使平台能够在未来启动时预取这些部分,减少冷启动时间。

microvmHooks
(runtime)

microvmHooks
(运行阶段)

HookPurposeTimeout range
run
Fires once after run from snapshot1–60s (default 1s)
resume
Fires after SUSPENDED → RUNNING1–60s (default 1s)
suspend
Fires before RUNNING → SUSPENDED1–60s (default 1s)
terminate
Fires before termination1–60s (default 1s)
See
references/getting-started.md
for a full example enabling all hooks.
钩子用途超时范围
run
从快照启动后触发一次1–60秒(默认1秒)
resume
从SUSPENDED转为RUNNING后触发1–60秒(默认1秒)
suspend
从RUNNING转为SUSPENDED前触发1–60秒(默认1秒)
terminate
终止前触发1–60秒(默认1秒)
启用所有钩子的完整示例,请参阅
references/getting-started.md

Per-MicroVM size limits

单MicroVM大小限制

ResourceLimit
Maximum vCPUs per MicroVM16
Maximum memory per MicroVM32 GB
For all other quotas — concurrent MicroVMs per account, launch rate, image count, max execution duration, auth token TTL, Lambda Network Connector (LNC) limits, per-ENI bandwidth, etc. — check the AWS docs / Service Quotas console. Most are soft quotas, raisable through Service Quotas / Support.
资源限制
每个MicroVM的最大vCPU数16
每个MicroVM的最大内存32 GB
所有其他配额——每个账户的并发MicroVM数、启动速率、镜像数量、最长执行时长、认证令牌TTL、Lambda网络连接器(LNC)限制、每个ENI的带宽等——请查看AWS文档/服务配额控制台。大多数是软配额,可通过服务配额/支持申请提升。

Additional capabilities

额外功能

By default, the container runs with a restricted set of Linux capabilities. Set
--additional-os-capabilities '["ALL"]'
at image creation time only when required by your use case:
  • Filesystem mounts — EFS, FUSE-based filesystems.
  • Nested containers — running additional containers with containerd inside the MicroVM.
  • eBPF programs — tracing, profiling, or custom network policies.
bash
aws lambda-microvms create-microvm-image \
  --name my-image \
  --base-image-arn arn:aws:lambda:<region>:aws:microvm-image:al2023-1 \
  --build-role-arn arn:aws:iam::<acct>:role/MicroVMBuildRole \
  --code-artifact '{"uri":"s3://<bucket>/<key>.zip"}' \
  --additional-os-capabilities '["ALL"]'
默认情况下,容器运行时使用受限的Linux权限集。仅当您的用例需要时,才在镜像创建时设置
--additional-os-capabilities '["ALL"]'
  • 文件系统挂载——EFS、基于FUSE的文件系统。
  • 嵌套容器——在MicroVM内部使用containerd运行额外容器。
  • eBPF程序——追踪、性能分析或自定义网络策略。
bash
aws lambda-microvms create-microvm-image \
  --name my-image \
  --base-image-arn arn:aws:lambda:<region>:aws:microvm-image:al2023-1 \
  --build-role-arn arn:aws:iam::<acct>:role/MicroVMBuildRole \
  --code-artifact '{"uri":"s3://<bucket>/<key>.zip"}' \
  --additional-os-capabilities '["ALL"]'

Shell ingress for agent use cases

适用于Agent场景的Shell入站

For programmatic shell access (agent workflows, remote command execution), use the
SHELL_INGRESS
network connector:
bash
aws lambda-microvms run-microvm \
  --image-identifier arn:aws:lambda:<region>:<acct>:microvm-image:my-image \
  --image-version 1.0 \
  --execution-role-arn arn:aws:iam::<acct>:role/MicroVMExecutionRole \
  --ingress-network-connectors '["arn:aws:lambda:<region>:aws:network-connector:aws-network-connector:SHELL_INGRESS"]' \
  --idle-policy '{"maxIdleDurationSeconds":900,"suspendedDurationSeconds":300,"autoResumeEnabled":true}'
This provides a WebSocket-based shell channel accessible from any client (terminal or browser), suitable for agent-driven workflows that need to execute commands inside the MicroVM.
对于程序化Shell访问(Agent工作流、远程命令执行),请使用
SHELL_INGRESS
网络连接器:
bash
aws lambda-microvms run-microvm \
  --image-identifier arn:aws:lambda:<region>:<acct>:microvm-image:my-image \
  --image-version 1.0 \
  --execution-role-arn arn:aws:iam::<acct>:role/MicroVMExecutionRole \
  --ingress-network-connectors '["arn:aws:lambda:<region>:aws:network-connector:aws-network-connector:SHELL_INGRESS"]' \
  --idle-policy '{"maxIdleDurationSeconds":900,"suspendedDurationSeconds":300,"autoResumeEnabled":true}'
这提供了基于WebSocket的Shell通道,可从任何客户端(终端或浏览器)访问,适用于需要在MicroVM内部执行命令的Agent驱动工作流。

Known constraints

已知限制

  • Image is single-size — you can't ship multiple instance sizes from one image. Plan one image per size.
  • Image versions incur storage cost even when no MicroVMs are running on them. Use
    delete-microvm-image-version
    to clean up.
  • Suspend → resume can't switch network connectors. LNC is bound at run time.
  • No self-suspend from inside the MicroVM. Call
    SuspendMicrovm
    from outside (via the public API).
  • Auth token max TTL is 60 min. Refresh ahead of expiry for long-running clients.
  • Runtime hooks (
    /run
    ,
    /resume
    ,
    /suspend
    ,
    /terminate
    ) are fast-notification only
    (1–60s timeout). Don't use them for slow init.
  • 镜像为单一大小——无法从一个镜像发布多种实例规格。请为每种规格规划一个镜像。
  • 镜像版本会产生存储成本,即使没有MicroVM在运行。使用
    delete-microvm-image-version
    清理旧版本。
  • 暂停→恢复无法切换网络连接器。LNC在运行时绑定。
  • 无法从MicroVM内部自行暂停。需通过外部公共API调用
    SuspendMicrovm
  • 认证令牌最大TTL为60分钟。对于长期运行的客户端,请在过期前刷新令牌。
  • 运行时钩子(
    /run
    /resume
    /suspend
    /terminate
    )仅用于快速通知
    (1–60秒超时)。不要用于缓慢的初始化操作。

Reference index

参考索引

Pick the reference that matches your task:
  • references/getting-started.md
    — prerequisites (S3 bucket, build role trust policy), packaging, end-to-end CLI walkthrough, first run + token + curl.
  • references/lifecycle-model.md
    — image vs. MicroVM state machines, the six lifecycle hooks (paths, timeouts, what to do in each), idle/suspend/resume semantics, hook payloads.
  • references/snapshots-and-uniqueness.md
    — what gets snapshotted, the uniqueness pitfall, CSPRNGs by language, env vars vs. run configuration, snapshot size inspection.
  • references/networking.md
    — ingress vs. egress connectors, port routing,
    X-aws-proxy-*
    headers, WebSocket subprotocols, HTTP/2 / gRPC, VPC egress.
  • references/iam-and-security.md
    — build role vs. execution role, trust policies, auth tokens (regular vs. shell),
    lambda:PassNetworkConnector
    .
  • references/troubleshooting.md
    — image build error codes, run/connect failures, hook timeouts, network connector issues, debugging via shell access.
根据您的任务选择对应的参考文档:
  • references/getting-started.md
    ——前提条件(S3存储桶、构建角色信任策略)、打包、端到端CLI指南、首次运行+令牌+curl调用。
  • references/lifecycle-model.md
    ——镜像与MicroVM状态机、六个生命周期钩子(路径、超时、每个钩子的操作)、闲置/暂停/恢复语义、钩子负载。
  • references/snapshots-and-uniqueness.md
    ——快照内容、唯一性陷阱、各语言的CSPRNG、环境变量与运行配置、快照大小检查。
  • references/networking.md
    ——入站与出站连接器、端口路由、
    X-aws-proxy-*
    头、WebSocket子协议、HTTP/2/gRPC、VPC出站。
  • references/iam-and-security.md
    ——构建角色与执行角色、信任策略、认证令牌(常规与Shell)、
    lambda:PassNetworkConnector
  • references/troubleshooting.md
    ——镜像构建错误代码、运行/连接失败、钩子超时、网络连接器问题、通过Shell访问调试。

Conventions used in references

参考文档中的约定

  • The runtime-side default proxy port is
    8080
    . Override per-request with
    X-aws-proxy-port
    or per-WebSocket with subprotocol
    lambda-microvms.port.<n>
    .
  • 运行时侧的默认代理端口为
    8080
    。可通过请求头
    X-aws-proxy-port
    或WebSocket子协议
    lambda-microvms.port.<n>
    按请求覆盖。

Security considerations

安全注意事项

  • Confused deputy prevention — add
    aws:SourceAccount
    (or
    aws:SourceArn
    ) condition keys to trust policies. See
    references/iam-and-security.md
    .
  • Snapshot uniqueness — snapshots share memory state. Reseed CSPRNGs and rotate secrets on resume. See
    references/snapshots-and-uniqueness.md
    .
  • Network isolation — use VPC egress connectors to restrict outbound traffic.
  • Least-privilege execution roles — scope IAM policies to specific regions, accounts, and resource prefixes.
  • Logging — enable CloudTrail for MicroVM lifecycle events.
  • 混淆代理防护——在信任策略中添加
    aws:SourceAccount
    (或
    aws:SourceArn
    )条件键。请参阅
    references/iam-and-security.md
  • 快照唯一性——快照共享内存状态。恢复时重新初始化CSPRNG并轮换密钥。请参阅
    references/snapshots-and-uniqueness.md
  • 网络隔离——使用VPC出站连接器限制出站流量。
  • 最小权限执行角色——将IAM策略范围限定为特定区域、账户和资源前缀。
  • 日志记录——为MicroVM生命周期事件启用CloudTrail。