troubleshooting-s3-files

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Troubleshooting S3 Files

S3 Files故障排查

Overview

概述

Diagnoses and resolves Amazon S3 Files issues: mount failures, IAM permissions, synchronization, conflict resolution, and performance.
For authoritative guidance, see S3 Files Troubleshooting.
诊断并解决Amazon S3 Files的各类问题:挂载失败、IAM权限、同步、冲突解决和性能问题。
如需权威指导,请参阅S3 Files故障排查

Common Tasks

常见任务

0. Verify Dependencies

0. 验证依赖项

  • You MUST verify
    aws
    CLI is available with
    s3files
    subcommand support
  • You MUST confirm valid AWS credentials
  • You MUST ONLY check for tool existence and version — MUST NOT execute destructive or mutating commands during verification
  • You MUST inform the user if any required tools are missing
  • You MUST respect the user's decision to abort if tools are unavailable
  • You SHOULD explain steps before executing and wait for user confirmation on write commands
  • 必须验证
    aws
    CLI是否可用且支持
    s3files
    子命令
  • 必须确认有效的AWS凭证
  • 仅需检查工具是否存在及版本——验证过程中不得执行破坏性或变更性命令
  • 若缺少任何必要工具,必须告知用户
  • 必须尊重用户在工具不可用时选择中止的决定
  • 执行写入命令前应先说明步骤并等待用户确认

1. Classify the Issue

1. 问题分类

SymptomCategory
mount.s3files: command not foundA: Client Installation
Connection timed out during mountB: Network/Security Group
Mount hangs indefinitely (no timeout)B: Network/Security Group
Access denied during mountC: IAM Permissions
File system stuck in "creating"C: IAM Permissions
Permission denied on file operationsC: IAM Permissions
Files not appearing in S3 after writeD: Synchronization
Files in .s3files-lost+found directoryE: Conflict Resolution
Slow reads or high latencyF: Performance
NFS server errorG: Encryption/KMS
DNS name resolution failsH: VPC DNS
症状类别
mount.s3files: command not foundA:客户端安装
挂载时连接超时B:网络/安全组
挂载无限期挂起(无超时)B:网络/安全组
挂载时访问被拒绝C:IAM权限
文件系统处于"创建中"状态停滞C:IAM权限
文件操作时权限被拒绝C:IAM权限
写入后文件未出现在S3中D:同步问题
文件位于.s3files-lost+found目录E:冲突解决
读取缓慢或延迟过高F:性能问题
NFS服务器错误G:加密/KMS
DNS名称解析失败H:VPC DNS

2. Category A — Client Installation

2. 类别A — 客户端安装

mount.s3files: command not found
means
amazon-efs-utils
is missing or < v3.0.0.
bash
sudo yum -y install amazon-efs-utils  # Amazon Linux
mount.s3files: command not found
表示
amazon-efs-utils
缺失或版本低于3.0.0。
bash
sudo yum -y install amazon-efs-utils  # Amazon Linux

3. Category B — Network/Security Group

3. 类别B — 网络/安全组

Connection timeout is the #1 mount failure — almost always security groups.
Verify mount target exists in the instance's AZ:
bash
aws s3files list-mount-targets --file-system-id fs-ID --region REGION
Cross-AZ mounting works but adds latency.
Verify security groups — most common fix:
  • Mount target SG MUST have inbound TCP 2049 from compute SG
  • Compute SG MUST have outbound TCP 2049 to mount target SG
  • Fix:
    aws ec2 authorize-security-group-ingress --group-id sg-MT --protocol tcp --port 2049 --source-group sg-COMPUTE
Test connectivity:
bash
nc -zv az-ID.fs-ID.s3files.REGION.on.aws 2049
Note: These SG troubleshooting steps also apply to EFS — use
aws efs describe-mount-targets
instead.
Mount hangs in isolated VPC: If the VPC has no internet access, S3 Files requires a CloudWatch Logs VPC endpoint (
com.amazonaws.REGION.logs
) for mount to complete.
连接超时是最常见的挂载失败原因——几乎都是安全组问题。
验证挂载目标是否存在于实例的可用区(AZ)中:
bash
aws s3files list-mount-targets --file-system-id fs-ID --region REGION
跨可用区挂载可行,但会增加延迟。
验证安全组——最常见的修复方案:
  • 挂载目标安全组必须允许来自计算实例安全组的TCP 2049入站流量
  • 计算实例安全组必须允许向挂载目标安全组的TCP 2049出站流量
  • 修复命令:
    aws ec2 authorize-security-group-ingress --group-id sg-MT --protocol tcp --port 2049 --source-group sg-COMPUTE
测试连通性:
bash
nc -zv az-ID.fs-ID.s3files.REGION.on.aws 2049
注意: 这些安全组故障排查步骤也适用于EFS——只需将命令替换为
aws efs describe-mount-targets
即可。
隔离VPC中挂载挂起: 如果VPC无互联网访问权限,S3 Files需要CloudWatch Logs VPC端点(
com.amazonaws.REGION.logs
)才能完成挂载。

4. Category C — IAM Permissions

4. 类别C — IAM权限

File system stuck in "creating" status: S3 Files does NOT validate IAM role permissions at creation time. Wrong trust policy or missing permissions → stuck in
creating
with access denied in
statusMessage
.
Check status:
bash
aws s3files get-file-system --file-system-id fs-ID --region REGION
Check
statusMessage
. If access denied, fix the IAM role and delete/recreate.
Mount access denied: Compute role needs
s3files:ClientMount
. For dev/test only,
AmazonS3FilesClientFullAccess
is acceptable — avoid in production.
Write permission denied: Compute role needs
s3files:ClientWrite
Root access denied: Compute role needs
s3files:ClientRootAccess
. ⚠️ Bypasses POSIX permissions — prefer access points with scoped POSIX users.
Check file system policy:
bash
aws s3files get-file-system-policy --file-system-id fs-ID --region REGION
文件系统处于"创建中"状态停滞: S3 Files在创建时不会验证IAM角色权限。错误的信任策略或缺失权限会导致文件系统停滞在
创建中
状态,且
statusMessage
中显示访问被拒绝。
检查状态:
bash
aws s3files get-file-system --file-system-id fs-ID --region REGION
查看
statusMessage
。若显示访问被拒绝,请修复IAM角色并重新创建文件系统。
挂载时访问被拒绝: 计算实例角色需要
s3files:ClientMount
权限。仅在开发/测试环境中可使用
AmazonS3FilesClientFullAccess
权限——生产环境中避免使用。
写入时权限被拒绝: 计算实例角色需要
s3files:ClientWrite
权限
根访问被拒绝: 计算实例角色需要
s3files:ClientRootAccess
权限。⚠️ 该权限会绕过POSIX权限——优先使用带范围POSIX用户的访问点。
检查文件系统策略:
bash
aws s3files get-file-system-policy --file-system-id fs-ID --region REGION

5. Category D — Synchronization

5. 类别D — 同步问题

Files not appearing in S3: Writes sync within ~60 seconds. Check status:
bash
getfattr -n "user.s3files.status;$(date -u +%s)" filename --only-values
Common ExportError values:
ErrorFix
S3AccessDeniedFile system IAM role lacks S3 write permissions
S3BucketNotFoundBucket deleted or renamed
RoleAssumptionFailedTrust policy misconfigured
EncryptionKeyInaccessibleKMS key disabled or permissions revoked
PathTooLongFile path exceeds 1,024 byte S3 key limit
Monitor:
PendingExports
CloudWatch metric. Growing = exceeds 800 files/sec rate.
文件未出现在S3中: 写入操作会在约60秒内完成同步。检查状态:
bash
getfattr -n "user.s3files.status;$(date -u +%s)" filename --only-values
常见的ExportError值:
错误修复方案
S3AccessDenied文件系统IAM角色缺少S3写入权限
S3BucketNotFound存储桶已删除或重命名
RoleAssumptionFailed信任策略配置错误
EncryptionKeyInaccessibleKMS密钥已禁用或权限被撤销
PathTooLong文件路径超过S3密钥1024字节的限制
监控:
PendingExports
CloudWatch指标。数值持续增长表示超过了800文件/秒的速率限制。

6. Category E — Conflict Resolution

6. 类别E — 冲突解决

Files in
.s3files-lost+found-{fs-id}
= sync conflict (modified via FS and S3 simultaneously). S3 wins; FS version moved to lost+found.
文件位于
.s3files-lost+found-{fs-id}
目录表示存在同步冲突(同时通过文件系统和S3修改文件)。此时S3版本会保留,文件系统版本会被移至lost+found目录。

7. Category F — Performance

7. 类别F — 性能问题

First access latency: Normal — first directory access imports metadata.
Intelligent read routing not working: Compute role needs
s3:GetObject
on the bucket.
Slow writes: If
PendingExports
growing, distribute across multiple file systems.
首次访问延迟: 属于正常现象——首次访问目录时会导入元数据。
智能读取路由未生效: 计算实例角色需要对存储桶拥有
s3:GetObject
权限。
写入缓慢: 如果
PendingExports
数值持续增长,请将负载分散到多个文件系统中。

8. Category G — Encryption/KMS

8. 类别G — 加密/KMS

NFS server error with encrypted FS = KMS issue. Verify key is enabled and role has KMS permissions.
加密文件系统出现NFS服务器错误表示存在KMS问题。验证密钥是否已启用,且角色拥有KMS权限。

9. Category H — VPC DNS

9. 类别H — VPC DNS

DNS resolution failure = VPC DNS settings disabled.
bash
aws ec2 describe-vpc-attribute --vpc-id vpc-ID --attribute enableDnsHostnames
aws ec2 describe-vpc-attribute --vpc-id vpc-ID --attribute enableDnsSupport
Both MUST be
true
. If not:
bash
aws ec2 modify-vpc-attribute --vpc-id vpc-ID --enable-dns-hostnames Value=true
aws ec2 modify-vpc-attribute --vpc-id vpc-ID --enable-dns-support Value=true
DNS解析失败表示VPC DNS设置已禁用。
bash
aws ec2 describe-vpc-attribute --vpc-id vpc-ID --attribute enableDnsHostnames
aws ec2 describe-vpc-attribute --vpc-id vpc-ID --attribute enableDnsSupport
上述两项必须为
true
。若不是,请执行以下命令:
bash
aws ec2 modify-vpc-attribute --vpc-id vpc-ID --enable-dns-hostnames Value=true
aws ec2 modify-vpc-attribute --vpc-id vpc-ID --enable-dns-support Value=true

Troubleshooting

额外故障排查场景

AWS CLI endpoint URL cannot be resolved

AWS CLI端点URL无法解析

CLI is too old for S3 Files. Run
aws --version
— if v1.x, upgrade to AWS CLI v2: Installing the AWS CLI.
CLI版本过旧,不支持S3 Files。执行
aws --version
查看版本——若为v1.x,请升级至AWS CLI v2:安装AWS CLI

ECS task fails with DNS resolution error

ECS任务因DNS解析错误失败

Used
efsVolumeConfiguration
instead of
s3filesVolumeConfiguration
. Fix: use
fileSystemArn
in S3 Files-specific volume config.
使用了
efsVolumeConfiguration
而非
s3filesVolumeConfiguration
。修复方案:在S3 Files专属的卷配置中使用
fileSystemArn

S3 Files vs other products confusion

S3 Files与其他产品混淆

S3 Files is NOT Mountpoint for S3, S3 File Gateway, or File Cache. Uses
aws s3files
CLI,
s3files:
IAM actions,
mount -t s3files
.
S3 Files并非Mountpoint for S3、S3 File Gateway或File Cache。它使用
aws s3files
CLI、
s3files:
类型的IAM操作以及
mount -t s3files
命令。

Enable Debug Logs

启用调试日志

Set
logging_level = DEBUG
in
/etc/amazon/efs/s3files-utils.conf
. Logs at
/var/log/amazon/efs/mount.log
.
/etc/amazon/efs/s3files-utils.conf
中设置
logging_level = DEBUG
。日志位于
/var/log/amazon/efs/mount.log

Collect Logs for AWS Support

收集日志提交AWS支持

bash
sudo tar -czf /tmp/s3files-logs.tar.gz /var/log/amazon/efs/ /etc/amazon/efs/s3files-utils.conf
bash
sudo tar -czf /tmp/s3files-logs.tar.gz /var/log/amazon/efs/ /etc/amazon/efs/s3files-utils.conf

Security Considerations

安全注意事项

  • When diagnosing IAM issues, verify least-privilege — avoid FullAccess as a shortcut
  • Without a file system policy, any VPC client can mount
  • Restrict
    /var/log/amazon/efs/
    access — logs contain S3 key names
  • 排查IAM问题时,验证是否遵循最小权限原则——避免使用FullAccess权限作为快捷方式
  • 若无文件系统策略,任何VPC客户端均可挂载
  • 限制
    /var/log/amazon/efs/
    目录的访问权限——日志中包含S3密钥名称

Additional Resources

额外资源