troubleshooting-efs
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTroubleshooting EFS
Amazon EFS故障排查
Overview
概述
Domain expertise for diagnosing and resolving Amazon EFS issues. Covers mount
failures, NFS connectivity, IAM and POSIX permissions, throughput and performance,
and encryption problems.
For authoritative guidance, see EFS Troubleshooting.
本内容为诊断和解决Amazon EFS问题的专业指南,涵盖挂载失败、NFS连接、IAM与POSIX权限、吞吐量与性能以及加密问题。
如需权威指导,请查看EFS故障排查文档。
Common Tasks
常见任务
0. Verify Dependencies
0. 验证依赖项
- You MUST verify CLI is available
aws - You MUST check if or
amazon-efs-utilsis installed on the instancenfs-utils - You MUST ONLY check for tool existence and version — MUST NOT execute destructive or mutating commands during verification
- You MUST inform the user if any required tools are missing
- You MUST respect the user's decision to abort if tools are unavailable
- You SHOULD explain what each step does and why before executing it
- You SHOULD display write commands and wait for user confirmation before executing
- 必须验证CLI是否可用
aws - 必须检查实例上是否安装了或
amazon-efs-utilsnfs-utils - 仅需检查工具是否存在及其版本——验证过程中禁止执行破坏性或变更性命令
- 若缺少任何必需工具,必须告知用户
- 若工具不可用,必须尊重用户中止操作的决定
- 在执行步骤前,应解释每个步骤的作用及原因
- 应显示写入命令并等待用户确认后再执行
1. Classify the Issue
1. 问题分类
| Symptom | Category |
|---|---|
| "wrong fs type" or mount command fails | A: Missing NFS Client |
| Connection timed out (hangs 2+ min) | B: Network/Security Group |
| "access denied by server" | C: IAM/Permissions |
| Slow throughput or high latency | D: Performance |
| NFS server error on encrypted FS | E: Encryption/KMS |
| DNS name resolution fails | F: VPC DNS |
| 症状 | 类别 |
|---|---|
| 显示"wrong fs type"或挂载命令执行失败 | A: 缺少NFS客户端 |
| 连接超时(挂起2分钟以上) | B: 网络/安全组 |
| 显示"access denied by server" | C: IAM/权限 |
| 吞吐量缓慢或延迟过高 | D: 性能 |
| 加密文件系统出现NFS服务器错误 | E: 加密/KMS |
| DNS名称解析失败 | F: VPC DNS |
2. Category A — Missing NFS Client
2. 类别A — 缺少NFS客户端
bash
undefinedbash
undefinedAmazon Linux / RHEL / CentOS
Amazon Linux / RHEL / CentOS
sudo yum -y install amazon-efs-utils # preferred (includes mount helper + TLS)
sudo yum -y install amazon-efs-utils # preferred (includes mount helper + TLS)
OR
OR
sudo yum -y install nfs-utils
sudo yum -y install nfs-utils
Ubuntu / Debian
Ubuntu / Debian
sudo apt-get install nfs-common
undefinedsudo apt-get install nfs-common
undefined3. Category B — Network/Security Group
3. 类别B — 网络/安全组
Connection timeout is the #1 EFS mount failure — almost always security groups.
- Verify mount target exists in the instance's AZ:
bash
aws efs describe-mount-targets --file-system-id fs-ID --region REGION-
Verify security groups — check BOTH directions:
- Mount target SG: — MUST have inbound TCP 2049 from compute SG
aws ec2 describe-security-groups --group-ids sg-MT - Compute SG: MUST have outbound TCP 2049 to mount target SG
- Quick fix:
aws ec2 authorize-security-group-ingress --group-id sg-MT --protocol tcp --port 2049 --source-group sg-COMPUTE
- Mount target SG:
-
Test connectivity:
bash
nc -zv fs-ID.efs.REGION.amazonaws.com 2049Note: These security group troubleshooting steps also apply to S3 Files. The only difference is S3 Files usesinstead ofaws s3files list-mount-targets.aws efs describe-mount-targets
连接超时是EFS挂载失败的首要原因——几乎都是安全组问题。
- 验证挂载目标是否存在于实例的可用区中:
bash
aws efs describe-mount-targets --file-system-id fs-ID --region REGION-
验证安全组——检查双向规则:
- 挂载目标安全组:— 必须允许来自计算实例安全组的TCP 2049入站流量
aws ec2 describe-security-groups --group-ids sg-MT - 计算实例安全组:必须允许发往挂载目标安全组的TCP 2049出站流量
- 快速修复:
aws ec2 authorize-security-group-ingress --group-id sg-MT --protocol tcp --port 2049 --source-group sg-COMPUTE
- 挂载目标安全组:
-
测试连接性:
bash
nc -zv fs-ID.efs.REGION.amazonaws.com 2049注意: 这些安全组故障排查步骤也适用于S3 Files。唯一区别是S3 Files使用而非aws s3files list-mount-targets。aws efs describe-mount-targets
4. Category C — IAM/Permissions
4. 类别C — IAM/权限
"access denied by server" with :
-o iam- Check identity-based IAM policy has
elasticfilesystem:ClientMount - Check file system resource policy:
bash
aws efs describe-file-system-policy --file-system-id fs-ID --region REGIONNote: IAM authorization is only enforced when a file system policy exists that requires it. Without a file system policy, any client in the VPC with port 2049 access can mount — even with . To enforce IAM, you MUST create a file system policy that denies anonymous access.
-o iamPOSIX permission denied (not IAM):
- Check file/directory ownership:
ls -la /mnt/efs/ - Use access points to enforce UID/GID for consistent permissions
使用时显示"access denied by server":
-o iam- 检查基于身份的IAM策略是否包含权限
elasticfilesystem:ClientMount - 检查文件系统资源策略:
bash
aws efs describe-file-system-policy --file-system-id fs-ID --region REGION注意: 仅当存在要求IAM授权的文件系统策略时,才会强制执行IAM授权。若无文件系统策略,VPC中任何拥有2049端口访问权限的客户端均可挂载——即使使用选项。要强制执行IAM授权,必须创建拒绝匿名访问的文件系统策略。
-o iamPOSIX权限被拒绝(非IAM问题):
- 检查文件/目录所有权:
ls -la /mnt/efs/ - 使用访问点强制统一的UID/GID以确保权限一致性
5. Category D — Performance
5. 类别D — 性能
Check throughput mode:
bash
aws efs describe-file-systems --file-system-id fs-ID --region REGION --query 'FileSystems[0].ThroughputMode'Burst credit exhaustion (Bursting mode only):
bash
aws cloudwatch get-metric-statistics --namespace AWS/EFS --metric-name BurstCreditBalance --dimensions Name=FileSystemId,Value=fs-ID --period 3600 --statistics Average --start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%S) --end-time $(date -u +%Y-%m-%dT%H:%M:%S)If credits near zero, switch to Elastic throughput:
bash
aws efs update-file-system --file-system-id fs-ID --throughput-mode elastic --region REGIONGeneral Purpose vs Max I/O:
- Check metric — if consistently >80%, consider Max I/O
PercentIOLimit - Note: performance mode is IMMUTABLE — must create new FS and migrate
检查吞吐量模式:
bash
aws efs describe-file-systems --file-system-id fs-ID --region REGION --query 'FileSystems[0].ThroughputMode'突发信用耗尽(仅适用于Bursting模式):
bash
aws cloudwatch get-metric-statistics --namespace AWS/EFS --metric-name BurstCreditBalance --dimensions Name=FileSystemId,Value=fs-ID --period 3600 --statistics Average --start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%S) --end-time $(date -u +%Y-%m-%dT%H:%M:%S)若信用余额接近零,切换至弹性吞吐量模式:
bash
aws efs update-file-system --file-system-id fs-ID --throughput-mode elastic --region REGION通用用途模式 vs 最大I/O模式:
- 检查指标——若持续超过80%,考虑切换至最大I/O模式
PercentIOLimit - 注意:性能模式不可更改——必须创建新的文件系统并迁移数据
6. Category E — Encryption/KMS
6. 类别E — 加密/KMS
NFS server error on encrypted FS = KMS key issue.
- Verify key is enabled in KMS console
- Verify EFS service-linked role has KMS permissions
- If key deleted: cancel deletion if within grace period
加密文件系统出现NFS服务器错误 = KMS密钥问题。
- 在KMS控制台中验证密钥是否启用
- 验证EFS服务关联角色是否拥有KMS权限
- 若密钥已删除:在宽限期内取消删除操作
7. Category F — VPC DNS
7. 类别F — VPC DNS
DNS resolution failure = VPC DNS settings disabled.
bash
aws ec2 describe-vpc-attribute --vpc-id vpc-ID --attribute enableDnsHostnames
aws ec2 describe-vpc-attribute --vpc-id vpc-ID --attribute enableDnsSupportBoth MUST be . If not:
truebash
aws ec2 modify-vpc-attribute --vpc-id vpc-ID --enable-dns-hostnames Value=true
aws ec2 modify-vpc-attribute --vpc-id vpc-ID --enable-dns-support Value=trueDNS解析失败 = VPC DNS设置已禁用。
bash
aws ec2 describe-vpc-attribute --vpc-id vpc-ID --attribute enableDnsHostnames
aws ec2 describe-vpc-attribute --vpc-id vpc-ID --attribute enableDnsSupport上述两项必须均为。若不是:
truebash
aws ec2 modify-vpc-attribute --vpc-id vpc-ID --enable-dns-hostnames Value=true
aws ec2 modify-vpc-attribute --vpc-id vpc-ID --enable-dns-support Value=trueTroubleshooting
故障排查场景
Mount hangs then times out
挂载挂起后超时
Most common cause: security group. Verify TCP 2049 is open between compute and mount target.
最常见原因:安全组。验证计算实例与挂载目标之间的TCP 2049端口是否开放。
Auto-mount fails on reboot
重启时自动挂载失败
/etc/fstab_netdev/etc/fstab_netdev"nfs not responding" after reconnect
重新连接后显示"nfs not responding"
Old kernel bug with TCP port reuse. Update kernel or add mount option.
noresvport旧内核存在TCP端口复用漏洞。更新内核或添加挂载选项。
noresvportEnable Debug Logs
启用调试日志
Set in . Logs at .
logging_level = DEBUG/etc/amazon/efs/efs-utils.conf/var/log/amazon/efs/mount.log在中设置。日志位于。
/etc/amazon/efs/efs-utils.conflogging_level = DEBUG/var/log/amazon/efs/mount.logCollect Logs for AWS Support
收集日志以供AWS支持团队使用
bash
sudo tar -czf /tmp/efs-logs.tar.gz /var/log/amazon/efs/ /etc/amazon/efs/efs-utils.confbash
sudo tar -czf /tmp/efs-logs.tar.gz /var/log/amazon/efs/ /etc/amazon/efs/efs-utils.confSecurity Considerations
安全注意事项
- IAM authorization is only enforced when a file system policy exists — without one, any VPC client with port 2049 access can mount
- When troubleshooting access denied, verify both identity-based and resource-based policies
- Use for encryption in transit — unencrypted NFS traffic is visible on the network
-o tls - Restrict access — logs may contain file system IDs and mount target IPs
/var/log/amazon/efs/
- 仅当存在文件系统策略时才会强制执行IAM授权——若无策略,VPC中任何拥有2049端口访问权限的客户端均可挂载
- 排查访问被拒绝问题时,需同时验证基于身份和基于资源的策略
- 使用选项实现传输中加密——未加密的NFS流量可在网络中被捕获
-o tls - 限制的访问权限——日志可能包含文件系统ID和挂载目标IP地址
/var/log/amazon/efs/