network-interface-health
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseNetwork Interface Health
网络接口健康状态
Use this skill when a network symptom might be caused by a physical link, switch
port, cable, transceiver, duplex setting, or congested interface.
当网络故障症状可能由物理链路、交换机端口、线缆、收发器、duplex设置或接口拥塞导致时,可使用此Skill。
When to Use
使用场景
- A host or VLAN has packet loss, latency spikes, or intermittent reachability.
- A switch or router interface shows CRCs, runts, giants, drops, resets, or flaps.
- You need to compare both ends of a link before replacing hardware.
- A change window needs before/after interface counter evidence.
- Monitoring reports rising ,
ifInErrors, orifOutErrors.ifOutDiscards
- 主机或VLAN存在丢包、延迟突增或间歇性连通问题。
- 交换机或路由器接口出现CRCs、runt帧、giant帧、丢包、重置或flapping现象。
- 需要在更换硬件前对比链路两端状态。
- 变更窗口需要接口计数器的前后对比数据。
- 监控报告显示、
ifInErrors或ifOutErrors持续上升。ifOutDiscards
How It Works
工作原理
Interface counters are evidence, but the trend matters more than the absolute
number. Capture a baseline, wait a measurement interval, capture again, then
compare increments.
text
show interfaces <interface>
show interfaces <interface> status
show logging | include <interface>|changed state|line protocolOn Linux hosts:
text
ip -s link show <interface>
ethtool <interface>
ethtool -S <interface>接口计数器是重要依据,但趋势比绝对值更关键。先捕获基准数据,等待一个测量间隔后再次捕获,然后对比增量变化。
text
show interfaces <interface>
show interfaces <interface> status
show logging | include <interface>|changed state|line protocol在Linux主机上:
text
ip -s link show <interface>
ethtool <interface>
ethtool -S <interface>Counter Reference
计数器参考
| Counter | Meaning | Common cause |
|---|---|---|
| CRC | Received frame checksum failed | Bad cable, dirty fiber, bad optic, duplex mismatch |
| input errors | Aggregate receive-side errors | Check sub-counters before concluding |
| runts | Frames below minimum Ethernet size | Duplex mismatch, collision domain, faulty NIC |
| giants | Frames larger than expected MTU | MTU mismatch or jumbo-frame boundary |
| input drops | Device could not accept inbound packets | Burst, oversubscription, CPU path, queue pressure |
| output drops | Egress queue discarded packets | Congestion, QoS policy, undersized uplink |
| resets | Interface hardware reset | Flapping, keepalive, driver, optic, power |
| collisions | Ethernet collision counter | Half duplex or negotiation mismatch |
| Counter | 含义 | 常见原因 |
|---|---|---|
| CRC | 接收帧校验和失败 | 线缆损坏、光纤污染、光模块故障、duplex不匹配 |
| input errors | 接收端错误汇总 | 先检查子计数器再下结论 |
| runts | 小于以太网最小规格的帧 | duplex不匹配、冲突域、网卡故障 |
| giants | 超过预期MTU的帧 | MTU不匹配或巨帧边界问题 |
| input drops | 设备无法接收inbound数据包 | 突发流量、链路超配、CPU路径瓶颈、队列压力 |
| output drops | 出口队列丢弃的数据包 | 拥塞、QoS策略、上行链路带宽不足 |
| resets | 接口硬件重置 | flapping、保活机制、驱动故障、光模块问题、电源异常 |
| collisions | 以太网冲突计数器 | 半双工或协商不匹配 |
Diagnosis Flow
诊断流程
CRCs Or Input Errors
CRCs 或输入错误
- Confirm counters are incrementing, not just historical.
- Check both ends of the link. Receive-side errors usually point to the signal arriving on that side, not necessarily the port reporting the error.
- Replace patch cable or clean/replace fiber and optics.
- Confirm speed/duplex settings match on both sides.
- Check logs for flap events around the same timestamp.
- 确认计数器正在增长,而非历史累计值。
- 检查链路两端状态。接收端错误通常指向该侧接收的信号问题,而非报告错误的端口本身。
- 更换跳线或清洁/更换光纤及光模块。
- 确认两端的速率/duplex设置一致。
- 检查同一时间戳附近的flap事件日志。
Drops
丢包
- Separate input drops from output drops.
- Compare interface rate against capacity.
- Check QoS policy, queue counters, and whether the link is an oversubscribed uplink.
- Treat queue tuning as secondary. First prove whether the link is congested.
- 区分输入丢包和输出丢包。
- 对比接口速率与链路容量。
- 检查QoS策略、队列计数器,以及链路是否为超配的上行链路。
- 队列调优作为次要措施。首先要确认链路是否存在拥塞。
Duplex And Speed
Duplex 与速率
Prefer auto-negotiation on modern Ethernet links when both sides support it. If
one side must be fixed, configure both sides explicitly and document why. Never
mix fixed speed/duplex on one side with auto on the other.
text
show interfaces <interface> | include duplex|speed当链路两端均支持时,现代以太网链路优先使用自动协商。若某一端必须固定配置,则需显式配置两端并记录原因。绝对不能一端配置固定速率/duplex,另一端使用自动协商。
text
show interfaces <interface> | include duplex|speedSafe Parser Example
安全解析器示例
Slice each interface block from one header to the next. Do not use an arbitrary
character window; large interface blocks can cause counters to be missed or
assigned to the wrong port.
python
import re
from typing import Any
HEADER_RE = re.compile(
r"^(?P<name>\S+) is (?P<status>(?:administratively )?down|up), "
r"line protocol is (?P<protocol>up|down)",
re.I | re.M,
)
ERROR_RE = re.compile(r"(?P<input>\d+) input errors, (?P<crc>\d+) CRC", re.I)
DROP_RE = re.compile(r"(?P<output>\d+) output errors", re.I)
DUPLEX_RE = re.compile(r"(?P<duplex>Full|Half|Auto)-duplex,\s+(?P<speed>[^,]+)", re.I)
def parse_show_interfaces(raw: str) -> list[dict[str, Any]]:
headers = list(HEADER_RE.finditer(raw))
interfaces = []
for index, header in enumerate(headers):
end = headers[index + 1].start() if index + 1 < len(headers) else len(raw)
block = raw[header.start():end]
errors = ERROR_RE.search(block)
drops = DROP_RE.search(block)
duplex = DUPLEX_RE.search(block)
interfaces.append({
"name": header.group("name"),
"status": header.group("status"),
"protocol": header.group("protocol"),
"duplex": duplex.group("duplex") if duplex else "unknown",
"speed": duplex.group("speed").strip() if duplex else "unknown",
"input_errors": int(errors.group("input")) if errors else 0,
"crc_errors": int(errors.group("crc")) if errors else 0,
"output_errors": int(drops.group("output")) if drops else 0,
})
return interfaces从一个标题到下一个标题分割每个接口块。不要使用任意字符窗口;较大的接口块可能导致计数器丢失或分配到错误端口。
python
import re
from typing import Any
HEADER_RE = re.compile(
r"^(?P<name>\S+) is (?P<status>(?:administratively )?down|up), "
r"line protocol is (?P<protocol>up|down)",
re.I | re.M,
)
ERROR_RE = re.compile(r"(?P<input>\d+) input errors, (?P<crc>\d+) CRC", re.I)
DROP_RE = re.compile(r"(?P<output>\d+) output errors", re.I)
DUPLEX_RE = re.compile(r"(?P<duplex>Full|Half|Auto)-duplex,\s+(?P<speed>[^,]+)", re.I)
def parse_show_interfaces(raw: str) -> list[dict[str, Any]]:
headers = list(HEADER_RE.finditer(raw))
interfaces = []
for index, header in enumerate(headers):
end = headers[index + 1].start() if index + 1 < len(headers) else len(raw)
block = raw[header.start():end]
errors = ERROR_RE.search(block)
drops = DROP_RE.search(block)
duplex = DUPLEX_RE.search(block)
interfaces.append({
"name": header.group("name"),
"status": header.group("status"),
"protocol": header.group("protocol"),
"duplex": duplex.group("duplex") if duplex else "unknown",
"speed": duplex.group("speed").strip() if duplex else "unknown",
"input_errors": int(errors.group("input")) if errors else 0,
"crc_errors": int(errors.group("crc")) if errors else 0,
"output_errors": int(drops.group("output")) if drops else 0,
})
return interfacesExamples
示例
CRCs On One Switch Port
单个交换机端口出现CRCs
- Capture counters on the local port.
- Capture counters on the connected remote port.
- Replace the cable or optic before changing routing or firewall rules.
- Clear counters only after recording the baseline.
- Recheck after a fixed interval.
- 捕获本地端口的计数器数据。
- 捕获连接的远端端口计数器数据。
- 在修改路由或防火墙规则前,先更换线缆或光模块。
- 仅在记录基准数据后清除计数器。
- 固定间隔后重新检查。
Internet Slow But LAN Is Fine
互联网缓慢但局域网正常
- Check WAN interface drops/errors.
- Check LAN uplink utilization and output drops.
- Check gateway CPU if the WAN link is clean but throughput is still low.
- Compare wired and wireless tests before blaming upstream service.
- 检查WAN接口的丢包/错误情况。
- 检查LAN上行链路利用率和输出丢包。
- 若WAN链路状态正常但吞吐量仍低,检查网关CPU。
- 在归咎于上游服务前,对比有线和无线测试结果。
Anti-Patterns
反模式
- Clearing counters before saving a baseline.
- Looking at only one side of a link.
- Assuming all historical CRCs are active problems without a time window.
- Mixing auto-negotiation on one side with fixed speed/duplex on the other.
- Treating output drops as a cable problem before checking congestion.
- 在保存基准数据前清除计数器。
- 仅查看链路一端的状态。
- 未限定时间范围就假设所有历史CRCs都是当前活跃问题。
- 一端使用自动协商,另一端配置固定速率/duplex。
- 在检查拥塞前就将输出丢包视为线缆问题。
See Also
相关链接
- Agent:
network-troubleshooter - Skill:
network-config-validation - Skill:
homelab-network-setup
- Agent:
network-troubleshooter - Skill:
network-config-validation - Skill:
homelab-network-setup