Back to Details

log-redaction-auditor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

When to invoke

调用场景

You have application logs (text) and want to check whether secrets/PII might be present.
You want a repeatable, automated check in CI/CD before sharing logs externally.

当你拥有应用日志（文本格式），并希望检查其中是否可能包含机密信息/个人身份信息（PII）时。
当你希望在CI/CD流程中添加一个可重复的自动化检查，以便在对外共享日志前进行验证时。

Inputs needed

所需输入

```
--input
```
path to a log file (UTF-8 text).
Optional:
```
--config
```
path to a JSON config overriding patterns and allowlists.

```
--input
```
：日志文件的路径（UTF-8文本格式）。
可选：
```
--config
```
：用于覆盖检测模式和白名单的JSON配置文件路径。

Workflow

工作流程

Scan log lines with conservative rules for likely secrets/PII.
Apply allowlists (known test keys/domains) to reduce false positives.
Emit a JSON report with counts, examples, and line numbers.

使用保守规则扫描日志行，检测可能存在的机密信息/PII。
应用白名单（已知的测试密钥/域名）以减少误报。
生成包含统计数量、示例和行号的JSON报告。

Output format

输出格式

JSON written to
```
--output
```
with:
- ```
summary
```
  : counts by severity and rule.
- ```
findings
```
  : list of matches with
```
severity
```
  ,
```
rule_id
```
  ,
```
line_number
```
  ,
```
match
```
  , and
```
context
```
  .

写入
```
--output
```
路径的JSON文件包含：
- ```
summary
```
  ：按严重程度和规则分类的统计数量。
- ```
findings
```
  ：匹配结果列表，包含
```
severity
```
  （严重程度）、
```
rule_id
```
  （规则ID）、
```
line_number
```
  （行号）、
```
match
```
  （匹配内容）和
```
context
```
  （上下文）。

Guardrails

防护规则

Do not modify the input logs.
Avoid printing raw secrets to stdout; only write to output file.
Provide an allowlist mechanism to reduce false positives.

不得修改输入日志。
避免将原始机密信息打印到标准输出；仅写入输出文件。
提供白名单机制以减少误报。

Reference code

参考代码

```
log_redaction_auditor.py
```
implements the scanner using Python stdlib regex + JSON.

```
log_redaction_auditor.py
```
使用Python标准库的正则表达式和JSON实现了该扫描器。