Chrome DevTools Skill

Use Chrome DevTools Protocol for web page data acquisition and analysis.

Workflow (Automatically Executed by AI)

When users use this skill, the AI should automatically complete all the following steps instead of asking users to perform them manually:

1. Automatically Start Chrome Remote Debugging

The AI automatically detects the operating system and starts Chrome:

Windows:

bash

start chrome --remote-debugging-port=9222 --user-data-dir="%TEMP%\chrome-devtools-profile" [URL]

Important Note: Use a fixed
--user-data-dir
(e.g.,
chrome-devtools-profile
) instead of a different folder each time. This way Chrome will remember login states, cookies, and cache, making subsequent starts faster.

macOS:

bash

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --user-data-dir="$TMPDIR/chrome-profile-stable" [URL]

Linux:

bash

google-chrome --remote-debugging-port=9222 --user-data-dir="/tmp/chrome-profile-stable" [URL]

2. Automatically Start MCP Server

The AI automatically starts the MCP server (runs in the background):

bash

npx -y chrome-devtools-mcp@latest --browser-url=http://127.0.0.1:9222

3. Execute the Operation Requested by the User

Complete user tasks using the following MCP tools:

```
mcp__chrome-devtools__navigate_page
```
- Navigate to a page
```
mcp__chrome-devtools__wait_for
```
- Wait for page loading
```
mcp__chrome-devtools__take_snapshot
```
- Get page snapshot
```
mcp__chrome-devtools__evaluate_script
```
- Execute JavaScript to extract data
```
mcp__chrome-devtools__screenshot
```
- Take screenshots

mcp__chrome-devtools__list_network_requests

- List network requests

Standard Operating Procedure (For AI Use)

When users request to open/analyze/crawl a web page, the AI executes in the following order:

1. Start Chrome (in debugging mode)
2. Start MCP server
3. Navigate to the target URL
4. Wait for page loading
5. Perform the operation requested by the user (screenshot/data extraction/network monitoring, etc.)
6. Return results to the user

Data Extraction Techniques

Extract List Data:

javascript

() => {
  const items = document.querySelectorAll(".item");
  return Array.from(items).map((item) => ({
    title: item.querySelector(".title")?.innerText,
    link: item.querySelector("a")?.href,
  }));
};

Extract Table Data:

javascript

() => {
  const rows = document.querySelectorAll("table tr");
  return Array.from(rows).map((row) =>
    Array.from(row.querySelectorAll("td, th")).map((cell) => cell.innerText)
  );
};

Common Use Cases

Open web page and take screenshot: Start Chrome → Navigate → Take screenshot → Return results
Product information crawling: Navigate → Scroll to load → Extract list
Form automation: Navigate → fill_form → click → wait_for
API monitoring: Navigate → list_network_requests → get_network_request
Performance analysis: performance_start_trace → Operation → performance_stop_trace

Best Practices

Automatically handle all pre-steps - Do not ask users to manually start Chrome or MCP server
Use a fixed user data directory - Keep
```
--user-data-dir
```
consistent to avoid reloading every time
Use
```
take_snapshot
```
first to understand the page structure
Use
```
evaluate_script
```
to execute JavaScript for complex extraction
Ensure returned data is JSON-serializable
Use
```
wait_for
```
when handling dynamic content

Technical Principles

Chrome DevTools Protocol (CDP) Communication Methods

Method	Protocol	Usage
HTTP	`http://127.0.0.1:9222/json/list`	Get page list and basic information
WebSocket	`ws://127.0.0.1:9222/devtools/page/{pageId}`	Real-time page control and script execution

WebSocket Communication Process

python

import asyncio
import websockets
import json

async def cdp_example():
    # 1. Connect to Chrome DevTools WebSocket
    ws_url = "ws://127.0.0.1:9222/devtools/page/{pageId}"
    async with websockets.connect(ws_url) as ws:

        # 2. Enable Runtime domain
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.enable"
        }))

        # 3. Execute JavaScript
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {
                "expression": "document.title",
                "returnByValue": True
            }
        }))

        # 4. Loop to receive responses and match id
        while True:
            msg = await ws.recv()
            data = json.loads(msg)
            if data.get('id') == 2:  # Match request id
                return data['result']['result']['value']

Key Notes

Response Matching: WebSocket receives various messages (such as
```
consoleAPICalled
```
,
```
executionContextCreated
```
, etc.), so you must match requests and responses via
```
id
```
Wait for Page Loading: Use
```
await asyncio.sleep(2)
```
or wait for specific elements after performing operations
Encoding Issues: Windows terminals require
```
sys.stdout.reconfigure(encoding='utf-8')
```
to display Chinese correctly

Practical Experience and Troubleshooting Records

Problem 1: Garbled Chinese Output on Windows

Phenomenon: Crawled Chinese content displays as garbled characters or throws

UnicodeEncodeError: 'gbk' codec can't encode character

Cause: Windows terminals use GBK encoding by default, which conflicts when Python outputs UTF-8 Chinese Solution: Add encoding configuration at the beginning of the Python script:

python

import sys
sys.stdout.reconfigure(encoding='utf-8')

Important: All sample scripts (

baidu_search_example.py

cdp_helper.py

) have added this encoding configuration, so running them directly will not cause garbled characters.

Problem 2: HTML Character Entity Escaping

Phenomenon: The

=>

arrow function in JavaScript code is escaped as

=>

during JSON transmission Cause: JSON standard escapes certain Unicode characters Solution: This is normal and does not affect actual execution. If you need to handle it in Python, you can use:

python

import codecs
decoded = codecs.decode(string, 'unicode_escape')

Problem 3: Page ID Acquisition

Phenomenon: Unknown WebSocket URL of the current page Solution: Get the page list via HTTP interface:

bash

curl -s http://127.0.0.1:9222/json/list

Then extract the

webSocketDebuggerUrl

field

Note: Page IDs are dynamically generated and change every time Chrome starts. You should get them dynamically in code instead of hardcoding:

python

import urllib.request
import json

def get_ws_url():
    req = urllib.request.Request('http://127.0.0.1:9222/json/list')
    with urllib.request.urlopen(req) as response:
        pages = json.loads(response.read().decode())
    return pages[0]['webSocketDebuggerUrl'] if pages else None

Problem 4: Cross-Platform Compatibility of Windows Wait Commands

Phenomenon: Using

timeout /t 3 /nobreak

in Git Bash reports

invalid time interval

Cause:

timeout

is a Windows CMD command, which is incompatible with Bash syntax Solution: Use cross-platform wait solutions:

bash

# Windows CMD
timeout /t 3 /nobreak >nul

# Git Bash / Linux
sleep 3

# Cross-platform universal (Windows)
ping -n 4 127.0.0.1 > nul

Problem 5: Claude Code File Writing Restrictions

Phenomenon: Prompt "File has not been read yet" appears when directly using the Write tool to create a new file Cause: Claude Code requires reading the file first before writing (security mechanism) Solution:

First use the Read tool to read the file (if it does not exist, an error will be reported, so you need to create an empty file with Bash)
Or use Bash's
```
echo
```
or
```
cat
```
command to write directly
Or use Python scripts to dynamically generate and execute code

Problem 6: Incomplete Dynamic Content Extraction

Phenomenon: Some fields (such as abstracts) in crawled search results are empty Cause:

The page uses dynamic loading and requires scrolling to display all content
Different websites use different class names
Some content is loaded asynchronously

Solution:

Use multiple selectors for matching:

javascript

const abstract = item.querySelector('.c-abstract, .abstract, [class*="abstract"]')?.innerText;

Scroll the page to load more content:

javascript

window.scrollTo(0, document.body.scrollHeight);

Increase waiting time to ensure asynchronous content is fully loaded

Problem 7: Foreground vs Background Browser Operations

CDP supports two operation modes, choose according to user needs:

Background Operation (Default)：

Directly operate the DOM by sending JavaScript commands via WebSocket
The browser interface will not display the operation process (users cannot see input, clicks, etc.)
Suitable for scenarios like data crawling and automated testing that do not require visual feedback
Code example:

python

# Directly set the value of the input box (users cannot see the input process)
await ws.send(json.dumps({
    "method": "Runtime.evaluate",
    "params": {
        "expression": "document.querySelector('#kw').value = '美女';"
    }
}))

Foreground Operation (Visualization)：

Use CDP's Input domain to simulate real user input
The browser will display the complete operation process (keyboard input, mouse clicks, etc.)
Suitable for demonstration, teaching, and scenarios where users need to observe the operation process
Code example:

python

# 1. First focus on the input box
await ws.send(json.dumps({
    "method": "Runtime.evaluate",
    "params": {
        "expression": "document.querySelector('#kw').focus();"
    }
}))

# 2. Simulate keyboard input (users can see character-by-character input)
for char in "美女":
    await ws.send(json.dumps({
        "method": "Input.dispatchKeyEvent",
        "params": {
            "type": "char",
            "text": char
        }
    }))
    await asyncio.sleep(0.1)  # Simulate real typing interval

# 3. Simulate clicking the search button (users can see the click effect)
await ws.send(json.dumps({
    "method": "Input.dispatchMouseEvent",
    "params": {
        "type": "mousePressed",
        "x": 100,
        "y": 200,
        "button": "left"
    }
}))

Selection Suggestions：

Scenario	Recommended Mode	Reason
Data crawling	Background	Fast, no visualization required
Automated testing	Background	Stable and reliable, not affected by UI
Search tasks	Background	Directly constructing URLs is simpler and more reliable
Operation demonstration	Foreground	Users can see the complete process
Teaching demonstration	Foreground	Easy to observe and understand
Form filling	Either	Background is faster, foreground is more intuitive

Best Practices for Search Tasks：

Background Method (Recommended): Directly construct the search URL, such as
```
https://www.baidu.com/s?wd=keyword
```
Foreground Method: Only use when users explicitly request "visualization" or "ability to see the operation process"

Problem 8: Best Practices for Search Tasks

Scenario: Users request "search for a certain keyword on Baidu/Google and crawl the results"

Solution Comparison:

Solution	Implementation	Advantages	Disadvantages	Recommendation
A	Directly access the search URL in the background	Simple, fast, stable	Users cannot see the process	⭐⭐⭐⭐⭐
B	Simulate input and click in the foreground	Visual, with feedback	Complex, slow, need to handle coordinates	⭐⭐⭐

Recommended Solution A (Background Method)：

python

# Directly construct the search URL in one step
url = f"https://www.baidu.com/s?wd={urllib.parse.quote(keyword)}"
await client.navigate(url)
await asyncio.sleep(3)  # Wait for loading
results = await client.evaluate(extract_script)

When to Use Solution B (Foreground Method)：

Users explicitly request "visual operation" or "let me see the process"
Teaching demonstration scenarios
Need to show the automation operation process

Notes for Foreground Method：

Must
```
focus()
```
on the input box first
A typing interval of 0.1-0.15 seconds is most appropriate
The coordinates of Baidu's search button are approximately x:650, y:230 (1920x1080 resolution)
Need to enable Input domain:
```
Input.enable
```

Detailed Steps for Foreground Text Input (Taking Baidu search box as an example)：

python

# Step 1: Enable Input domain (must be enabled first to send keyboard events)
await ws.send(json.dumps({
    "id": 1,
    "method": "Input.enable"
}))

# Step 2: Focus on the input box (key step, otherwise keyboard events may be invalid)
await ws.send(json.dumps({
    "id": 2,
    "method": "Runtime.evaluate",
    "params": {
        "expression": "document.querySelector('#kw').focus();",  # '#kw' is the ID of Baidu's search box
        "returnByValue": True
    }
}))
await asyncio.sleep(0.5)  # Wait for focusing to complete

# Step 3: Clear the input box (optional, ensure it is empty)
await ws.send(json.dumps({
    "id": 3,
    "method": "Runtime.evaluate",
    "params": {
        "expression": "document.querySelector('#kw').value = '';",
        "returnByValue": True
    }
}))

# Step 4: Input text character by character (users can see the typing animation)
text = "hello 你好呀"
for i, char in enumerate(text):
    await ws.send(json.dumps({
        "id": 10 + i,
        "method": "Input.dispatchKeyEvent",
        "params": {
            "type": "char",      # For inputting ordinary characters
            "text": char          # Character to input
        }
    }))
    await asyncio.sleep(0.15)  # Typing interval, allowing users to see the animation

Key Points：

```
type: "char"
```
is used for inputting ordinary characters
Must
```
focus()
```
on the input box first, otherwise input may be invalid
The recommended interval is 0.1-0.2 seconds; too short may cause missing characters, too long makes users wait too long
Supports Chinese input without additional encoding processing

Problem 9: Execution Order for Form Automation

Correct Order:

Enable Runtime → 2. Fill input boxes → 3. Click button → 4. Wait for page loading → 5. Extract data

Sample Code (Baidu search background method)：

python

# 1. Fill in search term
await ws.send(json.dumps({
    "id": 2,
    "method": "Runtime.evaluate",
    "params": {
        "expression": "document.querySelector('#kw').value = '搜索词';",
        "returnByValue": True
    }
}))

# 2. Click search button
await ws.send(json.dumps({
    "id": 3,
    "method": "Runtime.evaluate",
    "params": {
        "expression": "document.querySelector('#su').click();",
        "returnByValue": True
    }
}))

# 3. Wait for loading
await asyncio.sleep(3)

# 4. Extract results
await ws.send(json.dumps({
    "id": 4,
    "method": "Runtime.evaluate",
    "params": {
        "expression": """
            (() => {
                const items = document.querySelectorAll('.result');
                return Array.from(items).map(item => ({
                    title: item.querySelector('h3')?.innerText,
                    link: item.querySelector('a')?.href
                }));
            })()
        """,
        "returnByValue": True
    }
}))

User Guide

How to Use This Skill

Users can trigger this skill in the following ways:

Directly state requirements - Describe what you want to do with the web page
Use the
/chrome-devtools-skill
command - Explicitly call the skill
Mention web page operations - Any keywords involving "open web page, crawl data, take screenshot", etc.

Prompt Examples

Here are some prompt templates users can use:

Basic Operations

Open https://www.example.com and take a screenshot

Crawl the content of https://www.example.com

Analyze this web page: https://github.com/xxx/xxx

Data Extraction

Open https://www.jd.com and extract all product titles and prices on the homepage

Visit https://www.zhihu.com/explore and get the list of popular questions

Crawl all links on https://www.example.com

Monitoring and Analysis

Open https://www.example.com and monitor all API requests loaded by the page

Analyze the page performance of https://www.example.com

Visit https://www.example.com and extract all image URLs on the page

Automation Operations

Open https://www.example.com, enter "keyword" in the search box and search

Visit https://www.example.com, click the "Load More" button, and crawl all list data

Prompt Tips

Specify the URL clearly - Provide the complete web address
State the goal clearly - Describe what data you want to obtain or what operation you want to perform
Specify the format - If you need a specific format, you can state it (e.g., "return in table format")
Break down complex tasks - For multi-step tasks, describe them step by step

Sample Conversations

User: Help me open Baidu and see what's on the hot search list

AI: [Automatically start Chrome → Open Baidu → Extract hot search data → Return results]

User: Crawl the titles and links of the top 10 news items on https://news.ycombinator.com

AI: [Automatically start Chrome → Open Hacker News → Extract news data → Return table]

User: Analyze which interfaces are loaded on the Taobao homepage

AI: [Automatically start Chrome → Open Taobao → Monitor network requests → Return API list]

Reference Resources

Sample Scripts: See
```
scripts/
```
directory
- ```
baidu_search_example.py
```
  - Complete Baidu search example
- ```
cdp_helper.py
```
  - CDP client encapsulation class
- ```
README.md
```
  - Quick start guide

Chrome DevTools MCP Configuration Information

If users ask how to configure

chrome-devtools MCP

, provide the following JSON configuration information:

json

{
  "mcpServers": {
    "chrome-devtools": {
      "command": "npx",
      "args": [
        "chrome-devtools-mcp@latest",
        "--browser-url=http://127.0.0.1:9222"
      ]
    }
  }
}

// Test modification // Another test modification // Test modification pre-push solution

chrome-devtools-skill

NPX Install

Tags

SKILL.md Content (Chinese)

Chrome DevTools Skill

Workflow (Automatically Executed by AI)

1. Automatically Start Chrome Remote Debugging

2. Automatically Start MCP Server

3. Execute the Operation Requested by the User

Standard Operating Procedure (For AI Use)

Data Extraction Techniques

Common Use Cases

Best Practices

Technical Principles

Chrome DevTools Protocol (CDP) Communication Methods

WebSocket Communication Process

Key Notes

Practical Experience and Troubleshooting Records

Problem 1: Garbled Chinese Output on Windows

Problem 2: HTML Character Entity Escaping

Problem 3: Page ID Acquisition

Problem 4: Cross-Platform Compatibility of Windows Wait Commands

Problem 5: Claude Code File Writing Restrictions

Problem 6: Incomplete Dynamic Content Extraction

Problem 7: Foreground vs Background Browser Operations

Problem 8: Best Practices for Search Tasks

Problem 9: Execution Order for Form Automation

User Guide

How to Use This Skill

Prompt Examples

Basic Operations

Data Extraction

Monitoring and Analysis

Automation Operations

Prompt Tips

Sample Conversations

Reference Resources

Chrome DevTools MCP Configuration Information