autocli-web-scraping

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

AutoCLI Web Scraping Skill

AutoCLI 网页抓取工具

Skill by ara.so — Devtools Skills collection.
AutoCLI is a blazing fast, memory-safe command-line tool written in Rust that fetches information from 55+ websites with a single command. It covers Twitter/X, Reddit, YouTube, HackerNews, Bilibili, Zhihu, Xiaohongshu, and more, with support for browser session reuse, AI-powered adapter generation, and multi-format output.
Key Features:
  • 55 sites, 333 commands built-in
  • Browser session reuse (no token management needed)
  • AI-powered adapter generation via autocli.ai
  • Declarative YAML pipeline for custom adapters
  • Single 4.7MB binary, zero runtime dependencies
  • Up to 12x faster and 10x less memory than Node.js alternatives
ara.so 开发的技能工具 —— 开发工具技能合集。
AutoCLI 是一款基于Rust开发的极速、内存安全的命令行工具,仅需单条命令即可从55+网站获取信息。它支持Twitter/X、Reddit、YouTube、HackerNews、B站、知乎、小红书等平台,还具备浏览器会话复用、AI驱动的适配器生成以及多格式输出功能。
核心特性:
  • 内置55个平台、333条命令
  • 浏览器会话复用(无需管理令牌)
  • 通过autocli.ai实现AI驱动的适配器生成
  • 自定义适配器的声明式YAML流水线
  • 仅4.7MB的单二进制文件,无运行时依赖
  • 比Node.js同类工具快12倍,内存占用低10倍

Installation

安装

One-line Install (macOS / Linux)

一键安装(macOS / Linux)

bash
curl -fsSL https://raw.githubusercontent.com/nashsu/autocli/main/scripts/install.sh | sh
bash
curl -fsSL https://raw.githubusercontent.com/nashsu/autocli/main/scripts/install.sh | sh

Manual Installation

手动安装

Download the appropriate binary from GitHub Releases:
  • macOS (Apple Silicon):
    autocli-aarch64-apple-darwin.tar.gz
  • macOS (Intel):
    autocli-x86_64-apple-darwin.tar.gz
  • Linux (x86_64):
    autocli-x86_64-unknown-linux-musl.tar.gz
  • Windows (x64):
    autocli-x86_64-pc-windows-msvc.zip
Extract and place in your PATH:
bash
tar -xzf autocli-*.tar.gz
sudo mv autocli /usr/local/bin/
GitHub Releases 下载对应平台的二进制文件:
  • macOS(Apple Silicon):
    autocli-aarch64-apple-darwin.tar.gz
  • macOS(Intel):
    autocli-x86_64-apple-darwin.tar.gz
  • Linux(x86_64):
    autocli-x86_64-unknown-linux-musl.tar.gz
  • Windows(x64):
    autocli-x86_64-pc-windows-msvc.zip
解压后将文件放入PATH目录:
bash
tar -xzf autocli-*.tar.gz
sudo mv autocli /usr/local/bin/

Chrome Extension (Required for Browser Commands)

Chrome扩展(浏览器命令必备)

  1. Download
    autocli-chrome-extension.zip
    from releases
  2. Extract to any directory
  3. Open
    chrome://extensions
  4. Enable "Developer mode"
  5. Click "Load unpacked" and select the extracted folder
Public API commands (hackernews, devto, etc.) work without the extension.
  1. 从发布页面下载
    autocli-chrome-extension.zip
  2. 解压到任意目录
  3. 打开
    chrome://extensions
  4. 启用「开发者模式」
  5. 点击「加载已解压的扩展程序」并选择解压后的文件夹
公共API命令(如hackernews、devto等)无需安装扩展即可使用。

Basic Usage

基础用法

Discovery Commands

发现类命令

bash
undefined
bash
undefined

List all available commands

列出所有可用命令

autocli --help
autocli --help

List commands for specific site

列出指定平台的命令

autocli twitter --help autocli bilibili --help
autocli twitter --help autocli bilibili --help

Run diagnostics

运行诊断检查

autocli doctor
autocli doctor

List all sites and their command counts

列出所有平台及其命令数量

autocli list
undefined
autocli list
undefined

Public API Commands (No Browser Required)

公共API命令(无需浏览器)

bash
undefined
bash
undefined

Hacker News top stories

获取Hacker News热门文章

autocli hackernews top --limit 10
autocli hackernews top --limit 10

Hacker News search

Hacker News搜索

autocli hackernews search "rust" --limit 5
autocli hackernews search "rust" --limit 5

Dev.to top articles

获取Dev.to热门文章

autocli devto top --limit 10
autocli devto top --limit 10

Lobsters hot stories

获取Lobsters热门文章

autocli lobsters hot --limit 10
autocli lobsters hot --limit 10

Stack Overflow hot questions

获取Stack Overflow热门问题

autocli stackoverflow hot --limit 10
autocli stackoverflow hot --limit 10

arXiv paper search

arXiv论文搜索

autocli arxiv search "machine learning" --limit 5
autocli arxiv search "machine learning" --limit 5

Wikipedia search

Wikipedia搜索

autocli wikipedia search "rust programming"
autocli wikipedia search "rust programming"

Linux.do forum hot topics

获取Linux.do论坛热门话题

autocli linux-do hot --limit 10
undefined
autocli linux-do hot --limit 10
undefined

Browser-Based Commands (Requires Extension + Login)

基于浏览器的命令(需安装扩展并登录)

bash
undefined
bash
undefined

Twitter trending topics

获取Twitter热门话题

autocli twitter trending
autocli twitter trending

Twitter search

Twitter搜索

autocli twitter search "rust lang" --limit 10
autocli twitter search "rust lang" --limit 10

Twitter timeline

获取Twitter时间线

autocli twitter timeline --limit 20
autocli twitter timeline --limit 20

Reddit frontpage

获取Reddit首页内容

autocli reddit frontpage --limit 10
autocli reddit frontpage --limit 10

Reddit subreddit posts

获取Reddit指定社区帖子

autocli reddit subreddit rust --limit 15
autocli reddit subreddit rust --limit 15

Bilibili hot videos

获取B站热门视频

autocli bilibili hot --limit 20
autocli bilibili hot --limit 20

Bilibili search

B站搜索

autocli bilibili search "rust programming" --limit 10
autocli bilibili search "rust programming" --limit 10

Xiaohongshu search

小红书搜索

autocli xiaohongshu search "travel" --limit 10
autocli xiaohongshu search "travel" --limit 10

Zhihu hot topics

获取知乎热门话题

autocli zhihu hot --limit 10
autocli zhihu hot --limit 10

YouTube search

YouTube搜索

autocli youtube search "rust tutorial" --limit 5
autocli youtube search "rust tutorial" --limit 5

Weibo hot topics

获取微博热门话题

autocli weibo hot --limit 10
undefined
autocli weibo hot --limit 10
undefined

Output Formats

输出格式

AutoCLI supports multiple output formats for easy integration with other tools:
bash
undefined
AutoCLI支持多种输出格式,便于与其他工具集成:
bash
undefined

Table format (default)

表格格式(默认)

autocli hackernews top --limit 5
autocli hackernews top --limit 5

JSON output

JSON输出

autocli hackernews top --limit 5 --format json
autocli hackernews top --limit 5 --format json

YAML output

YAML输出

autocli hackernews top --limit 5 --format yaml
autocli hackernews top --limit 5 --format yaml

CSV output

CSV输出

autocli hackernews top --limit 5 --format csv
autocli hackernews top --limit 5 --format csv

Markdown output

Markdown输出

autocli hackernews top --limit 5 --format markdown
undefined
autocli hackernews top --limit 5 --format markdown
undefined

JSON Processing with jq

使用jq处理JSON

bash
undefined
bash
undefined

Extract specific fields

提取指定字段

autocli hackernews top --limit 5 --format json | jq '.[].title'
autocli hackernews top --limit 5 --format json | jq '.[].title'

Filter by score

按分数过滤

autocli hackernews top --limit 20 --format json | jq '.[] | select(.score > 100)'
autocli hackernews top --limit 20 --format json | jq '.[] | select(.score > 100)'

Count results

统计结果数量

autocli reddit frontpage --limit 50 --format json | jq 'length'
undefined
autocli reddit frontpage --limit 50 --format json | jq 'length'
undefined

AI-Powered Features

AI驱动功能

AutoCLI integrates with autocli.ai for AI-powered adapter generation and sharing.
AutoCLI与 autocli.ai 集成,支持AI驱动的适配器生成与共享。

Authentication

身份验证

bash
undefined
bash
undefined

Authenticate with autocli.ai

登录autocli.ai

autocli auth

This opens your browser to get an API token and saves it to `~/.autocli/config.json`.
autocli auth

该命令会打开浏览器获取API令牌,并将其保存到 `~/.autocli/config.json`。

Generate Adapters with AI

AI生成适配器

Use the Chrome extension to visually select data elements:
  1. Navigate to any website in Chrome
  2. Click the AutoCLI extension icon
  3. Use the selector tool to pick data elements
  4. Click "Generate" to let AI create the adapter
  5. The adapter is saved locally and synced to autocli.ai
使用Chrome扩展可视化选择数据元素:
  1. 在Chrome中打开任意网站
  2. 点击AutoCLI扩展图标
  3. 使用选择工具选取数据元素
  4. 点击「生成」按钮,由AI创建适配器
  5. 适配器会保存到本地并同步到autocli.ai

Search Community Adapters

搜索社区适配器

bash
undefined
bash
undefined

Search by URL

通过URL搜索

Search by domain

通过域名搜索

autocli search producthunt.com

Searches autocli.ai for community-shared adapters. Select one to download and use immediately.
autocli search producthunt.com

搜索autocli.ai上的社区共享适配器,选择后即可下载并立即使用。

Environment Variables

环境变量

bash
undefined
bash
undefined

Override API server (default: https://www.autocli.ai)

覆盖API服务器地址(默认:https://www.autocli.ai)

export AUTOCLI_API_BASE=https://custom-server.com
undefined
export AUTOCLI_API_BASE=https://custom-server.com
undefined

Advanced Usage

高级用法

Download Media

媒体下载

bash
undefined
bash
undefined

Download YouTube video (requires yt-dlp)

下载YouTube视频(需安装yt-dlp)

autocli youtube download "https://youtube.com/watch?v=..."
autocli youtube download "https://youtube.com/watch?v=..."

Download Bilibili video

下载B站视频

autocli bilibili download "https://www.bilibili.com/video/..."
autocli bilibili download "https://www.bilibili.com/video/..."

Download Xiaohongshu content

下载小红书内容

autocli xiaohongshu download "https://www.xiaohongshu.com/..."
undefined
autocli xiaohongshu download "https://www.xiaohongshu.com/..."
undefined

Social Media Interactions

社交媒体交互

bash
undefined
bash
undefined

Twitter operations

Twitter操作

autocli twitter post "Hello from AutoCLI!" autocli twitter reply TWEET_ID "Great post!" autocli twitter like TWEET_ID autocli twitter bookmark TWEET_ID autocli twitter follow USERNAME autocli twitter bookmarks --limit 10
autocli twitter post "Hello from AutoCLI!" autocli twitter reply TWEET_ID "Great post!" autocli twitter like TWEET_ID autocli twitter bookmark TWEET_ID autocli twitter follow USERNAME autocli twitter bookmarks --limit 10

Reddit operations

Reddit操作

autocli reddit upvote POST_ID autocli reddit comment POST_ID "Interesting discussion" autocli reddit subscribe SUBREDDIT_NAME autocli reddit saved --limit 10
autocli reddit upvote POST_ID autocli reddit comment POST_ID "Interesting discussion" autocli reddit subscribe SUBREDDIT_NAME autocli reddit saved --limit 10

Xiaohongshu publishing

小红书发布

autocli xiaohongshu publish --title "My Post" --content "Content here" --images "image1.jpg,image2.jpg"
undefined
autocli xiaohongshu publish --title "My Post" --content "Content here" --images "image1.jpg,image2.jpg"
undefined

Job Search (BOSS Zhipin)

求职搜索(BOSS直聘)

bash
undefined
bash
undefined

Search jobs

搜索职位

autocli boss search "Rust developer" --city "Beijing"
autocli boss search "Rust developer" --city "Beijing"

Greet recruiter

问候招聘者

autocli boss greet JOB_ID "Hello, I'm interested in this position"
autocli boss greet JOB_ID "Hello, I'm interested in this position"

Batch greet

批量问候

autocli boss batchgreet JOB_ID1,JOB_ID2,JOB_ID3 "Hello, I'm interested"
autocli boss batchgreet JOB_ID1,JOB_ID2,JOB_ID3 "Hello, I'm interested"

View chat list

查看聊天列表

autocli boss chatlist
autocli boss chatlist

View messages

查看聊天消息

autocli boss chatmsg BOSS_ID
undefined
autocli boss chatmsg BOSS_ID
undefined

Shell Completion

Shell补全

Generate shell completions for better autocomplete experience:
bash
undefined
生成Shell补全脚本,提升自动补全体验:
bash
undefined

Bash

Bash

autocli completion bash >> ~/.bashrc
autocli completion bash >> ~/.bashrc

Zsh

Zsh

autocli completion zsh >> ~/.zshrc
autocli completion zsh >> ~/.zshrc

Fish

Fish

autocli completion fish > ~/.config/fish/completions/autocli.fish
autocli completion fish > ~/.config/fish/completions/autocli.fish

PowerShell

PowerShell

autocli completion powershell >> $PROFILE
undefined
autocli completion powershell >> $PROFILE
undefined

Configuration

配置

AutoCLI stores configuration in
~/.autocli/config.json
:
json
{
  "api_token": "your-autocli-ai-token",
  "browser_ws_endpoint": "ws://localhost:9222",
  "default_format": "table",
  "adapters_dir": "~/.autocli/adapters"
}
AutoCLI的配置文件存储在
~/.autocli/config.json
json
{
  "api_token": "your-autocli-ai-token",
  "browser_ws_endpoint": "ws://localhost:9222",
  "default_format": "table",
  "adapters_dir": "~/.autocli/adapters"
}

Creating Custom Adapters

创建自定义适配器

AutoCLI uses declarative YAML pipelines for custom adapters. Adapters are stored in
~/.autocli/adapters/
.
AutoCLI使用声明式YAML流水线创建自定义适配器,适配器存储在
~/.autocli/adapters/
目录下。

Example Adapter Structure

适配器示例结构

yaml
name: example-site
description: Example site adapter
version: 1.0.0
author: Your Name
commands:
  - name: hot
    description: Get hot posts
    mode: public  # or 'browser'
    pipeline:
      - step: fetch
        url: https://api.example.com/hot
        method: GET
      - step: extract
        selector: $.data[*]
        fields:
          - name: title
            path: $.title
          - name: url
            path: $.url
          - name: score
            path: $.score
      - step: output
        format: table
yaml
name: example-site
description: Example site adapter
version: 1.0.0
author: Your Name
commands:
  - name: hot
    description: Get hot posts
    mode: public  # or 'browser'
    pipeline:
      - step: fetch
        url: https://api.example.com/hot
        method: GET
      - step: extract
        selector: $.data[*]
        fields:
          - name: title
            path: $.title
          - name: url
            path: $.url
          - name: score
            path: $.score
      - step: output
        format: table

Adapter Modes

适配器模式

  • public: Uses public APIs, no authentication needed
  • browser: Requires Chrome extension and browser session
  • public:使用公共API,无需身份验证
  • browser:需要Chrome扩展和浏览器会话

Pipeline Steps

流水线步骤

  1. fetch: HTTP request to target URL
  2. extract: Extract data using JSON path or CSS selectors
  3. transform: Modify/filter extracted data
  4. output: Format and display results
  1. fetch:向目标URL发送HTTP请求
  2. extract:使用JSON路径或CSS选择器提取数据
  3. transform:修改/过滤提取的数据
  4. output:格式化并展示结果

Integration with AI Agents

与AI Agent集成

Register in Agent Configuration

在Agent配置中注册

Add to
.cursorrules
or
AGENT.md
:
bash
undefined
添加到
.cursorrules
AGENT.md
bash
undefined

Discover all available AutoCLI commands

发现所有可用的AutoCLI命令

autocli list
autocli list

Use specific commands as needed

根据需要使用特定命令

autocli hackernews top --limit 10 --format json autocli twitter search "topic" --format json
undefined
autocli hackernews top --limit 10 --format json autocli twitter search "topic" --format json
undefined

Register Local CLI Tools

注册本地CLI工具

bash
undefined
bash
undefined

Register local CLI tools for AI agent access

注册本地CLI工具供AI Agent访问

autocli register gh # GitHub CLI autocli register docker # Docker CLI autocli register kubectl # Kubernetes CLI
undefined
autocli register gh # GitHub CLI autocli register docker # Docker CLI autocli register kubectl # Kubernetes CLI
undefined

Common Patterns

常见使用模式

Pipeline Data from Multiple Sites

多平台数据流水线

bash
undefined
bash
undefined

Get tech news from multiple sources

从多个来源获取科技新闻

autocli hackernews top --limit 5 --format json > hn.json autocli lobsters hot --limit 5 --format json > lobsters.json autocli devto top --limit 5 --format json > devto.json jq -s 'add' hn.json lobsters.json devto.json > combined.json
undefined
autocli hackernews top --limit 5 --format json > hn.json autocli lobsters hot --limit 5 --format json > lobsters.json autocli devto top --limit 5 --format json > devto.json jq -s 'add' hn.json lobsters.json devto.json > combined.json
undefined

Monitor Topics Across Platforms

跨平台话题监控

bash
#!/bin/bash
TOPIC="rust"

echo "=== Hacker News ==="
autocli hackernews search "$TOPIC" --limit 3

echo "=== Reddit ==="
autocli reddit search "$TOPIC" --limit 3

echo "=== Twitter ==="
autocli twitter search "$TOPIC" --limit 3

echo "=== Dev.to ==="
autocli devto tag "$TOPIC" --limit 3
bash
#!/bin/bash
TOPIC="rust"

echo "=== Hacker News ==="
autocli hackernews search "$TOPIC" --limit 3

echo "=== Reddit ==="
autocli reddit search "$TOPIC" --limit 3

echo "=== Twitter ==="
autocli twitter search "$TOPIC" --limit 3

echo "=== Dev.to ==="
autocli devto tag "$TOPIC" --limit 3

Archive Bookmarks

书签归档

bash
undefined
bash
undefined

Export Twitter bookmarks

导出Twitter书签

autocli twitter bookmarks --limit 100 --format json > bookmarks_$(date +%Y%m%d).json
autocli twitter bookmarks --limit 100 --format json > bookmarks_$(date +%Y%m%d).json

Export Reddit saved posts

导出Reddit已保存帖子

autocli reddit saved --limit 100 --format json > reddit_saved_$(date +%Y%m%d).json
undefined
autocli reddit saved --limit 100 --format json > reddit_saved_$(date +%Y%m%d).json
undefined

Troubleshooting

故障排查

Browser Connection Issues

浏览器连接问题

bash
undefined
bash
undefined

Run diagnostics

运行诊断检查

autocli doctor
autocli doctor

Check Chrome extension is loaded and active

确认Chrome扩展已加载并激活

Verify browser is running on ws://localhost:9222

验证浏览器在ws://localhost:9222运行

Restart Chrome with remote debugging:

重启Chrome并开启远程调试:

macOS:

macOS:

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222

Linux:

Linux:

google-chrome --remote-debugging-port=9222
google-chrome --remote-debugging-port=9222

Windows:

Windows:

"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
undefined
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
undefined

Authentication Errors

身份验证错误

bash
undefined
bash
undefined

Re-authenticate

重新验证身份

autocli auth
autocli auth

Verify token in config

验证配置文件中的令牌

cat ~/.autocli/config.json | jq '.api_token'
cat ~/.autocli/config.json | jq '.api_token'

Clear and re-auth

清除配置并重新验证

rm ~/.autocli/config.json autocli auth
undefined
rm ~/.autocli/config.json autocli auth
undefined

Command Not Found

命令未找到

bash
undefined
bash
undefined

Verify installation

验证安装情况

which autocli
which autocli

Check available commands

查看可用命令

autocli --help
autocli --help

Update to latest version

更新到最新版本

Rate Limiting

请求频率限制

Some sites may rate-limit requests. Use
--limit
to reduce request size:
bash
undefined
部分平台可能会限制请求频率,使用
--limit
参数减少请求数量:
bash
undefined

Reduce limit to avoid rate limiting

减少结果数量以避免频率限制

autocli twitter search "topic" --limit 5
autocli twitter search "topic" --limit 5

Add delays between requests (if supported)

添加请求延迟(如支持)

autocli twitter timeline --limit 10 --delay 1000
undefined
autocli twitter timeline --limit 10 --delay 1000
undefined

Extension Not Connecting

扩展无法连接

  1. Ensure Chrome extension is enabled
  2. Check extension has permissions for target sites
  3. Verify autocli daemon is running
  4. Restart browser and extension
  1. 确保Chrome扩展已启用
  2. 检查扩展对目标网站的权限
  3. 验证autocli守护进程正在运行
  4. 重启浏览器和扩展

Performance Tips

性能优化建议

  • Use
    --format json
    for processing with other tools
  • Limit results with
    --limit
    for faster responses
  • Public API commands are faster than browser commands
  • Browser commands reuse logged-in sessions (no token refresh)
  • Use
    --quiet
    flag to suppress progress output in scripts
  • 使用
    --format json
    格式以便与其他工具配合处理
  • 使用
    --limit
    限制结果数量以加快响应速度
  • 公共API命令比浏览器命令更快
  • 浏览器命令会复用已登录会话(无需刷新令牌)
  • 在脚本中使用
    --quiet
    参数抑制进度输出

Resources

相关资源