Search Results: web-scraping

Found 256 Skills

puppeteer

使用Puppeteer（Google）进行浏览器自动化和PDF生成。支持无头Chrome控制，用于网页爬虫、截图、PDF生成和自动化测试。

playwright-patterns

Use when writing Playwright automation code, building web scrapers, or creating E2E tests - provides best practices for selector strategies, waiting patterns, and robust automation that minimizes flakiness

🇺🇸|EnglishTranslated

Data Processingvibery-studio/templates

udemy-crawler

Extract Udemy course content to markdown. Use when user asks to scrape/crawl Udemy course pages.

🇺🇸|EnglishTranslated

Data Processingleobrival/serum-plugins-o...

web-crawler

High-performance Rust web crawler with stealth mode, LLM-ready Markdown export, multi-format output, sitemap discovery, and robots.txt support. Optimized for content extraction, site mapping, structure analysis, and LLM/RAG pipelines.

🇺🇸|EnglishTranslated

Tools & Utilitiesvm0-ai/vm0-skills

scrapeninja

High-performance web scraping API with Chrome TLS fingerprint and JS rendering

🇺🇸|EnglishTranslated

Data Processingbrightdata/skills

data-feeds

Extract structured data from 40+ websites including Amazon, LinkedIn, Instagram, TikTok, Facebook, YouTube, and more. Uses Bright Data's Web Data APIs with automatic polling. Returns clean JSON with product details, profiles, reviews, posts, and comments.

🇺🇸|EnglishTranslated

2 scripts/Attention

Data Processingjinfanzheng/kode-sdk-csha...

data-base

Data acquisition for web scraping and data collection. Use when user needs "爬取数据/抓取网页/scrape data". Outputs structured JSON/CSV for analysis.

🇺🇸|EnglishTranslated

1 scripts/Attention

Automationyelban/camofox-browser-sk...

camofox-browser

Anti-detection browser automation using Camoufox (Firefox fork with C++ fingerprint spoofing). Use when standard browser tools get blocked by Cloudflare, Akamai, or bot detection. Triggers include "stealth browse", "anti-detection", "bypass bot", "camofox", "blocked by Cloudflare", scraping protected sites (X/Twitter, Amazon, Product Hunt), or when agent-browser/playwright fails with bot detection errors.

🇺🇸|EnglishTranslated

4 scripts/Attention

Data Processingagentbay-ai/agentbay-skil...

web-scraper

Scrape web pages and save as HTML or Markdown (with text and images). Minimal dependencies - only requests and beautifulsoup4. Use when the user provides a URL and wants to download/archive the content locally.

🇺🇸|EnglishTranslated

1 scripts/Checked

Automationrysweet/amplihack

authenticated-web-scraper

Scrape authenticated websites from WSL2 using Edge CDP. Launches headed Edge for user auth, then headless scraping via Chrome DevTools Protocol. Use when mirroring internal wikis, docs sites, or any site requiring 2FA/SSO login.

🇺🇸|EnglishTranslated

2 scripts/Attention

Data Processingd4vinci/scrapling

scrapling-official

Scrape web pages using Scrapling with anti-bot bypass (like Cloudflare Turnstile), stealth headless browsing, spiders framework, adaptive scraping, and JavaScript rendering. Use when asked to scrape, crawl, or extract data from websites; web_fetch fails; the site has anti-bot protections; write Python code to scrape/crawl; or write spiders.

🇺🇸|EnglishTranslated

4 scripts/Checked

Tools & Utilitiesibigqiang/feedgrab

feedgrab

Universal content grabber — fetch any URL and return structured Markdown. Supports X/Twitter, WeChat, Xiaohongshu, YouTube, GitHub, Feishu/Lark, Bilibili, Telegram, RSS, and any web page. Use when user provides a URL and wants its content extracted.

🇺🇸|EnglishTranslated