Loading...
Loading...
Found 172 Skills
Invisible Chrome automation for web scraping via CDP. Use when WebFetch fails or gets blocked (403, 429, Cloudflare, bot protection, JS-rendered pages). Launches your real Chrome install completely hidden, sends commands via Chrome DevTools Protocol. Sites see a normal browser with real extensions - no detectable automation. Learns which domains block and skips straight to stealth on future requests. Also handles form filling, clicking, screenshots, and scraping dynamic content.
This skill should be used when the user requests to research topics using FireCrawl, enrich notes with web sources, search and scrape information, or write scientific/academic papers. It extracts research topics from markdown files, creates research documents with scraped sources, generates BibTeX bibliographies from research results, and provides Pandoc/MyST templates for academic writing with citation management.
Browser automation using Playwright MCP. Navigate websites, fill forms, click elements, take screenshots, and extract data. Use when tasks require web browsing, form submission, web scraping, UI testing, or any browser interaction. NOT when only fetching static content (use curl/wget instead).
Playwright browser automation API, web scraping, and tooling. Covers locator strategies, assertions, API testing, stealth mode, anti-bot bypass, authenticated sessions, screenshots/PDFs, Docker deployment, configuration, debugging, and MCP integration with AI agents. Prevents documented errors including CI timeout hangs, extension testing failures, and navigation issues. Use when automating browsers, scraping protected sites, bypassing bot detection, generating screenshots/PDFs, configuring Playwright Test, troubleshooting Playwright errors, or learning Playwright API patterns. For E2E test architecture, Page Object Models, CI sharding strategies, or test organization patterns, use the e2e-testing skill instead.
Fetch any URL and convert to markdown using Chrome CDP. Supports two modes - auto-capture on page load, or wait for user signal (for pages requiring login). Use when user wants to save a webpage as markdown.
Web content extraction via Jina AI Reader API. Three modes: read (URL to markdown), search (web search + full content), ground (fact-checking). Extracts clean content without exposing server IP.
Batch-process web pages via headless Playwright browser, extract HTML, convert to markdown using Turndown, and save to timestamped scratchpad file. Use when user asks to "capture these pages as markdown", "save web content", "fetch and convert webpages", or needs clean markdown from HTML. All URLs from one prompt → single file at docs/web-captures/<timestamp>.md.
Web scraping, search, and data extraction using Firecrawl API. Use when users need to fetch web content, discover URLs on sites, search the web, or extract structured data from pages.
Unofficial Tavily CLI skill by SYXS for search, extract, crawl, map, and research. USE FOR: - Web research and source discovery - URL content extraction - Website mapping and crawling - Tavily research + follow-up status workflows Must be installed and authenticated. See rules/install.md and rules/security.md.
通过 Chrome CDP 抓取任意 URL 并转换为 Markdown。支持两种模式——页面加载后自动抓取,或等待用户信号(适用于需要登录的页面)。当用户要求将网页保存为 Markdown 时使用。
Extract structured data from multiple web pages using Playwright with built-in ethical crawling practices including rate limiting, robots.txt compliance, and error monitoring. Use when asked to "scrape data from", "extract information from pages", "collect data from site", "crawl multiple pages", or when gathering structured data from websites. Supports pagination, multi-page extraction, data aggregation, and export to CSV/JSON/Markdown. Works with browser_navigate, browser_evaluate, browser_wait_for, and browser_snapshot tools.
Comprehensive web scraping, crawling, and data extraction toolkit powered by Firecrawl API. Provides scripts for single-page scraping (scrape.py), web search (search.py), URL discovery (map.py), multi-page crawling (crawl.py), structured data extraction (extract.py), and autonomous data gathering (agent.py). Use when you need to: (1) extract content from web pages, (2) search and scrape the web, (3) discover URLs on websites, (4) crawl multiple pages, (5) extract structured data with JSON schemas, or (6) autonomously gather data from anywhere on the web. Requires FIRECRAWL_API_KEY environment variable.