Search Results: crawler

Found 47 Skills

Frontend Developmentdatocms/agent-skills

datocms-frontend-integrations

Patch, extend, or explain DatoCMS front-end integration code inside an existing web project (Next.js App Router, Nuxt, SvelteKit, Astro, plus React/Vue/Svelte component usage). Use for targeted, per-concern work — adding a draft mode endpoint, wiring Preview Links / Visual Editing flows, fixing Content Link overlays, tuning real-time preview updates/subscriptions, setting up cache-tag invalidation/revalidation flows (Next.js revalidateTag or CDN purge by tags), adding robots/sitemap wiring, or hooking up crawler-safe search integration. Also the go-to skill for framework component/hook wiring with react-datocms, vue-datocms, @datocms/svelte, and @datocms/astro: Image/SRCImage/datocms-image, StructuredText, VideoPlayer (React/Vue/Svelte), SEO/meta helpers (renderMetaTags/toHead/Seo), QuerySubscription/QueryListener realtime patterns, ContentLink components, and Site Search (React/Vue). Prefer this skill whenever the user is modifying a live codebase one concern at a time, asking a framework-specific API question, or mixing several front-end concerns in the same patch.

🇺🇸|EnglishTranslated

Data Processingmindrally/skills

scrapy-web-scraping

Expert guidance for building web scrapers and crawlers using the Scrapy Python framework with best practices for spider development, data extraction, and pipeline management.

🇺🇸|EnglishTranslated

Data Processinglancelin111/crawl4ai-skil...

crawl4ai-skill

Web crawling and scraping tool with LLM-optimized output. 网页爬虫爬取工具 | Web crawler, web scraper, spider. DuckDuckGo search, site crawling, dynamic page scraping. 智能搜索爬取 | Free, no API key required.

🇺🇸|EnglishTranslated

Data Processingzlstas/skills

web-scraping-python

Apply Web Scraping with Python practices (Ryan Mitchell). Covers First Scrapers (Ch 1: urllib, BeautifulSoup), HTML Parsing (Ch 2: find, findAll, CSS selectors, regex, lambda), Crawling (Ch 3-4: single-domain, cross-site, crawl models), Scrapy (Ch 5: spiders, items, pipelines, rules), Storing Data (Ch 6: CSV, MySQL, files, email), Reading Documents (Ch 7: PDF, Word, encoding), Cleaning Data (Ch 8: normalization, OpenRefine), NLP (Ch 9: n-grams, Markov, NLTK), Forms & Logins (Ch 10: POST, sessions, cookies), JavaScript (Ch 11: Selenium, headless, Ajax), APIs (Ch 12: REST, undocumented), Image/OCR (Ch 13: Pillow, Tesseract), Avoiding Traps (Ch 14: headers, honeypots), Testing (Ch 15: unittest, Selenium), Parallel (Ch 16: threads, processes), Remote (Ch 17: Tor, proxies), Legalities (Ch 18: robots.txt, CFAA, ethics). Trigger on "web scraping", "BeautifulSoup", "Scrapy", "crawler", "spider", "scraper", "parse HTML", "Selenium scraping", "data extraction".

🇺🇸|EnglishTranslated

2 scripts/Checked

Marketing & Growthagricidaniel/claude-blog

blog-geo

AI citation readiness audit ONLY (does not touch Google rankings, use blog-rewrite for combined Google+AI work). Use whenever the user wants their content to rank in ChatGPT, Perplexity, Claude, Gemini, or Google AI Overviews. AI citation optimization audit scoring blog posts for ChatGPT, Perplexity, and Google AI Overview citability. Evaluates passage-level citability, Q&A formatting, entity clarity, structured data, and AI crawler accessibility. Generates citation capsules and a 0-100 AI Citation Readiness score. Use when user says "geo", "ai citation", "ai optimization", "citation audit", "aeo", "perplexity optimization", "chatgpt citation".

🇺🇸|EnglishTranslated

Marketing & Growthzubair-trabzada/geo-seo-c...

geo

GEO-first SEO analysis tool. Optimizes websites for AI-powered search engines (ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews) while maintaining traditional SEO foundations. Performs full GEO audits, citability scoring, AI crawler analysis, llms.txt generation, brand mention scanning, platform-specific optimization, schema markup, technical SEO, content quality (E-E-A-T), and client-ready GEO report generation. Use when user says "geo", "seo", "audit", "AI search", "AI visibility", "optimize", "citability", "llms.txt", "schema", "brand mentions", "GEO report", or any URL for analysis.

🇺🇸|EnglishTranslated

Data Processingxcrawl-api/xcrawl-skills

xcrawl-crawl

Use this skill for XCrawl crawl tasks, including bulk site crawling, crawler rule design, async status polling, and delivery of crawl output for downstream scrape and search workflows.

🇺🇸|EnglishTranslated

Marketing & Growthrampstackco/claude-skills

seo-site-health-audit

Triage technical SEO findings from Ahrefs Site Audit (and similar crawlers) by SEO impact, not just severity. Use this skill when reviewing crawl results, prioritizing technical fixes, scoping a technical SEO sprint, or after running any site-wide crawl. Triggers on site audit results, technical fix list, crawl errors, technical SEO triage, prioritize technical issues, what should we fix first, broken links, redirect chains. Also triggers when a long list of crawler issues is creating decision paralysis.

🇺🇸|EnglishTranslated

Frontend Developmentkostja94/marketing-skills

rendering-strategies

When the user wants to choose or optimize rendering strategy for SEO. Also use when the user mentions "SSR," "SSG," "CSR," "ISR," "static rendering," "dynamic rendering," "server-side rendering," "client-side rendering," "JavaScript rendering," "pre-rendering," "prerender," "content in initial HTML," or "crawler visibility."

🇺🇸|EnglishTranslated

Tools & Utilitiesnanzhipro/questskills

wechat-article-fetcher

Crawl WeChat Official Account articles, supporting full-text extraction, mixed text and image layout, local image download, and summary generation. Call this when you need to access WeChat Official Account links and convert them to Markdown.

🇨🇳|ChineseTranslated

4 scripts/Checked

Data Processingd4vinci/scrapling

scrapling-official

Scrape web pages using Scrapling with anti-bot bypass (like Cloudflare Turnstile), stealth headless browsing, spiders framework, adaptive scraping, and JavaScript rendering. Use when asked to scrape, crawl, or extract data from websites; web_fetch fails; the site has anti-bot protections; write Python code to scrape/crawl; or write spiders.

🇺🇸|EnglishTranslated

4 scripts/Checked

Data Processingagentic-reserve/blockint-...

blockchain-spider-toolkit

Points to the BlockchainSpider open-source Python/Scrapy toolkit for collecting on-chain data—transfer subgraphs around an address or tx, EVM and Solana block/transaction ingestion, receipts/logs, and optional label plugins. Use when the user wants to build datasets, offline traces, or research pipelines alongside blockchain-analytics-operations and solana-tracing-specialist—not as a substitute for RPC provider ToS, rate limits, or legal review of sensitive crawls.

🇺🇸|EnglishTranslated