Loading...
Loading...
Extract structured data from websites. Use when: collecting competitor pricing; scraping product listings; extracting contact information; gathering research data; monitoring website changes
npx skill4agent add guia-matthieu/clawfu-skills web-scraperExtract structured data from websites using BeautifulSoup and requests - turn any webpage into usable data.
| Claude Does | You Decide |
|---|---|
| Structures analysis frameworks | Strategic priorities |
| Synthesizes market data | Competitive positioning |
| Identifies opportunities | Resource allocation |
| Creates strategic options | Final strategy selection |
| Suggests implementation approaches | Execution decisions |
pip install beautifulsoup4 requests pandas click lxmlpython scripts/main.py scrape https://example.com --selector "h1,h2,p"
python scripts/main.py scrape https://example.com --selector ".product-price"python scripts/main.py links https://example.com
python scripts/main.py links https://example.com --internal-onlypython scripts/main.py emails https://example.com
python scripts/main.py emails https://example.com --depth 2python scripts/main.py structured https://example.com/article --schema article
python scripts/main.py structured https://example.com/product --schema productpython scripts/main.py scrape https://competitor.com/pricing --selector ".price,.plan-name"
# Output:
# Extracted 6 elements
# 1. Starter - $29/mo
# 2. Pro - $99/mo
# 3. Enterprise - Contact uspython scripts/main.py structured https://blog.example.com/post --schema article
# Output: article_data.json
# {
# "title": "How to Scale Your Startup",
# "author": "Jane Doe",
# "date": "2024-01-15",
# "content": "...",
# "word_count": 1523
# }| Selector | Description | Example |
|---|---|---|
| Element type | |
| Class name | |
| Element ID | |
| Tag with class | |
| Has attribute | |
| Direct child | |
| Multiple | |
category: automation
subcategory: data-extraction
dependencies: [beautifulsoup4, requests, pandas]
difficulty: intermediate
time_saved: 5+ hours/week