firecrawl-research

Original🇺🇸 English
Translated
3 scriptsChecked / no sensitive code detected

This skill should be used when the user requests to research topics using FireCrawl, enrich notes with web sources, search and scrape information, or write scientific/academic papers. It extracts research topics from markdown files, creates research documents with scraped sources, generates BibTeX bibliographies from research results, and provides Pandoc/MyST templates for academic writing with citation management.

3installs
Added on

NPX Install

npx skill4agent add glebis/claude-skills firecrawl-research

FireCrawl Research

Overview

Enrich research documents by automatically searching and scraping web sources using the FireCrawl API. Extract research topics from markdown files and generate comprehensive research documents with source material.

When to Use This Skill

Use this skill when the user:
  • Says "Research this topic using FireCrawl"
  • Requests to enrich notes or documents with web sources
  • Wants to gather information about topics listed in a markdown file
  • Needs to search and scrape multiple topics systematically

How It Works

1. Topic Extraction

The script automatically extracts research topics from markdown files using two methods:
Method 1: Headers
markdown
## Spatial Reasoning in AI
### Computer Vision Applications
Both
Spatial Reasoning in AI
and
Computer Vision Applications
become research topics.
Method 2: Research Tags
markdown
- [research] Large Language Models for robotics
- [search] Theory of Mind in autonomous driving
Both tagged items become research topics.

2. Search and Scrape

For each topic:
  1. Searches FireCrawl with the topic as query
  2. Retrieves up to N results (default: 5)
  3. Automatically scrapes full content from each result
  4. Extracts markdown-formatted content (main content only)

3. Output Generation

Creates new markdown files in the specified output directory:
  • One file per topic
  • Filename:
    {topic}_{timestamp}.md
  • Contains: title, date, sources count, full scraped content
  • Each source includes: title, URL, markdown content

Usage

Basic Usage

bash
python scripts/firecrawl_research.py research.md
Outputs to current directory.

Specify Output Directory

bash
python scripts/firecrawl_research.py research.md ./output
Creates files in
./output/
folder.

Limit Results Per Topic

bash
python scripts/firecrawl_research.py research.md ./output 3
Retrieves maximum 3 results per topic.

Configuration

API Key Setup

  1. Copy
    .env.example
    to
    .env
    :
    bash
    cp .env.example .env
  2. Add FireCrawl API key:
    FIRECRAWL_API_KEY=fc-your-actual-api-key
The script automatically loads the API key from the skill's
.env
file.

Rate Limiting

The script includes automatic rate limiting for FireCrawl's free tier:
  • Free tier limit: 5 requests/minute
  • Built-in delay: 12 seconds between topics
  • Prevents API errors and credit exhaustion
When processing multiple topics, expect:
  • 5 topics: ~1 minute
  • 10 topics: ~2 minutes
  • 20 topics: ~4 minutes

Workflow Example

User request: "Research these AI topics using FireCrawl"
Input file (
ai-research.md
):
markdown
# AI Research Topics

## Spatial Reasoning in Vision-Language Models

- [research] Embodied AI for robotics
- [research] Computer Use Agents
Command:
bash
python scripts/firecrawl_research.py ai-research.md ./research_output 5
Output:
research_output/
├── Spatial_Reasoning_in_Vision-Language_Models_20251122_140530.md
├── Embodied_AI_for_robotics_20251122_140542.md
└── Computer_Use_Agents_20251122_140554.md
Each file contains:
  • Topic title
  • Timestamp
  • Source count
  • Full scraped content from up to 5 sources
  • Source URLs

Common Patterns

Pattern 1: Quick Research

Extract topics from existing notes, research them, save to current folder:
bash
python scripts/firecrawl_research.py my-notes.md

Pattern 2: Organized Research

Create dedicated output folder for research results:
bash
python scripts/firecrawl_research.py topics.md ./research_results

Pattern 3: Deep Dive

Increase results per topic for comprehensive coverage:
bash
python scripts/firecrawl_research.py topics.md ./deep_research 10

Pattern 4: Obsidian Vault Integration

Direct output to vault's research folder:
bash
python scripts/firecrawl_research.py topics.md ~/Brains/brain/Research

Error Handling

"API key not found"

Create
.env
file in skill folder with
FIRECRAWL_API_KEY=...

"Rate limit exceeded"

  • Free tier: 5 req/min
  • Script has 12s delay built-in
  • If still hitting limit, reduce topics or wait between runs

"Insufficient credits"

  • Check FireCrawl account credits
  • Upgrade plan or wait for credit reset

"No topics found"

Add topics to markdown using:
  • ## Header format
  • - [research] Topic format
  • - [search] Topic format

Script Details

Location:
scripts/firecrawl_research.py
Dependencies:
  • python-dotenv
    - Environment variable management
  • requests
    - HTTP requests to FireCrawl API
Install dependencies:
bash
pip install python-dotenv requests
FireCrawl Features Used:
  • /v1/search
    endpoint - Search with automatic scraping
  • scrapeOptions.formats: ['markdown']
    - Markdown output
  • scrapeOptions.onlyMainContent: true
    - Filter noise

Academic Writing Templates

This skill includes templates for writing scientific papers in markdown format.

Available Templates

1. Pandoc Scholarly Paper (
assets/templates/pandoc-scholarly-paper.md
)
  • Standard academic paper format
  • Compatible with Pandoc converter
  • Supports citations via BibTeX
  • Exports to PDF, DOCX, HTML
2. MyST Scientific Paper (
assets/templates/myst-scientific-paper.md
)
  • MyST (Markedly Structured Text) format
  • Advanced cross-referencing
  • Professional scientific publishing
  • Multi-format export (PDF, LaTeX, DOCX)

Using Templates

Copy template to your project:
bash
cp assets/templates/pandoc-scholarly-paper.md my-paper.md
# or
cp assets/templates/myst-scientific-paper.md my-paper.md
Edit content:
  • Update YAML frontmatter (title, authors, affiliations)
  • Write your content in sections
  • Add citations using
    [@AuthorYear]
    (Pandoc) or
    {cite}\
    AuthorYear`` (MyST)
Convert to PDF/DOCX:
bash
python scripts/convert_academic.py my-paper.md pdf
python scripts/convert_academic.py my-paper.md docx
python scripts/convert_academic.py my-paper.md pdf --myst  # For MyST

Bibliography Generation

Convert FireCrawl research results into BibTeX bibliography entries:
bash
python scripts/generate_bibliography.py research_output/*.md -o references.bib
What it does:
  • Extracts URLs and titles from FireCrawl markdown files
  • Generates BibTeX
    @misc
    entries
  • Creates citation keys automatically
  • Adds access dates
Example workflow:
bash
# 1. Research topics
python scripts/firecrawl_research.py topics.md ./research

# 2. Generate bibliography
python scripts/generate_bibliography.py research/*.md -o refs.bib

# 3. Copy template
cp assets/templates/pandoc-scholarly-paper.md paper.md

# 4. Edit paper.md (add content, cite sources)

# 5. Convert to PDF
python scripts/convert_academic.py paper.md pdf

Citation Examples

Pandoc syntax:
markdown
Recent research [@Smith2024] shows...
Multiple studies [@Jones2023; @Brown2024] indicate...
MyST syntax:
markdown
Recent research {cite}`Smith2024` shows...
Multiple studies {cite}`Jones2023,Brown2024` indicate...

Example Bibliography File

An example bibliography is provided in
assets/references.bib
with common entry types:
  • Journal articles (
    @article
    )
  • Conference papers (
    @inproceedings
    )
  • Books (
    @book
    )
  • PhD theses (
    @phdthesis
    )
  • Web resources (
    @misc
    )
  • Preprints (
    @article
    with arXiv)

Tips

  1. Organize topics hierarchically - Use
    ##
    for main topics,
    ###
    for subtopics
  2. Use descriptive names - Topic text becomes filename, make it clear
  3. Batch processing - Group related topics in one file for efficiency
  4. Output organization - Create separate folders for different research projects
  5. Content review - Results are truncated at 3000 chars/source for readability
  6. Academic workflow - Use bibliography generator to cite research sources in papers
  7. Template customization - Modify templates for your field's citation style

Limitations

  • No summarization - Returns raw scraped content, not summaries
  • No deduplication - Duplicate sources may appear across topics
  • No quality ranking - All results treated equally
  • New files only - Does not append to existing files
  • Free tier constraints - Rate limiting affects processing speed