linkedin-sourcer

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LinkedIn Sourcer

LinkedIn 候选人寻访工具

Source candidates from LinkedIn, analyze their profiles, and evaluate fit against role requirements using the
linkedin_scraper
library (v3.0+, Playwright-based, async).
使用
linkedin_scraper
库(v3.0+版本,基于Playwright,异步)从LinkedIn寻访候选人、分析其资料并评估与职位要求的匹配度。

Prerequisites

前置条件

Ensure dependencies are installed before any scraping:
bash
pip install linkedin-scraper
playwright install chromium
An authenticated session file (
session.json
) is required. If one does not exist, create one:
Programmatic login (using credentials):
bash
python3 scripts/create_session.py --email USER@EXAMPLE.COM --password PASS
Or via environment variables:
bash
export LINKEDIN_EMAIL=user@example.com
export LINKEDIN_PASSWORD=mypassword
python3 scripts/create_session.py
Manual login (opens a browser window — use when programmatic login fails due to CAPTCHA/2FA):
bash
python3 scripts/create_session.py
The session file is reusable until LinkedIn expires it. See
references/linkedin_scraper_api.md
for browser configuration options.
在开始爬取前,请确保已安装所有依赖项:
bash
pip install linkedin-scraper
playwright install chromium
需要一个已认证的会话文件(
session.json
)。如果没有该文件,请按以下方式创建:
程序化登录(使用账号密码):
bash
python3 scripts/create_session.py --email USER@EXAMPLE.COM --password PASS
或者通过环境变量:
bash
export LINKEDIN_EMAIL=user@example.com
export LINKEDIN_PASSWORD=mypassword
python3 scripts/create_session.py
手动登录(会打开浏览器窗口——当程序化登录因验证码/双因素认证失败时使用):
bash
python3 scripts/create_session.py
该会话文件可重复使用,直到LinkedIn将其过期。有关浏览器配置选项,请参阅
references/linkedin_scraper_api.md

Workflow Decision Tree

工作流决策树

Determine the task type:
  1. "Scrape this profile / these profiles" → Profile Scraping
  2. "Find candidates for this role" → Candidate Search
  3. "Evaluate this candidate for this role" → Candidate Evaluation
  4. "Compare these candidates" → Candidate Comparison
确定任务类型:
  1. "爬取该资料/这些资料" → 资料爬取
  2. "为该职位寻找候选人" → 候选人搜索
  3. "评估该候选人是否适合该职位" → 候选人评估
  4. "对比这些候选人" → 候选人对比

1. Profile Scraping

1. 资料爬取

Run
scripts/scrape_profile.py
to extract structured profile data:
bash
python3 scripts/scrape_profile.py "https://linkedin.com/in/username" --session session.json
For multiple profiles:
bash
python3 scripts/scrape_profile.py URL1 URL2 URL3 --delay 2 --output profiles.json
Output is JSON with: name, headline, location, about, experiences, educations, skills.
For inline scraping within custom code, see
references/linkedin_scraper_api.md
→ PersonScraper.
运行
scripts/scrape_profile.py
以提取结构化的资料数据:
bash
python3 scripts/scrape_profile.py "https://linkedin.com/in/username" --session session.json
若要爬取多个资料:
bash
python3 scripts/scrape_profile.py URL1 URL2 URL3 --delay 2 --output profiles.json
输出为JSON格式,包含以下字段:姓名、职位头衔、所在地、个人简介、工作经历、教育背景、技能。
若要在自定义代码中嵌入爬取功能,请参阅
references/linkedin_scraper_api.md
中的PersonScraper部分。

2. Candidate Search

2. 候选人搜索

Generate boolean search queries the user can paste into LinkedIn or Google to find candidates. See
references/sourcing_workflows.md
→ Boolean Search String Patterns for templates and examples. Tailor the boolean string to the specific role requirements provided.
生成可粘贴到LinkedIn或Google中的布尔搜索查询语句以寻找候选人。有关模板和示例,请参阅
references/sourcing_workflows.md
中的「布尔搜索字符串模板」部分。请根据提供的具体职位要求调整布尔搜索字符串。

3. Candidate Evaluation

3. 候选人评估

After scraping profile(s), evaluate fit against a job description:
  1. Scrape the candidate's profile
  2. Apply the scorecard template from
    references/sourcing_workflows.md
    → Candidate Scorecard Template
  3. Rate each criterion (1-5) with notes based on the scraped data
  4. Assign an overall fit rating: STRONG_FIT, GOOD_FIT, PARTIAL_FIT, or WEAK_FIT
  5. Identify strengths, concerns, and key questions for outreach
Use the evaluation heuristics in
references/sourcing_workflows.md
→ Evaluation Heuristics to guide ratings.
For quick single-candidate output, use the Candidate Summary Template instead.
在爬取资料后,根据职位描述评估匹配度:
  1. 爬取候选人的资料
  2. 使用
    references/sourcing_workflows.md
    中的候选人评分卡模板
  3. 根据爬取的数据为每个评估标准打分(1-5分)并添加备注
  4. 给出整体匹配度评级:STRONG_FIT(高度匹配)、GOOD_FIT(良好匹配)、PARTIAL_FIT(部分匹配)、WEAK_FIT(匹配度低)
  5. 列出候选人的优势、待关注问题以及沟通时的关键问题
请使用
references/sourcing_workflows.md
中的评估准则(Evaluation Heuristics)来指导打分。
若要快速生成单个候选人的评估结果,请使用候选人总结模板。

4. Candidate Comparison

4. 候选人对比

When evaluating multiple candidates for the same role:
  1. Scrape all candidate profiles
  2. Apply the comparison table from
    references/sourcing_workflows.md
    → Candidate Comparison Table
  3. Rank candidates with rationale
当为同一职位评估多名候选人时:
  1. 爬取所有候选人的资料
  2. 使用
    references/sourcing_workflows.md
    中的候选人对比表格
  3. 为候选人排名并说明理由

Error Handling

错误处理

  • AuthenticationError → Session expired. Re-run
    scripts/create_session.py
    with credentials or manual login
  • RateLimitError → Wait and retry. Increase
    --delay
    between requests
  • ProfileNotFoundError → Profile is private or URL is invalid
See
references/linkedin_scraper_api.md
→ Error Handling for try/except patterns.
  • AuthenticationError → 会话已过期。请使用账号密码或手动登录重新运行
    scripts/create_session.py
  • RateLimitError → 请等待后重试。增加请求之间的
    --delay
    参数值
  • ProfileNotFoundError → 资料为私有或URL无效
有关try/except异常处理示例,请参阅
references/linkedin_scraper_api.md
中的错误处理部分。

Rate Limiting

速率限制

Always use delays between requests (default 2s in scripts). For large batches, increase to 3-5s. Never scrape aggressively — respect LinkedIn's rate limits.
请始终在请求之间设置延迟(脚本中默认是2秒)。对于大批量爬取,请将延迟增加至3-5秒。请勿过度爬取——请遵守LinkedIn的速率限制规则。