linkedin-sourcer

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

LinkedIn Sourcer

LinkedIn 候选人寻访工具

Source candidates from LinkedIn, analyze their profiles, and evaluate fit against role requirements using the

linkedin_scraper

library (v3.0+, Playwright-based, async).

使用

linkedin_scraper

库（v3.0+版本，基于Playwright，异步）从LinkedIn寻访候选人、分析其资料并评估与职位要求的匹配度。

Prerequisites

前置条件

Ensure dependencies are installed before any scraping:

bash

pip install linkedin-scraper
playwright install chromium

An authenticated session file (

session.json

) is required. If one does not exist, create one:

Programmatic login (using credentials):

bash

python3 scripts/create_session.py --email USER@EXAMPLE.COM --password PASS

Or via environment variables:

bash

export LINKEDIN_EMAIL=user@example.com
export LINKEDIN_PASSWORD=mypassword
python3 scripts/create_session.py

Manual login (opens a browser window — use when programmatic login fails due to CAPTCHA/2FA):

bash

python3 scripts/create_session.py

The session file is reusable until LinkedIn expires it. See

references/linkedin_scraper_api.md

for browser configuration options.

在开始爬取前，请确保已安装所有依赖项：

bash

pip install linkedin-scraper
playwright install chromium

需要一个已认证的会话文件(

session.json

)。如果没有该文件，请按以下方式创建：

程序化登录（使用账号密码）：

bash

python3 scripts/create_session.py --email USER@EXAMPLE.COM --password PASS

或者通过环境变量：

bash

export LINKEDIN_EMAIL=user@example.com
export LINKEDIN_PASSWORD=mypassword
python3 scripts/create_session.py

手动登录（会打开浏览器窗口——当程序化登录因验证码/双因素认证失败时使用）：

bash

python3 scripts/create_session.py

该会话文件可重复使用，直到LinkedIn将其过期。有关浏览器配置选项，请参阅

references/linkedin_scraper_api.md

。

Workflow Decision Tree

工作流决策树

Determine the task type:

"Scrape this profile / these profiles" → Profile Scraping
"Find candidates for this role" → Candidate Search
"Evaluate this candidate for this role" → Candidate Evaluation
"Compare these candidates" → Candidate Comparison

确定任务类型：

"爬取该资料/这些资料" → 资料爬取
"为该职位寻找候选人" → 候选人搜索
"评估该候选人是否适合该职位" → 候选人评估
"对比这些候选人" → 候选人对比

1. Profile Scraping

1. 资料爬取

Run

scripts/scrape_profile.py

to extract structured profile data:

bash

python3 scripts/scrape_profile.py "https://linkedin.com/in/username" --session session.json

For multiple profiles:

bash

python3 scripts/scrape_profile.py URL1 URL2 URL3 --delay 2 --output profiles.json

Output is JSON with: name, headline, location, about, experiences, educations, skills.

For inline scraping within custom code, see

references/linkedin_scraper_api.md

→ PersonScraper.

运行

scripts/scrape_profile.py

以提取结构化的资料数据：

bash

python3 scripts/scrape_profile.py "https://linkedin.com/in/username" --session session.json

若要爬取多个资料：

bash

python3 scripts/scrape_profile.py URL1 URL2 URL3 --delay 2 --output profiles.json

输出为JSON格式，包含以下字段：姓名、职位头衔、所在地、个人简介、工作经历、教育背景、技能。

若要在自定义代码中嵌入爬取功能，请参阅

references/linkedin_scraper_api.md

中的PersonScraper部分。

2. Candidate Search

2. 候选人搜索

Generate boolean search queries the user can paste into LinkedIn or Google to find candidates. See

references/sourcing_workflows.md

→ Boolean Search String Patterns for templates and examples. Tailor the boolean string to the specific role requirements provided.

生成可粘贴到LinkedIn或Google中的布尔搜索查询语句以寻找候选人。有关模板和示例，请参阅

references/sourcing_workflows.md

中的「布尔搜索字符串模板」部分。请根据提供的具体职位要求调整布尔搜索字符串。

3. Candidate Evaluation

3. 候选人评估

After scraping profile(s), evaluate fit against a job description:

Scrape the candidate's profile
Apply the scorecard template from
```
references/sourcing_workflows.md
```
→ Candidate Scorecard Template
Rate each criterion (1-5) with notes based on the scraped data
Assign an overall fit rating: STRONG_FIT, GOOD_FIT, PARTIAL_FIT, or WEAK_FIT
Identify strengths, concerns, and key questions for outreach

Use the evaluation heuristics in

references/sourcing_workflows.md

→ Evaluation Heuristics to guide ratings.

For quick single-candidate output, use the Candidate Summary Template instead.

在爬取资料后，根据职位描述评估匹配度：

爬取候选人的资料
使用
```
references/sourcing_workflows.md
```
中的候选人评分卡模板
根据爬取的数据为每个评估标准打分（1-5分）并添加备注
给出整体匹配度评级：STRONG_FIT（高度匹配）、GOOD_FIT（良好匹配）、PARTIAL_FIT（部分匹配）、WEAK_FIT（匹配度低）
列出候选人的优势、待关注问题以及沟通时的关键问题

请使用

references/sourcing_workflows.md

中的评估准则（Evaluation Heuristics）来指导打分。

若要快速生成单个候选人的评估结果，请使用候选人总结模板。

4. Candidate Comparison

4. 候选人对比

When evaluating multiple candidates for the same role:

Scrape all candidate profiles
Apply the comparison table from
```
references/sourcing_workflows.md
```
→ Candidate Comparison Table
Rank candidates with rationale

当为同一职位评估多名候选人时：

爬取所有候选人的资料
使用
```
references/sourcing_workflows.md
```
中的候选人对比表格
为候选人排名并说明理由

Error Handling

错误处理

AuthenticationError → Session expired. Re-run
```
scripts/create_session.py
```
with credentials or manual login
RateLimitError → Wait and retry. Increase
```
--delay
```
between requests
ProfileNotFoundError → Profile is private or URL is invalid

See

references/linkedin_scraper_api.md

→ Error Handling for try/except patterns.

AuthenticationError → 会话已过期。请使用账号密码或手动登录重新运行
```
scripts/create_session.py
```
RateLimitError → 请等待后重试。增加请求之间的
```
--delay
```
参数值
ProfileNotFoundError → 资料为私有或URL无效

有关try/except异常处理示例，请参阅

references/linkedin_scraper_api.md

中的错误处理部分。

Rate Limiting

速率限制

Always use delays between requests (default 2s in scripts). For large batches, increase to 3-5s. Never scrape aggressively — respect LinkedIn's rate limits.

请始终在请求之间设置延迟（脚本中默认是2秒）。对于大批量爬取，请将延迟增加至3-5秒。请勿过度爬取——请遵守LinkedIn的速率限制规则。