research-idea-validator
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseResearch Idea Validator
研究想法验证工具
Purpose
目的
Help the user pressure-test a research idea before they sink weeks into it. The workflow is grounded in the New Researcher Handbook section on Research Idea Generation: ideas come from connections, should be evaluated with the FIVE+C framework, and should be validated through fast feedback rather than private overthinking.
The goal is not to declare an idea "good" or "bad." The goal is to produce a concrete decision: prototype, revise, park, or kill.
帮助用户在投入数周时间之前,对研究想法进行压力测试。该工作流程基于《新研究者手册》中“研究想法生成”章节的内容:想法源于关联,需用FIVE+C框架评估,并通过快速反馈而非独自过度思考来验证。
目标不是判定一个想法“好”或“坏”,而是得出一个具体的决策:原型开发、修改、搁置或放弃。
When to Use
使用场景
- User has a new project idea, paper idea, or thesis direction
- User is comparing multiple possible directions
- User wants to prepare an advisor pitch
- User has read papers and sees a potential gap
- User is worried an idea is too small, too broad, too late, or hard to evaluate
- 用户有新的项目想法、论文思路或论文方向
- 用户正在对比多个可能的研究方向
- 用户准备向导师汇报想法
- 用户阅读论文后发现潜在的研究空白
- 用户担心想法太小、太宽泛、为时已晚或难以评估
Workflow
工作流程
Stage 1: Capture the Idea in One Sentence
阶段1:用一句话概括想法
Ask the user to state the idea in one sentence:
I want to [method/action] for [problem/domain] because [gap/failure mode], evaluated by [metric/task].
If the user cannot fill this in, help them rewrite it. Do not move to scoring until the one-sentence version is clear.
请用户用一句话阐述想法:
我想要为[问题/领域]采用[方法/行动],因为[空白/失效模式],将通过[指标/任务]进行评估。
如果用户无法完成填空,帮助他们改写。在一句话版本清晰之前,不要进入评分阶段。
Stage 2: Identify the Source of the Idea
阶段2:确定想法的来源
Ask where the idea came from:
- Literature gap: which papers and which assumptions/future-work lines?
- Senior student or advisor signal: who mentioned the problem and why?
- Cross-pollination: which method is being moved to which domain?
- Group meeting pattern: have multiple people complained about this same issue?
- Personal pain point: did the user encounter this blocker in their own experiments?
Source matters because it changes the confidence level. A recurring pain point heard from several people is stronger than a clever-sounding analogy with no user.
询问用户想法的来源:
- 文献空白:哪些论文,哪些假设/未来研究方向?
- 高年级学生或导师的提示:谁提到了这个问题,原因是什么?
- 跨领域借鉴:将哪种方法应用到哪个领域?
- 小组会议中的共性问题:是否有多人抱怨过同一个问题?
- 个人痛点:用户在自己的实验中是否遇到过这个障碍?
来源很重要,因为它会改变置信度。从多人口中听到的反复出现的痛点,比没有实际用户支撑的听起来巧妙的类比更有说服力。
Stage 3: FIVE+C Evaluation
阶段3:FIVE+C评估
Score each criterion as , , or . Ask only for the missing information needed to score honestly.
StrongUnclearWeak| Criterion | Questions |
|---|---|
| Feasible | Can the user prototype it with their current compute, data, time, and skills? |
| Interesting | Would the target community care if it worked? Who specifically? |
| Novel | How is it different from the closest papers from the last 2 years? |
| Valuable | What field-level capability or understanding would improve? |
| Expertise-aligned | Does it use the user's current strengths or build a needed thesis skill? |
| Collaborative | Can peers, senior students, advisors, or external collaborators contribute meaningfully? |
将每个标准评为、或。仅询问评分所需的缺失信息。
StrongUnclearWeak| 标准 | 问题 |
|---|---|
| Feasible | 用户能否利用当前的计算资源、数据、时间和技能构建原型? |
| Interesting | 如果想法可行,目标研究社区是否会关心?具体是哪些人? |
| Novel | 它与过去2年最相关的论文有何不同? |
| Valuable | 该想法能提升领域层面的哪些能力或认知? |
| Expertise-aligned | 它是否利用了用户当前的优势,或能培养论文所需的技能? |
| Collaborative | 同行、高年级学生、导师或外部合作者能否做出有意义的贡献? |
Stage 4: Red-Flag Check
阶段4:风险排查
Call out any red flags directly:
- Too incremental: the contribution sounds like a small parameter, architecture, or dataset swap.
- Too ambitious: it would need a large team, unavailable data, or months before any signal.
- Already solved: the user has not checked recent top venues or arXiv.
- No evaluation metric: success cannot be measured cleanly.
- Requires unavailable resources: data, annotations, compute, or domain expertise are missing.
- No obvious audience: it is unclear who would cite or use the result.
For each red flag, propose one narrowing or reframing move.
直接指出任何风险信号:
- 过于增量:贡献听起来像是小参数、架构或数据集的替换。
- 过于宏大:需要大型团队、无法获取的数据,或数月时间才能看到任何进展信号。
- 已被解决:用户未查阅近期顶级会议或arXiv的内容。
- 无评估指标:无法清晰衡量成功与否。
- 缺乏可用资源:缺少数据、标注、计算资源或领域专业知识。
- 无明确受众:不清楚谁会引用或使用研究结果。
针对每个风险信号,提出一个缩小范围或重新定位的方案。
Stage 5: Two-Week Validation Sprint
阶段5:两周验证冲刺
If the idea survives, design a two-week sprint:
- Minimal baseline to reproduce or implement
- One decisive experiment or toy setup
- One expected failure mode to check
- Three people to ask for feedback
- Recent-paper search target
- Stop condition: what evidence would make the user park the idea?
Keep the sprint small. If it cannot produce any signal in two weeks, shrink the idea.
如果想法通过评估,设计一个两周的冲刺计划:
- 用于复现或实现的最小基线
- 一个决定性的实验或简易模型
- 一个需要验证的预期失效模式
- 三位需要征求反馈的人员
- 近期论文搜索目标
- 终止条件:什么证据会让用户搁置该想法?
保持冲刺计划精简。如果两周内无法产生任何进展信号,就缩小想法的范围。
Stage 6: Produce the Artifact
阶段6:生成文档
Save to .
~/phd-log/ideas/YYYY-MM-DD-[short-topic].mdmarkdown
undefined保存至。
~/phd-log/ideas/YYYY-MM-DD-[short-topic].mdmarkdown
undefinedResearch Idea Validation — [Short Topic]
Research Idea Validation — [Short Topic]
One-sentence idea
One-sentence idea
[Clear sentence]
[Clear sentence]
Source of the idea
Source of the idea
- Origin:
- Evidence that this is a real problem:
- Closest papers / systems:
- Origin:
- Evidence that this is a real problem:
- Closest papers / systems:
FIVE+C score
FIVE+C score
| Criterion | Rating | Notes / missing evidence |
|---|---|---|
| Feasible | ||
| Interesting | ||
| Novel | ||
| Valuable | ||
| Expertise-aligned | ||
| Collaborative |
| Criterion | Rating | Notes / missing evidence |
|---|---|---|
| Feasible | ||
| Interesting | ||
| Novel | ||
| Valuable | ||
| Expertise-aligned | ||
| Collaborative |
Red flags
Red flags
- [flag] -> [reframe or mitigation]
- [flag] -> [reframe or mitigation]
Two-week validation sprint
Two-week validation sprint
- Baseline:
- Decisive test:
- Feedback targets:
- Literature check:
- Stop condition:
- Baseline:
- Decisive test:
- Feedback targets:
- Literature check:
- Stop condition:
Decision
Decision
[Prototype / revise / park / kill]
[Prototype / revise / park / kill]
Next action
Next action
- [small concrete next step]
undefined- [small concrete next step]
undefinedTone
语气
Be rigorous but not dismissive. Most early ideas are under-specified, not worthless. Help the user make them testable.
严谨但不否定。大多数早期想法只是不够明确,而非毫无价值。帮助用户让想法变得可测试。
What Not to Do
禁忌
- Do not encourage a full implementation before the nearest related work is checked.
- Do not accept "interesting" without naming the audience.
- Do not let the user skip evaluation design.
- Do not overfit to novelty. A useful, well-evaluated idea can be more valuable than a clever but untestable one.
- 不要鼓励在未查阅最相关研究之前就进行完整实现。
- 不要在未明确受众的情况下接受“有趣”的评价。
- 不要让用户跳过评估设计环节。
- 不要过度追求创新性。一个有用且经过充分评估的想法,可能比一个巧妙但无法测试的想法更有价值。