research-idea-validator

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Research Idea Validator

研究想法验证工具

Purpose

目的

Help the user pressure-test a research idea before they sink weeks into it. The workflow is grounded in the New Researcher Handbook section on Research Idea Generation: ideas come from connections, should be evaluated with the FIVE+C framework, and should be validated through fast feedback rather than private overthinking.
The goal is not to declare an idea "good" or "bad." The goal is to produce a concrete decision: prototype, revise, park, or kill.
帮助用户在投入数周时间之前,对研究想法进行压力测试。该工作流程基于《新研究者手册》中“研究想法生成”章节的内容:想法源于关联,需用FIVE+C框架评估,并通过快速反馈而非独自过度思考来验证。
目标不是判定一个想法“好”或“坏”,而是得出一个具体的决策:原型开发、修改、搁置或放弃。

When to Use

使用场景

  • User has a new project idea, paper idea, or thesis direction
  • User is comparing multiple possible directions
  • User wants to prepare an advisor pitch
  • User has read papers and sees a potential gap
  • User is worried an idea is too small, too broad, too late, or hard to evaluate
  • 用户有新的项目想法、论文思路或论文方向
  • 用户正在对比多个可能的研究方向
  • 用户准备向导师汇报想法
  • 用户阅读论文后发现潜在的研究空白
  • 用户担心想法太小、太宽泛、为时已晚或难以评估

Workflow

工作流程

Stage 1: Capture the Idea in One Sentence

阶段1:用一句话概括想法

Ask the user to state the idea in one sentence:
I want to [method/action] for [problem/domain] because [gap/failure mode], evaluated by [metric/task].
If the user cannot fill this in, help them rewrite it. Do not move to scoring until the one-sentence version is clear.
请用户用一句话阐述想法:
我想要为[问题/领域]采用[方法/行动],因为[空白/失效模式],将通过[指标/任务]进行评估。
如果用户无法完成填空,帮助他们改写。在一句话版本清晰之前,不要进入评分阶段。

Stage 2: Identify the Source of the Idea

阶段2:确定想法的来源

Ask where the idea came from:
  • Literature gap: which papers and which assumptions/future-work lines?
  • Senior student or advisor signal: who mentioned the problem and why?
  • Cross-pollination: which method is being moved to which domain?
  • Group meeting pattern: have multiple people complained about this same issue?
  • Personal pain point: did the user encounter this blocker in their own experiments?
Source matters because it changes the confidence level. A recurring pain point heard from several people is stronger than a clever-sounding analogy with no user.
询问用户想法的来源:
  • 文献空白:哪些论文,哪些假设/未来研究方向?
  • 高年级学生或导师的提示:谁提到了这个问题,原因是什么?
  • 跨领域借鉴:将哪种方法应用到哪个领域?
  • 小组会议中的共性问题:是否有多人抱怨过同一个问题?
  • 个人痛点:用户在自己的实验中是否遇到过这个障碍?
来源很重要,因为它会改变置信度。从多人口中听到的反复出现的痛点,比没有实际用户支撑的听起来巧妙的类比更有说服力。

Stage 3: FIVE+C Evaluation

阶段3:FIVE+C评估

Score each criterion as
Strong
,
Unclear
, or
Weak
. Ask only for the missing information needed to score honestly.
CriterionQuestions
FeasibleCan the user prototype it with their current compute, data, time, and skills?
InterestingWould the target community care if it worked? Who specifically?
NovelHow is it different from the closest papers from the last 2 years?
ValuableWhat field-level capability or understanding would improve?
Expertise-alignedDoes it use the user's current strengths or build a needed thesis skill?
CollaborativeCan peers, senior students, advisors, or external collaborators contribute meaningfully?
将每个标准评为
Strong
Unclear
Weak
。仅询问评分所需的缺失信息。
标准问题
Feasible用户能否利用当前的计算资源、数据、时间和技能构建原型?
Interesting如果想法可行,目标研究社区是否会关心?具体是哪些人?
Novel它与过去2年最相关的论文有何不同?
Valuable该想法能提升领域层面的哪些能力或认知?
Expertise-aligned它是否利用了用户当前的优势,或能培养论文所需的技能?
Collaborative同行、高年级学生、导师或外部合作者能否做出有意义的贡献?

Stage 4: Red-Flag Check

阶段4:风险排查

Call out any red flags directly:
  • Too incremental: the contribution sounds like a small parameter, architecture, or dataset swap.
  • Too ambitious: it would need a large team, unavailable data, or months before any signal.
  • Already solved: the user has not checked recent top venues or arXiv.
  • No evaluation metric: success cannot be measured cleanly.
  • Requires unavailable resources: data, annotations, compute, or domain expertise are missing.
  • No obvious audience: it is unclear who would cite or use the result.
For each red flag, propose one narrowing or reframing move.
直接指出任何风险信号:
  • 过于增量:贡献听起来像是小参数、架构或数据集的替换。
  • 过于宏大:需要大型团队、无法获取的数据,或数月时间才能看到任何进展信号。
  • 已被解决:用户未查阅近期顶级会议或arXiv的内容。
  • 无评估指标:无法清晰衡量成功与否。
  • 缺乏可用资源:缺少数据、标注、计算资源或领域专业知识。
  • 无明确受众:不清楚谁会引用或使用研究结果。
针对每个风险信号,提出一个缩小范围或重新定位的方案。

Stage 5: Two-Week Validation Sprint

阶段5:两周验证冲刺

If the idea survives, design a two-week sprint:
  • Minimal baseline to reproduce or implement
  • One decisive experiment or toy setup
  • One expected failure mode to check
  • Three people to ask for feedback
  • Recent-paper search target
  • Stop condition: what evidence would make the user park the idea?
Keep the sprint small. If it cannot produce any signal in two weeks, shrink the idea.
如果想法通过评估,设计一个两周的冲刺计划:
  • 用于复现或实现的最小基线
  • 一个决定性的实验或简易模型
  • 一个需要验证的预期失效模式
  • 三位需要征求反馈的人员
  • 近期论文搜索目标
  • 终止条件:什么证据会让用户搁置该想法?
保持冲刺计划精简。如果两周内无法产生任何进展信号,就缩小想法的范围。

Stage 6: Produce the Artifact

阶段6:生成文档

Save to
~/phd-log/ideas/YYYY-MM-DD-[short-topic].md
.
markdown
undefined
保存至
~/phd-log/ideas/YYYY-MM-DD-[short-topic].md
markdown
undefined

Research Idea Validation — [Short Topic]

Research Idea Validation — [Short Topic]

One-sentence idea

One-sentence idea

[Clear sentence]
[Clear sentence]

Source of the idea

Source of the idea

  • Origin:
  • Evidence that this is a real problem:
  • Closest papers / systems:
  • Origin:
  • Evidence that this is a real problem:
  • Closest papers / systems:

FIVE+C score

FIVE+C score

CriterionRatingNotes / missing evidence
Feasible
Interesting
Novel
Valuable
Expertise-aligned
Collaborative
CriterionRatingNotes / missing evidence
Feasible
Interesting
Novel
Valuable
Expertise-aligned
Collaborative

Red flags

Red flags

  • [flag] -> [reframe or mitigation]
  • [flag] -> [reframe or mitigation]

Two-week validation sprint

Two-week validation sprint

  • Baseline:
  • Decisive test:
  • Feedback targets:
  • Literature check:
  • Stop condition:
  • Baseline:
  • Decisive test:
  • Feedback targets:
  • Literature check:
  • Stop condition:

Decision

Decision

[Prototype / revise / park / kill]
[Prototype / revise / park / kill]

Next action

Next action

  • [small concrete next step]
undefined
  • [small concrete next step]
undefined

Tone

语气

Be rigorous but not dismissive. Most early ideas are under-specified, not worthless. Help the user make them testable.
严谨但不否定。大多数早期想法只是不够明确,而非毫无价值。帮助用户让想法变得可测试。

What Not to Do

禁忌

  • Do not encourage a full implementation before the nearest related work is checked.
  • Do not accept "interesting" without naming the audience.
  • Do not let the user skip evaluation design.
  • Do not overfit to novelty. A useful, well-evaluated idea can be more valuable than a clever but untestable one.
  • 不要鼓励在未查阅最相关研究之前就进行完整实现。
  • 不要在未明确受众的情况下接受“有趣”的评价。
  • 不要让用户跳过评估设计环节。
  • 不要过度追求创新性。一个有用且经过充分评估的想法,可能比一个巧妙但无法测试的想法更有价值。