shinka-setup
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseShinka Task Setup Skill
Shinka任务搭建技能
Create a setup scaffold consisting of an evaluation script and initial solution for an optimization problem given a user's task description. Both ingredients will be used within ShinkaEvolve, a framework combining LLMs with evolutionary algorithms to drive code optimization.
根据用户的任务描述,为优化问题创建包含评估脚本和初始解决方案的搭建脚手架。这两部分内容将用于ShinkaEvolve——一个结合LLMs与进化算法来推动代码优化的框架。
When to Use
使用场景
Invoke this skill when the user:
- Wants to optimize code with LLM-driven code evolution (Shinka/ShinkaEvolve)
- No and
evaluate.pyexist in the working directoryinitial.<ext>
当用户有以下需求时调用此技能:
- 希望通过LLM驱动的代码进化(Shinka/ShinkaEvolve)优化代码
- 工作目录中不存在和
evaluate.py文件initial.<ext>
User Inputs
用户输入
- Task description + success criteria
- Target language for (if omitted, default to Python)
initial.<ext> - What parts of the script to optimize
- Evaluation metric(s) and score direction
- Number of eval runs / seeds
- Required assets or data files
- Dependencies or constraints (runtime, memory)
- 任务描述及成功标准
- 的目标语言(若未指定,默认使用Python)
initial.<ext> - 脚本中需要优化的部分
- 评估指标及分数方向
- 评估运行次数/随机种子数
- 所需资源或数据文件
- 依赖项或约束条件(运行时、内存)
Workflow
工作流程
- Check if all user inputs are provided and ask the user follow-up questions if not inferrable.
- Inspect working directory. Detect chosen language + extension. Avoid overwriting existing or
evaluate.pywithout consent.initial.<ext> - Write with a clear evolve region (
initial.<ext>markers or language-equivalent comments) and stable I/O contract.EVOLVE-BLOCK - Write :
evaluate.py- Python : call
initial.pywithrun_shinka_eval,experiment_fn_name,get_experiment_kwargs,aggregate_metrics_fn, and optionalnum_runs.validate_fn - Non-Python : run candidate program directly (usually via
initial.<ext>) and writesubprocess+metrics.json.correct.json
- Python
- Ensure candidate output schema matches evaluator expectations (tuple/dict for Python module eval, or file/CLI contract for non-Python).
- Validate draft before handoff:
evaluate.py- Run a smoke test:
python evaluate.py --program_path initial.<ext> --results_dir /tmp/shinka_eval_smoke
- Confirm evaluator runs without exceptions.
- Confirm a metrics is produced (either from
dictoraggregate_fn) with at least:metrics.json- (numeric),
combined_score - (
public),dict - (
private),dict - (
extra_data),dict - (string, can be empty).
text_feedback
- Confirm exists with
correct.json(bool) andcorrect(string) fields.error
- Run a smoke test:
- Ask the user if they want to run the evolution themself or whether to use the skill:
shinka-run- If the user wants to run evolution manually, add plus a
run_evo.pyconfig with matching language +shinka.yaml.init_program_path - Ask the user if they want to use the skill to perform optimization with the agent.
shinka-run
- If the user wants to run evolution manually, add
- 检查是否获取了所有用户输入,若有无法推断的信息则向用户提问。
- 检查工作目录,检测所选语言及扩展名。未经用户同意,不得覆盖已有的或
evaluate.py文件。initial.<ext> - 编写,包含明确的进化区域(
initial.<ext>标记或等效的语言注释)以及稳定的I/O约定。EVOLVE-BLOCK - 编写:
evaluate.py- Python版:调用
initial.py,传入run_shinka_eval、experiment_fn_name、get_experiment_kwargs、aggregate_metrics_fn,可选传入num_runs。validate_fn - 非Python版:直接运行候选程序(通常通过
initial.<ext>)并生成subprocess和metrics.json文件。correct.json
- Python版
- 确保候选程序的输出格式符合评估器的要求(Python模块评估使用元组/字典,非Python则使用文件/CLI约定)。
- 交付前验证草稿:
evaluate.py- 运行冒烟测试:
python evaluate.py --program_path initial.<ext> --results_dir /tmp/shinka_eval_smoke
- 确认评估器运行无异常。
- 确认生成了包含至少以下字段的指标(来自
dict或aggregate_fn):metrics.json- (数值型),
combined_score - (
public类型),dict - (
private类型),dict - (
extra_data类型),dict - (字符串,可为空)。
text_feedback
- 确认存在文件,包含
correct.json(布尔型)和correct(字符串)字段。error
- 运行冒烟测试:
- 询问用户是希望自行运行进化流程还是使用技能:
shinka-run- 如果用户希望手动运行进化流程,添加以及匹配语言和
run_evo.py的init_program_path配置文件。shinka.yaml - 询问用户是否希望使用技能通过智能体执行优化。
shinka-run
- 如果用户希望手动运行进化流程,添加
What is ShinkaEvolve?
什么是ShinkaEvolve?
A framework developed by SakanaAI that combines LLMs with evolutionary algorithms to propose program mutations, that are then evaluated and archived. The goal is to optimize for performance and discover novel scientific insights.
Repo and documentation: https://github.com/SakanaAI/ShinkaEvolve
Paper: https://arxiv.org/abs/2212.04180
由SakanaAI开发的框架,结合LLMs与进化算法来生成程序变异方案,随后对这些方案进行评估和归档。目标是优化性能并发现新的科学见解。
Evolution Flow
进化流程
- Select parent(s) from archive/population
- LLM proposes patch (diff, full rewrite, or crossover)
- Evaluate candidate →
combined_score - If valid, insert into island archive (higher score = better)
- Periodically migrate top solutions between islands
- Repeat for N generations
- 从归档/种群中选择父代程序
- LLM生成补丁(差异文件、完整重写或交叉变异)
- 评估候选程序 → 生成
combined_score - 若有效,将其加入孤岛归档(分数越高越好)
- 定期在不同孤岛间迁移最优解决方案
- 重复上述步骤N代
Core Files To Generate
需生成的核心文件
| File | Purpose |
|---|---|
| Starting solution in the chosen language with an evolve region that LLMs mutate |
| Scores candidates and emits metrics/correctness outputs that guide selection |
| (Optional) Launches the evolution loop |
| (Optional) Config: generations, islands, LLM models, patch types, etc. |
| 文件 | 用途 |
|---|---|
| 所选语言的初始解决方案,包含供LLMs变异的进化区域 |
| 为候选程序评分并输出指标/正确性结果,指导选择过程 |
| (可选)启动进化循环 |
| (可选)配置文件:进化代数、孤岛数量、LLM模型、补丁类型等 |
Quick Install (if Shinka is not set up yet)
快速安装(若尚未设置Shinka)
Install once before creating/running tasks:
bash
undefined创建/运行任务前先安装一次:
bash
undefinedCheck if shinka is available in workspace environment
检查工作区环境中是否有shinka
python -c "import shinka"
python -c "import shinka"
If not; install from PyPI
若没有;从PyPI安装
pip install shinka-evolve
pip install shinka-evolve
Or with uv
或使用uv安装
uv pip install shinka-evolve
undefineduv pip install shinka-evolve
undefinedLanguage Support (initial.<ext>
)
initial.<ext>语言支持(initial.<ext>
)
initial.<ext>Shinka supports multiple candidate-program languages. Choose one, then keep extension/config/evaluator aligned.
| |
|---|---|
| |
| |
| |
| |
| |
| |
| |
Rules:
- stays the evaluator entrypoint.
evaluate.py - Python candidates: prefer +
run_shinka_eval.experiment_fn_name - Non-Python candidates: evaluate via and write
subprocess+metrics.json.correct.json - Always set both and matching
evo_config.language.evo_config.init_program_path
Shinka支持多种候选程序语言。选择一种语言后,需保持扩展名/配置/评估器一致。
| |
|---|---|
| |
| |
| |
| |
| |
| |
| |
规则:
- 始终作为评估器入口。
evaluate.py - Python候选程序:优先使用+
run_shinka_eval。experiment_fn_name - 非Python候选程序:通过评估并生成
subprocess+metrics.json。correct.json - 务必同时设置和匹配的
evo_config.language。evo_config.init_program_path
Template: initial.<ext>
(Python example)
initial.<ext>模板:initial.<ext>
(Python示例)
initial.<ext>py
import randompy
import randomEVOLVE-BLOCK-START
EVOLVE-BLOCK-START
def advanced_algo():
# Implement the evolving algorithm here.
return 0.0, ""
def advanced_algo():
# Implement the evolving algorithm here.
return 0.0, ""
EVOLVE-BLOCK-END
EVOLVE-BLOCK-END
def solve_problem(params):
return advanced_algo()
def run_experiment(random_seed: int | None = None, **kwargs):
"""Main entrypoint called by evaluator."""
if random_seed is not None:
random.seed(random_seed)
score, text = solve_problem(kwargs)
return float(score), text
For non-Python `initial.<ext>`, keep the same idea: small evolve region + deterministic program interface consumed by `evaluate.py`.def solve_problem(params):
return advanced_algo()
def run_experiment(random_seed: int | None = None, **kwargs):
"""Main entrypoint called by evaluator."""
if random_seed is not None:
random.seed(random_seed)
score, text = solve_problem(kwargs)
return float(score), text
对于非Python版`initial.<ext>`,遵循相同思路:小范围进化区域 + 供`evaluate.py`调用的确定性程序接口。Template: evaluate.py
(Python run_shinka_eval
path)
evaluate.pyrun_shinka_eval模板:evaluate.py
(Python run_shinka_eval
路径)
evaluate.pyrun_shinka_evalpy
import argparse
import numpy as np
from shinka.core import run_shinka_eval # required for results storage
def get_kwargs(run_idx: int) -> dict:
return {"random_seed": int(np.random.randint(0, 1_000_000_000))}
def aggregate_fn(results: list) -> dict:
scores = [r[0] for r in results]
texts = [r[1] for r in results if len(r) > 1]
combined_score = float(np.mean(scores))
text = texts[0] if texts else ""
return {
"combined_score": combined_score,
"public": {},
"private": {},
"extra_data": {},
"text_feedback": text,
}
def validate_fn(result):
# Return (True, None) or (False, "reason")
return True, None
def main(program_path: str, results_dir: str):
metrics, correct, err = run_shinka_eval(
program_path=program_path,
results_dir=results_dir,
experiment_fn_name="run_experiment",
num_runs=3,
get_experiment_kwargs=get_kwargs,
aggregate_metrics_fn=aggregate_fn,
validate_fn=validate_fn, # Optional
)
if not correct:
raise RuntimeError(err or "Evaluation failed")
if __name__ == "__main__":
# argparse program path & dir
parser = argparse.ArgumentParser()
parser.add_argument("--program_path", required=True)
parser.add_argument("--results_dir", required=True)
args = parser.parse_args()
main(program_path=args.program_path, results_dir=args.results_dir)py
import argparse
import numpy as np
from shinka.core import run_shinka_eval # required for results storage
def get_kwargs(run_idx: int) -> dict:
return {"random_seed": int(np.random.randint(0, 1_000_000_000))}
def aggregate_fn(results: list) -> dict:
scores = [r[0] for r in results]
texts = [r[1] for r in results if len(r) > 1]
combined_score = float(np.mean(scores))
text = texts[0] if texts else ""
return {
"combined_score": combined_score,
"public": {},
"private": {},
"extra_data": {},
"text_feedback": text,
}
def validate_fn(result):
# Return (True, None) or (False, "reason")
return True, None
def main(program_path: str, results_dir: str):
metrics, correct, err = run_shinka_eval(
program_path=program_path,
results_dir=results_dir,
experiment_fn_name="run_experiment",
num_runs=3,
get_experiment_kwargs=get_kwargs,
aggregate_metrics_fn=aggregate_fn,
validate_fn=validate_fn, # Optional
)
if not correct:
raise RuntimeError(err or "Evaluation failed")
if __name__ == "__main__":
# argparse program path & dir
parser = argparse.ArgumentParser()
parser.add_argument("--program_path", required=True)
parser.add_argument("--results_dir", required=True)
args = parser.parse_args()
main(program_path=args.program_path, results_dir=args.results_dir)Template: evaluate.py
(non-Python initial.<ext>
path)
evaluate.pyinitial.<ext>模板:evaluate.py
(非Python initial.<ext>
路径)
evaluate.pyinitial.<ext>py
import argparse
import json
import os
from pathlib import Path
def main(program_path: str, results_dir: str):
os.makedirs(results_dir, exist_ok=True)
# 1) Execute candidate program_path (subprocess / runtime-specific call)
# 2) Compute task metrics + correctness
metrics = {
"combined_score": 0.0,
"public": {},
"private": {},
"extra_data": {},
"text_feedback": "",
}
correct = False
error = ""
(Path(results_dir) / "metrics.json").write_text(
json.dumps(metrics, indent=2), encoding="utf-8"
)
(Path(results_dir) / "correct.json").write_text(
json.dumps({"correct": correct, "error": error}, indent=2), encoding="utf-8"
)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--program_path", required=True)
parser.add_argument("--results_dir", required=True)
args = parser.parse_args()
main(program_path=args.program_path, results_dir=args.results_dir)py
import argparse
import json
import os
from pathlib import Path
def main(program_path: str, results_dir: str):
os.makedirs(results_dir, exist_ok=True)
# 1) Execute candidate program_path (subprocess / runtime-specific call)
# 2) Compute task metrics + correctness
metrics = {
"combined_score": 0.0,
"public": {},
"private": {},
"extra_data": {},
"text_feedback": "",
}
correct = False
error = ""
(Path(results_dir) / "metrics.json").write_text(
json.dumps(metrics, indent=2), encoding="utf-8"
)
(Path(results_dir) / "correct.json").write_text(
json.dumps({"correct": correct, "error": error}, indent=2), encoding="utf-8"
)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--program_path", required=True)
parser.add_argument("--results_dir", required=True)
args = parser.parse_args()
main(program_path=args.program_path, results_dir=args.results_dir)(Optional) Template: run_evo.py
(async)
run_evo.py(可选)模板:run_evo.py
(异步)
run_evo.pySee for an example to edit.
skills/shinka-setup/scripts/run_evo.py请查看获取可编辑的示例。
skills/shinka-setup/scripts/run_evo.py(Optional) Template: shinka.yaml
shinka.yaml(可选)模板:shinka.yaml
shinka.yamlSee for an example to edit.
skills/shinka-setup/scripts/shinka.yaml请查看获取可编辑的示例。
skills/shinka-setup/scripts/shinka.yamlNotes
注意事项
- Keep evolve markers tight; only code inside the region should evolve.
- Keep evaluator schema stable (,
combined_score,public,private,extra_data).text_feedback - Python module path: ensure matches function name in
experiment_fn_name.initial.py - Non-Python path: ensure evaluator/runtime contract matches CLI/I/O.
initial.<ext> - Higher values indicate better performance.
combined_score
- 进化标记范围要紧凑;只有标记区域内的代码会被进化。
- 保持评估器格式稳定(、
combined_score、public、private、extra_data)。text_feedback - Python模块路径:确保与
experiment_fn_name中的函数名匹配。initial.py - 非Python路径:确保评估器/运行时约定与的CLI/I/O一致。
initial.<ext> - 值越高表示性能越好。
combined_score