shinka-setup

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Shinka Task Setup Skill

Shinka任务搭建技能

Create a setup scaffold consisting of an evaluation script and initial solution for an optimization problem given a user's task description. Both ingredients will be used within ShinkaEvolve, a framework combining LLMs with evolutionary algorithms to drive code optimization.

根据用户的任务描述，为优化问题创建包含评估脚本和初始解决方案的搭建脚手架。这两部分内容将用于ShinkaEvolve——一个结合LLMs与进化算法来推动代码优化的框架。

When to Use

使用场景

Invoke this skill when the user:

Wants to optimize code with LLM-driven code evolution (Shinka/ShinkaEvolve)
No
```
evaluate.py
```
and
```
initial.<ext>
```
exist in the working directory

当用户有以下需求时调用此技能：

希望通过LLM驱动的代码进化（Shinka/ShinkaEvolve）优化代码
工作目录中不存在
```
evaluate.py
```
和
```
initial.<ext>
```
文件

User Inputs

用户输入

Task description + success criteria
Target language for
```
initial.<ext>
```
(if omitted, default to Python)
What parts of the script to optimize
Evaluation metric(s) and score direction
Number of eval runs / seeds
Required assets or data files
Dependencies or constraints (runtime, memory)

任务描述及成功标准
```
initial.<ext>
```
的目标语言（若未指定，默认使用Python）
脚本中需要优化的部分
评估指标及分数方向
评估运行次数/随机种子数
所需资源或数据文件
依赖项或约束条件（运行时、内存）

Workflow

工作流程

Check if all user inputs are provided and ask the user follow-up questions if not inferrable.
Inspect working directory. Detect chosen language + extension. Avoid overwriting existing
```
evaluate.py
```
or
```
initial.<ext>
```
without consent.
Write
```
initial.<ext>
```
with a clear evolve region (
```
EVOLVE-BLOCK
```
markers or language-equivalent comments) and stable I/O contract.

Write

evaluate.py

Python

initial.py

: call

run_shinka_eval

with

experiment_fn_name

get_experiment_kwargs

aggregate_metrics_fn

num_runs

, and optional

validate_fn

Non-Python
```
initial.<ext>
```
: run candidate program directly (usually via
```
subprocess
```
) and write
```
metrics.json
```
+
```
correct.json
```
.

Ensure candidate output schema matches evaluator expectations (tuple/dict for Python module eval, or file/CLI contract for non-Python).
Validate draft
```
evaluate.py
```
before handoff:
- Run a smoke test:
  - ```
  python evaluate.py --program_path initial.<ext> --results_dir /tmp/shinka_eval_smoke
```
- Confirm evaluator runs without exceptions.
- Confirm a metrics
```
dict
```
  is produced (either from
```
aggregate_fn
```
  or
```
metrics.json
```
  ) with at least:
  - ```
  combined_score
```
  (numeric),
- ```
public
```
    (
```
dict
```
    ),
  - ```
  private
```
  (
```
  dict
```
  ),
- ```
extra_data
```
    (
```
dict
```
    ),
  - ```
  text_feedback
```
  (string, can be empty).
- Confirm
```
correct.json
```
  exists with
```
correct
```
  (bool) and
```
error
```
  (string) fields.
Ask the user if they want to run the evolution themself or whether to use the
```
shinka-run
```
skill:
- If the user wants to run evolution manually, add
```
run_evo.py
```
  plus a
```
shinka.yaml
```
  config with matching language +
```
init_program_path
```
  .
- Ask the user if they want to use the
```
shinka-run
```
  skill to perform optimization with the agent.

检查是否获取了所有用户输入，若有无法推断的信息则向用户提问。
检查工作目录，检测所选语言及扩展名。未经用户同意，不得覆盖已有的
```
evaluate.py
```
或
```
initial.<ext>
```
文件。
编写
```
initial.<ext>
```
，包含明确的进化区域（
```
EVOLVE-BLOCK
```
标记或等效的语言注释）以及稳定的I/O约定。

编写

evaluate.py

：

Python版

initial.py

：调用

run_shinka_eval

，传入

experiment_fn_name

、

get_experiment_kwargs

、

aggregate_metrics_fn

、

num_runs

，可选传入

validate_fn

。

非Python版
```
initial.<ext>
```
：直接运行候选程序（通常通过
```
subprocess
```
）并生成
```
metrics.json
```
和
```
correct.json
```
文件。

确保候选程序的输出格式符合评估器的要求（Python模块评估使用元组/字典，非Python则使用文件/CLI约定）。
交付前验证
```
evaluate.py
```
草稿：
- 运行冒烟测试：
  - ```
  python evaluate.py --program_path initial.<ext> --results_dir /tmp/shinka_eval_smoke
```
- 确认评估器运行无异常。
- 确认生成了包含至少以下字段的指标
```
dict
```
  （来自
```
aggregate_fn
```
  或
```
metrics.json
```
  ）：
  - ```
  combined_score
```
  （数值型），
- ```
public
```
    （
```
dict
```
    类型），
  - ```
  private
```
  （
```
  dict
```
  类型），
- ```
extra_data
```
    （
```
dict
```
    类型），
  - ```
  text_feedback
```
  （字符串，可为空）。
- 确认存在
```
correct.json
```
  文件，包含
```
correct
```
  （布尔型）和
```
error
```
  （字符串）字段。
询问用户是希望自行运行进化流程还是使用
```
shinka-run
```
技能：
- 如果用户希望手动运行进化流程，添加
```
run_evo.py
```
  以及匹配语言和
```
init_program_path
```
  的
```
shinka.yaml
```
  配置文件。
- 询问用户是否希望使用
```
shinka-run
```
  技能通过智能体执行优化。

What is ShinkaEvolve?

什么是ShinkaEvolve？

A framework developed by SakanaAI that combines LLMs with evolutionary algorithms to propose program mutations, that are then evaluated and archived. The goal is to optimize for performance and discover novel scientific insights.

Repo and documentation: https://github.com/SakanaAI/ShinkaEvolve Paper: https://arxiv.org/abs/2212.04180

由SakanaAI开发的框架，结合LLMs与进化算法来生成程序变异方案，随后对这些方案进行评估和归档。目标是优化性能并发现新的科学见解。

仓库与文档：https://github.com/SakanaAI/ShinkaEvolve 论文：https://arxiv.org/abs/2212.04180

Evolution Flow

进化流程

Select parent(s) from archive/population
LLM proposes patch (diff, full rewrite, or crossover)
Evaluate candidate →
```
combined_score
```
If valid, insert into island archive (higher score = better)
Periodically migrate top solutions between islands
Repeat for N generations

从归档/种群中选择父代程序
LLM生成补丁（差异文件、完整重写或交叉变异）
评估候选程序 → 生成
```
combined_score
```
若有效，将其加入孤岛归档（分数越高越好）
定期在不同孤岛间迁移最优解决方案
重复上述步骤N代

Core Files To Generate

需生成的核心文件

File	Purpose
`initial.<ext>`	Starting solution in the chosen language with an evolve region that LLMs mutate
`evaluate.py`	Scores candidates and emits metrics/correctness outputs that guide selection
`run_evo.py`	(Optional) Launches the evolution loop
`shinka.yaml`	(Optional) Config: generations, islands, LLM models, patch types, etc.

文件	用途
`initial.<ext>`	所选语言的初始解决方案，包含供LLMs变异的进化区域
`evaluate.py`	为候选程序评分并输出指标/正确性结果，指导选择过程
`run_evo.py`	（可选）启动进化循环
`shinka.yaml`	（可选）配置文件：进化代数、孤岛数量、LLM模型、补丁类型等

Quick Install (if Shinka is not set up yet)

快速安装（若尚未设置Shinka）

Install once before creating/running tasks:

bash

undefined

创建/运行任务前先安装一次：

bash

undefined

Check if shinka is available in workspace environment

检查工作区环境中是否有shinka

python -c "import shinka"

If not; install from PyPI

若没有；从PyPI安装

pip install shinka-evolve

Or with uv

或使用uv安装

uv pip install shinka-evolve

undefined

uv pip install shinka-evolve

undefined

Language Support (

initial.<ext>

)

语言支持（

initial.<ext>

）

Shinka supports multiple candidate-program languages. Choose one, then keep extension/config/evaluator aligned.

`evo_config.language`	`initial.<ext>`
`python`	`initial.py`
`julia`	`initial.jl`
`cpp`	`initial.cpp`
`cuda`	`initial.cu`
`rust`	`initial.rs`
`swift`	`initial.swift`
`json` / `json5`	`initial.json`

Rules:

```
evaluate.py
```
stays the evaluator entrypoint.
Python candidates: prefer
```
run_shinka_eval
```
+
```
experiment_fn_name
```
.
Non-Python candidates: evaluate via
```
subprocess
```
and write
```
metrics.json
```
+
```
correct.json
```
.

Always set both

evo_config.language

and matching

evo_config.init_program_path

Shinka支持多种候选程序语言。选择一种语言后，需保持扩展名/配置/评估器一致。

`evo_config.language`	`initial.<ext>`
`python`	`initial.py`
`julia`	`initial.jl`
`cpp`	`initial.cpp`
`cuda`	`initial.cu`
`rust`	`initial.rs`
`swift`	`initial.swift`
`json` / `json5`	`initial.json`

规则：

```
evaluate.py
```
始终作为评估器入口。
Python候选程序：优先使用
```
run_shinka_eval
```
+
```
experiment_fn_name
```
。
非Python候选程序：通过
```
subprocess
```
评估并生成
```
metrics.json
```
+
```
correct.json
```
。

务必同时设置

evo_config.language

和匹配的

evo_config.init_program_path

。

Template:

initial.<ext>

(Python example)

模板：

initial.<ext>

（Python示例）

import random

import random

EVOLVE-BLOCK-START

def advanced_algo(): # Implement the evolving algorithm here. return 0.0, ""

EVOLVE-BLOCK-END

def solve_problem(params): return advanced_algo()

def run_experiment(random_seed: int | None = None, **kwargs): """Main entrypoint called by evaluator.""" if random_seed is not None: random.seed(random_seed)

score, text = solve_problem(kwargs)
return float(score), text


For non-Python `initial.<ext>`, keep the same idea: small evolve region + deterministic program interface consumed by `evaluate.py`.

def solve_problem(params): return advanced_algo()

def run_experiment(random_seed: int | None = None, **kwargs): """Main entrypoint called by evaluator.""" if random_seed is not None: random.seed(random_seed)

score, text = solve_problem(kwargs)
return float(score), text


对于非Python版`initial.<ext>`，遵循相同思路：小范围进化区域 + 供`evaluate.py`调用的确定性程序接口。

Template:

evaluate.py

(Python

run_shinka_eval

path)

模板：

evaluate.py

（Python

run_shinka_eval

路径）

import argparse
import numpy as np

from shinka.core import run_shinka_eval  # required for results storage


def get_kwargs(run_idx: int) -> dict:
    return {"random_seed": int(np.random.randint(0, 1_000_000_000))}


def aggregate_fn(results: list) -> dict:
    scores = [r[0] for r in results]
    texts = [r[1] for r in results if len(r) > 1]
    combined_score = float(np.mean(scores))
    text = texts[0] if texts else ""
    return {
        "combined_score": combined_score,
        "public": {},
        "private": {},
        "extra_data": {},
        "text_feedback": text,
    }


def validate_fn(result):
    # Return (True, None) or (False, "reason")
    return True, None


def main(program_path: str, results_dir: str):
    metrics, correct, err = run_shinka_eval(
        program_path=program_path,
        results_dir=results_dir,
        experiment_fn_name="run_experiment",
        num_runs=3,
        get_experiment_kwargs=get_kwargs,
        aggregate_metrics_fn=aggregate_fn,
        validate_fn=validate_fn,  # Optional
    )
    if not correct:
        raise RuntimeError(err or "Evaluation failed")


if __name__ == "__main__":
    # argparse program path & dir
    parser = argparse.ArgumentParser()
    parser.add_argument("--program_path", required=True)
    parser.add_argument("--results_dir", required=True)
    args = parser.parse_args()
    main(program_path=args.program_path, results_dir=args.results_dir)

import argparse
import numpy as np

from shinka.core import run_shinka_eval  # required for results storage


def get_kwargs(run_idx: int) -> dict:
    return {"random_seed": int(np.random.randint(0, 1_000_000_000))}


def aggregate_fn(results: list) -> dict:
    scores = [r[0] for r in results]
    texts = [r[1] for r in results if len(r) > 1]
    combined_score = float(np.mean(scores))
    text = texts[0] if texts else ""
    return {
        "combined_score": combined_score,
        "public": {},
        "private": {},
        "extra_data": {},
        "text_feedback": text,
    }


def validate_fn(result):
    # Return (True, None) or (False, "reason")
    return True, None


def main(program_path: str, results_dir: str):
    metrics, correct, err = run_shinka_eval(
        program_path=program_path,
        results_dir=results_dir,
        experiment_fn_name="run_experiment",
        num_runs=3,
        get_experiment_kwargs=get_kwargs,
        aggregate_metrics_fn=aggregate_fn,
        validate_fn=validate_fn,  # Optional
    )
    if not correct:
        raise RuntimeError(err or "Evaluation failed")


if __name__ == "__main__":
    # argparse program path & dir
    parser = argparse.ArgumentParser()
    parser.add_argument("--program_path", required=True)
    parser.add_argument("--results_dir", required=True)
    args = parser.parse_args()
    main(program_path=args.program_path, results_dir=args.results_dir)

Template:

evaluate.py

(non-Python

initial.<ext>

path)

模板：

evaluate.py

（非Python

initial.<ext>

路径）

import argparse
import json
import os
from pathlib import Path


def main(program_path: str, results_dir: str):
    os.makedirs(results_dir, exist_ok=True)

    # 1) Execute candidate program_path (subprocess / runtime-specific call)
    # 2) Compute task metrics + correctness
    metrics = {
        "combined_score": 0.0,
        "public": {},
        "private": {},
        "extra_data": {},
        "text_feedback": "",
    }
    correct = False
    error = ""

    (Path(results_dir) / "metrics.json").write_text(
        json.dumps(metrics, indent=2), encoding="utf-8"
    )
    (Path(results_dir) / "correct.json").write_text(
        json.dumps({"correct": correct, "error": error}, indent=2), encoding="utf-8"
    )


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--program_path", required=True)
    parser.add_argument("--results_dir", required=True)
    args = parser.parse_args()
    main(program_path=args.program_path, results_dir=args.results_dir)

import argparse
import json
import os
from pathlib import Path


def main(program_path: str, results_dir: str):
    os.makedirs(results_dir, exist_ok=True)

    # 1) Execute candidate program_path (subprocess / runtime-specific call)
    # 2) Compute task metrics + correctness
    metrics = {
        "combined_score": 0.0,
        "public": {},
        "private": {},
        "extra_data": {},
        "text_feedback": "",
    }
    correct = False
    error = ""

    (Path(results_dir) / "metrics.json").write_text(
        json.dumps(metrics, indent=2), encoding="utf-8"
    )
    (Path(results_dir) / "correct.json").write_text(
        json.dumps({"correct": correct, "error": error}, indent=2), encoding="utf-8"
    )


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--program_path", required=True)
    parser.add_argument("--results_dir", required=True)
    args = parser.parse_args()
    main(program_path=args.program_path, results_dir=args.results_dir)

(Optional) Template:

run_evo.py

(async)

（可选）模板：

run_evo.py

（异步）

See

skills/shinka-setup/scripts/run_evo.py

for an example to edit.

请查看

skills/shinka-setup/scripts/run_evo.py

获取可编辑的示例。

(Optional) Template:

shinka.yaml

（可选）模板：

shinka.yaml

See

skills/shinka-setup/scripts/shinka.yaml

for an example to edit.

请查看

skills/shinka-setup/scripts/shinka.yaml

获取可编辑的示例。

Notes

注意事项

Keep evolve markers tight; only code inside the region should evolve.

Keep evaluator schema stable (

combined_score

public

private

extra_data

text_feedback

Python module path: ensure
```
experiment_fn_name
```
matches function name in
```
initial.py
```
.
Non-Python path: ensure evaluator/runtime contract matches
```
initial.<ext>
```
CLI/I/O.
Higher
```
combined_score
```
values indicate better performance.

进化标记范围要紧凑；只有标记区域内的代码会被进化。

保持评估器格式稳定（

combined_score

、

public

、

private

、

extra_data

、

text_feedback

）。

Python模块路径：确保
```
experiment_fn_name
```
与
```
initial.py
```
中的函数名匹配。
非Python路径：确保评估器/运行时约定与
```
initial.<ext>
```
的CLI/I/O一致。
```
combined_score
```
值越高表示性能越好。

shinka-setup

Original

Translation

Shinka Task Setup Skill

Shinka任务搭建技能

When to Use

使用场景

User Inputs

用户输入

Workflow

工作流程

What is ShinkaEvolve?

什么是ShinkaEvolve？

Evolution Flow

进化流程

Core Files To Generate

需生成的核心文件

Quick Install (if Shinka is not set up yet)

快速安装（若尚未设置Shinka）

Check if shinka is available in workspace environment

检查工作区环境中是否有shinka

If not; install from PyPI

若没有；从PyPI安装

Or with uv

或使用uv安装

Language Support (initial.<ext>)

语言支持（initial.<ext>）

Template: initial.<ext> (Python example)

模板：initial.<ext>（Python示例）

EVOLVE-BLOCK-START

EVOLVE-BLOCK-START

EVOLVE-BLOCK-END

EVOLVE-BLOCK-END

Template: evaluate.py (Python run_shinka_eval path)

模板：evaluate.py（Python run_shinka_eval路径）

Template: evaluate.py (non-Python initial.<ext> path)

模板：evaluate.py（非Python initial.<ext>路径）

(Optional) Template: run_evo.py (async)

（可选）模板：run_evo.py（异步）

(Optional) Template: shinka.yaml

（可选）模板：shinka.yaml

Notes

注意事项

Language Support (
`initial.<ext>`
)

语言支持（
`initial.<ext>`
）

Template:
`initial.<ext>`
(Python example)

模板：
`initial.<ext>`
（Python示例）

Template:
`evaluate.py`
(Python
`run_shinka_eval`
path)

模板：
`evaluate.py`
（Python
`run_shinka_eval`
路径）

Template:
`evaluate.py`
(non-Python
`initial.<ext>`
path)

模板：
`evaluate.py`
（非Python
`initial.<ext>`
路径）

(Optional) Template:
`run_evo.py`
(async)

（可选）模板：
`run_evo.py`
（异步）

(Optional) Template:
`shinka.yaml`

（可选）模板：
`shinka.yaml`