Loading...
Loading...
Compare original and translation side by side
references/code-templates.yamljob_payload_builderskills/platform/<platform>/SKILL.mdreferences/code-templates.yamlregistry_write.<platform>readiness_checkjob_idnetwork_arch"latest"references/code-templates.yamlrequest.registry_readjob_idmax_tokenstop_ptemperaturereferences/code-templates.yamlstop.registry_readskills/platform/<platform>/SKILL.mdreferences/service.yamlnetwork_archreferences/request.yamlreferences/code-templates.yamlreferences/code-templates.yamljob_payload_builderskills/platform/<platform>/SKILL.mdreferences/code-templates.yamlregistry_write.<platform>readiness_checkjob_idnetwork_arch"latest"references/code-templates.yamlrequest.registry_readjob_idmax_tokenstop_ptemperaturereferences/code-templates.yamlstop.registry_readjob_idskills/platform/<platform>/SKILL.mdreferences/service.yamlnetwork_archreferences/request.yamlreferences/code-templates.yamlexport HF_TOKEN=...os.environ["VAR_NAME"]references/service.yamlsecrets_handlingHF_TOKENWANDB_API_KEYCLEARML_API_ACCESS_KEYCLEARML_API_SECRET_KEYTAO_API_KEYTAO_USER_KEYnetwork_archmodel_pathnum_gpusWANDB_*CLEARML_*_HOSTexport HF_TOKEN=...os.environ["VAR_NAME"]references/service.yamlsecrets_handlingHF_TOKENWANDB_API_KEYCLEARML_API_ACCESS_KEYCLEARML_API_SECRET_KEYTAO_API_KEYTAO_USER_KEYnetwork_archmodel_pathnum_gpusWANDB_*CLEARML_*_HOST| Input | Role |
|---|---|
| Chooses container image, the per-arch inner command shape ( |
| The trained model checkpoint. Valid forms: |
| Compute platform: |
| Defaults to 1; minimum 1 for inference. |
| 输入项 | 作用 |
|---|---|
| 选择容器镜像、对应架构的内部命令格式( |
| 训练好的模型检查点。有效格式: |
| 计算平台: |
| 默认值为1;推理所需的最小值为1。 |
network_arch{network_arch}.config.json{network_arch}.config.jsonapi_params.imageCOSMOS_RLdocker_image_defaults.mappingreferences/service.yamlIMAGE_<KEY>IMAGE_COSMOS_RLversions.yamltao_toolkit.cosmos_rlnvcr.io/...versions.yamlimages.<group>.<name>IMAGE_<KEY>references/code-templates.yamlapi_params.imageCOSMOS_RLspec_params.inference.model_pathfoldernetwork_arch{network_arch}.config.json{network_arch}.config.jsonapi_params.imageCOSMOS_RLreferences/service.yamldocker_image_defaults.mappingIMAGE_<KEY>IMAGE_COSMOS_RLversions.yamltao_toolkit.cosmos_rlversions.yamlimages.<group>.<name>nvcr.io/...IMAGE_<KEY>references/code-templates.yamlapi_params.imageCOSMOS_RLspec_params.inference.model_pathfolderenv_payloadenv_jsonTAO_LOGGING_SERVER_URLTAO_ADMIN_KEYTAO_EXECUTION_BACKEND| Platform | |
|---|---|
| local-docker | |
| brev | |
| lepton | |
| slurm | |
| kubernetes | |
CLOUD_BASED"False"TAO_LOGGING_SERVER_URL--runtime=nvidiaNVIDIA_DRIVER_CAPABILITIES=allNVIDIA_VISIBLE_DEVICES=<ids>device_requestsenv_jsonenv_payloadTAO_LOGGING_SERVER_URLTAO_ADMIN_KEYTAO_EXECUTION_BACKEND| 平台 | |
|---|---|
| local-docker | |
| brev | |
| lepton | |
| slurm | |
| kubernetes | |
CLOUD_BASED"False"TAO_LOGGING_SERVER_URL--runtime=nvidiaNVIDIA_DRIVER_CAPABILITIES=allNVIDIA_VISIBLE_DEVICES=<ids>device_requestsskills/platform/<name>/SKILL.mdskills/platform/<name>/SKILL.mdnetwork_archreferences/service.yamlcontainer_commands.<network_arch>references/code-templates.yamljob_payload_builder.<network_arch>umask 0 &&job_iduuid.uuid4()imageaccess_keysecret_keyHF_TOKENreferences/service.yamlcontainer_commandscosmos-rl--job '<JOB_JSON>' --docker_env_vars '<ENV_JSON>'json.dumps(...)shlex.quote(...)env_payloadTAO_EXECUTION_BACKENDTAO_API_JOB_IDCLOUD_BASED=FalseHF_TOKENcosmos-predict2.5cosmos_predict inference_microservice start ... --port 8080setup.tyro.conf.OmitArgPrefixes--job--docker_env_varsmodel_path--checkpoint-path--model <registered_key>hf_model://HF_TOKENTAO_EXECUTION_BACKENDTAO_API_JOB_IDCLOUD_BASEDnetwork_archreferences/service.yamlcontainer_commands.<network_arch>references/code-templates.yamljob_payload_builder.<network_arch>umask 0 &&job_iduuid.uuid4()imageaccess_keysecret_keyHF_TOKENreferences/service.yamlcontainer_commandscosmos-rl--job '<JOB_JSON>' --docker_env_vars '<ENV_JSON>'json.dumps(...)shlex.quote(...)env_payloadTAO_EXECUTION_BACKENDTAO_API_JOB_IDCLOUD_BASED=FalseHF_TOKENcosmos-predict2.5cosmos_predict inference_microservice start ... --port 8080setup.tyro.conf.OmitArgPrefixes--job--docker_env_varsmodel_path--checkpoint-path--model <registered_key>hf_model://HF_TOKENTAO_EXECUTION_BACKENDTAO_API_JOB_IDCLOUD_BASEDskills/platform/<platform>/SKILL.md| Parameter | Value |
|---|---|
| resolved container image (Section 2) |
| |
| |
| |
| job / container name | |
| host-side port to bind to container port 8080. Default |
| Platform | Additional inputs |
|---|---|
| local-docker | None beyond base |
| brev | |
| lepton | |
| slurm | |
| kubernetes | |
-p <host_port>:8080job_id/tmp/tao-inf-ms-state.jsonhost_portinstance_idhost_port = next(p for p in range(8080, 8200) if p not in used_ports)8080host_urlbind: address already in useskills/platform/<platform>/SKILL.md| 参数 | 值 |
|---|---|
| 解析后的容器镜像(第2节) |
| |
| |
| |
| 作业/容器名称 | |
| 主机端绑定到容器端口8080的端口。默认值为 |
| 平台 | 附加输入 |
|---|---|
| local-docker | 除基础参数外无其他输入 |
| brev | |
| lepton | |
| slurm | |
| kubernetes | |
-p <host_port>:8080job_id/tmp/tao-inf-ms-state.jsoninstance_idhost_porthost_port = next(p for p in range(8080, 8200) if p not in used_ports)8080host_urlbind: address already in use/tmp/tao-inf-ms-state.jsonjob_id"latest"references/code-templates.yamlregistry_write.<platform>| Platform | | | Extra step before writing |
|---|---|---|---|
| local-docker | | — | None |
| brev | | — | |
| lepton | Lepton endpoint URL | | Poll |
| slurm | | SLURM scheduler job ID | Wait until Running; SSH port-forward |
| kubernetes | | k8s job name | |
print(f"Inference service started.")
print(f" Job ID : {job_id}")
print(f" Arch : {network_arch}")
print(f" URL : {state[job_id]['host_url']}/v1/chat/completions")
print(f"Use this Job ID to send requests or stop the service.")references/code-templates.yamlreadiness_check/tmp/tao-inf-ms-state.jsonjob_id"latest"references/code-templates.yamlregistry_write.<platform>| 平台 | | | 写入前的额外步骤 |
|---|---|---|---|
| local-docker | | — | 无 |
| brev | | — | 执行 |
| lepton | Lepton端点URL | | 轮询 |
| slurm | | SLURM调度器作业ID | 等待状态变为Running;通过SSH端口转发 |
| kubernetes | | k8s作业名称 | 执行 |
print(f"Inference service started.")
print(f" Job ID : {job_id}")
print(f" Arch : {network_arch}")
print(f" URL : {state[job_id]['host_url']}/v1/chat/completions")
print(f"Use this Job ID to send requests or stop the service.")references/code-templates.yamlreadiness_checkjob_idstate["latest"]references/code-templates.yamlstop.registry_readskills/platform/<platform>/SKILL.md| Platform | Identifier to pass | Extra cleanup |
|---|---|---|
| local-docker | | None |
| brev | | None |
| lepton | | None |
| slurm | | |
| kubernetes | | |
entry = state[job_id_to_stop]references/code-templates.yamlstop.registry_cleanupjob_idstate["latest"]references/code-templates.yamlstop.registry_readskills/platform/<platform>/SKILL.md| 平台 | 需传递的标识符 | 额外清理操作 |
|---|---|---|
| local-docker | | 无 |
| brev | | 无 |
| lepton | | 无 |
| slurm | | 执行 |
| kubernetes | | 执行 |
entry = state[job_id_to_stop]references/code-templates.yamlstop.registry_cleanupjob_idnetwork_archjob_idjob_idstatenetwork_archcandidates = [j for j, e in state.items() if j != "latest" and isinstance(e, dict) and e["network_arch"] == arch]job_idstarted_atjob_idnetwork_arch"latest"statestate["latest"]job_idnetwork_archhost_url"latest"references/code-templates.yamlrequest.registry_readjob_iduser_provided_job_idreferences/code-templates.yamlreadiness_checkguidancenum_stepsseednegative_promptimage_urlvideo_urlstate[job_id]["network_arch"]job_idnetwork_archjob_idjob_idstatenetwork_archcandidates = [j for j, e in state.items() if j != "latest" and isinstance(e, dict) and e["network_arch"] == arch]job_idstarted_atjob_idnetwork_archstate"latest"state["latest"]job_idnetwork_archhost_url"latest"references/code-templates.yamlrequest.registry_readjob_iduser_provided_job_idreferences/code-templates.yamlreadiness_checkguidancenum_stepsseednegative_promptimage_urlvideo_urlstate[job_id]["network_arch"]AskUserQuestionreferences/request.yamlchat_completions_request_bodymax_tokenstop_ptemperaturenetwork_arch_constraints.<network_arch>guidancenum_stepsseednegative_promptcosmos-predict2.5AskUserQuestionreferences/request.yamlchat_completions_request_bodymax_tokenstop_ptemperaturenetwork_arch_constraints.<network_arch>guidancenum_stepsseednegative_promptPOST{BASE_URL}/v1/chat/completionsContent-Type: application/jsonreferences/request.yamlchat_completions_request_bodycode_examplesdata:references/request.yamlnetwork_arch_constraints{BASE_URL}/v1/chat/completionsPOSTContent-Type: application/jsonreferences/request.yamlchat_completions_request_bodycode_examplesdata:references/request.yamlnetwork_arch_constraints| HTTP status | Meaning | Action |
|---|---|---|
| 200 | Success — | Read result |
| 202 | Server still initializing or model still loading | Retry after a delay |
| 503 | Initialization failed, model load failed, or model not yet ready | Inspect |
| 400 | Missing or empty JSON body | Fix request |
| 500 | Unhandled exception during inference | Check container logs |
{"error": {"type": "<error_type>", "message": "<reason>"}}container_response_shapesreferences/request.yaml| HTTP状态码 | 含义 | 操作 |
|---|---|---|
| 200 | 成功 —— | 读取结果 |
| 202 | 服务器仍在初始化或模型仍在加载 | 延迟后重试 |
| 503 | 初始化失败、模型加载失败,或模型尚未就绪 | 检查 |
| 400 | 请求体缺失或为空 | 修复请求 |
| 500 | 推理过程中出现未处理的异常 | 检查容器日志 |
{"error": {"type": "<error_type>", "message": "<reason>"}}references/request.yamlcontainer_response_shapes