speak-with-profile

Original🇺🇸 English
Translated
3 scripts

Profile-aware speech workflow for narrated notes, spoken drafts, audio summaries, accessibility reads, and other text-to-speech tasks. Use when one front-door workflow should resolve voice profiles, enforce disclosure, and apply manifest tracking before delegating to built-in `$speech` or a deterministic local CLI path.

3installs
Added on

NPX Install

npx skill4agent add gaelic-ghost/productivity-skills speak-with-profile

Tags

Translated version includes tags in frontmatter

Speak With Profile

Use this skill as the front door for speech tasks when the workflow needs reusable profiles, consistent disclosure, and reproducible reporting. Treat built-in
$speech
as the primary synthesis engine, and keep this skill as the policy and profile adapter.

Purpose

  • Normalize speech requests with deterministic profile precedence.
  • Enforce disclosure and manifest policy consistently.
  • Delegate generation to
    $speech
    by default in Codex App/CLI conversations.
  • Keep a script fallback for deterministic local/automation execution.
  • Support productivity workflows such as narrated notes, spoken drafts, audio summaries, and hands-free review without dropping accessibility-friendly profile options.

Required UX pattern

  1. Route speech requests through
    $speak-with-profile
    first.
  2. Resolve profile/default/override values.
  3. Require disclosure text in end-user output context.
  4. Delegate generation to built-in
    $speech
    using resolved fields.
  5. Record effective configuration in a run manifest.

Execution modes

  • delegate
    (default for conversational usage): use built-in
    $speech
    for synthesis.
  • local-cli
    (fallback for scripted/automation runs): invoke a compatible speech CLI via
    scripts/speak_with_profile.py
    .
Use
delegate
unless there is a concrete reason to run the local script path.

Inputs needed

  • Input source: either the exact text to speak or a path to a text file.
  • Profile choice: a
    --profile
    ID if the caller wants a named profile, or no profile if baseline defaults are acceptable.
  • Profile data source: a
    --profiles-file
    path when profile-based resolution is required.
  • Delivery guidance: any explicit voice, instructions, speed, or format overrides the caller wants to force.
  • Output intent: target output path or filename expectations for the generated audio.
  • Execution preference: whether built-in
    $speech
    is acceptable or deterministic
    local-cli
    behavior is required.
  • Playback intent: whether the generated file should also be played locally with
    open
    or
    afplay
    .
  • Optional skill-local workflow customization from
    config/customization.yaml
    or
    config/customization.template.yaml
    .

Hard constraints

  • Do not bypass this skill for speech tasks in this repository workflow.
  • Do not modify built-in
    $speech
    behavior assumptions; adapt inputs/outputs around it.
  • Never modify a compatible downstream speech CLI from this skill; adapt inputs and outputs around it.
  • Use built-in voices only.
  • Require
    OPENAI_API_KEY
    for live API calls.

Disclosure policy

Always include a clear disclosure in user-visible output when speech is produced, for example:
  • "This audio was generated by AI text-to-speech."
If a profile provides
disclosure
, use it. Otherwise, use the default disclosure above.

Profile resolution order

  1. Explicit wrapper flags (
    --voice
    ,
    --speed
    ,
    --instructions
    ,
    --format
    , output path flags).
  2. Selected profile (
    --profile
    ).
  3. Default profile from profiles file (
    default_profile
    ).
  4. Baseline defaults (
    voice=cedar
    ,
    speed=1.0
    , format
    mp3
    , model
    gpt-4o-mini-tts-2025-12-15
    ).

Workflow configuration precedence

  1. Explicit user input and wrapper flags.
  2. Skill-local workflow customization in
    config/customization.yaml
    , when present.
  3. Skill-local defaults in
    config/customization.template.yaml
    .
  4. Workflow defaults described in this skill and
    references/wrapper-contract.md
    .
The local wrapper loads supported customization values from
config/customization.yaml
and
config/customization.template.yaml
at runtime. Use skill-local customization to guide wrapper defaults and agent decisions.
preferredExecutionMode
remains documentation-only because the wrapper script implements only
local-cli
, while the skill-level workflow still treats
delegate
as the default conversational path.

Workflow

  1. Resolve input text source (
    --text
    or
    --text-file
    ).
  2. Resolve effective workflow settings using explicit input, skill-local customization, then workflow defaults.
  3. Resolve profile and defaults using
    references/wrapper-contract.md
    .
  4. Validate configuration against
    references/profile-schema.md
    .
  5. Choose execution mode:
    • delegate
      : call built-in
      $speech
      with resolved fields.
    • local-cli
      : run
      scripts/speak_with_profile.py
      .
  6. Emit/validate run manifest and include disclosure.

Output Contract

  • delegate
    : return a user-visible result that confirms speech generation intent, includes the disclosure text, and surfaces the resolved speech settings that matter for the request.
  • local-cli
    : produce an audio file at the requested or resolved output path and write an adjacent
    <audio>.manifest.json
    file.
  • local-cli
    manifest/result reporting must include:
    • output audio path
    • manifest path
    • disclosure text
    • execution mode (
      local-cli
      )
    • playback result when
      open
      or
      afplay
      is used
  • Playback failures must remain explicit rather than being treated as silent success.

Failure modes

  • Missing
    OPENAI_API_KEY
    : stop before generation and tell the caller that the local environment must provide the API key.
  • Missing text source or unreadable text file: fail fast rather than guessing input content.
  • Missing profiles file when
    --profile
    is requested: fail fast and require the caller to provide a valid JSON or YAML profiles file.
  • Invalid profiles file: stop on schema/parse errors and surface the validation failure clearly.
  • Unknown profile ID: fail with the known profile list from the supplied profiles file.
  • Input over 4096 characters: fail fast and require the caller to split or chunk the input before running.
  • Missing compatible speech CLI on the local CLI path: stop and report that
    --tts-cli-path
    must point to a valid speech CLI implementation.
  • Downstream speech CLI failure: propagate the subprocess failure rather than masking it.
  • Playback failure with
    open
    or
    afplay
    : return the deterministic playback failure outcome, record it in the manifest, and respect
    --stop-on-error
    /
    --no-stop-on-error
    .

References

  • Profile schema and validation rules:
    references/profile-schema.md
  • Starter profile set and examples:
    references/starter-profiles.md
  • Adapter contract and mode behavior:
    references/wrapper-contract.md
  • Skill-local workflow customization:
    references/customization.md

Validation helper

Use
scripts/validate_manifest.py
to verify required manifest keys:
bash
uv run --group dev python speak-with-profile/scripts/validate_manifest.py path/to/output/file.manifest.json