ian-gemini-web

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Gemini Web Client

Gemini Web 客户端

Supports:
  • Text generation
  • Image generation (download + save)
  • Automatic watermark removal (Gemini watermarks are removed by default)
  • Reference image upload (attach images for vision tasks)
  • Multi-turn conversations within the same executor instance (
    keepSession
    )
  • Experimental video generation (
    generateVideo
    ) — Gemini may return an async placeholder; download might require Gemini web UI
支持功能:
  • 文本生成
  • 图像生成(下载+保存)
  • 自动去除水印(默认自动移除Gemini水印)
  • 参考图片上传(为视觉任务附加图片)
  • 同一执行器实例内的多轮对话(
    keepSession
  • 实验性视频生成(
    generateVideo
    )——Gemini可能返回异步占位符;下载可能需要Gemini网页UI

Quick start

快速开始

bash
npx -y bun scripts/main.ts "Hello, Gemini"
npx -y bun scripts/main.ts --prompt "Explain quantum computing"
npx -y bun scripts/main.ts --prompt "A cute cat" --image cat.png
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.png
bash
npx -y bun scripts/main.ts "Hello, Gemini"
npx -y bun scripts/main.ts --prompt "Explain quantum computing"
npx -y bun scripts/main.ts --prompt "A cute cat" --image cat.png
npx -y bun scripts/main.ts --promptfiles system.md content.md --image out.png

Multi-turn conversation (agent generates unique sessionId)

多轮对话(agent会生成唯一sessionId)

npx -y bun scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123 npx -y bun scripts/main.ts "What number?" --sessionId my-unique-id-123
undefined
npx -y bun scripts/main.ts "Remember this: 42" --sessionId my-unique-id-123 npx -y bun scripts/main.ts "What number?" --sessionId my-unique-id-123
undefined

Executor options (programmatic)

程序化调用的执行器选项

This skill is typically consumed via
createGeminiWebExecutor(geminiOptions)
(see
scripts/executor.ts
).
Key options in
GeminiWebOptions
:
  • referenceImages?: string | string[]
    Upload local images as references (vision input).
  • keepSession?: boolean
    Reuse Gemini
    chatMetadata
    to continue the same conversation across calls (required if you want reference images to persist across multiple messages).
  • generateVideo?: string
    Generate a video and (best-effort) download to the given path. Gemini may return
    video_gen_chip
    (async); in that case you must open Gemini web UI to download the result.
Notes:
  • generateVideo
    cannot be combined with
    generateImage
    /
    editImage
    .
  • When
    keepSession=true
    and
    referenceImages
    is set, reference images are uploaded once per executor instance.
该技能通常通过
createGeminiWebExecutor(geminiOptions)
调用(详见
scripts/executor.ts
)。
GeminiWebOptions
中的关键选项:
  • referenceImages?: string | string[]
    上传本地图片作为参考(视觉输入)。
  • keepSession?: boolean
    复用Gemini的
    chatMetadata
    以在多次调用间延续同一对话(如果希望参考图片在多条消息间保持有效,则需要启用此选项)。
  • generateVideo?: string
    生成视频并(尽力)下载到指定路径。Gemini可能返回
    video_gen_chip
    (异步);这种情况下你必须打开Gemini网页UI来下载结果。
注意事项:
  • generateVideo
    不能与
    generateImage
    /
    editImage
    同时使用。
  • keepSession=true
    且设置了
    referenceImages
    时,参考图片会在每个执行器实例中上传一次。

Commands

命令

Text generation

文本生成

bash
undefined
bash
undefined

Simple prompt (positional)

简单提示(位置参数)

npx -y bun scripts/main.ts "Your prompt here"
npx -y bun scripts/main.ts "Your prompt here"

Explicit prompt flag

显式prompt参数

npx -y bun scripts/main.ts --prompt "Your prompt here" npx -y bun scripts/main.ts -p "Your prompt here"
npx -y bun scripts/main.ts --prompt "Your prompt here" npx -y bun scripts/main.ts -p "Your prompt here"

With model selection

选择模型

npx -y bun scripts/main.ts -p "Hello" -m gemini-2.5-pro
npx -y bun scripts/main.ts -p "Hello" -m gemini-2.5-pro

Pipe from stdin

从标准输入管道传入

echo "Summarize this" | npx -y bun scripts/main.ts
undefined
echo "Summarize this" | npx -y bun scripts/main.ts
undefined

Image generation

图像生成

bash
undefined
bash
undefined

Generate image with default path (./generated.png)

生成图像并保存到默认路径(./generated.png)

npx -y bun scripts/main.ts --prompt "A sunset over mountains" --image
npx -y bun scripts/main.ts --prompt "A sunset over mountains" --image

Generate image with custom path

生成图像并保存到自定义路径

npx -y bun scripts/main.ts --prompt "A cute robot" --image robot.png
npx -y bun scripts/main.ts --prompt "A cute robot" --image robot.png

Shorthand

简写形式

npx -y bun scripts/main.ts "A dragon" --image=dragon.png
undefined
npx -y bun scripts/main.ts "A dragon" --image=dragon.png
undefined

Output formats

输出格式

bash
undefined
bash
undefined

Plain text (default)

纯文本(默认)

npx -y bun scripts/main.ts "Hello"
npx -y bun scripts/main.ts "Hello"

JSON output

JSON输出

npx -y bun scripts/main.ts "Hello" --json
undefined
npx -y bun scripts/main.ts "Hello" --json
undefined

Options

选项

OptionDescription
--prompt <text>
,
-p
Prompt text
--promptfiles <files...>
Read prompt from files (concatenated in order)
--model <id>
,
-m
Model: gemini-3-pro (default), gemini-2.5-pro, gemini-2.5-flash
--image [path]
Generate image, save to path (default: generated.png)
--sessionId <id>
Session ID for multi-turn conversation (agent generates unique ID)
--list-sessions
List saved sessions (max 100, sorted by update time)
--json
Output as JSON
--login
Refresh cookies only, then exit
--cookie-path <path>
Custom cookie file path
--profile-dir <path>
Chrome profile directory
--help
,
-h
Show help
CLI note:
scripts/main.ts
supports text generation, image generation, and multi-turn conversations via
--sessionId
. Reference images and video generation are exposed via the executor API.
选项说明
--prompt <text>
,
-p
提示文本
--promptfiles <files...>
从文件读取提示(按顺序拼接)
--model <id>
,
-m
模型:gemini-3-pro(默认)、gemini-2.5-pro、gemini-2.5-flash
--image [path]
生成图像并保存到指定路径(默认:generated.png)
--sessionId <id>
多轮对话的会话ID(agent会生成唯一ID)
--list-sessions
列出已保存的会话(最多100个,按更新时间排序)
--json
以JSON格式输出
--login
仅刷新Cookie,然后退出
--cookie-path <path>
自定义Cookie文件路径
--profile-dir <path>
Chrome配置文件目录
--help
,
-h
显示帮助信息
CLI说明:
scripts/main.ts
支持文本生成、图像生成和通过
--sessionId
实现的多轮对话。参考图片和视频生成功能通过执行器API暴露。

Models

模型

  • gemini-3-pro
    - Default, latest model
  • gemini-2.5-pro
    - Previous generation pro
  • gemini-2.5-flash
    - Fast, lightweight
  • gemini-3-pro
    - 默认模型,最新版本
  • gemini-2.5-pro
    - 上一代专业版模型
  • gemini-2.5-flash
    - 快速、轻量型模型

Authentication

身份验证

First run opens Chrome to authenticate with Google. Cookies are cached for subsequent runs.
bash
undefined
首次运行会打开Chrome进行Google账号认证。Cookie会被缓存用于后续运行。
bash
undefined

Force cookie refresh

强制刷新Cookie

npx -y bun scripts/main.ts --login
undefined
npx -y bun scripts/main.ts --login
undefined

Environment variables

环境变量

VariableDescription
GEMINI_WEB_DATA_DIR
Data directory
GEMINI_WEB_COOKIE_PATH
Cookie file path
GEMINI_WEB_CHROME_PROFILE_DIR
Chrome profile directory
GEMINI_WEB_CHROME_PATH
Chrome executable path
变量说明
GEMINI_WEB_DATA_DIR
数据目录
GEMINI_WEB_COOKIE_PATH
Cookie文件路径
GEMINI_WEB_CHROME_PROFILE_DIR
Chrome配置文件目录
GEMINI_WEB_CHROME_PATH
Chrome可执行文件路径

Examples

示例

Generate text response

生成文本响应

bash
npx -y bun scripts/main.ts "What is the capital of France?"
bash
npx -y bun scripts/main.ts "What is the capital of France?"

Generate image

生成图像

bash
npx -y bun scripts/main.ts "A photorealistic image of a golden retriever puppy" --image puppy.png
bash
npx -y bun scripts/main.ts "A photorealistic image of a golden retriever puppy" --image puppy.png

Get JSON output for parsing

获取可解析的JSON输出

bash
npx -y bun scripts/main.ts "Hello" --json | jq '.text'
bash
npx -y bun scripts/main.ts "Hello" --json | jq '.text'

Generate image from prompt files

从提示文件生成图像

bash
undefined
bash
undefined

Concatenate system.md + content.md as prompt

将system.md + content.md拼接作为提示

npx -y bun scripts/main.ts --promptfiles system.md content.md --image output.png
undefined
npx -y bun scripts/main.ts --promptfiles system.md content.md --image output.png
undefined

Multi-turn conversation

多轮对话

bash
undefined
bash
undefined

Start a session with unique ID (agent generates this)

用唯一ID启动会话(agent会生成该ID)

npx -y bun scripts/main.ts "You are a helpful math tutor." --sessionId task-abc123
npx -y bun scripts/main.ts "You are a helpful math tutor." --sessionId task-abc123

Continue the conversation (remembers context)

继续对话(保留上下文)

npx -y bun scripts/main.ts "What is 2+2?" --sessionId task-abc123 npx -y bun scripts/main.ts "Now multiply that by 10" --sessionId task-abc123
npx -y bun scripts/main.ts "What is 2+2?" --sessionId task-abc123 npx -y bun scripts/main.ts "Now multiply that by 10" --sessionId task-abc123

List recent sessions (max 100, sorted by update time)

列出最近的会话(最多100个,按更新时间排序)

npx -y bun scripts/main.ts --list-sessions

Session files are stored in `~/Library/Application Support/ian-skills/gemini-web/sessions/<id>.json` and contain:
- `id`: Session ID
- `metadata`: Gemini chat metadata for continuation
- `messages`: Array of `{role, content, timestamp, error?}`
- `createdAt`, `updatedAt`: Timestamps
npx -y bun scripts/main.ts --list-sessions

会话文件存储在`~/Library/Application Support/ian-skills/gemini-web/sessions/<id>.json`,包含以下内容:
- `id`: 会话ID
- `metadata`: 用于延续对话的Gemini聊天元数据
- `messages`: `{role, content, timestamp, error?}`数组
- `createdAt`, `updatedAt`: 时间戳

Watermark Removal

水印移除

Generated PNG images automatically have Gemini watermarks removed using the Reverse Alpha Blending algorithm.
This is enabled by default for all PNG images. The algorithm is lossless and mathematically precise.
生成的PNG图像会通过Reverse Alpha Blending算法自动去除Gemini水印。
默认对所有PNG图像启用该功能,该算法无损且数学上精确。