Loading...
Loading...
Generate a branded YouTube thumbnail from a video title. Uses a reference photo of the creator, high-CTR thumbnail principles, and brand colours to produce a ready-to-generate image prompt for Gemini. Use this skill whenever the user says "thumbnail", "youtube thumbnail", "build me a thumbnail", or wants a video cover image before writing the script. The thumbnail-first workflow mirrors the graphic-first approach for LinkedIn: sells the video before anyone hears a word of the script.
npx skill4agent add charlie947/social-media-skills youtube-thumbnailthumbnail-config.mdbrand-kit.mdabout-me.mdUpload or provide the path to the reference photo of yourself you want used in the thumbnail. Ideally a clear headshot with distinctive lighting and expression you plan to reuse across videos for brand consistency.
[
{
"question": "What is the video title?",
"header": "Title",
"multiSelect": false,
"options": [
{"label": "I will type the title", "description": "Type the full working title"},
{"label": "Suggest one", "description": "Given the topic, propose 3 click-worthy titles first"}
]
},
{
"question": "Emotional tone?",
"header": "Tone",
"multiSelect": false,
"options": [
{"label": "Shock / surprise", "description": "Wide eyes, open mouth, bold reaction"},
{"label": "Curious / thinking", "description": "Slight smirk, raised eyebrow, gaze off-frame"},
{"label": "Confident / direct", "description": "Eye contact, calm, assertive"},
{"label": "Frustrated / strong take", "description": "Intense gaze, hand gesture, tension"}
]
}
]THUMBNAIL BRIEF: [video title]
Composition: [face position, % of frame, direction of gaze]
Text: "[hook phrase, 3-5 words]"
Text placement: [left, right, top, wraps around face]
Colour palette: [primary hex], [accent hex], [background hex]
Supporting element: [logo / prop / arrow / number]
Emotional tone: [tone from Step 1]Here's the brief. Say "generate" to output the image prompt or tell me what to change.
Using the attached reference photo of me, generate a YouTube thumbnail at 1280 x 720 pixels (16:9).
Composition:
- Place me [left / right / centre] filling [30-50]% of the frame
- My expression: [tone details — e.g., shocked with wide eyes and open mouth]
- My gaze: [direction — e.g., looking directly at camera / looking off-frame toward the text]
Text:
- Display "[hook phrase]" in large bold sans-serif typography
- Text colour: [hex]
- Text outline: [colour, thickness for readability]
- Text placement: [specific area]
Colour palette:
- Primary: [hex]
- Accent: [hex]
- Background: [hex] — [describe treatment: flat, gradient, blurred scene, etc.]
Supporting element: [specific description of the supporting visual]
Constraints:
- Face must be clear and sharp
- Text must be readable at 320px wide (YouTube mobile size)
- No watermarks, no YouTube UI elements, no bottom-right corner text
- High contrast between face, text, and backgroundPaste this into a new Gemini chat, attach your reference photo, enable Create Image, and select Nano Banana. Generate at 1280x720.
Want me to outline the video next? Hook, mid, CTA from the thumbnail. Or call the create skill if you have one.