clone-ui
Faithful, multi-source web UI cloning. Optimized for fidelity over speed: the goal is "looks identical to the source" first, "fits the project conventions" second.
Fidelity tiers (read this first)
The output quality of this skill scales with the source material available. Be upfront with the user about what tier you're working in:
| Tier | Inputs available | Achievable result |
|---|
| A — Full source | Live screenshot via browser MCP + rendered DOM + computed styles | "close visual match" → "pixel-perfect" possible |
| B — Static fetch | WebFetch HTML works (no JS hydration) + screenshot user provided | "close visual match" likely |
| C — Provided assets | User-supplied screenshot/HTML, no live access | "close visual match" if assets are good |
| D — Memory only | No fetchable source, no screenshot — only training data | "rough sketch" max — say so explicitly |
If you land in Tier D,
stop and tell the user before writing code. A clone built from training data is almost certainly stale (sites change copy/layout often) and the user is better served by capturing a screenshot first. Offer to walk them through
MCP setup or ask for a manual capture rather than producing low-fidelity output silently.
Optional but strongly recommended: Chrome DevTools MCP
When
chrome-devtools-mcp is installed and active, you have access to:
- — capture the current viewport (use multiple viewports for responsive)
- — DOM + accessibility tree (post-hydration, includes JS-rendered content)
- Network/console inspection for sites that need login flow
These tools elevate every clone from Tier B/C to Tier A. If the user asks you to clone a live URL and these tools are NOT available, mention it once at the start: "I'd recommend installing chrome-devtools-mcp for higher-fidelity clones — see the skill's setup section. I'll proceed with what's available."
If unsure whether the MCP is available, just try calling
once early in Phase 2 — if it fails, you know you're in Tier B or below.
Concurrency: chrome-devtools is single-browser
runs
one shared Chrome instance per Claude Code session. If multiple subagents spawn in parallel and each tries to clone a different URL, they will fight over the same browser tab focus — one agent's
/
/
switches the active tab away from a sibling's work, and the sibling's next
or
then runs against the wrong page.
Two mitigations, in order of preference:
- Use isolated contexts. When opening a new page, pass (or the equivalent flag for whichever -style call your MCP version supports). Each agent gets its own browser context with its own page list — no cross-contamination.
- Serialize runs. If isolated contexts aren't available or reliable, run sw-clone tasks one at a time. The skill's per-task duration (~3-5 minutes for a real Tier A clone) makes this acceptable for most workflows.
Symptom that you're hitting the focus drift bug: a
returns the wrong page, or an
returns DOM from a URL you didn't navigate to. If you see this, recover by calling
(which auto-selects) and re-navigating.
Auth-gated UIs (logged-in views)
launches Chrome with
by default — a fresh user-data-dir with no cookies, no extensions, no logged-in sessions. This is great for repeatability, but it means:
The MCP can only directly observe what a logged-out visitor sees. If the user asks you to clone a logged-in view (the GitHub authenticated app header, Gmail inbox, Linear's project view, any SaaS dashboard),
will return either the marketing/landing version of the page or a sign-in prompt — not the target the user wants.
When this happens, be honest about what tier of source material you actually have:
- Visual tokens (color palette, typography, font sizes, button styles) usually still match between logged-out and logged-in surfaces of the same product → Tier A for tokens.
- Layout, components, copy of the logged-in page itself → unobservable → Tier D (memory only) for those parts.
Report this honestly in your output as "Tier mixed: tokens A, layout D" rather than claiming a uniform tier. The user gets to decide whether mixed-tier output is good enough or whether they want to take a manual screenshot of the logged-in view and provide it as Tier C input.
If the user explicitly opts into observing the logged-in view, two paths forward:
- Ask them to drop the flag from their chrome-devtools entry and restart Claude Code so the MCP attaches to a Chrome with their existing session cookies. Trade-off: their personal browsing state becomes visible to the agent.
- Ask them to take a manual screenshot of the logged-in view and provide its file path. The skill operates on that screenshot as a Tier C input.
Don't silently fall back to memory and pretend you observed a logged-in surface. That produces misleading output and erodes user trust in the skill.
When to use
Trigger this skill whenever the user gives you:
- A screenshot (single or multi-viewport)
- A live URL to clone
- Raw HTML/CSS
- A Figma frame export
- Any combination of the above
…and asks you to build, recreate, match, or implement that UI in their codebase.
When not to use
- The user asks for an original design ("design a hero section for X") — that's a creative task, not a clone task.
- The user asks for code review of an existing implementation — different skill.
- The user wants only data extraction (e.g. scraping content from HTML) without rebuilding the UI.
The six-phase flow
Cloning quality collapses when phases are skipped. Resist the urge to jump to "write the code" — every skipped phase shows up as visible drift in the final result.
- Inventory inputs — figure out what sources of truth you have
- Gather — fetch / read every available source
- Plan — break the target into components, identify breakpoints + interactive states
- Implement — translate sources into code in the user's stack
- Verify — compare side-by-side, list every drift
- Polish — fix drifts in priority order until parity
Phase 1 — Inventory inputs
List what you have. Each input type has different fidelity:
| Input | Fidelity | Limitations |
|---|
| Screenshot (PNG/JPG) | Visual truth — what user actually sees | No exact color values, no font names, no exact px |
| Live URL | Highest — rendered DOM + computed styles | May be auth-gated, may rate-limit, JS-heavy sites need real browser |
| Raw HTML (view-source) | Markup truth, but pre-hydration | Missing JS-rendered content, inlined styles only |
| Rendered HTML (post-hydration) | DOM truth | Still no computed styles unless captured |
| Computed styles dump (JSON / DevTools export) | Style truth — exact px/colors/fonts | Tied to one viewport + state |
| Figma export | Design truth — exact tokens | May not match actual rendered site |
If the user only gave one source, ask if more are available before starting:
"I have the screenshot. Do you also have the live URL or raw HTML? Multi-source clones are dramatically more accurate — even view-source HTML helps."
If only a screenshot is available, that's still workable, but flag the lower fidelity ceiling upfront.
Phase 2 — Gather
For each input the user has, pull it into context:
Screenshot
the file path. The image is your visual truth — refer back to it constantly.
If multiple viewport screenshots exist (e.g.
,
), open the
largest first to understand the desktop layout, then each smaller width to map the responsive transitions.
Live URL
Preferred path (Tier A): Chrome DevTools MCP. If
and
are available, use them —
returns post-hydration DOM (handles SPAs cleanly) and
gives you visual truth. Capture at minimum 1440px (desktop) and 375px (mobile) viewports; add 768px (tablet) if the layout has 3+ breakpoints.
Fallback path (Tier B): WebFetch. Grabs markdown-converted content. WebFetch does NOT render JavaScript — it gives you the static HTML response only.
For JS-heavy SPAs (React, Vue, Next.js apps), WebFetch will return a near-empty
. WebFetch may also be denied by some sites' bot detection (Cloudflare, etc.). In either case, the right move is to either:
- Ask the user to install Chrome DevTools MCP (see Setup section)
- Ask the user to capture the rendered DOM (via DevTools "Copy outerHTML" on ) and provide it as raw HTML
- Ask the user for a screenshot
Do not silently fall back to building from training-data memory — that produces Tier D output even when the user thinks they're getting Tier B. Tell them upfront what tier you're operating in.
Raw HTML
the file or paste. Look for:
- Class names → likely Tailwind, BEM, CSS modules, or custom
- Inline styles → exact values to honor
- → fetch those CSSes too if user has them
- blocks → inline CSS rules
- Font imports → Google Fonts links, declarations
Computed styles / context
If the user provides a context dump (JSON, markdown, CSS variable list),
it. These are
gold — exact values beat eyeballed values every time. Prioritize them as source of truth.
Phase 3 — Plan
Before writing any code, produce these four artifacts (briefly — bullet lists, not essays):
3a. Component breakdown
Break the target into logical components. Match the user's project conventions — if they use functional React with hooks, plan that way; if they use Vue SFCs, plan that way.
HomePage
├── HeroSection
│ ├── Logo
│ ├── NavMenu
│ └── HeadlineGroup
├── FeatureGrid (3 cards)
└── Footer
3b. Breakpoint map
Identify breakpoints from screenshots OR CSS (
queries). Common patterns:
| Width | Behavior |
|---|
| ≥ 1280px | Desktop: 3-col grid, full nav |
| 768–1279px | Tablet: 2-col grid, condensed nav |
| < 768px | Mobile: 1-col, hamburger menu |
3c. Design tokens
Extract or eyedrop:
- Colors: primary, secondary, text, bg, border (hex codes — exact)
- Typography: font family, weights used, base size, line-height, scale ratio
- Spacing: detect the grid (4px? 8px?), list common gaps
- Radius: card radius, button radius
- Shadows: subtle? prominent? elevation levels
If you have computed styles → use those values verbatim. If only screenshot → use a color picker mentally and round to the closest sensible value (e.g.
not
).
3d. Interactive states
Even from a single screenshot, infer states from visible cues:
- Buttons usually have , ,
- Inputs have , ,
- Cards may have lift
- Nav items may have styling
If the user provides a live URL or context dump with
rules, capture those exactly.
Phase 4 — Implement
Stack detection
bash
# Use Read on package.json
| Detected dependency | Stack |
|---|
| Next.js (App or Pages router — check vs ) |
| (no Next) | React + Vite/CRA |
| Vue 3 |
| SvelteKit |
| Astro |
| (none, or no at all) | Plain HTML + CSS + JS — write , , and as siblings the user can open directly in a browser. See "Default plain output" below. |
Default plain output
When there's no project context to match (the user is in an empty folder, in their home directory, or just dropped a prompt without pointing you at a codebase), default to a three-file plain web structure:
index.html
styles.css
app.js
Rationale: most real-world web pages need
some interactivity even if it looks minimal — hamburger menu toggle, smooth-scroll behavior, an intersection observer for nav-on-scroll, a dropdown. Always scaffolding all three files (even if
starts as a couple lines) avoids the awkward "I built it pure CSS but now you want a menu toggle, here's a fourth file" moment.
If the source page is genuinely zero-JS (a static brochure with no interactivity at all), you can omit
— but call it out in your output: "no JS file created since the source has no interactive behavior." Otherwise default to all three.
Do not introduce a build step (no Vite, no Tailwind CDN unless you explicitly verify the user wants that, no
). The whole point of the plain default is the user can double-click
and see the result.
Styling detection
| Found | Use |
|---|
| Tailwind classes |
| / | CSS-in-JS |
| CSS Modules |
| / | Sass |
| (none) | Plain CSS in a file next to the component |
Match the user's existing conventions, don't introduce new ones. If they use Tailwind, don't write inline styles. If they use CSS Modules, don't suggest Tailwind.
Fidelity rules (non-negotiable)
These are the rules that separate a real clone from a "looks-roughly-similar" approximation:
- Exact colors — read from CSS or eyedrop precisely. Never approximate.
- Exact spacing — measure pixel gaps in the screenshot or read from computed styles. vs is a visible difference.
- Exact font — if Google Fonts, import the same family + weights. If system font stack, match it.
- Exact radius — (6px) ≠ (8px). Be precise.
- Exact icons — if the source uses Lucide, use Lucide. If Heroicons, use Heroicons. Don't substitute.
- No invented content — keep the source's text verbatim unless the user explicitly says to replace it. Lorem ipsum is a code smell here.
- Match the layout primitive — if the source uses CSS Grid, don't reimplement with flexbox + nth-child hacks.
Anti-patterns (from prior cloning failures)
- ❌ "Close enough" colors — pick a color picker and copy hex
- ❌ Skipping the responsive viewports — always implement all breakpoints, not just desktop
- ❌ Inferring content from context — copy the actual text, don't summarize
- ❌ Using only the static HTML when JS-rendered DOM is available — the JS version is the truth
- ❌ Refactoring "while you're there" — clone first, refactor in a separate pass
Phase 5 — Verify
After implementing, don't claim done yet. Verify in two passes:
Pass A — Visual diff
Preferred (Tier A): MCP-driven diff loop. If Chrome DevTools MCP is available and the user has a working dev server (
/
), use
against the local URL at each viewport, then compare visually against the source screenshot you captured earlier. This is the closest thing to an automated regression diff.
Fallback (Tier B+): manual side-by-side. Open the cloned page in a browser, place it next to the source screenshot, and look for:
- Color drift (your blue vs their blue)
- Spacing drift (4px difference in padding adds up)
- Font weight mismatch ( vs is visible)
- Layout drift (alignment, gap, wrap)
- Missing states (hover doesn't change, focus ring missing)
For
multi-viewport clones, resize the browser to each breakpoint and verify all of them — not just the one that's currently open. With MCP, this is
at each width; without, it's manual resize.
Pass B — Drift list
Produce a written list:
Drift detected:
- Hero headline: source 56px, mine 48px → bump font-size
- CTA button: source has 12px shadow blur, mine has 4px → adjust shadow
- Card grid: source has 24px gap, mine has 32px → reduce gap
- Mobile nav: source slides in from right, mine fades → change transition
No drift detected on:
- Color palette
- Typography family
- Footer layout
If the list has > 0 items in "drift detected" → loop back to Phase 4 and fix. Do not declare done with known drifts.
Phase 6 — Polish
Last 10% — the bits that distinguish "implemented" from "shipped":
- Hover transitions match the source's easing + duration
- Focus rings are visible and accessible
- Touch targets ≥ 44px on mobile
- respects (if source has motion)
- Dark mode handling (if source supports it and user's stack does)
- Image attributes (don't leave on meaningful imagery)
- and meta tags if cloning a full page
Output expectations
When you finish, hand the user:
- List of files created/modified with paths
- Drift list from Phase 5 showing what was checked and what's clean
- Known limitations (e.g. "couldn't match the parallax effect — needs a JS library the project doesn't have")
- Suggested next steps if any (e.g. "you may want to extract the button styles into a reusable component once you have 2-3 instances")
Don't claim "pixel-perfect" unless you've actually verified pixel-level parity. "Close visual match" is honest; "pixel-perfect" requires receipts.
Quick reference: tooling map
| Need | Preferred (Tier A) | Fallback (Tier B+) |
|---|
| Capture live URL | + (Chrome DevTools MCP) | (text only) |
| Read user-provided screenshot | (image rendering) | — |
| Read raw HTML/CSS | | — |
| Find files in user's project | | — |
| Search for existing components/styles | | — |
| Run dev server for visual verification | (e.g. ) | — |
| Diff cloned output vs source | of local + visual compare | Manual side-by-side in browser |
The Chrome DevTools MCP path is the difference between "looks roughly like the brand" and "matches the live page". When it's not available and the user asks for a live URL clone, surface that limitation early — don't bury it in NOTES.md after the fact.
Setup: installing Chrome DevTools MCP
If the user asks how to enable the higher-fidelity path, share these steps:
-
Edit
(Mac/Linux) or
C:\Users\<name>\.claude\settings.json
(Windows). Add to the
block:
json
{
"mcpServers": {
"chrome-devtools": {
"command": "npx",
"args": ["-y", "chrome-devtools-mcp@latest"]
}
}
}
-
Restart Claude Code so the MCP server loads.
-
Verify with a probe: ask the agent to call
against any URL — if it works, you're set.
Or run the bundled setup script:
~/.claude/skills/sw-clone/scripts/install-chrome-devtools-mcp.ps1
(Windows) /
(Unix). The script appends the config without overwriting existing
entries.