Loading...
Loading...
Extract Xiaohongshu post content (text, images, video transcription), organize it into Markdown and save
npx skill4agent add chenxiachan/xhs-claude-skills xhs~/cookies.json~/Documents/Obsidian Vault/xhsmlx-community/whisper-large-v3-turbo~/cookies.jsoncopy(JSON.stringify(document.cookie.split('; ').map(c => {
const [name, ...rest] = c.split('=');
return { name, value: rest.join('='), domain: '.xiaohongshu.com', path: '/',
expires: Date.now()/1000 + 86400*30, size: name.length + rest.join('=').length,
httpOnly: false, secure: false, session: false, priority: 'Medium',
sameParty: false, sourceScheme: 'Secure', sourcePort: 443 };
})))~/cookies.jsonwindow.__INITIAL_STATE__import json, urllib.request, ssl, re
with open('<Cookies 文件>') as f:
cookies = json.load(f)
cookie_str = '; '.join(f"{c['name']}={c['value']}" for c in cookies)
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
req = urllib.request.Request('<帖子URL>')
req.add_header('Cookie', cookie_str)
req.add_header('User-Agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36')
resp = urllib.request.urlopen(req, timeout=15, context=ctx)
html = resp.read().decode('utf-8', errors='ignore')
m = re.search(r'window\.__INITIAL_STATE__\s*=\s*(\{.+?\})\s*</script>', html, re.DOTALL)
raw = m.group(1).replace('undefined', 'null')
data = json.loads(raw)
# Post data is located at: data['note']['noteDetailMap'][<key>]['note']
# Contains: title, desc, type, time, user, imageList, video, interactInfo, ipLocationnote['video']['media']['stream'] -> Take the masterUrl of the first stream in the priority order of h264 > h265 > av1curl -L -o /tmp/xhs_{post_id}.mp4 -H "Referer: https://www.xiaohongshu.com/" <视频URL>
ffmpeg -y -i /tmp/xhs_{post_id}.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/xhs_{post_id}.wavimport mlx_whisper
result = mlx_whisper.transcribe("/tmp/xhs_{post_id}.wav",
path_or_hf_repo="mlx-community/whisper-large-v3-turbo", language="zh", verbose=False)rm -f /tmp/xhs_{post_id}.mp4 /tmp/xhs_{post_id}.wav<Obsidian Save Directory>/{YYYY-MM-DD} {Short Title}.md{Post Date} {Short Title}.md<Obsidian Save Directory>/img/<Obsidian Save Directory>/video/# One-sentence Core Insight (Counterintuitive judgment, not descriptive title)
Core argument, 2-3 sentences. Directly give a judgment like "Most people think X, but actually Y".
No nonsense, no foreshadowing, just like Thiel speaking in a board meeting.
**Relevance to Me:** One sentence. Read the user's memory (user and project-type memory under ~/.claude/projects/*/memory/) to understand the user's background, research direction, and current work, and clarify how this content relates to the user. If memory is unavailable, start from the perspective of general personal development/tools/methodology.
**Is it worth digging deeper?:** Yes/No. One-sentence reason.
> [!tip]- Details
> Structured organization of the post's core content (collapsed state, visible only when expanded):
> - Extract from desc and video transcription, clean up `#xxx[topic]#` tags
> - Divide into sections by logical structure, retain key data and conclusions
> - Embed images using ``
> - For video posts, place the organized transcribed content here
> [!info]- Note Attributes
> - **Source**: Xiaohongshu · Author Name
> - **Post ID**: xxx
> - **Link**: Original Link
> - **Date**: YYYY-MM-DD
> - **Type**: image/video
> - **Engagement**: N Likes / N Saves / N Comments
> - **Tags**: Tag 1, Tag 2, ...urlDefault