Loading...
Loading...
Found 1,578 Skills
Optimize Apple App Store metadata in store.config.json for ASO (App Store Optimization). Use when working with store.config.json, App Store keywords, titles, subtitles, descriptions, or localizing app metadata. Helps maximize app visibility and downloads.
Complete SEO setup for Next.js applications. Use when the user wants to implement or improve SEO in a Next.js app, including page metadata, sitemap.xml, llms.txt, robots.txt, and JSON-LD structured data generation, or SEO auditing. Trigger for queries about Next.js SEO optimization, search engine visibility, metadata management, or when the user mentions wanting better SEO for their Next.js application.
Design push notification and messaging strategies including channel selection, timing optimization, personalization, and fatigue management. Use this skill when the user needs to improve notification engagement, reduce opt-out rates, plan multi-channel messaging, or A/B test notification content — even if they say 'our push open rates are low', 'users are unsubscribing', 'when should we send notifications', or 'which channel to use for alerts'.
Solve vehicle routing problems to optimize delivery routes under capacity and time constraints. Use this skill when the user needs to plan delivery routes, minimize transportation costs, or optimize fleet utilization — even if they say 'delivery route optimization', 'fleet routing', or 'minimize driving distance'.
Troubleshoot and optimize the performance of Ascend C operators. This skill is applicable when users develop, review or optimize Ascend C kernel operators, or triggered when users mention keywords such as Ascend C performance optimization, operator optimization, tiling, pipeline, data copy, memory optimization, NPU/Ascend.
Comprehensive SEO, discoverability, and AI crawler optimization for web projects. Use for technical SEO audits, llms.txt/robots.txt setup, schema markup, social launch strategies (Product Hunt, HN, Reddit), and Answer Engine Optimization (AEO). Activate on 'SEO', 'discoverability', 'llms.txt', 'robots.txt', 'Product Hunt', 'launch strategy', 'get traffic', 'be found', 'search ranking'. NOT for paid advertising, PPC campaigns, or social media content creation (use marketing skills).
NCU-driven iterative optimization workflow for CUDA/CUTLASS/Triton/CuTe DSL kernels. MANDATORY: every optimization MUST start with NCU profiling, followed by multi-dimensional analysis, then targeted code modification, then re-profiling to verify. Supports roofline, memory hierarchy, warp stalls, instruction mix, occupancy, divergence analysis. Provides implementation-specific code modifications: Native CUDA (launch config, memory patterns, async copy, Tensor Core), CUTLASS (ThreadblockShape, stages, epilogue, schedule policy, alignment), Triton (autotune params, compiler hints, tl.* API patterns), CuTe DSL (threads_per_cta, elems_per_thread, tiled_copy, copy atom, shared memory, warp/cta reduce). Use when optimizing any CUDA kernel performance.
MLA (Multi-Latent Attention) cost models, regime analysis, and kernel selection guide. Use when: (1) reasoning about which kernel approach to use for a given regime, (2) understanding cost model tradeoffs between FlashMLA, FlashAttention, and MLAvar6+, (3) analyzing roofline behavior across decode/speculative/prefill regimes, (4) setting optimization targets, (5) understanding MLA math and absorption trick.
Compatibility router for the shared optimization knowledge base and the language-specific optimization catalog skills. Use when: (1) selecting which optimization catalog skill to load, (2) the implementation language is not fixed yet, (3) a workflow still references the legacy optimization-catalog skill name, (4) deciding whether a finding is shared or language-specific, (5) updating the generalized knowledge-base structure.
CuTe Python DSL kernel workflow, CuteKernel runtime wrapper, suitability gate, tiling guidance, and CuTe-specific pitfalls. Use when: (1) planning or implementing a kernel in the CuTe Python DSL, (2) the optimization needs more explicit control than cuTile exposes but should remain in a Python-driven workflow, (3) defining package naming for cute-dsl kernels, (4) documenting CuTe Python DSL design choices, (5) recording language-specific knowledge for CuTe Python DSL.
Fine-tune LLMs using reinforcement learning with TRL - SFT for instruction tuning, DPO for preference alignment, PPO/GRPO for reward optimization, and reward model training. Use when need RLHF, align model with preferences, or train from human feedback. Works with HuggingFace Transformers.
GPU kernel profiling workflow across supported kernel implementation languages. Provides commands for all 4 profiling modes (annotation, event, ncu, nsys), metric interpretation tables, bottleneck identification rules, and the output contract for returning compact results to the orchestrator. Use when: (1) profiling a kernel version, (2) interpreting profiling artifacts/reports, (3) comparing kernel versions, (4) identifying bottlenecks and optimization opportunities, (5) documenting performance in the development log.