Search Results: object-detection

Found 23 Skills

AI & Machine Learningdavila7/claude-code-templ...

senior-computer-vision

World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision transformers. Includes 3D vision, video analysis, real-time processing, and production deployment. Use when building vision AI systems, implementing object detection, training custom vision models, or optimizing inference pipelines.

🇺🇸|EnglishTranslated

3 scripts/Checked

AI & Machine Learningnvidia/skills

tao-train-rtdetr

RT-DETR (Real-Time DEtection TRansformer) for 2D object detection. Designed for real-time inference with competitive accuracy and supports distillation and quantization for deployment optimization. Use when training, evaluating, distilling, quantizing, exporting, or running inference for a TAO RT-DETR model. Trigger phrases include "train RT-DETR", "real-time DETR", "low-latency object detection", "RT-DETR distillation / quantization".

🇺🇸|EnglishTranslated

AI & Machine Learningpromptingcompany/nv-skill...

deepstream-import-vision-model

Use this skill to bring any vision model from HuggingFace or NVIDIA NGC into an NVIDIA DeepStream pipeline with end-to-end automation: ONNX download, SafeTensors export, TRT engine build, custom nvinfer bbox parser, multi-stream benchmark, and PDF report. Object detection models only.

🇺🇸|EnglishTranslated

20 scripts/Attention

AI & Machine Learningpromptingcompany/nv-skill...

tao-train-deformable-detr

Deformable DETR for 2D object detection. Uses deformable attention for efficient multi-scale feature processing, lighter than DINO with competitive accuracy. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Deformable-DETR model. Trigger phrases include "train deformable-detr", "Deformable DETR object detection", "lightweight DETR detector".

🇺🇸|EnglishTranslated

AI & Machine Learningabsolutelyskilled/absolut...

computer-vision

Use this skill when building computer vision applications, implementing image classification, object detection, or segmentation pipelines. Triggers on image classification, object detection, YOLO, semantic segmentation, image preprocessing, data augmentation, transfer learning, CNN architectures, vision transformers, and any task requiring visual recognition or image analysis.

🇺🇸|EnglishTranslated

AI & Machine Learningdatadrivenconstruction/dd...

image-to-data

Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.

🇺🇸|EnglishTranslated

AI & Machine Learningzai-org/glm-skills

glmv-grounding

A skill that uses GLM-V native grounding capabilities for coordinate conversion, bounding-box visualization, and more. GLM-V native grounding can locate any target specified by the prompt in an image and output relative coordinates normalized to 0-1000 based on image size. Coordinate formats include 2D bounding box (default), 2D points, and 3D bounding box. GLM-V also supports spatiotemporal localization and tracking of multiple prompt-specified targets in videos, outputting 2D bounding boxes per second.

🇺🇸|EnglishTranslated

6 scripts/Attention

Backend Developmentpalkan/skills

layered-rails

Design and review Rails applications using layered architecture principles from "Layered Design for Ruby on Rails Applications". Use when analyzing Rails codebases, reviewing PRs for architecture violations, planning feature implementations, or implementing patterns like authorization, view components, or AI integration. Triggers on "layered design", "architecture layers", "abstraction", "specification test", "layer violation", "extract service", "fat controller", "god object".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-train-grounding-dino

Grounding DINO for open-set object detection. Combines DINO-style detection with a BERT text encoder for language-guided detection — detects objects described by text prompts without a fixed class vocabulary. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Grounding DINO model. Trigger phrases include "train Grounding DINO", "open-vocabulary detection", "text-prompted detector", "language-guided object detection".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-train-bevfusion

BEVFusion for multi-sensor 3D object detection. Fuses LiDAR point clouds and camera images in bird's-eye-view (BEV) space, used in autonomous driving for robust 3D perception. Use when training, evaluating, or running inference for a TAO BEVFusion model. Trigger phrases include "train BEVFusion", "LiDAR + camera fusion", "BEV 3D detection", "multi-sensor 3D perception".

🇺🇸|EnglishTranslated

AI & Machine Learningnvidia/skills

tao-train-dino

DINO (DETR with Improved DeNoising Anchor Boxes) for 2D object detection. Transformer-based detector with denoising training, multi-scale features, and optional distillation support. Use when training, evaluating, exporting, distilling, quantizing, or running inference for a TAO DINO detector. Trigger phrases include "train DINO", "DETR object detection", "TAO 2D detection", "DINO with distillation".

🇺🇸|EnglishTranslated