Loading...
Loading...
Found 12 Skills
Trace downstream data lineage and impact analysis. Use when the user asks what depends on this data, what breaks if something changes, downstream dependencies, or needs to assess change risk before modifying a table or DAG.
This skill should be used when working with LaminDB, an open-source data framework for biology that makes data queryable, traceable, reproducible, and FAIR. Use when managing biological datasets (scRNA-seq, spatial, flow cytometry, etc.), tracking computational workflows, curating and validating data with biological ontologies, building data lakehouses, or ensuring data lineage and reproducibility in biological research. Covers data management, annotation, ontologies (genes, cell types, diseases, tissues), schema validation, integrations with workflow managers (Nextflow, Snakemake) and MLOps platforms (W&B, MLflow), and deployment strategies.
Use when implementing data governance frameworks, building data catalogs, establishing data lineage, defining data quality rules, or setting up data stewardship programs - covers metadata management, data quality, and complianceUse when ", " mentioned.
DataWorks metadata Skill for Alibaba Cloud — browse Data Map metadata and perform non-destructive writes via Aliyun CLI. READ scope: list/get catalogs, databases, tables, columns, partitions; query data lineage (upstream/downstream impact); list/get datasets & versions; list/get metadata collections (Category/Album) and entities inside them; preview dataset version content. WRITE scope (non-destructive only): update table & column business metadata; register lineage relationships; create/update datasets and versions; create/update metadata collections and add entities to them. This Skill exposes NO delete or remove APIs — every `delete-*` and `remove-*` operation is intentionally out of scope. For deletions, use the DataWorks console. Triggers: "dataworks metadata", "data map", "data lineage", "meta collection", "dataset", "catalog", "table info", "column info", "partition", "impact analysis", "register lineage", "create dataset", "update business metadata".
Annotate Airflow tasks with data lineage using inlets and outlets. Use when the user wants to add lineage metadata to tasks, specify input/output datasets, or enable lineage tracking for operators without built-in OpenLineage extraction.
Design and operate data quality programs for financial data — golden source architecture, validation rules, data lineage, exception management, profiling, and governance. Use when building validation rules for pricing or client data pipelines, designing a data quality monitoring framework, establishing golden source designations across systems, implementing data lineage for BCBS 239 or MiFID II, investigating reconciliation breaks or billing errors traced to bad data, preparing for regulatory exams on data accuracy, building data quality scorecards, or defining data stewardship roles. Trigger on: data quality, golden source, data lineage, data validation, data profiling, exception management, data governance, BCBS 239, data completeness, data accuracy, validation rules, data anomaly, data stewardship, data quality scorecard.
Trace upstream data lineage. Use when the user asks where data comes from, what feeds a table, upstream dependencies, data sources, or needs to understand data origins.
Use to define schemas, topic tags, and lineage metadata for enriched signals.
Use this skill when implementing data validation, data quality monitoring, data lineage tracking, data contracts, or Great Expectations test suites. Triggers on schema validation, data profiling, freshness checks, row-count anomalies, column drift, expectation suites, contract testing between producers and consumers, lineage graphs, data observability, and any task requiring data integrity enforcement across pipelines.
Create custom OpenLineage extractors for Airflow operators. Use when the user needs lineage from unsupported or third-party operators, wants column-level lineage, or needs complex extraction logic beyond what inlets/outlets provide.
Track data lineage and provenance from source to consumption. Use when auditing data flows, debugging data quality issues, ensuring compliance (GDPR, SOX), or understanding data dependencies. Covers lineage tracking, impact analysis, data catalogs, and metadata management.
基于ByteHouse MCP Server,生成数据资产目录和血缘分析的技能,用于获取数据库表结构、生成数据资产目录、分析表之间的血缘关系。当用户需要获取ByteHouse数据库的表结构、生成数据资产目录、分析表之间的血缘关系时,使用此Skill。