Data engineering and analytics operations workflows for SQL, dbt, Airflow, warehouses, Postgres, CSV cleanup, schema quality, retrieval indexes, data catalogs, dashboards, and query tuning.
- Live page: https://agentskillexchange.com/industry-skills/#data-platform-analytics-engineering
- Homepage access: Curated Collections on https://agentskillexchange.com/
- Data engineers, analytics engineers, data platform teams, BI operators, and database reliability owners.
- Teams managing dbt projects, SQL quality, warehouse performance, CSV cleanup, pipeline orchestration, schema review, dashboards, and database operations.
- Operate Airflow and dbt context for pipeline orchestration, lineage, and transformation review.
- Lint, format, and validate SQL, dbt, CSV, and schema changes before they break downstream consumers.
- Profile local files and query structured exports before loading them into warehouses or dashboards.
- Tune PostgreSQL and Snowflake performance while preserving backup and restore readiness.
- Build and evaluate retrieval, vector, and catalog context before agents consume data assets.
- Agent-safe data operations: Inspect database or DAG context → Review permissions and query scope → Run controlled analysis → Record evidence for the operator
- Analytics engineering change review: Plan dbt or SQL model change → Lint and validate SQL → Compare model or relation parity → Preview rollout impact
- Warehouse and database performance: Inspect schema and execution plans → Tune query, index, or RLS behavior → Document risky migrations → Hand off a review packet
- Tabular intake and cleanup: Inspect large CSV or messy tables → Profile schema and anomalies → Normalize for downstream transforms → Validate quality before ingestion
| Skill | What it does here | Persona | Install | Stars |
|---|---|---|---|---|
| Query and inspect Postgres databases through MCP with pgEdge Postgres MCP | Gives agents a controlled Postgres inspection path for schema, table, and query context without turning database access into ad hoc shell work. | Postgres operator / data platform engineer | Medium | 165 |
| Query cross-connector business data for agents with Dinobase | Lets data teams query business context across connectors when analytics questions span tools rather than one warehouse table. | Analytics platform engineer | Medium | 252 |
| Review SQL Server execution plans through an MCP-compatible analysis workflow with Performance Studio | Adds SQL Server execution-plan review to the workflow for query tuning and database performance investigations. | Database performance engineer | Medium | 175 |
| Run deterministic SQL and dbt analysis under coding agents with Altimate Code | Keeps SQL and dbt analysis deterministic under coding agents, which matters when agents review warehouse changes. | Analytics engineering lead | Medium | 552 |
| Operate Airflow and warehouse workflows through agent-safe data engineering skills with Astronomer Agents | Connects Airflow and warehouse operations through agent-safe data engineering workflows instead of unbounded DAG manipulation. | Data platform engineer | High | 337 |
| Tune Supabase Postgres queries, indexing, and RLS with Supabase Postgres Best Practices | Covers Supabase/Postgres query, index, and RLS tuning where app data teams need database advice tied to production constraints. | Supabase/Postgres engineer | Medium | — |
| Compare dbt models and warehouse relations before trusting migration parity with dbt-audit-helper | Checks dbt model and warehouse parity before migrations or refactors are trusted. | Data quality lead | Medium | 402 |
| Translate and validate SQL across dialects with SQLGlot | Translates and validates SQL across dialects so warehouse migrations and multi-engine analytics work stay reviewable. | Data engineer / migration reviewer | Low | 9.1k |
| Lint PostgreSQL migrations and SQL changes before irreversible schema mistakes land with Squawk | Catches risky PostgreSQL migration patterns before irreversible schema mistakes land. | Database operator | Low | 1.1k |
| Inspect large CSV files interactively before cleanup, mapping, or downstream transforms with csvlens | Provides interactive inspection of large CSV files before teams map, clean, or ingest messy tabular data. | Data operations analyst | Low | 3.7k |
| Profile and triage messy tabular files from the terminal with VisiData | Helps analysts profile and triage messy tabular files quickly before building downstream transforms. | Data analyst / data wrangler | Low | 9k |
| Plan and preview warehouse SQL model changes before rollout with SQLMesh | Adds plan-and-preview discipline for warehouse SQL model changes before rollout. | Analytics engineer / warehouse reviewer | Medium | 3k |
| Apache Airflow MCP | Connects agents to Airflow DAG context and orchestration workflows for platform-owned pipelines. | Data platform engineer | High | 45k |
| dbt Cloud MCP | Lets agents inspect and operate dbt Cloud context without treating the warehouse as a black box. | Analytics engineer | Medium | 12.6k |
| dbt Data Transform Orchestrator | Frames dbt model runs and transformation orchestration as a repeatable analytics engineering workflow. | Analytics engineer | Medium | 12.6k |
| dbt Model Lineage Extractor | Extracts model lineage so teams can reason about downstream impact before changing data assets. | Analytics platform owner | Medium | 12.6k |
| Gate dbt projects with pre-commit checks from dbt-checkpoint | Adds pre-commit gates for dbt projects before style, tests, or docs drift. | Analytics engineering reviewer | Low | 738 |
| SQLFluff SQL Linter and Auto-Formatter | Lints and formats SQL so warehouse changes are reviewable and consistent. | Data engineer | Low | 9.6k |
| sqruff High-Performance SQL Linter and Formatter | Provides a fast SQL lint/format path for teams that need lightweight query quality checks. | Analytics engineer | Low | 1.3k |
| DuckDB SQL Analytics Agent | Runs local analytics over structured files for fast investigation without provisioning a warehouse. | Data analyst / analytics engineer | Low | 37.1k |
| trdsql SQL Query Engine for CSV JSON and YAML Files | Queries CSV, JSON, and YAML exports with SQL before loading them into a larger system. | Data operations analyst | Low | 2.2k |
| Anyquery Universal SQL Engine with MCP Integration | Turns many APIs and files into SQL-queryable surfaces for agents and data operators. | Data platform engineer | Medium | 1.7k |
| Great Expectations Data Validation Pipeline | Creates explicit data quality expectations and validation artifacts for pipelines. | Data quality engineer | Medium | 11.3k |
| Inventory live database schemas and generate reviewable docs before risky SQL changes with SchemaCrawler | Inventories live schemas and produces reviewable docs before risky SQL or database changes. | Database platform engineer | Medium | 1.8k |
| Build local vector retrieval indexes with Faiss | Adds a local vector-indexing layer for retrieval and RAG experiments before teams adopt heavier managed vector infrastructure. | Data platform engineer / retrieval engineer | Medium | 40.2k |
| Expose data catalog context to AI workflows with DataHub | Grounds agent-assisted data discovery in catalog metadata, ownership, schema, and lineage context. | Data governance lead / analytics platform engineer | High | 12.1k |
| Query ClickHouse analytics safely from MCP clients | Gives agents bounded, schema-aware ClickHouse analytics access without handing them ad hoc database credentials. | Analytics platform engineer / ClickHouse operator | Medium | 801 |
| Serve production agentic RAG through R2R | Provides production hybrid retrieval and agentic RAG APIs with citations and reviewable document operations. | Retrieval engineer / data platform lead | High | 7.9k |
| Build document search layers for AI apps with Morphik | Adds document ingestion and search quality tuning before AI apps or agents consume retrieval context. | Retrieval engineer / AI data platform builder | High | 3.6k |
| Add Postgres-native vector retrieval to agent and RAG workflows with pgvector | Stores embeddings beside application data in Postgres for semantic search, RAG, recommendations, and agent memory retrieval. | Postgres data platform engineer / retrieval engineer | Medium | 21.7k |
| Generate and evaluate retrieval embeddings with FlagEmbedding | Helps teams choose, evaluate, and rerank retrieval embeddings before feeding context into RAG workflows. | Retrieval engineer / ML platform engineer | Medium | 11.8k |
| Scale agent retrieval workloads with Milvus | Adds a scalable vector database path for filtered similarity search across RAG and agent retrieval workloads. | Vector database operator / data platform engineer | High | 44.7k |
| Build enterprise RAG and agent workflows with Bisheng | Assembles and evaluates enterprise RAG workflows across internal data, models, and business users. | Enterprise AI platform lead / retrieval workflow owner | High | 11.4k |
- This collection spans Data Extraction, Code Quality, Runbooks, and dashboarding because data platform work is both engineering and operations.
- Several dbt picks are included, but each covers a different job: orchestration, parity, lineage, pre-commit gates, and formatting.
- Listed SchemaCrawler is included because live schema inventory is a distinct database-operations gap not covered by the security-reviewed data picks.
- Recent RAG and vector-database picks are included when they serve data-platform ownership: retrieval quality, catalog grounding, and production data access.
- Keep this centered on data platform and analytics engineering workflows; do not turn it into a generic Data Extraction category page.
- Finance & Filings
- Product Analytics & Growth Ops
- Infrastructure, SRE & Incident Operations
- Education & Research Workflows