dataadvanced

ETL Pipeline Generator

Generates extract-transform-load pipeline code with error handling and monitoring.

Prompt

Design an ETL pipeline for the following data flow:

**Source**: {{source}} (API/database/file/stream)
**Destination**: {{destination}}
**Transform requirements**: {{transforms}}
**Frequency**: {{frequency}} (real-time/hourly/daily)
**Volume**: approximately {{volume}} records per run

Generate a pipeline in {{language}} (Python/SQL/Airflow DAG) that includes:
1. **Extract**: connection handling, pagination, rate limiting, incremental extraction (watermark/CDC)
2. **Transform**: data cleaning, type casting, deduplication, business logic, validation rules
3. **Load**: upsert strategy (insert vs update), batch sizing, transaction handling
4. **Error handling**: retry logic, dead letter queue, partial failure recovery
5. **Monitoring**: row counts at each stage, data quality checks, alerting thresholds
6. **Idempotency**: safe to re-run without duplicating data
7. **Logging**: structured logs for debugging and auditing

Variables

{{source}}{{destination}}{{transforms}}{{frequency}}{{volume}}{{language}}

Use Cases

  • Data warehouse loading
  • API data synchronization
  • Log aggregation pipelines

Compatible Models

claude-sonnet-4-20250514gpt-4o

Tags

etldata-pipelinedata-engineering

Details

Author
PromptIndex
Updated
2026-04-01
Difficulty
advanced

Related Prompts