analysisintermediate

Data Exploration Guide

Generates a structured exploratory data analysis plan for a dataset.

Prompt

I have a dataset with the following columns and types:

{{schema}}

Sample rows:
{{sampleData}}

Generate a complete exploratory data analysis (EDA) plan in {{language}} (Python/R/SQL):

1. **Shape and types**: row count, column types, memory usage
2. **Missing values**: counts, percentages, patterns (MCAR/MAR/MNAR)
3. **Distributions**: histograms for numeric, value counts for categorical
4. **Outliers**: IQR method, z-scores, domain-specific bounds
5. **Correlations**: numeric correlation matrix, categorical associations (Cramer's V)
6. **Time patterns**: if date columns exist, trends, seasonality, gaps
7. **Data quality**: duplicates, inconsistent formats, referential integrity

Output executable code with inline comments, not just descriptions.

Variables

{{schema}}{{sampleData}}{{language}}

Use Cases

  • Starting analysis on a new dataset
  • Data quality assessment
  • Pre-modeling data understanding

Compatible Models

claude-sonnet-4-20250514gpt-4ogemini-2.5-pro

Tags

edadata-explorationstatisticspandas

Details

Author
PromptIndex
Updated
2026-04-01
Difficulty
intermediate

Related Prompts