9.4 KiB
{dataset_name} Feature Engineering Analysis Report
Dataset: {dataset_id} Category: {category} Region: {region} Analysis Date: {analysis_date} Fields Analyzed: {field_count}
Executive Summary
Primary Question Answered by Dataset: What does this dataset fundamentally measure?
Key Insights from Analysis:
- {insight_1}
- {insight_2}
- {insight_3}
Critical Field Relationships Identified:
- {relationship_1}
- {relationship_2}
Most Promising Feature Concepts:
- {top_feature_1} - because {reason_1}
- {top_feature_2} - because {reason_2}
- {top_feature_3} - because {reason_3}
Dataset Deep Understanding
Dataset Description
{dataset_description}
Field Inventory
| Field ID | Description | Data Type | Update Frequency | Coverage |
|---|---|---|---|---|
| {field_1_id} | {field_1_desc} | {type_1} | {freq_1} | {coverage_1}% |
| {field_2_id} | {field_2_desc} | {type_2} | {freq_2} | {coverage_2}% |
| {field_3_id} | {field_3_desc} | {type_3} | {freq_3} | {coverage_3}% |
(Additional fields as needed)
Field Deconstruction Analysis
{field_1_id}: {field_1_name}
- What is being measured?: {measurement_object_1}
- How is it measured?: {measurement_method_1}
- Time dimension: {time_dimension_1}
- Business context: {business_context_1}
- Generation logic: {generation_logic_1}
- Reliability considerations: {reliability_1}
{field_2_id}: {field_2_name}
- What is being measured?: {measurement_object_2}
- How is it measured?: {measurement_method_2}
- Time dimension: {time_dimension_2}
- Business context: {business_context_2}
- Generation logic: {generation_logic_2}
- Reliability considerations: {reliability_2}
(Additional fields as needed)
Field Relationship Mapping
The Story This Data Tells: {story_description}
Key Relationships Identified:
- {relationship_1_desc}
- {relationship_2_desc}
- {relationship_3_desc}
Missing Pieces That Would Complete the Picture:
- {missing_1}
- {missing_2}
Feature Concepts by Question Type
Q1: "What is stable?" (Invariance Features)
Concept: {stability_feature_1_name}
- Fields Used: {fields_used_1}
- Definition: {definition_1}
- Why This Feature: {why_1}
- Logical Meaning: {logical_meaning_1}
- Directionality: {directionality_1}
- Boundary Conditions: {boundaries_1}
- Implementation Example:
{implementation_1}
Concept: {stability_feature_2_name}
- Fields Used: {fields_used_2}
- Definition: {definition_2}
- Why This Feature: {why_2}
- Logical Meaning: {logical_meaning_2}
- Directionality: {directionality_2}
- Boundary Conditions: {boundaries_2}
- Implementation Example:
{implementation_2}
Q2: "What is changing?" (Dynamics Features)
Concept: {dynamics_feature_1_name}
- Fields Used: {fields_used_3}
- Definition: {definition_3}
- Why This Feature: {why_3}
- Logical Meaning: {logical_meaning_3}
- Directionality: {directionality_3}
- Boundary Conditions: {boundaries_3}
- Implementation Example:
{implementation_3}
Concept: {dynamics_feature_2_name}
- Fields Used: {fields_used_4}
- Definition: {definition_4}
- Why This Feature: {why_4}
- Logical Meaning: {logical_meaning_4}
- Directionality: {directionality_4}
- Boundary Conditions: {boundaries_4}
- Implementation Example:
{implementation_4}
Q3: "What is anomalous?" (Deviation Features)
Concept: {anomaly_feature_1_name}
- Fields Used: {fields_used_5}
- Definition: {definition_5}
- Why This Feature: {why_5}
- Logical Meaning: {logical_meaning_5}
- Directionality: {directionality_5}
- Boundary Conditions: {boundaries_5}
- Implementation Example:
{implementation_5}
Concept: {anomaly_feature_2_name}
- Fields Used: {fields_used_6}
- Definition: {definition_6}
- Why This Feature: {why_6}
- Logical Meaning: {logical_meaning_6}
- Directionality: {directionality_6}
- Boundary Conditions: {boundaries_6}
- Implementation Example:
{implementation_6}
Q4: "What is combined?" (Interaction Features)
Concept: {interaction_feature_1_name}
- Fields Used: {fields_used_7}
- Definition: {definition_7}
- Why This Feature: {why_7}
- Logical Meaning: {logical_meaning_7}
- Directionality: {directionality_7}
- Boundary Conditions: {boundaries_7}
- Implementation Example:
{implementation_7}
Concept: {interaction_feature_2_name}
- Fields Used: {fields_used_8}
- Definition: {definition_8}
- Why This Feature: {why_8}
- Logical Meaning: {logical_meaning_8}
- Directionality: {directionality_8}
- Boundary Conditions: {boundaries_8}
- Implementation Example:
{implementation_8}
Q5: "What is structural?" (Composition Features)
Concept: {structure_feature_1_name}
- Fields Used: {fields_used_9}
- Definition: {definition_9}
- Why This Feature: {why_9}
- Logical Meaning: {logical_meaning_9}
- Directionality: {directionality_9}
- Boundary Conditions: {boundaries_9}
- Implementation Example:
{implementation_9}
Concept: {structure_feature_2_name}
- Fields Used: {fields_used_10}
- Definition: {definition_10}
- Why This Feature: {why_10}
- Logical Meaning: {logical_meaning_10}
- Directionality: {directionality_10}
- Boundary Conditions: {boundaries_10}
- Implementation Example:
{implementation_10}
Q6: "What is cumulative?" (Accumulation Features)
Concept: {accumulation_feature_1_name}
- Fields Used: {fields_used_11}
- Definition: {definition_11}
- Why This Feature: {why_11}
- Logical Meaning: {logical_meaning_11}
- Directionality: {directionality_11}
- Boundary Conditions: {boundaries_11}
- Implementation Example:
{implementation_11}
Concept: {accumulation_feature_2_name}
- Fields Used: {fields_used_12}
- Definition: {definition_12}
- Why This Feature: {why_12}
- Logical Meaning: {logical_meaning_12}
- Directionality: {directionality_12}
- Boundary Conditions: {boundaries_12}
- Implementation Example:
{implementation_12}
Q7: "What is relative?" (Comparison Features)
Concept: {relative_feature_1_name}
- Fields Used: {fields_used_13}
- Definition: {definition_13}
- Why This Feature: {why_13}
- Logical Meaning: {logical_meaning_13}
- Directionality: {directionality_13}
- Boundary Conditions: {boundaries_13}
- Implementation Example:
{implementation_13}
Concept: {relative_feature_2_name}
- Fields Used: {fields_used_14}
- Definition: {definition_14}
- Why This Feature: {why_14}
- Logical Meaning: {logical_meaning_14}
- Directionality: {directionality_14}
- Boundary Conditions: {boundaries_14}
- Implementation Example:
{implementation_14}
Q8: "What is essential?" (Essence Features)
Concept: {essence_feature_1_name}
- Fields Used: {fields_used_15}
- Definition: {definition_15}
- Why This Feature: {why_15}
- Logical Meaning: {logical_meaning_15}
- Directionality: {directionality_15}
- Boundary Conditions: {boundaries_15}
- Implementation Example:
{implementation_15}
Concept: {essence_feature_2_name}
- Fields Used: {fields_used_16}
- Definition: {definition_16}
- Why This Feature: {why_16}
- Logical Meaning: {logical_meaning_16}
- Directionality: {directionality_16}
- Boundary Conditions: {boundaries_16}
- Implementation Example:
{implementation_16}
Implementation Considerations
Data Quality Notes
- Coverage: {coverage_note}
- Timeliness: {timeliness_note}
- Accuracy: {accuracy_note}
- Potential Biases: {bias_note}
Computational Complexity
- Lightweight features: {simple_features}
- Medium complexity: {medium_features}
- Heavy computation: {complex_features}
Recommended Prioritization
Tier 1 (Immediate Implementation):
- {priority_1_feature} - {priority_1_reason}
- {priority_2_feature} - {priority_2_reason}
- {priority_3_feature} - {priority_3_reason}
Tier 2 (Secondary Priority):
- {priority_4_feature} - {priority_4_reason}
- {priority_5_feature} - {priority_5_reason}
Tier 3 (Requires Further Validation):
- {priority_6_feature} - {priority_6_reason}
Critical Questions for Further Exploration
Unanswered Questions:
- {unanswered_question_1}
- {unanswered_question_2}
- {unanswered_question_3}
Recommended Additional Data:
- {additional_data_1}
- {additional_data_2}
- {additional_data_3}
Assumptions to Challenge:
- {assumption_1}
- {assumption_2}
- {assumption_3}
Methodology Notes
Analysis Approach: This report was generated by:
- Deep field deconstruction to understand data essence
- Question-driven feature generation (8 fundamental questions)
- Logical validation of each feature concept
- Transparent documentation of reasoning
Design Principles:
- Focus on logical meaning over conventional patterns
- Every feature must answer a specific question
- Clear documentation of "why" for each suggestion
- Emphasis on data understanding over prediction
Report generated: {generation_timestamp} Analysis depth: Comprehensive field deconstruction + 8-question framework Next steps: Implement Tier 1 features, validate assumptions, gather additional data as needed