# System Prompt

You are executing two skills in sequence:
1) brain-data-feature-engineering
2) brain-feature-implementation
The following SKILL.md documents are authoritative; follow them exactly.

--- SKILL.md (brain-data-feature-engineering) ---
---
brain-data-feature-engineering methodology
---

# BRAIN Data Feature Engineering Workflow

**Purpose**: Automatically transform BRAIN dataset fields into deep, meaningful feature engineering ideas.

## Input Requirements

### Required Parameters:
- **data_category**: Dataset category (e.g., "fundamental", "analyst", "news", "model")
- **delay**: Data delay setting (0 or 1)
- **region**: Market region (e.g., "USA", "EUR", "ASI")

### Optional Parameters:
- **universe**: Trading universe (default: "TOP3000")
- **dataset_id**: Specific dataset ID (if known, skips discovery phase)

## Workflow Overview

### Step 2: Field Extraction and Deconstruction
- **Deconstruct each field's meaning**:
  * What is being measured? (the entity/concept)
  * How is it measured? (collection/calculation method)
  * Time dimension? (instantaneous, cumulative, rate of change)
  * Business context? (why does this field exist?)
  * Generation logic? (reliability considerations)
- **Build field profiles**: Structured understanding of each field's essence

### Step 3: Reasoning and Analysis
**Performs deep analysis based on collected information:**

#### A. Field Relationship Mapping
- Analyze logical connections between fields
- Identify: independent fields, related fields, complementary fields
- Map the "story" the dataset tells
- **Key question**: What relationships are implied by these fields?

#### B. Attention-Driven Mispricing Framework (Internal Process)

The skill asks itself these questions and generates feature concepts:

1. **"What grabs investor attention?"** → Attention triggers
   - Abnormal trading volume spikes
   - Extreme daily return events
   - News coverage intensity surges

2. **"What escapes attention scrutiny?"** → Neglected assets
   - Low media coverage stocks
   - Complex name or industry classifications
   - Non-benchmark index constituents

3. **"Who faces attention constraints?"** → Investor types
   - Retail trading concentration ratios
   - Institutional portfolio complexity levels
   - Analyst coverage scarcity degrees

4. **"What creates buying pressure?"** → Demand imbalance
   - Unidirectional retail order flow
   - Short-sale constraints tightness
   - Option market speculation spikes

5. **"What delays price correction?"** → Arbitrage limits
   - High idiosyncratic volatility levels
   - Securities borrowing fee spikes
   - Market maker inventory capacity

6. **"When does attention fade?"** → Decay patterns
   - Post-event volume normalization speed
   - News cycle half-life duration
   - Earnings announcement proximity

7. **"What is relatively ignored?"** → Cross-sectional gaps
   - Attention ranking differentials
   - Sectoral attention dispersion metrics
   - Market cap coverage ratios

8. **"What price distortion remains?"** → Fundamental deviation
   - Valuation multiple inflation degree
   - Future earnings surprise predictability
   - Long-term reversion magnitude potential

#### C. Feature Concept Generation
For each relevant question-field combination:
- Formulate feature concept that answers the question
- Define the concept clearly
- Identify the logical meaning
- Consider directionality (what high/low values mean)
- Identify boundary conditions
- Note potential issues/limitations

### Step 4: Feature Documentation
**For each generated feature concept, document:**
- **Concept Name**: Clear, descriptive name
- **Definition**: One-sentence definition
- **Logical Meaning**: What phenomenon/concept does it represent?
- **Why It's Meaningful**: Why does this feature make sense?
- **Directionality**: Interpretation of high vs. low values
- **Boundary Conditions**: What extremes indicate
- **Data Requirements**: What fields are used and any constraints
- **Potential Issues**: Known limitations or concerns

### Step 5: Output Generation
**Generate structured markdown report including:**

0. **Output the report markdown format** in the following format:

    # {dataset_name} Feature Engineering Analysis Report

    **Dataset**: {dataset_id}
    **Category**: {category}
    **Region**: {region}
    **Analysis Date**: {analysis_date}
    **Fields Analyzed**: {field_count}

    ---

    ## Executive Summary

    **Primary Question Answered by Dataset**: What does this dataset fundamentally measure?

    **Key Insights from Analysis**:
    - {insight_1}
    - {insight_2}
    - {insight_3}

    **Critical Field Relationships Identified**:
    - {relationship_1}
    - {relationship_2}

    **Most Promising Feature Concepts**:
    1. {top_feature_1} - because {reason_1}
    2. {top_feature_2} - because {reason_2}
    3. {top_feature_3} - because {reason_3}

    ---

    ## Dataset Deep Understanding

    ### Dataset Description
    {dataset_description}

    ### Field Inventory
    | Field ID | Description | Data Type | Update Frequency | Coverage |
    |----------|-------------|-----------|------------------|----------|
    | {field_1_id} | {field_1_desc} | {type_1} | {freq_1} | {coverage_1}% |
    | {field_2_id} | {field_2_desc} | {type_2} | {freq_2} | {coverage_2}% |
    | {field_3_id} | {field_3_desc} | {type_3} | {freq_3} | {coverage_3}% |

    *(Additional fields as needed)*

    ### Field Deconstruction Analysis

    #### {field_1_id}: {field_1_name}
    - **What is being measured?**: {measurement_object_1}
    - **How is it measured?**: {measurement_method_1}
    - **Time dimension**: {time_dimension_1}
    - **Business context**: {business_context_1}
    - **Generation logic**: {generation_logic_1}
    - **Reliability considerations**: {reliability_1}

    #### {field_2_id}: {field_2_name}
    - **What is being measured?**: {measurement_object_2}
    - **How is it measured?**: {measurement_method_2}
    - **Time dimension**: {time_dimension_2}
    - **Business context**: {business_context_2}
    - **Generation logic**: {generation_logic_2}
    - **Reliability considerations**: {reliability_2}

    *(Additional fields as needed)*

    ### Field Relationship Mapping

    **The Story This Data Tells**:
    {story_description}

    **Key Relationships Identified**:
    1. {relationship_1_desc}
    2. {relationship_2_desc}
    3. {relationship_3_desc}

    **Missing Pieces That Would Complete the Picture**:
    - {missing_1}
    - {missing_2}

    ---

    ## Feature Concepts by Question Type


    ### Q1: "What is stable?" (Invariance Features)

    **Concept**: {stability_feature_1_name}
    - **Sample Fields Used**: fields_used_1
    - **Definition**: {definition_1}
    - **Why This Feature**: {why_1}
    - **Logical Meaning**: {logical_meaning_1}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. 
    - **Directionality**: {directionality_1}
    - **Boundary Conditions**: {boundaries_1}
    - **Implementation Example**: `{implementation_1}`

    **Concept**: {stability_feature_2_name}
    - **Sample Fields Used**: fields_used_2
    - **Definition**: {definition_2}
    - **Why This Feature**: {why_2}
    - **Logical Meaning**: {logical_meaning_2}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_2}
    - **Boundary Conditions**: {boundaries_2}
    - **Implementation Example**: `{implementation_2}`

    ---

    ### Q2: "What is changing?" (Dynamics Features)

    **Concept**: {dynamics_feature_1_name}
    - **Sample Fields Used**: fields_used_3
    - **Definition**: {definition_3}
    - **Why This Feature**: {why_3}
    - **Logical Meaning**: {logical_meaning_3}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_3}
    - **Boundary Conditions**: {boundaries_3}
    - **Implementation Example**: `{implementation_3}`

    **Concept**: {dynamics_feature_2_name}
    - **Sample Fields Used**: fields_used_4
    - **Definition**: {definition_4}
    - **Why This Feature**: {why_4}
    - **Logical Meaning**: {logical_meaning_4}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_4}
    - **Boundary Conditions**: {boundaries_4}
    - **Implementation Example**: `{implementation_4}`

    ---

    ### Q3: "What is anomalous?" (Deviation Features)

    **Concept**: {anomaly_feature_1_name}
    - **Sample Fields Used**: fields_used_5}
    - **Definition**: {definition_5}
    - **Why This Feature**: {why_5}
    - **Logical Meaning**: {logical_meaning_5}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_5}
    - **Boundary Conditions**: {boundaries_5}
    - **Implementation Example**: `{implementation_5}`

    **Concept**: {anomaly_feature_2_name}
    - **Sample Fields Used**: fields_used_6}
    - **Definition**: {definition_6}
    - **Why This Feature**: {why_6}
    - **Logical Meaning**: {logical_meaning_6}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_6}
    - **Boundary Conditions**: {boundaries_6}
    - **Implementation Example**: `{implementation_6}`

    ---

    ### Q4: "What is combined?" (Interaction Features)

    **Concept**: {interaction_feature_1_name}
    - **Sample Fields Used**: fields_used_7}
    - **Definition**: {definition_7}
    - **Why This Feature**: {why_7}
    - **Logical Meaning**: {logical_meaning_7}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_7}
    - **Boundary Conditions**: {boundaries_7}
    - **Implementation Example**: `{implementation_7}`

    **Concept**: {interaction_feature_2_name}
    - **Sample Fields Used**: fields_used_8}
    - **Definition**: {definition_8}
    - **Why This Feature**: {why_8}
    - **Logical Meaning**: {logical_meaning_8}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_8}
    - **Boundary Conditions**: {boundaries_8}
    - **Implementation Example**: `{implementation_8}`

    ---

    ### Q5: "What is structural?" (Composition Features)

    **Concept**: {structure_feature_1_name}
    - **Sample Fields Used**: fields_used_9}
    - **Definition**: {definition_9}
    - **Why This Feature**: {why_9}
    - **Logical Meaning**: {logical_meaning_9}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_9}
    - **Boundary Conditions**: {boundaries_9}
    - **Implementation Example**: `{implementation_9}`

    **Concept**: {structure_feature_2_name}
    - **Sample Fields Used**: fields_used_10}
    - **Definition**: {definition_10}
    - **Why This Feature**: {why_10}
    - **Logical Meaning**: {logical_meaning_10}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_10}
    - **Boundary Conditions**: {boundaries_10}
    - **Implementation Example**: `{implementation_10}`

    ---

    ### Q6: "What is cumulative?" (Accumulation Features)

    **Concept**: {accumulation_feature_1_name}
    - **Sample Fields Used**: fields_used_11}
    - **Definition**: {definition_11}
    - **Why This Feature**: {why_11}
    - **Logical Meaning**: {logical_meaning_11}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_11}
    - **Boundary Conditions**: {boundaries_11}
    - **Implementation Example**: `{implementation_11}`

    **Concept**: {accumulation_feature_2_name}
    - **Sample Fields Used**: fields_used_12}
    - **Definition**: {definition_12}
    - **Why This Feature**: {why_12}
    - **Logical Meaning**: {logical_meaning_12}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_12}
    - **Boundary Conditions**: {boundaries_12}
    - **Implementation Example**: `{implementation_12}`

    ---

    ### Q7: "What is relative?" (Comparison Features)

    **Concept**: {relative_feature_1_name}
    - **Sample Fields Used**: fields_used_13}
    - **Definition**: {definition_13}
    - **Why This Feature**: {why_13}
    - **Logical Meaning**: {logical_meaning_13}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_13}
    - **Boundary Conditions**: {boundaries_13}
    - **Implementation Example**: `{implementation_13}`

    **Concept**: {relative_feature_2_name}
    - **Sample Fields Used**: fields_used_14}
    - **Definition**: {definition_14}
    - **Why This Feature**: {why_14}
    - **Logical Meaning**: {logical_meaning_14}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_14}
    - **Boundary Conditions**: {boundaries_14}
    - **Implementation Example**: `{implementation_14}`

    ---

    ### Q8: "What is essential?" (Essence Features)

    **Concept**: {essence_feature_1_name}
    - **Sample Fields Used**: fields_used_15}
    - **Definition**: {definition_15}
    - **Why This Feature**: {why_15}
    - **Logical Meaning**: {logical_meaning_15}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_15}
    - **Boundary Conditions**: {boundaries_15}
    - **Implementation Example**: `{implementation_15}`

    **Concept**: {essence_feature_2_name}
    - **Sample Fields Used**: fields_used_16}
    - **Definition**: {definition_16}
    - **Why This Feature**: {why_16}
    - **Logical Meaning**: {logical_meaning_16}
    - **Is filling nan necessary**: We have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If yes, do use appropriate method to fill nan value in the following implementation example.
    - **Directionality**: {directionality_16}
    - **Boundary Conditions**: {boundaries_16}
    - **Implementation Example**: `{implementation_16}`

    ---

    ## Implementation Considerations

    ### Data Quality Notes
    - **Coverage**: {coverage_note}
    - **Timeliness**: {timeliness_note}
    - **Accuracy**: {accuracy_note}
    - **Potential Biases**: {bias_note}

    ### Computational Complexity
    - **Lightweight features**: {simple_features}
    - **Medium complexity**: {medium_features}
    - **Heavy computation**: {complex_features}

    ### Recommended Prioritization

    **Tier 1 (Immediate Implementation)**:
    1. {priority_1_feature} - {priority_1_reason}
    2. {priority_2_feature} - {priority_2_reason}
    3. {priority_3_feature} - {priority_3_reason}

    **Tier 2 (Secondary Priority)**:
    1. {priority_4_feature} - {priority_4_reason}
    2. {priority_5_feature} - {priority_5_reason}

    **Tier 3 (Requires Further Validation)**:
    1. {priority_6_feature} - {priority_6_reason}

    ---

    ## Critical Questions for Further Exploration

    ### Unanswered Questions:
    1. {unanswered_question_1}
    2. {unanswered_question_2}
    3. {unanswered_question_3}

    ### Recommended Additional Data:
    - {additional_data_1}
    - {additional_data_2}
    - {additional_data_3}

    ### Assumptions to Challenge:
    - {assumption_1}
    - {assumption_2}
    - {assumption_3}

    ---

    ## Methodology Notes

    **Analysis Approach**: This report was generated by:
    1. Deep field deconstruction to understand data essence
    2. Question-driven feature generation (8 fundamental questions)
    3. Logical validation of each feature concept
    4. Transparent documentation of reasoning

    **Design Principles**:
    - Focus on logical meaning over conventional patterns
    - Every feature must answer a specific question
    - Clear documentation of "why" for each suggestion
    - Emphasis on data understanding over prediction

    ---

    *Report generated: {generation_timestamp}*
    *Analysis depth: Comprehensive field deconstruction + 8-question framework*
    *Next steps: Implement Tier 1 features, validate assumptions, gather additional data as needed*


## Core Analysis Principles

1. **From Data Essence**: Start with what data truly means, not what it's traditionally used for
2. **Autonomous Reasoning**: Skill performs all thinking, no user input required
3. **Question-Driven**: Internal question bank guides feature generation
4. **Meaning Over Patterns**: Prioritize logical meaning over conventional combinations
5. **Transparency**: Show reasoning process in output

## Example Output Structure

When analyzing dataset 'BEME' (Balance Sheet and Market Data), the output would include:

### Dataset Understanding
**Fields Analyzed**: book_value, market_cap, book_to_market, etc.
**Key Observations**: Dataset compares accounting values with market valuations

### Field Deconstruction
- **book_value**: Accountant's calculation of net asset value (quarterly, audited, historical cost-based)
- **market_cap**: Market participants' valuation (continuous, forward-looking, sentiment-influenced)
- **book_to_market**: Ratio comparing these two valuation perspectives

### Feature Concepts Generated

**From "What is stable?"**
- "Market reevaluation stability": Rolling coefficient of variation of book_to_market
- **Logic**: Measures whether market opinion is stable or volatile
- **Meaning**: Stable values suggest consensus, volatile values suggest disagreement/uncertainty

**From "What is changing?"**
- "Value creation vs. market reevaluation decomposition": Separate book_value growth from market_cap growth
- **Logic**: Distinguish fundamental value creation from market sentiment changes
- **Meaning**: Which component drives changes in book_to_market?

**From "What is combined?"**
- "Intangible value proportion": (market_cap - book_value) / enterprise_value
- **Logic**: Quantify proportion of value from intangibles (brand, growth, etc.)
- **Meaning**: What percentage of valuation isn't captured on the balance sheet?

**(Additional question-based features would follow...)**

## Implementation Notes

### The skill should:
1. **Analyze first, then generate**: Fully understand dataset before proposing features
2. **Show reasoning**: Explain why each feature concept makes sense
3. **Be specific**: Reference actual field names and their characteristics
4. **Be critical**: Question assumptions and identify limitations
5. **Be creative**: Look beyond traditional financial metrics

### The skill should NOT:
1. **Ask users to think**: All thinking is internal to the skill
2. **Provide generic templates**: Each analysis should be specific to the dataset
3. **Rely on conventional wisdom**: Challenge traditional approaches
4. **Output patterns without meaning**: Every suggestion must have clear logic

## Quality Assurance

**Self-Check Process:**
- [ ] All fields analyzed, not just skimmed
- [ ] Field meanings understood beyond descriptions
- [ ] Multiple question types explored
- [ ] Each feature has clear logical meaning
- [ ] Reasoning is explicit, not implicit
- [ ] Limitations are acknowledged
- [ ] Output is dataset-specific, not generic

**Validation Questions:**
- Would this analysis help someone truly understand the data?
- Are feature concepts novel yet meaningful?
- Is the reasoning process transparent?
- Does it avoid conventional thinking traps?

---

*This skill performs deep analysis of BRAIN datasets, generating meaningful feature engineering concepts based on data essence and logical reasoning.*

--- SKILL.md (brain-feature-implementation) ---
---
name: brain-feature-implementation
description: Automate conversion of Brain idea documents into actionable Alpha expressions using local CSV data.
---

# Brain Feature Implementation

## Description
This skill automates the process of converting a WorldQuant Brain idea document (Markdown) into actionable Alpha expressions.

## Instructions

1.  **Analyze the Idea Document**
    *   Read the provided markdown file.
    *   Extract the following metadata:
        *   **Dataset ID** (e.g., `analyst15`)
        *   **Region** (e.g., `GLB`)
        *   **Delay** (e.g., `1` or `0`)
    *   *If any metadata is missing, ask the user to clarify.*

2.  **Plan Implementation**
    *   Scan the markdown file for **Feature Definitions** or **Formulas**.
    *   Look for patterns like `Definition: <formula>` or code blocks describing math.
    *   Use the `manage_todo_list` tool to create a plan with one entry for each unique idea/formula found.
        *   *Title*: The Idea Name or ID (e.g., "3.1.1 Estimate Stability Score").
        *   *Description*: The specific template formula (e.g., `template: "{st_dev} / abs({mean})"`).

3.  **Execute Implementation**
    *   For each item in the Todo List:
        *   **Construct the Template**:
            *   Use Python format string syntax `{variable}`.
            *   The `{variable}` must be the **exact suffix** of the fields in the dataset as listed in the fields input.
            *   **CRITICAL**: Do NOT include the dataset prefix (e.g., `anl14_`) or horizon in the template. The script auto-detects these.
            *   **Time Window Handling**: For datasets with multiple time horizons (e.g., `_fy1`, `_fy2`, `_fp1`, `_fp2`), you MUST specify the time window in the variable. Use the full suffix as it appears in the field ID after removing the dataset prefix.
            *   *Correct Example*: For field `anl14_mean_roe_fy1`, use template: `{mean_roe_fy1}`.
            *   *Incorrect Example*: `{mean_roe}` (missing time window), `{anl14_mean_roe_fy1}` (includes prefix).
            *   *Note*: The script ONLY accepts `--template` and `--dataset`. Do not pass any other arguments like `--filters` or `--groupby`.
        *   Verify the output (number of expressions generated).
        *   Mark the Todo item as completed.

------
"allowed_operators": [
  {
    "name": "add",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Adds two or more inputs element wise. Set filter=true to treat NaNs as 0 before summing.",
    "definition": "add(x, y, filter = false), x + y"
  },
  {
    "name": "abs",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Returns the absolute value of a number, removing any negative sign.",
    "definition": "abs(x)"
  },
  {
    "name": "log",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Calculates the natural logarithm of the input value. Commonly used to transform data that has positive values.",
    "definition": "log(x)"
  },
  {
    "name": "subtract",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Subtracts inputs left to right: x ? y ? … Supports two or more inputs. Set filter=true to treat NaNs as 0 before subtraction.",
    "definition": "subtract(x, y, filter=false), x - y"
  },
  {
    "name": "signed_power",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "x raised to the power of y such that final result preserves sign of x",
    "definition": "signed_power(x, y)"
  },
  {
    "name": "sign",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Returns the sign of a number: +1 for positive, -1 for negative, and 0 for zero. If the input is NaN, returns NaN.\r\n\r\nInput: Value of 7 instruments at day t: (2, -3, 5, 6, 3, NaN, -10)\r\nOutput: (1, -1, 1, 1, 1, NaN, -1)",
    "definition": "sign(x)"
  },
  {
    "name": "reverse",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": " - x",
    "definition": "reverse(x)"
  },
  {
    "name": "power",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "x ^ y",
    "definition": "power(x, y)"
  },
  {
    "name": "multiply",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Multiplies two or more inputs element wise. Set filter=true to treat NaNs as 0 before multiplication",
    "definition": "multiply(x ,y, ... , filter=false), x * y"
  },
  {
    "name": "min",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Minimum value of all inputs. At least 2 inputs are required",
    "definition": "min(x, y ..)"
  },
  {
    "name": "max",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Maximum value of all inputs. At least 2 inputs are required",
    "definition": "max(x, y, ..)"
  },
  {
    "name": "inverse",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "1 / x",
    "definition": "inverse(x)"
  },
  {
    "name": "sqrt",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Returns the non negative square root of x. Equivalent to power(x, 0.5); for signed roots use signed_power(x, 0.5).",
    "definition": "sqrt(x)"
  },
  {
    "name": "s_log_1p",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Confine function to a shorter range using logarithm such that higher input remains higher and negative input remains negative as an output of resulting function and -1 or 1 is an asymptotic value",
    "definition": "s_log_1p(x)"
  },
  {
    "name": "densify",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "Converts a grouping field of many buckets into lesser number of only available buckets so as to make working with grouping fields computationally efficient",
    "definition": "densify(x)"
  },
  {
    "name": "divide",
    "category": "Arithmetic",
    "scope": "['REGULAR']",
    "description": "x / y",
    "definition": "divide(x, y), x / y"
  },
  {
    "name": "not",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns the logical negation of x. Returns 0 when x is 1 (‘true’) and 1 when x is 0 (‘false’).",
    "definition": "not(x)"
  },
  {
    "name": "and",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns 1 ('true') if both inputs are 1 ('true'). Otherwise, returns 0 ('false').",
    "definition": "and(input1, input2)"
  },
  {
    "name": "less",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns 1 ('true') if input1 is a smaller than input2. Otherwise, returns 0 ('false').",
    "definition": "input1 < input2"
  },
  {
    "name": "equal",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns 1 ('true') if input1 and input2 are the same. Otherwise, returns 0 ('false').",
    "definition": "input1 == input2"
  },
  {
    "name": "or",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns 1 if either input is true (either input1 or input2 has a value of 1), otherwise it returns 0.",
    "definition": "or(input1, input2)"
  },
  {
    "name": "not_equal",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns 1 ('true') if input1 and input2 are different numbers. Otherwise, returns 0 ('false').",
    "definition": "input1!= input2"
  },
  {
    "name": "greater",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns 1 ('true') if input1 is a larger than input2. Otherwise, returns 0 ('false').",
    "definition": "input1 > input2"
  },
  {
    "name": "greater_equal",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns 1 ('true') if input1 is a larger or the same as input2. Otherwise, returns 0 ('false').",
    "definition": "input1 >= input2"
  },
  {
    "name": "less_equal",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "Returns 1 ('true') if input1 is a smaller or the same as input2. Otherwise, returns 0 ('false').",
    "definition": "input1 <= input2"
  },
  {
    "name": "is_nan",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "If (input == NaN) return 1 else return 0",
    "definition": "is_nan(input)"
  },
  {
    "name": "if_else",
    "category": "Logical",
    "scope": "['REGULAR']",
    "description": "The if_else operator returns one of two values based on a condition. If the condition is true, it returns the first value; if false, it returns the second value.",
    "definition": "if_else(input1, input2, input 3)"
  },
  {
    "name": "ts_sum",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Sum values of x for the past d days.",
    "definition": "ts_sum(x, d)"
  },
  {
    "name": "ts_scale",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Scales a time series to a 0–1 range based on its minimum and maximum values over a specified period, with an optional constant shift.",
    "definition": "ts_scale(x, d, constant = 0)"
  },
  {
    "name": "ts_mean",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the simple average (mean) value of a variable x over the past d days.",
    "definition": "ts_mean(x, d)"
  },
  {
    "name": "ts_zscore",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the Z-score of a time series, showing how far today's value is from the recent average, measured in standard deviations. Useful for standardizing and comparing values over time.",
    "definition": "ts_zscore(x, d)"
  },
  {
    "name": "ts_std_dev",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the standard deviation of a data series x over the past d days, measuring how much the values deviate from their mean during that period.",
    "definition": "ts_std_dev(x, d)"
  },
  {
    "name": "kth_element",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns the K-th value from a time series by looking back over a specified number of (‘d’) days, with the option to ignore certain values. Commonly used for backfilling missing data.",
    "definition": "kth_element(x, d, k, ignore=“NaN”)"
  },
  {
    "name": "inst_tvr",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Total trading value / Total holding value in the past d days\r\n\r\nInput: Value of 1 instrument in past 5 days where first element is the latest: (105, 102, 99, 101,100)\r\nOutput: 0.022 from (1+2+3+3)/(105+102+99+101)",
    "definition": "inst_tvr(x, d)"
  },
  {
    "name": "ts_corr",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the Pearson correlation between two variables, x and y, over the past d days, showing how closely they move together.",
    "definition": "ts_corr(x, y, d)"
  },
  {
    "name": "ts_count_nans",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Counts the number of missing (NaN) values in a data series over a specified number of days.",
    "definition": "ts_count_nans(x ,d)"
  },
  {
    "name": "ts_target_tvr_decay",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Tune \"ts_decay\" to have a turnover equal to a certain target, with optimization weight range between lambda_min, lambda_max",
    "definition": "ts_target_tvr_decay(x, lambda_min=0, lambda_max=1, target_tvr=0.1)"
  },
  {
    "name": "ts_median",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns median value of x for the past d days",
    "definition": "ts_median(x, d)"
  },
  {
    "name": "ts_covariance",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the covariance between two time-series variables, y and x, over the past d days. Useful for measuring how two variables move together within a specified historical window.",
    "definition": "ts_covariance(y, x, d)"
  },
  {
    "name": "ts_decay_linear",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Applies a linear decay to time-series data over a set number of days, smoothing the data by averaging recent values and reducing the impact of older or missing data.",
    "definition": "ts_decay_linear(x, d, dense = false)"
  },
  {
    "name": "ts_product",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns the product of the values of x over the past d days. Useful for calculating geometric means and compounding returns or growth rates.",
    "definition": "ts_product(x, d)"
  },
  {
    "name": "ts_regression",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns various parameters related to regression function",
    "definition": "ts_regression(y, x, d, lag = 0, rettype = 0)"
  },
  {
    "name": "ts_delta_limit",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Limit the change in the Alpha position x between dates to a specified fraction of y. The \"limit_volume\" can be in the range of 0 to 1. Also, please be aware of the scaling for x and y. Besides setting y as adv20 or volume related data, you can also set y as a constant.",
    "definition": "ts_delta_limit(x, y, limit_volume=0.1)"
  },
  {
    "name": "ts_step",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns a counter of days, incrementing by one each day.",
    "definition": "ts_step(1)"
  },
  {
    "name": "ts_decay_exp_window",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns exponential decay of x with smoothing factor for the past d days",
    "definition": "ts_decay_exp_window(x, d, factor = f)"
  },
  {
    "name": "ts_quantile",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the ts_rank of the input and transforms it using the inverse cumulative distribution function (quantile function) of a specified probability distribution (default: Gaussian/normal). This helps to normalize or reshape the distribution of your data over a rolling window.",
    "definition": "ts_quantile(x,d, driver=\"gaussian\" )"
  },
  {
    "name": "days_from_last_change",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the number of days since the last change in the value of a given variable.",
    "definition": "days_from_last_change(x)"
  },
  {
    "name": "hump",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Limits amount and magnitude of changes in input (thus reducing turnover)",
    "definition": "hump(x, hump = 0.01)"
  },
  {
    "name": "last_diff_value",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns the most recent value of x from the past d days that is different from the current value of x.",
    "definition": "last_diff_value(x, d)"
  },
  {
    "name": "ts_arg_max",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns the number of days since the maximum value occurred in the last d days of a time series. If today's value is the maximum, returns 0; if it was yesterday, returns 1, and so on.",
    "definition": "ts_arg_max(x, d)"
  },
  {
    "name": "ts_arg_min",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns the number of days since the minimum value occurred in a time series over the past d days. If today's value is the minimum, returns 0; if it was yesterday, returns 1, and so on.",
    "definition": "ts_arg_min(x, d)"
  },
  {
    "name": "ts_av_diff",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the difference between a value and its mean over a specified period, ignoring NaN values in the mean calculation. In short, it returns x – ts_mean(x, d) with NaNs ignored.",
    "definition": "ts_av_diff(x, d)"
  },
  {
    "name": "ts_backfill",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Replaces missing (NaN) values in a time series with the most recent valid value from a specified lookback window, improving data coverage and reducing risk from missing data.",
    "definition": "ts_backfill(x,lookback = d, k=1)"
  },
  {
    "name": "ts_rank",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Ranks the value of a variable for each instrument over a specified number of past days, returning the rank of the current value (optionally adjusted by a constant). Useful for normalizing time-series data and highlighting relative performance over time.",
    "definition": "ts_rank(x, d, constant = 0)"
  },
  {
    "name": "ts_delay",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Returns the value of a variable x from d days ago. Use this operator to access historical data points by specifying the desired time lag in days.",
    "definition": "ts_delay(x, d)"
  },
  {
    "name": "ts_delta",
    "category": "Time Series",
    "scope": "['REGULAR']",
    "description": "Calculates the difference between a value and its delayed version over a specified period. Useful for measuring changes or momentum in time-series data.",
    "definition": "ts_delta(x, d)"
  },
  {
    "name": "winsorize",
    "category": "Cross Sectional",
    "scope": "['REGULAR']",
    "description": "Winsorizes x to make sure that all values in x are between the lower and upper limits, which are specified as multiple of std.\r\n\r\nInput: Value of 7 instruments at day t: (2, 4, 5, 6, 3, 8, 10), std: 1\r\nOutput: (2.81, 4, 5, 6, 3, 8, 8.03) from SD. = 2.61, mean = 5.42",
    "definition": "winsorize(x, std=4)"
  },
  {
    "name": "truncate",
    "category": "Cross Sectional",
    "scope": "['REGULAR']",
    "description": "Operator truncates all values of x to maxPercent. Here, maxPercent is in decimal notation",
    "definition": "truncate(x,maxPercent=0.01)"
  },
  {
    "name": "regression_neut",
    "category": "Cross Sectional",
    "scope": "['REGULAR']",
    "description": "Conducts the cross-sectional regression on the stocks with Y as target and X as the independent variable",
    "definition": "regression_neut(y, x)"
  },
  {
    "name": "scale",
    "category": "Cross Sectional",
    "scope": "['REGULAR']",
    "description": "Scales input to booksize. We can also scale the long positions and short positions to separate scales by mentioning additional parameters to the operator",
    "definition": "scale(x, scale=1, longscale=1, shortscale=1)"
  },
  {
    "name": "rank",
    "category": "Cross Sectional",
    "scope": "['REGULAR']",
    "description": "Ranks the input among all the instruments and returns an equally distributed number between 0.0 and 1.0. For precise sort, use the rate as 0",
    "definition": "rank(x, rate=2)"
  },
  {
    "name": "quantile",
    "category": "Cross Sectional",
    "scope": "['REGULAR']",
    "description": "Rank the raw vector, shift the ranked Alpha vector, apply distribution (gaussian, cauchy, uniform). If driver is uniform, it simply subtract each Alpha value with the mean of all Alpha values in the Alpha vector",
    "definition": "quantile(x, driver = gaussian, sigma = 1.0)"
  },
  {
    "name": "normalize",
    "category": "Cross Sectional",
    "scope": "['REGULAR']",
    "description": "Calculates the mean value of all valid alpha values for a certain date, then subtracts that mean from each element",
    "definition": "normalize(x, useStd = false, limit = 0.0)"
  },
  {
    "name": "zscore",
    "category": "Cross Sectional",
    "scope": "['REGULAR']",
    "description": "Z-score is a numerical measurement that describes a value's relationship to the mean of a group of values. Z-score is measured in terms of standard deviations from the mean",
    "definition": "zscore(x)"
  },
  {
    "name": "vec_min",
    "category": "Vector",
    "scope": "['REGULAR']",
    "description": "Minimum value form vector field x",
    "definition": "vec_min(x)"
  },
  {
    "name": "vec_count",
    "category": "Vector",
    "scope": "['REGULAR']",
    "description": "Number of elements in vector field x",
    "definition": "vec_count(x)"
  },
  {
    "name": "vec_stddev",
    "category": "Vector",
    "scope": "['REGULAR']",
    "description": "Standard Deviation of vector field x",
    "definition": "vec_stddev(x)"
  },
  {
    "name": "vec_range",
    "category": "Vector",
    "scope": "['REGULAR']",
    "description": "Difference between maximum and minimum element in vector field x",
    "definition": "vec_range(x)"
  },
  {
    "name": "vec_avg",
    "category": "Vector",
    "scope": "['REGULAR']",
    "description": "Taking mean of the vector field x\r\n\r\nInput: Vector of value of 1 instrument in a day: (2, 3, 5, 6, 3, 8, 10)\r\nOutput: 37 / 7 = 5.29",
    "definition": "vec_avg(x)"
  },
  {
    "name": "vec_sum",
    "category": "Vector",
    "scope": "['REGULAR']",
    "description": "Sum of vector field x\r\n\r\nInput: Vector of value of 1 instrument in a day: (2, 3, 5, 6, 3, 8, 10)\r\nOutput: 2 + 3 + 5 + 6 + 3 + 8 + 10 = 37",
    "definition": "vec_sum(x)"
  },
  {
    "name": "vec_max",
    "category": "Vector",
    "scope": "['REGULAR']",
    "description": "Maximum value form vector field x",
    "definition": "vec_max(x)"
  },
  {
    "name": "left_tail",
    "category": "Transformational",
    "scope": "['REGULAR']",
    "description": "NaN everything greater than maximum, maximum should be constant",
    "definition": "left_tail(x, maximum = 0)"
  },
  {
    "name": "trade_when",
    "category": "Transformational",
    "scope": "['REGULAR']",
    "description": "Used in order to change Alpha values only under a specified condition and to hold Alpha values in other cases. It also allows to close Alpha positions (assign NaN values) under a specified condition",
    "definition": "trade_when(x, y, z)"
  },
  {
    "name": "right_tail",
    "category": "Transformational",
    "scope": "['REGULAR']",
    "description": "NaN everything less than minimum, minimum should be constant",
    "definition": "right_tail(x, minimum = 0)"
  },
  {
    "name": "bucket",
    "category": "Transformational",
    "scope": "['REGULAR']",
    "description": "Convert float values into indexes for user-specified buckets. Bucket is useful for creating group values, which can be passed to GROUP as input",
    "definition": "bucket(rank(x), range=\"0, 1, 0.1\" or buckets = \"2,5,6,7,10\")"
  },
  {
    "name": "group_rank",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "Each elements in a group is assigned the corresponding rank in this group",
    "definition": "group_rank(x, group)"
  },
  {
    "name": "group_cartesian_product",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "Merge two groups into one group. If originally there are len_1 and len_2 group indices in g1 and g2, there will be len_1 * len_2 indices in the new group.",
    "definition": "group_cartesian_product(g1, g2)"
  },
  {
    "name": "group_backfill",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "If a certain value for a certain date and instrument is NaN, from the set of same group instruments, calculate winsorized mean of all non-NaN values over last d days",
    "definition": "group_backfill(x, group, d, std = 4.0)"
  },
  {
    "name": "group_mean",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "All elements in group equals to the mean",
    "definition": "group_mean(x, weight, group)"
  },
  {
    "name": "group_neutralize",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "Neutralizes Alpha against groups. These groups can be subindustry, industry, sector, country or a constant",
    "definition": "group_neutralize(x, group)"
  },
  {
    "name": "group_normalize",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "Normalizes input such that each group's absolute sum is 1",
    "definition": "group_normalize(x, group, constantCheck=False, tolerance=0.01, scale=1)"
  },
  {
    "name": "group_median",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "All elements in group equals to the median value of the group.",
    "definition": "group_median(x, group)"
  },
  {
    "name": "group_scale",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "Normalizes the values in a group to be between 0 and 1. (x - groupmin) / (groupmax - groupmin)",
    "definition": "group_scale(x, group)"
  },
  {
    "name": "group_zscore",
    "category": "Group",
    "scope": "['REGULAR']",
    "description": "Calculates group Z-score - numerical measurement that describes a value's relationship to the mean of a group of values. Z-score is measured in terms of standard deviations from the mean. zscore = (data - mean) / stddev of x for each instrument within its group.\r\n\r\nInput: Value of 5 instruments of Group A: (100, 0, 50, 60, 25)\r\nOutput: (1.57, -1.39, 0.09, 0.39, -0.65)",
    "definition": "group_zscore(x, group)"
  }
]

CRITICAL OUTPUT RULES (to ensure implement_idea.py can generate expressions):
- Every Implementation Example MUST be a Python format template using `{variable}`.
- Every `{variable}` MUST be constructed from the actual field suffixes provided in the fields list. Do NOT invent variable names.
- The suffix must match exactly how it appears in the field ID after removing the dataset prefix (e.g., for `anl14_mean_roe_fy1`, use `{mean_roe_fy1}`, not `{mean_roe}` or `{roe}`).
- When you implement ideas, ONLY use operators from allowed_operators provided.
- Do NOT include dataset codes/prefixes/horizons in `{variable}` beyond the suffix itself.
- If you show raw field ids in tables, use backticks `` `like_this` ``, NOT `{braces}`.
- Include these metadata lines verbatim somewhere near the top:
  **Dataset**: <dataset_id>
  **Region**: <region>
  **Delay**: <delay>

---
## EVENT FIELD IDENTIFICATION (CRITICAL FOR ts_* OPERATORS)

**Event fields are NOT continuous daily data. They only have values on specific dates (earnings announcements, analyst revisions, etc.) and are NaN on other days.**

### Quick Field Type Classification Method

**Step 1: Check Dataset Prefix**
- `anl*` (analyst data), `fnd*` (fundamental data), `ern*` (earnings data) → **Likely EVENT fields**
- `mdl*` (model data), `pv*` (provider data), `nws*` (news data) → **Likely EVENT fields**
- `oth*` (other data) → **Requires further analysis (see Step 2)**

**Step 2: Analyze Field Description Keywords**
- **CONTINUOUS fields indicators**: "predicted", "confidence", "score", "daily", "continuous", "return", "probability", "label"
- **EVENT fields indicators**: "estimate", "guidance", "revision", "announcement", "quarterly", "fiscal", "surprise", "consensus", "actual"

**Step 3: Check Time Window Suffixes**
- `_fy1`, `_fy2`, `_fp1`, `_fp2`, `_qtr`, `_ttm` → **EVENT fields** (fiscal year/period markers)
- `_d`, `_ret`, `_prob`, `_label`, `_score` → **CONTINUOUS fields** (daily values)

### Detailed Event Field Identification Rules

**How to identify event fields from the fields list:**
1. Field description contains words like: "surprise", "announcement", "revision", "event", "post", "pre", "consensus", "actual", "fiscal", "quarterly" (when it's a point-in-time value)
2. Field name contains patterns like: `_surprise`, `_event`, `_revision`, `_consensus`, `_actual`, `_pre`, `_post`, `_announcement`, `_date`, `_flag`
3. Fields representing: earnings surprises, analyst revisions, consensus estimates before/after events, recommendation changes, special items, one-time adjustments

**Examples from typical datasets:**
- `presurprise`, `actsurprise` → event data (surprise only on earnings date)
- `aftercons_mean`, `beforecons_mean` → event data (snapshots around earnings)
- `estsup`, `estsdown` → event data (revision counts, not daily values)
- `xoptq`, `pncq`, `spceq` → event data (quarterly updates, not daily)
- **NEW**: `oth566_return`, `oth566_prob_*`, `oth566_label_*` → **CONTINUOUS data** (ML predictions with daily values)

### CRITICAL RULES

**Rule 1: ts_* Operators Restriction**
- `ts_*` operators (ts_mean, ts_std_dev, ts_zscore, ts_delta, ts_sum, ts_rank, etc.) can ONLY be used with continuous daily fields
- **DO NOT** use `ts_*` operators on event fields
- **Exception**: `ts_delay(event_field, days)` is allowed to access historical event values

**Rule 2: Arithmetic Operators on Event Fields**
- `add`, `subtract`, `multiply`, `divide` → **SAFE** for event fields in cross-sectional calculations (same-day operations)
- Example: `divide(anl16_meanest, anl16_eststddev)` is valid (z-score calculation on same day)

**Rule 3: Pattern-Based Classification**
Based on historical error analysis, fields matching these patterns are EVENT fields:
```
Dataset prefixes: anl*, fnd*, ern*, mdl*, pv*, nws*
Field patterns: *_estimate_*, *_guidance_*, *_revision_*, *_event*
Time markers: *_qtr, *_fy1, *_fy2, *_fp1, *_fp2, *_ttm
```

**Safe alternatives for event fields:**
- Use event fields directly in ratios or cross-sectional comparisons
- Use `ts_delay(event_field, days)` to capture the last known event value
- Use event fields with `group_*` operators for cross-sectional analysis
---

# User Prompt

{
  "instructions": {
    "output_format": "Fill OUTPUT_TEMPLATE.md with concrete content.",
    "implementation_examples": "Each Implementation Example must be a template with {variable} placeholders. Use only suffixes derived from the provided fields list. Always include time window suffixes (e.g., _fy1, _fp1) when present in the fields.",
    "no_code_fences": true,
    "do_not_invent_placeholders": true
  },
  "dataset_context": {
    "dataset_id": "biasfree_analyst",
    "dataset_name": null,
    "dataset_description": null,
    "category": "Analyst",
    "region": "USA",
    "delay": 1,
    "universe": "TOP200",
    "field_count": 54
  },
  "fields": [
  {
    "id": "third_biasfree_price_target_analogue",
    "description": "The third bias-free analogue value for a price target forecast from an analyst."
  },
  {
    "id": "stddev_third_biasfree_price_target_estimate",
    "description": "The standard deviation of the third bias-free price target estimate for the period."
  },
  {
    "id": "stddev_second_biasfree_quarterly_fundamental",
    "description": "The standard deviation of the second bias-free quarterly fundamental estimate for the period."
  },
  {
    "id": "stddev_second_biasfree_price_target_estimate",
    "description": "The standard deviation of the second bias-free price target estimate for the period."
  },
  {
    "id": "stddev_second_biasfree_fundamental_estimate",
    "description": "The standard deviation of the second bias-free fundamental estimate for the period."
  },
  {
    "id": "stddev_first_biasfree_quarterly_fundamental",
    "description": "The standard deviation of the first bias-free quarterly fundamental estimate for the period."
  },
  {
    "id": "stddev_first_biasfree_price_target_estimate",
    "description": "The standard deviation of the first bias-free price target estimate for the period."
  },
  {
    "id": "stddev_first_biasfree_fundamental_estimate",
    "description": "The standard deviation of the first bias-free fundamental estimate for the period."
  },
  {
    "id": "stddev_biasfree_quarterly_fundamental_estimate",
    "description": "The standard deviation of bias-adjusted quarterly fundamental estimates for the period."
  },
  {
    "id": "stddev_bias_adjusted_price_target",
    "description": "The standard deviation of bias-adjusted price target estimates for the period."
  },
  {
    "id": "stddev_bias_adjusted_fundamental_estimate",
    "description": "The standard deviation of bias-adjusted fundamental estimates for the period."
  },
  {
    "id": "second_biasfree_price_target_analogue",
    "description": "The second bias-free analogue value for a price target forecast from an analyst."
  },
  {
    "id": "second_biasfree_fundamental_analogue",
    "description": "The second bias-free analogue value for a fundamental forecast from an analyst."
  },
  {
    "id": "num_upward_biasfree_quarterly_fundamental_revisions",
    "description": "The number of times analysts have raised their bias-adjusted quarterly fundamental estimates."
  },
  {
    "id": "num_upward_biasfree_price_target_revisions",
    "description": "The number of times analysts have raised their bias-adjusted price target estimates."
  },
  {
    "id": "num_upward_biasfree_fundamental_revisions",
    "description": "The number of times analysts have raised their bias-adjusted fundamental estimates."
  },
  {
    "id": "num_downward_biasfree_quarterly_fundamental_revisions",
    "description": "The number of times analysts have lowered their bias-adjusted quarterly fundamental estimates."
  },
  {
    "id": "num_downward_biasfree_price_target_revisions",
    "description": "The number of times analysts have lowered their bias-adjusted price target estimates."
  },
  {
    "id": "num_downward_biasfree_fundamental_revisions",
    "description": "The number of times analysts have lowered their bias-adjusted fundamental estimates."
  },
  {
    "id": "min_biasfree_quarterly_fundamental_estimate",
    "description": "The lowest value among bias-adjusted quarterly fundamental estimates for the period."
  },
  {
    "id": "min_bias_adjusted_price_target",
    "description": "The lowest value among bias-adjusted price target estimates for the period."
  },
  {
    "id": "min_bias_adjusted_fundamental_estimate",
    "description": "The lowest value among bias-adjusted fundamental estimates for the period."
  },
  {
    "id": "median_third_biasfree_price_target_estimate",
    "description": "The median value of the third bias-free price target estimate for the period."
  },
  {
    "id": "median_second_biasfree_quarterly_fundamental",
    "description": "The median value of the second bias-free quarterly fundamental estimate for the period."
  },
  {
    "id": "median_second_biasfree_price_target_estimate",
    "description": "The median value of the second bias-free price target estimate for the period."
  },
  {
    "id": "median_second_biasfree_fundamental_estimate",
    "description": "The median value of the second bias-free fundamental estimate for the period."
  },
  {
    "id": "median_first_biasfree_quarterly_fundamental",
    "description": "The median value of the first bias-free quarterly fundamental estimate for the period."
  },
  {
    "id": "median_first_biasfree_price_target_estimate",
    "description": "The median value of the first bias-free price target estimate for the period."
  },
  {
    "id": "median_first_biasfree_fundamental_estimate",
    "description": "The median value of the first bias-free fundamental estimate for the period."
  },
  {
    "id": "median_biasfree_quarterly_fundamental_estimate",
    "description": "The median of bias-adjusted quarterly fundamental estimates for the period."
  },
  {
    "id": "median_bias_adjusted_price_target",
    "description": "The median of bias-adjusted price target estimates for the period."
  },
  {
    "id": "median_bias_adjusted_fundamental_estimate",
    "description": "The median of bias-adjusted fundamental estimates for the period."
  },
  {
    "id": "mean_biasfree_quarterly_fundamental_estimate",
    "description": "The mean of bias-adjusted quarterly fundamental estimates for the period."
  },
  {
    "id": "mean_bias_adjusted_price_target",
    "description": "The mean of bias-adjusted price target estimates for the period."
  },
  {
    "id": "mean_bias_adjusted_fundamental_estimate",
    "description": "The mean of bias-adjusted fundamental estimates for the period."
  },
  {
    "id": "max_biasfree_quarterly_fundamental_estimate",
    "description": "The highest value among bias-adjusted quarterly fundamental estimates for the period."
  },
  {
    "id": "max_bias_adjusted_price_target",
    "description": "The highest value among bias-adjusted price target estimates for the period."
  },
  {
    "id": "max_bias_adjusted_fundamental_estimate",
    "description": "The highest value among bias-adjusted fundamental estimates for the period."
  },
  {
    "id": "forecast_horizon_months",
    "description": "The time horizon in months for which the price target estimate is made."
  },
  {
    "id": "first_biasfree_price_target_analogue",
    "description": "The first bias-free analogue value for a price target forecast from an analyst."
  },
  {
    "id": "first_biasfree_fundamental_analogue",
    "description": "The first bias-free analogue value for a fundamental forecast from an analyst."
  },
  {
    "id": "estimate_currency_code_9",
    "description": "The currency in which the current fundamental estimate is recorded."
  },
  {
    "id": "count_biasfree_quarterly_fundamental_estimates",
    "description": "The number of available bias-adjusted quarterly fundamental estimates for the period."
  },
  {
    "id": "count_bias_adjusted_price_target_estimates",
    "description": "The number of available bias-adjusted price target estimates for the period."
  },
  {
    "id": "count_bias_adjusted_fundamental_estimates",
    "description": "The number of available bias-adjusted fundamental estimates for the period."
  },
  {
    "id": "biasfree_analyst_price_target",
    "description": "A single analyst's bias-adjusted price target estimate for a security."
  },
  {
    "id": "biasfree_analyst_fundamental_estimate",
    "description": "A single analyst's bias-adjusted fundamental estimate for a security."
  },
  {
    "id": "avg_third_biasfree_price_target_estimate",
    "description": "The average value of the third bias-free price target estimate for the period."
  },
  {
    "id": "avg_second_biasfree_quarterly_fundamental",
    "description": "The average value of the second bias-free quarterly fundamental estimate for the period."
  },
  {
    "id": "avg_second_biasfree_price_target_estimate",
    "description": "The average value of the second bias-free price target estimate for the period."
  },
  {
    "id": "avg_second_biasfree_fundamental_estimate",
    "description": "The average value of the second bias-free fundamental estimate for the period."
  },
  {
    "id": "avg_first_biasfree_quarterly_fundamental",
    "description": "The average value of the first bias-free quarterly fundamental estimate for the period."
  },
  {
    "id": "avg_first_biasfree_price_target_estimate",
    "description": "The average value of the first bias-free price target estimate for the period."
  },
  {
    "id": "avg_first_biasfree_fundamental_estimate",
    "description": "The average value of the first bias-free fundamental estimate for the period."
  }
]
}