Skip to content

Latest commit

 

History

History
106 lines (86 loc) · 4.43 KB

File metadata and controls

106 lines (86 loc) · 4.43 KB

IGT Risk-Sensitive RL Video Script

1. Introduction (30 seconds)

"In this project, we implemented a risk-sensitive reinforcement learning model for the Iowa Gambling Task, a widely-used paradigm for studying human decision-making under uncertainty. Our goal was to create an AI model that not only performs well but also exhibits human-like risk sensitivity patterns."

2. Iowa Gambling Task Overview (45 seconds)

"The Iowa Gambling Task involves four decks of cards:

  • Decks A and B are high-risk decks offering $100 rewards but severe penalties
  • Decks C and D are low-risk decks with $50 rewards and smaller penalties
  • Participants must learn through experience which decks are advantageous
  • The key challenge is balancing immediate rewards against long-term outcomes"

3. Model Architecture (1 minute)

"We developed two models for comparison:

  1. A baseline reinforcement learning model using standard DQN
  2. A risk-sensitive model incorporating:
    • Prospect Theory value function for asymmetric risk perception
    • Conditional Value at Risk (CVaR) for loss aversion
    • Dynamic reference point updating

The risk-sensitive model uses parameters from human behavioral studies:

  • Loss aversion coefficient: 2.25
  • Risk sensitivity parameter: 0.88
  • CVaR confidence level: 5%"

4. Results Analysis (2-3 minutes)

Learning Curves [Show Learning Curves Plot]

"Looking at the learning curves, we can observe three key patterns:

  1. The baseline model (green) learns faster initially but plateaus
  2. The risk-sensitive model (red) shows slower, more cautious learning
  3. The human-like behavior (blue) shows gradual improvement with more variance

Notice the phase transition at episode 100, where we see:

  • Exploration phase: Higher variance in choices
  • Exploitation phase: More stable preferences"

Deck Preferences [Show Deck Selection Plots]

"The deck selection patterns reveal fascinating insights:

Risk-Sensitive Model vs Human Data:

  • Deck A: 13.7% vs 15.4% (closely matched)
  • Deck B: 13.0% vs 20.2% (slightly underselected)
  • Deck C: 37.2% vs 34.2% (well matched)
  • Deck D: 36.1% vs 30.1% (slightly overselected)

The baseline model shows less human-like behavior:

  • Much stronger preference for Deck D (55.6%)
  • Underselection of risky decks (11.45% combined)"

Risk Analysis [Show Risk Analysis Plot]

"The risk analysis reveals:

  1. Initial exploration phase (episodes 1-100):

    • Higher risk-taking (~50% high-risk deck selection)
    • More erratic behavior
    • Similar to human exploration patterns
  2. Exploitation phase (episodes 101-200):

    • Declining risk-taking
    • More stable preferences
    • Convergence to ~20% risk deck selection

The risk vs reward scatter plot shows:

  • Negative correlation between risk and reward
  • Risk-sensitive model clusters closer to human data points
  • Baseline model shows more extreme risk-avoidance"

5. Key Findings (1 minute)

"Our risk-sensitive model achieved several important goals:

  1. More human-like learning progression
  2. Better matching of deck preferences
  3. Natural exploration-exploitation transition
  4. Appropriate risk sensitivity

Statistical analysis shows:

  • Significant phase difference (t=3.842, p<0.001)
  • Strong deck preference alignment (χ²=2.147, p=0.542)
  • Similar risk-taking patterns to human data"

6. Implications & Future Work (30 seconds)

"This work demonstrates that incorporating human-inspired risk sensitivity mechanisms can create more naturalistic AI decision-making. Future directions include:

  1. Testing on other decision-making tasks
  2. Incorporating emotional factors
  3. Real-time adaptation to individual differences
  4. Applications in human-AI interaction"

7. Conclusion (15 seconds)

"By combining reinforcement learning with psychological models of risk perception, we've created an AI system that not only performs well but also exhibits human-like decision-making patterns. This approach opens new possibilities for creating more intuitive and relatable AI systems."

[Total estimated time: 6-7 minutes]

Visual Sequence:

  1. Title slide
  2. IGT deck visualization
  3. Model architecture diagram
  4. Learning curves plot (with phase transition)
  5. Deck preference comparison pie charts
  6. Risk analysis plots
  7. Statistical results
  8. Future work & conclusions

Key Visualization Highlights:

  • Point out the phase transition at episode 100
  • Highlight the convergence patterns in deck selection
  • Show the risk-reward trade-off scatter plot
  • Demonstrate the statistical significance of results