rajatSingapore ·
← Index
Bomber
AI learns via Thompson Sampling
Current context
Calm
Game1
Windcalm
Turn0
You00AI
Reward signal
Bomb within 60px:
✓ alpha + 1 (+ neighbors)
✗ beta + 1
Direct hit: alpha + 3
3 HP per bomber
Why Thompson Sampling works

Thompson Sampling continuously learns and relearns. When conditions shift, it re-explores automatically.

In this game, wind changes, you move, and craters reshape terrain. Each is a contextual parameter the AI adapts to.

No rules. No retraining. Just Bayesian updating from sparse signals.

Attempt log
AI hasn't fired yet. Move and fire first.
AI posterior beliefs
context: wind calm
hover a bar for details
5%power →100%
Wind shifting
Fixed
Wind stays fixed. The AI only needs one context. Turn this on to see contextual learning in the Knowledge Surface below.
Play fair
OFF
The AI only sees success/failure signals. It never sees terrain, positions, or trajectories. Toggle this to play with the same blind constraint. You'll only see hit/miss distance after each shot. This mirrors real-world decision-making: sparse feedback, no full picture.
Penalty strength
Miss: +1.0β
Off-screen: +2.5β
Higher = faster learning but more risk of over-correction. The AI may abandon a good arm too quickly after a single unlucky miss.