Tagged #XAI

# Projects

29 February 2024 | 15 min read | tags: Chess LLM XAI Attention Training

Training GPT-2 on Stockfish Games

Training GPT-2 on Stockfish Games

I trained a GPT-2 model on Stockfish self-played games in the most naive way, with no search, and it can play decently. The model is trained to output the next move given the FEN string of the board (single state). While I present some gotchas and caveats, the results are quite acceptable for the amount of work and computing invested. I also present a basic attention visualiser parsing the attention of the text tokens into the board.

# Articles

16 January 2024 | 18 min read | tags: XAI AlphaZero

Layer-Wise Relevance Propagation

Layer-Wise Relevance Propagation

Layer-Wise Relevance Propagation (LRP) is a propagation method that produces relevances for a given input with regard to a target output. Technically the computation happens using a single back-progation pass similarly to deconvolution. I propose to illustrate this method on an Alpha-Zero network trained to play Othello.

# Publications

5 June 2024 | 25 min read | tags: Chess XAI

Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents

Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents

We propose contrastive sparse autoencoders (CSAE), a novel framework for studying pairs of game trajectories. Using CSAE, we are able to extract and interpret concepts that are meaningful to the chess-agent plans. We primarily focused on a qualitative analysis of the CSAE features before proposing an automated feature taxonomy. Furthermore, to evaluate the quality of our trained CSAE, we devise sanity checks to wave spurious correlations in our results.

> All tags