Articles

5 October 2024 | 14 min read | tags: AIS XAI FHE Eval

FHE for Open Model Audits

Thanks to recent developments, FHE can now be applied easily and scalably to deep neural networks. I think, like many, that these advancements are a real opportunity to improve AI safety. I thus outline possible applications of FHE in model evaluation and interpretability, the most mature tools in safety as of today in my opinion.

16 January 2024 | 16 min read | tags: XAI AlphaZero

Layer-Wise Relevance Propagation

Layer-Wise Relevance Propagation (LRP) is a propagation method that produces relevances for a given input with regard to a target output. Technically the computation happens using a single back-progation pass similarly to deconvolution. I propose to illustrate this method on an Alpha-Zero network trained to play Othello.