Faithfulness of explainable text classifiers

January 17th, 2023. 13:15 - 15:00

Speaker: Gabor Recski

Faithfulness of an explainable ML model refers to the degree to which its explanations actually reflects the model's inner workings. How faithfulness should be evaluated quantitatively is the subject of ongoing controversy (see e.g. Jacovi and Goldberg 2020). We shall review some approaches to evaluating faithfulness, starting with that of the ERASER benchmark (DeYoung et al. 2020), and then proceed to discuss how these methods could be applied to the evaluation of rule-based classifiers.

References:

Jacovi, Alon and Yoav Goldberg (2020). Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness? In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

DeYoung, Jay, Sarthak Jain, Nazneen Fatema Rajani, Eric Lehman, Caiming Xiong, Richard Socher, and Byron C. Wallace (2020). ERASER: A Benchmark to Evaluate Rationalized NLP Models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

Location: Zoom