Submission Type:
Article in Special Issue (note in cover letter)
Abstract:
With the widespread adoption of Deep Learning techniques, the need for explainability and trustworthiness is increasingly
critical, especially in safety-sensitive applications and for improved debugging, given the black-box nature of these
models. The Explainable AI (XAI) literature offers various helpful techniques; however, many approaches use a secondary deep
learning-based model to explain the primary model’s decisions or require domain expertise to interpret the explanations. A relatively
new approach involves explaining models using high-level, human-understandable concepts. While these methods have
proven effective, an intriguing area of exploration lies in using a white-box technique to explain the probing model.
We present a novel, model-agnostic, post-hoc Explainable AI method that provides meaningful interpretations for hidden neuron
activations. Our approach leverages a Wikipedia-derived concept hierarchy, encompassing approximately 2 million classes
as background knowledge, and uses deductive reasoning-based Concept Induction to generate explanations. Our method demonstrates
competitive performance across various evaluation metrics, including statistical evaluation, concept activation analysis,
and benchmarking against contemporary methods. Additionally, a specialized study with Large Language Models (LLMs) highlights
how LLMs can serve as explainers in a manner similar to our method, showing comparable performance with some
trade-offs. Furthermore, we have developed a tool called ConceptLens, enabling users to test custom images and obtain explanations
for model decisions. Finally, we introduce an entirely reproducible, end-to-end system that simplifies the process of
replicating our system and results.
Cover Letter:
Submission to the NeSy 2024 special issue.
Please note that one of the authors has a conflict of interest due to his position as EiC at the journal.
The paper will need to be handled in such a way that anonymity of the reviewers is preserved if they choose that. I.e.
* contact reviewers *by email* (outside the system). You can still point to the journal website for the paper link of course. Ask them whether they want to remain anonymous.
* If they do *not* want to remain anonymous, then you can invite them through the website interface as reviewers.
* If they want to stay anonymous, then they need to send reviews to you by email, and you will have to upload the review for them (technically, under your name, but state at the beginning of the review that this is an anonymous review).
Thanks!
Pascal.