KEA Explain: A Neurosymbolic Framework for Detecting and Explaining LLM Hallucinations

Tracking #: 879-1888

Flag : Review Received

Authors:

Reilly Haskins

Benjamin Adams

Responsible editor:

Guest Editors NeSy 2025

Submission Type:

Article in Special Issue (note in cover letter)

Full PDF Version:

nai-paper-879.pdf

Cover Letter:

Dear Editors We are submitting our updated version of a paper that was accepted at the 19th International Conference on Neurosymbolic Learning and Reasoning (NeSy25), titled "KEA Explain: Explanations of Hallucinations using Graph Kernel Analysis". In accordance with the NAI Author Guidelines, we have significantly expanded the article with the following key changes: - Added Background section. - Added Explainable AI subsection in related work. - Expanded related work to better ground our contribution. - Added more detail into the Methodology section (dataset statistics, WL kernel details, etc.) - Expanded Discussion section with a more detailed interpretation of the results. We look forward to receiving the reviewer feedback on our submission. Thank you, Reilly and Ben

Approve Decision:

Approved

Revised Version:

KEA Explain: A Neurosymbolic Framework for Explaining LLM Hallucinations

Tags:

Reviewed

Decision:
Minor Revision

Solicited Reviews:

Review #1 submitted on 29/Sep/2025

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Good

Content:
Technical Quality of the paper: Good
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Limited
Organization of the paper: Needs improvement
Level of English: Satisfactory
Overall presentation: Good

Detailed Comments:

The paper proposes a framework for detecting hallucinations in large language models using graph kernels. The system is structured around four main components:

- Relation extraction
- Relation filtering
- Graph comparison
- Distance calculation and explanation generation

Strengths
- The approach addresses both retrieval-augmented generation (RAG) settings and unconstrained model outputs.
- It leverages ground-truth resources.
- The system provides explanations, which is a crucial contribution for transparency.
- The use of edit operations for generating contrastive explanations is a particularly clever and appealing idea.

General Comments
- The distinction between Background and Related Work sections is not entirely clear; merging them could improve readability.
- The paper devotes a considerable amount of space to discussing prior work, which somewhat overshadows the explanation of the proposed methodology. A stronger balance in favor of methodological clarity would help.
- A central component of the approach, the Weisfeiler-Lehman Graph Kernel, is not explained with sufficient concreteness. The appendix description feels too abstract, and it is not entirely clear how the extracted triples are represented and compared within this framework. Providing a more intuitive walkthrough or a worked example would make the contribution much more accessible.

Review #2 submitted on 02/Oct/2025

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Good

Content:
Technical Quality of the paper: Good
Originality of the paper: Yes
Adequacy of the bibliography: Yes, but see detailed comments

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Good

Detailed Comments:

The proposed KEA Explain is a neurosymbolic method for LLM hallucination detection and explanation. Its main strengths include the adoption of graph-based heuristics, specifically the Weisfeiler-Lehman subtree kernel, for robust structural comparison of knowledge graphs. Furthermore, the generation of contrastive explanations is a defining feature, directly addressing the critical "lack of explainability" limitation of existing methods by detailing discrepancies between the claim and the ground truth facts.

Strenghts:

- Adopting graph-based heuristics to evaluate hallucinations is a novel and compelling approach. Using a subtree kernel for structural comparison seems like a powerful way to move past simple triple-matching methods and incorporate the wider context of the knowledge graph structure.

- The framework's ability to generate contrastive explanations is its defing strength, directly addressing the critical "lack of explainability" limitation of existing methods. The explanations detail not only why a statement is hallucinatory but also what change would correct it.

- The method is relevant for the NeSy community, as it combines the strengths of symbolic components (KGs, graph kernels) with neural techniques (SBERT embeddings for semantic clustering). This allows the symbolic comparison to account for the semantic similarity of labels, making the comparison robust.

Weaknesses:

- I think that the performance of the approach is heavily reliant on the quality of the actual ground truth KG, which, in turn, is heavily conditioned by several possible failure points. These include the performance of the SentenceBERT embeddings, the SpaCy Entity Linker, and the empirically chosen Similarity Thresholds. Performance comparison with baselines indicates that the model’s value may ultimately lie more in the benefit of structured explanations than in detection performance alone.

- Testing the open-domain hallucination detection only on the WikiBio dataset is somewhat limiting, given its modest size. Furthermore, while competitive, the detection performance might appear suboptimal compared to certain baselines.

- The Related Work requires a deeper focus on the interpretability limitations of existing models and a more robust justification for the preference of the proposed approach

Additional comments:

- Regarding Algorithm 1, clarification is needed for the condition "attributes differ". I assumed that "attributes differ" refers to a discrepancy in the relation label (the r in an (h,r,t) triple) even if the head and tail entities are identical, however in the discussion the authors refer to a conflict at entity-level (Paris and Rome).

- The paper clearly states an LLM generates the natural language explanation, yet it doesn't specify the LLM used for this task or provide an example of the specific prompt used for explanation generation.

- The technique used for filtering the ground-truth KG by maximizing the cosine similarity of SBERT embeddings (arg max) with the claim KG triples. From my understanding, this reliance on argmax is a concern because it could select irrelevant context triples as long as they are the closest available in the embedding space, potentially retaining triples that are only somewhat relevant (similar domains) but not directly tied to the entities being evaluated. A strict similarity threshold alongside argmax could make this filtering process more robust.

- The necessary use of empirically chosen graph kernel similarity thresholds that vary significantly by task introduces complexity. This sensitivity to different domains and tasks means real-world deployment would likely require re-optimizing the threshold before use.

- The framework incurs a significant practical computational burden. I think that programmatically constructing ground truth KGs and, for open-domain tasks, retrieving relevant facts from Wikidata on the fly via SPARQL queries can be time-consuming, which could be a limitation for real-time applications. Is this limitation worth discussing?

- The observation that explanation quality monotonically declines as hallucination severity decreases is important. It suggests the current method struggles with subtle inconsistencies, as it is too reliant on finding concrete conflicting triples, which are sparser in nuanced hallucinations.

- For the triple notation, e.g., (h,r,t), using italics to distinguish variable names from prose is recommended for clarity.

- This is more of a suggestion: the structured nature of the contrastive explanation (based on graph edit operations) is ideally suited to generating follow-up prompts (e.g. to the same LLM that generated the hallucinated sentence) to actively correct the hallucination. This could be a valuable direction for future exploration.

KEA Explain: A Neurosymbolic Framework for Detecting and Explaining LLM Hallucinations

Tracking #: 879-1888

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Recent blog posts

Journal Info

Submit

For Reviewers

Links

Search form

Tracking #: 879-1888

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Journal Info

Submit

For Reviewers

Links