By Lia Morra
Review Details
Reviewer has chosen not to be Anonymous
Overall Impression: Average
Content:
Technical Quality of the paper: Good
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes, but see detailed comments
Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Average
Detailed Comments:
Summary and contribution
The manuscript proposes a XAI technique for GNN embeddings which aims at producing a global explanation of the graph embeddings. It is based on Functional Semantic Activation Mapping (FSAM), recently proposed in a Nesy24 paper, of which the present manuscript is an extension. FSAM constructs a semantic graph that relates hidden nodes with output classes based on activations’ correlations. This graph can be analyzed using, e.g., community detection, to characterize the network’s behavior. Experiments conducted on several benchmarks show that FSAM can analyze the GNN behavior on a layer-by-layer basis. FSAM is used to explore GNNs of increasing depth, showing that increased correlation between activations correlates to a decrease in performance, due to oversmoothing.
Strengths
- The manuscript focuses on an important yet underexplored topic (GNN explainability at the model rather than instance level)
- Extensive experimental validation is provided to show that FSAM quality correlates with GNN accuracy, and that increasing the number of layers reduces neuron specialization
- Community analysis provides a way to group classes that are considered similar by the network, thus providing insights into its inner working, especially when correlated with misclassification patterns
Weaknesses
- The technical novelty is limited with respect to the conference paper: the primary contribution are additional experiments
- While the paper claims that “FSAM quality” correlates with GNN accuracy, the concept of quality is vaguely defined. It is not clear if the FSAM is per se interpretable or validated against a reference standard
- The manuscript could be improved in terms of clarity and flow
Detailed remarks
1. One of the claims of the paper is that FSAM quality decreases due to over-smoothing (page 2, L20-26). A small introduction on the concept of over-smoothing, and a summary of related works on the topic, would strengthen and clarify the contribution of this paper. Oversmoothing has long been studied in the machine learning literature [1], what are the unique advantages of FSAM, if any, in detecting oversmoothing? Or conversely, is the fact that FSAM output indicates oversmoothing proof of its validating, since oversmoothing is a well-studied problem?
2. In Table 1, the content of each column should be clarified, ideally in the caption. While type or black-box are self-explanatory, task, target, flow, and design are less so. Abbreviations used in the Table should be defined in the caption or in the text.
3. I like the idea of having Section 4 with the detailed contribution, but a lot of content is repeated from the introduction. The introduction could be shortened to avoid repetitions
4. It is not clear to me what cross-domain validation means in the context of FSAM validation (page 6, line 9). I think validation on multiple domains would be clearer, as often cross-domain is used to entail that the system is trained/configured on one domain, and tested/used on another domain
5. The description of FSAM is much more concise than in the NeSy paper. I understand that this choice leaves room to expand on the experimental validation, and limits repetition. However, in the interest of a more self-contained manuscript, I believe it would be useful to at least briefly define all elements of the methodology. In particular, the following concepts are mentioned but not defined or explained: the notion of ego-graph (page 5, line 10); how each activated neuron is mapped to the final predicted class (page 7, line 34-38); how communities are extracted (Section 5.4)
6. In Fig.1 there are two nodes which are separately from the rest of the graph. Is it an artifact of the visualization or are they nodes with distinct characteristics?
7. It is not entirely clear, to me, why the graphs in Fig. 1-4 are “semantic graphs”, since most of the nodes are layers, and thus are labelled with strings that do not carry, by themselves, any semantic meaning. If I understand the paper correctly, the semantics are given by the connections with the predicted labels, but these are hard to interpret visually. It is also not evident which neurons correspond to each layer, and thus how layer-by-layer comparison can be obtained. In Figs 2-4, all labels appear to refer to the first convolutional layer (Conv1*), thus it is not self-evident how the structure changes in different layers, or with networks of different depths.
8. In Table 2, how is the layer-wise accuracy calculated? Or, if the table compares the accuracy of separate networks characterized by different depths, then the caption should be revised (by layer-wise accuracy, I understand that the network has four layers, and the accuracy is computed at the end of the first, second, third and fourth layer).
9. Table 3 includes the absolute mistake count, however, also including the percentage figure would clarify.
10. Sections 5.4 and 5.5 are quite long and would benefit from a revision to better summarize the substantial number of experiments described in the paper. Section 5.5 also refers to several key figures (page 15, 1-24) but does not refer to the actual content of the paper. Some references are unclear: for instance, there is a reference to Section 8, which does not exist in the paper, or to visualization of community structures that are not present in the manuscript. Page 15, line 25 refers to Table 2, but the sentence seems more consistent with Table 3 instead. I think that revising these two final sections to improve readability, clarify conclusions, and better connect them with the experimental results would strengthen the paper.
[1] Li, Qimai, Zhichao Han, and Xiao-Ming Wu. "Deeper insights into graph convolutional networks for semi-supervised learning." Proceedings of the AAAI conference on artificial intelligence. Vol. 32. No. 1. 2018.
Typos:
Page 6, line 8 (and 16/24/31): in the subsection 5.2 We are extending -> in Subsection 5.2, we are extending
Page 6, line 9: what is 5.1? subsection I image
I would avoid capitalization in the sentence (e.g., page 6, line 8)
Page 7, line 44: However, As shown -> However, as shown
Page 11 lines 31-48: the same paragraph is repeated twice