What can knowledge graph alignment gain with Neuro-Symbolic learning approaches?

Tracking #: 694-1674

Flag : Review Received

Authors:

Pedro Cotovio

Ernesto Jimenez-Ruiz

Catia Pesquita

Responsible editor:

Janna Hastings

Submission Type:

Other (note in cover letter)

Full PDF Version:

nai-paper-694.pdf

Cover Letter:

Dear Editors of the Neurosymbolic Artificial Intelligence Journal, I would like to submit the manuscript entitled "What can knowledge graph alignment gain with Neuro-Symbolic learning approaches?" by P.G. Cotovio, E. Jimenez-Ruiz, and C. Pesquita to be considered for publication as a position paper in the inaugural issue of the NeSyAI journal. Our manuscript surveys the state of the art in knowledge graph alignment and neurosymbolic AI, analyzing how neurosymbolic integration could be employed as a means to overcome the current critical challenges of knowledge graph alignment. By exploring the synergistic potential of neurosymbolic approaches, we identify promising research paths for unifying logical reasoning and data-driven learning in the context of knowledge graph alignment. We believe that our manuscript will interest the readers of your journal. We declare that this manuscript has not been published before, in whole or in part, and is not currently being considered for publication elsewhere. We know of no conflicts of interest associated with this publication. Best Regards, Pedro Giesteira Cotovio

Approve Decision:

Approved

Tags:

Reviewed

Decision:
Major Revision

Solicited Reviews:

Review #1 submitted on 19/Mar/2024

By Alessandra Mileo
Review Details

Reviewer has chosen not to be Anonymous

Overall Impression: Weak

Content:
Technical Quality of the paper: Average
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Limited
Organization of the paper: Needs improvement
Level of English: Satisfactory
Overall presentation: Average

Detailed Comments:

I think the main weakness of this paper is its clarity: it should be clear from the beginning the distinction between approaches to KGA that are grounded in ontology matching literature, and what is specific to more recent neural approaches
This should be reflected in the naming of subsections and the order of the paper. These two classes of approaches could be also compared via a table that indicates what are the limitations of each approach and category, to more clearly present evidence supporting the benefits neuro-symbolic approaches would bring.

With respect to clarity, the flow of the paper also does not help as it goes from a mix of approaches to KGA, to neuro-symbolic AI in general (section 3) back to the opportunities for neuro-symbolic approaches in KGA.
Section 2 should focus on the limitation of current KGA approaches, differentiating ontology matching and neural approaches to KGA so that the key limitations/challeges can be highlighted before looking at opportunities.
Likewise, section 3 should be better structured and do a better job in focusing on what specific strength of neurosymbolic approaches would be suitable to address KGA limitations.

Section 4 focuses on the highlighted challenges (which I believe should be emerging already from previous sections and not just appear in Table 1), and indicates how these challenges could be tackled. This is done by challenge, which means that sometimes solving one problem might be weakening the approach to some other challenge.
What is missing is a general account on the trade-off between the challenge-specific solutions proposed.
Maybe a table with all the potential neuro-symbolic approaches for KGA discussed in section 4, with the challenges, to indicate what is tackled and what is not by each potential approach (with references to the respective papers), could represent a valuable roadmap for future research in this area and set the bases for the directions/vision outlined in Section 5.

MINOR:
Outline of the paper in the introduction is missing.
From 2.1 to 2.1.1. the graphs name used change from KG1 and KG2 to KG and KG’. Some consistency would be needed.

Review #2 submitted on 29/Dec/2023

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Weak

Content:
Technical Quality of the paper: Weak
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes

Detailed Comments:

The paper illustrates the state of the art in Knowledge Graph Alignment (KGA) and discusses the potential of neurosymbolic solutions for filling some of the existing gaps in the field of KGA.

The paper overall offers a good overview of the KGA problems and of the state of the art of neurosymbolic approaches. Nevertheless the paper could benefit of a more detailed discussion in section 4 and related subsections whilst some technicalities presented in section 2 could be possibly omitted. Indeed very preliminary ideas are presented in section 4 and mostly problems rather than research direction are discussed. Also, I invite the authors to consider the option of reporting a summary of the envisioned challenges already in section 1, since the reader needs to read the whole paper for finding what are the key messages and this could not be very productive.

Additionally, Table 1 should be described more extensively or at lest the authors should report clearly that each row of the time is analyzed in a dedicated subsection of section 4.

Some additional comments are reported in the following.

- Beginning of section 2.1.1: I'd suggest to use KG1 and KG2 (as for section 2.1) rather than KG and KG' (this is enforced by the fact that in section 2.1.2 KG1 and KG2 is used again). Also after the sentence "The aligned knowledge graph, represented as KGM, has the potential to produce axioms not directly derivable from its sources, KG, KG′, or its alignment, M, in isolation." an example showing the referred case would be useful. Also the notion of deductive difference should be enriched with more details clarifying it. Furthermore, in the sentence "These violations invariably culminate in aligned KGs that are either inconsistent or incoherent" what is the difference between inconsistent and incoherent?

- Section 2.1.1 page 3: when introducing the function consist, the domain needs to be clarified. Additionally, when introducing mapping f : e′ → f (e′) the definition of the function f is provided by using the function itself.

Page 4, line 43: "In this setting, the correspondence function f can be optimized problem to maximize re f" --> please rephrase it

Review #3 submitted on 06/Jan/2024

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Weak

Content:
Technical Quality of the paper: Weak
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes, but see detailed comments

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Limited
Organization of the paper: Poor
Level of English: Satisfactory
Overall presentation: Weak

Detailed Comments:

This manuscript considers the question "what can knowledge graph alignment (KGA) approaches gain with Neuro-Symbolic approaches?". This is indeed an interesting question, a paper on which could be a very important contribution, as there is, as the authors say, a discrepancy between logical methods and learning methods based on lexical, structural, and semantic data.

However, I find the contribution of this paper to be unclear. I would expect that the paper would specify what exactly state-of-the-art methods in KGA are, what exactly are their problems, and how could they be bridged by a NeSy integration. Instead, the paper is composed of: a KGA section that focuses on defining the problem (this section does include a section on SotA methods, but their presentation focuses only on the features they use without an analysis of their pros and cons); a NeSy section that talks about NeSy broadly, not in the context of KGA; and a "challenges and opportunities" section that doesn't seem to follow on the setup of the previous sections, and points challenges that are intuitive but come out of the blue. With this organization, the reader gets lost in what exactly the paper is trying to do.

Section 1 reads relatively well. Here, I was puzzled by the phrase "Recent subsymbolic approaches, including linguistic and structural models like Attention Networks and Graph Neural Networks..." - what exactly is the linguistic approach here, attention networks? Also, later in the paper deep neural networks are considered NeSy methods - how come their structured variant is considered "subsymbolic" in §1?

Section 2:
- The notion of a KG seems somewhat narrow to graphs that rely on RDF and OWL formalisms. Note that the most widely used KGs of today, like Wikidata or ConceptNet, do not align with this formalism (while Wikidata can be translated to RDF, there is no OWL in it, and its Qualifier model is generally considered a different class of models). Property graphs neither. Does this mean that KGA is not relevant for those KGs?
- "The consistency principle [16] stipulated that every named concept ... should be satisfiable" - please describe why the consistency is applied to named concepts and not to attributes, relations, etc.
- As §2.1 has very few citations, it is unclear whether this section is considered a novel contribution (as in, formalization of the task), or whether it is adapted from prior work. Parts such as the softconsist formula trigger this question, it is good to know where this formula comes from. Similarly, 2.1.2 makes various claims (e.g., "the majority of existing ... are unsupervised" and "the candidate generation ... uses ..."), which would also benefit from support. And 2.1.3 again has no pointers to prior work.
- What is K^M (undefined)?
- §2.2 makes an effort to review the state of the art, which has several issues:
1) the review classification comes from work on ontology matching. The paper never discusses the relationship between the envisioned KGA and OM. Given the definition in 2.1 and the focus on OWL schemas, it seems like they have a lot of overlap. In that case, it is unclear what is the novelty of §2.2. If there is a difference, then the authors should motivate why the reuse of a classification of another task is appropriate.
2) the review focuses on what the systems use as knowledge sources, but at no point, is there a review of how the systems operate (what is the method). Without knowing what the method is, it becomes hard to appreciate the methodological challenges presented in §4. Instead, the section has a lot of vague phrasings that seem to deliberately avoid providing more detail: "is put forth by", "capitalizing on", "harnessing", "directing attention", "several works have been published in this area", ...
3) the classification seems rather inconsistent to me. The first class has only two subclasses with coarse granularity, while the second class has six that are very finely split (e.g., string-based vs language-based; graph-based vs instance-based).
4) at the end of §2.2, the authors single out two methods that are "go-to choices". There is, again, no description of how these methods work, how they are different than other methods, and why they are favored over other methods.
5) There is a mention of the "Bio-ML track" which is "of particular significance" because "it is the first to emphasize machine learning... for ontology alignment". Again, what is the relation between ontology alignment and KGA is not specified.

Section 3:
- I found this section to be unaligned with the topic of the paper. The section does not talk about KGA at all, instead, it provides general motivations for NeSy.
- Also, there is again a variety of arguments that are left semi-addressed:
1) "symbolic approaches face challenges" (I count only one challenge in the given paragraph, that of reliance on human input);
2) a mention of the "current limitations" of deep learning without contextualizing these in the context of KGA;
3) "Due to these limitations, ... CYC ... have largely fallen out of favor" - it is unclear that CYC has "fallen out of favor" only because it relied on people (e.g., it was largely proprietary), plus CYC did gradually shift towards including automated modules as well.
- §3.1 talks about integration, summarizing prior work by Hitzler et al. This section is also not focused on KGA, so its relevance to the paper is unclear, as is its novelty.

Section 4:
- Section 4 is arguably where the main novelty of the paper lies. The section points out issues of KGA methods and discusses how they are addressed or can be addressed. My main problem with this section is that it is not well-aligned with the rest of the paper. The mapping and repair categories, which are its organizing factors, are mentioned before but just in passing. The five challenges relate to other challenges provided in the paper, but their solutions (cells in the table) do not clearly align with the prior description of the methods. What would have been a nice contribution is to evaluate the adequacy of prior method categories (or methods) with respect to each of the challenges, and indicate which of them are most promising to solve each challenge.
- I was confused by the sudden proposal of an approach in 4.1, followed by a statement that a "similar architecture was already introduced". Maybe it would be better to start with that architecture and discuss what could be improved?
- §4.2 indicates that LLMs rely on large corpora and KG inference must be induced from thousands of examples. However, this argument misses the fact that these LLMs are actually already available, and recent paradigms like in-context learning allow us to perform inference with them without fine-tuning and without a large number of examples. Interestingly, a similar argument comes two paragraphs later, so it is perhaps a matter of consolidation.
- §4.3 (achieving correctness) as a separate challenge seems confusing. Isn't producing correct alignments the basic objective of the task? According to §2, this seems to be the case. Discussing the accuracy can certainly be beneficial, as long as the paper provides indications of what aspects of the model or the task are currently challenging and how these can be improved.
- §4.5 makes a point that all methods are non-transparent. While this statement is hard to judge as the workings of the methods are not provided in the paper, §2.2 seems to allude that some of the methods provide an intrinsic way to derive explanations. §4.5 would thus be also improved by a more granular distinction between the method families that address a challenge and those that don't.
- In §4.5, it sounds like the post-hoc explanations are an invention of the semantic web community. This is certainly not the case, as popular post-hoc explainable AI methods like LIME and SHAP mainly come from machine learning research.

Section 5:
- the conclusion seems very generic to me. With phrases like "we advocate for research endeavours that transcend singular methodologies", and an "aspiration ... that collectively surpasses prevailing state-of-the-art algorithms", it sounds like a truism rather than a summary of a vision for bringing NeSy to KGA

In summary, I do see a lot of value in the premise of this paper and I can certainly see informative content in it, especially the challenges. However, to make it a paper that can serve as a review of the state-of-the-art NeSy systems in KGA, I suggest that the paper presentation, contributions, and argumentation are thoroughly revised and re-focused.

Review #4 submitted on 07/Dec/2023

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Average

Content:
Technical Quality of the paper: Average
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes

Detailed Comments:

This paper reviews and present challenges in Knowledge Graph Alignment
(KGA), a method for identifying and merging entities across multiple
Knowledge Graphs (KGs). It provides definitions for KGs and KGA while
presenting challenges within the context of neuroSymbolic methods.
Overall, this paper is a good introduction to KGA, yet some improvements
are are necessary for better clarity and conciseness. I think this
paper would be of interest to the NeuroSymbolic community and with a set
of revisions, I think it would be a strong paper

Pros
======

- The paper provides clear definitions for Knowledge Graphs and
Knowledge Graph Alignment, establishing a foundational understanding
for readers. (Section 2).
- The paper focuses on KGA challenges from a neuroSymbolic
perspective. This is a new contribution that potentially opening
new avenues for research.
- The paper covers relevant challenges in KGA, especially concerning
integration problems in aligning KGs.

Cons
======

- The format of the paper could be revamped to better support the
novelty and contributions. For example, the paper lacks a clear
articulation of its contributions in the introduction (Section 1).
Instead, I would suggest expanding on the last paragraph and outline
the next sections of the paper, as well as the contributions.
- Some of the technical terms need further definitions. For example,
in Section 2, the 'softconsist' function is unclear. Is this a
given function? Is it domain dependent. This may also be
strengthened with a running example.
- Some of the paper's review on Knowledge Graphs tend to be overly
detailed. For example, the problem definition in Section 2.1 could
be nicely illustrated with a small Figure.
- Formatting Issues:
- Table 1 extends beyond the page margins, interfering with the
readability by overlapping line numbers. I also think this Table
could be strengthened with a more summary of the takeaways in a
Caption.
- The Challenges section nicely lays out the problems in the area.
However, it is a bit verbose, and I'm wondering if a short paragraph
at the beginning of Section 4 could highlight and structure the
types of problems and opportunities more succinctly.

What can knowledge graph alignment gain with Neuro-Symbolic learning approaches?

Tracking #: 694-1674

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Recent blog posts

Journal Info

Submit

For Reviewers

Links

Search form

Tracking #: 694-1674

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Journal Info

Submit

For Reviewers

Links