Graph Neural Network based hierarchy-aware Box Embeddings of Knowledge Graphs

Tracking #: 881-1890

Flag : Review Received

Authors:

Filip Kronström

Alexander Gower

Daniel Brunnsåker

Ievgeniia Tiukova

Ross King

Responsible editor:

Guest Editors NeSy 2025

Submission Type:

Article in Special Issue (note in cover letter)

Full PDF Version:

nai-paper-881.pdf

Cover Letter:

Dear Editor, This submission is for the "Special Issue on NeSy 2025" and is an extension to the paper "Ontology-based box embeddings and knowledge graphs for predicting phenotypic traits in Saccharomyces cerevisiae". For this extension we have further incorporated symbolic knowledge to the training of the GNN by applying semantic losses to the generated node embeddings to adhere to class hierarchies from the ontologies. We define the new loss functions, including additional regularisation methods. We evaluated this method on the same knowledge graph learning and prediction task as in the original paper. A statistically significant performance improvement was achieved when including training for semantically correct representations. Furthermore, to demonstrate how these semantic losses can be used to train GNN-based box embeddings, we have trained embeddings using these semantic losses for a smaller family tree based knowledge graph, which provides insight into the ways the different semantic losses affect training. Finally, we propose a method for how this can be used for evaluating link revisions added to the graph. We believe this work is of interest to the Neurosymbolic AI journal as it proposes a method to generate semantically correct knowledge graph embeddings using GNNs, combining symbolic knowledge with neural network models. We also show it can be useful for real world prediction tasks. This is demonstrated with the knowledge graph we have created and the ability to predict biological measurements. Along with this we include our demonstration from the original paper on how this can be used for scientific discovery, where we use it to generate an hypothesis which is successfully tested in a biological experiment. Yours sincerely, on behalf of all the authors, Filip Kronström

Approve Decision:

Approved

Tags:

Reviewed

Decision:
Major Revision

Solicited Reviews:

Review #1 submitted on 16/Oct/2025

By Janna Hastings
Review Details

Reviewer has chosen not to be Anonymous

Overall Impression: Good

Content:
Technical Quality of the paper: Good
Originality of the paper: Yes
Adequacy of the bibliography: Yes

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Limited
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Good

Detailed Comments:

The article introduces a new approach for creating box embeddings that respect hierarchical ontology structures and demonstrate it on a yeast knowledge graph for the task of predicting gene deletion effects, showing that incorporating class hierarchies through semantic loss functions improves predictive performance compared to standard embeddings. The paper is overall well written and interesting. I have just a few minor requests:

- Please update the introduction to say explicitly what the current paper does and how it is structured. At the moment the introduction is just covering the needed background, but not specifically describing what the manuscript as a whole does, how it is organised etc. I was left a bit lost and had to read the whole paper a few times to fully get the structure.

- It is not clear to me why the family history ontology is used for demonstration of the box embeddings rather than e.g. a simplified subset of the overarching knowledge graph (perhaps just a selected subset of phenotypes). The family history example is not well aligned with the overarching narrative of the article and feels out of place. I would strongly suggest choosing an aligned example for this part.

Review #2 submitted on 30/Oct/2025

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Good

Content:
Technical Quality of the paper: Good
Originality of the paper: Yes
Adequacy of the bibliography: Yes

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Good

Detailed Comments:

The paper uses a Gumbel box embedding to represent classes of several biochemical ontologies like ChEBI and the Gene ontology. Different ontologies are embedded differently, also with different dimensions. Based on the ontology, from the data in the Saccharomyces Genome Database, a knowledge graph is built. The knowledge graph is then used to train a graph neural network that predicts gene interaction phenotypes in Saccharomyces cerevisiae (yeast), and in particular digenic deletion fitness. This is done using the Hadamard product of the gene embeddings, followed by a fully-connected neural network. The highest-weighted pair of genes was selected and the interaction tested using an automated lab. The lab experiments could plausibly confirm the predicted interaction. The embedding method is also tested on a family relations example.

The paper is interdisciplinary and combines several bio-chemical ontologies, a knowledge graph, box embeddings (via deep learning), graph neural networks and biochemical lab experiments. This is remarkable. The approach seems to be novel and is significant for NeSy, because a sucessful interaction between symbolic knowledge representation and deep neural networks is shown. The paper is mostly well readable, and extends the NeSy conference paper.

However, I also see several issues:

1. The authors choose a semantic KG embedding, alongside with an embedding of the class hierarchy of the ontologies (which ontologies exactly does not become clear). They seem to follow the approaches of Peng et al. (2022) and Jackermeier et al. (2024) which box-embed OWL-EL++. This is a useful choice when compared to statistical embeddings like TransE that do not respect class hierarchies. However, this choice should be stated more explicitly and defended more substantially in the paper. In particular, the conversion of the KG into an OWL TBox (explained on p.6) involves expressions of form {a} ⊑ ∃r.{b}. This involves existential restrictions, the embedding of which is not discussed in the paper. Probably you follow the approach of Jackermeier et al. (2024) here? (However, you also deviate from it in subtle details, see detailed comments below.)

2. It is not clear to me how the different ontologies are aligned when using a different box embedding per ontology. For example, how is phosphate (CHEBI:43474) related to "phosphate ion transport" (GO:0006817)? In a global box embedding, one would expect some geometric relation induced by the semantic relation of these terms.

3. On p. 12, LightGBM (decision trees) are used as a baseline, but this is rather ad-hoc. Otherwise, there are no baselines in the paper. Other baselines could be ontained by using different semantic embeddings, like LogicE or ProbE.

4. The use of the family relation example (embedded in two dimensions!) is a bit irritating. It seems to be a toy example, unrelated to the rest of the paper. Why don't you use your main use case, gene embeddings, to illustrate the same points?

Hence, I can only recommend an "accept with major revision".

Detailed comments_

p.4
Your formulas (5) and (6) apparently stem from Jackermeier et al. (2024). However, the analogon to (5) is more complex in Jackermeier et al. (2024), because the case that the second concept is empty is handeled separately there. And both (5) and (6) there have margin hyperparameter, which you omit here. Why?

p.5
The loss L^-_overlap is not symmetric. But one would expect a symmetric loss here, because overlap (or disjointness) is a symmetric relation.

p.6
ontologies.A

p.9 why do you compute the Hadamard product of the emeddings of the two genes before feeding into a fully connected neural network? While the Hadamard product ensures symmetry, it leads to an information loss and prevents cross-feature interaction. You should at least try out whether concatenation or a bilinear transformation would lead to better results.

In Fig. 3, it does not become clear that the Hadamard product is feed into a fully connected neural network.

p.10
Individual1 ⊓ Individual2 ⊑ ⊥ should be {Individual1} ⊓ {Individual2} ⊑ ⊥.

p.14 Fig. 5: It is hard to distinguish some of the colours.

p.14 bottom: please clarify why you consider pairs of edges here.

p.16 "yet both under the person class" - but in Fig. 7b, the difference Man \ Person is clearly nonempty.

p.20, at the end: you should discuss some future work

p.28 A version with speaking names would be helpful.
RO_0002200 has not been introduced.

Review #3 submitted on 02/Dec/2025

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Average

Content:
Technical Quality of the paper: Average
Originality of the paper: Yes
Adequacy of the bibliography: Yes, but see detailed comments

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Needs improvement
Level of English: Satisfactory
Overall presentation: Average

Detailed Comments:

The paper proposes an embedding method for knowledge graphs using graph neural networks in combination with box embeddings. This is demonstrated on the tasks of predictions and evaluation of KG revisions.
The topic is in scope of the Nai-journal and presents an interesting extension for modeling ontological information geometrically. It seems to be a sufficient extension of the conference article.
The overall aim of the paper is not fully clear: whereas the title proposes a general framework for an embedding method, the paper is highly focused on one specific use case. Therefore, it would be necessary to make the distinction between general approach and application to the use case clearer in the whole paper. E.g., Section 3 could be focused on the general approach, whereas section 4 would describe the specific KG. This would also increase the understandability of the approach, as it is partly not clear which design decisions are use case focused and which are general decisions for the approach.
The approach of combining a box embedding and a GNN seems to be promising, however, it needs a more detailed explanation and evaluation. Many preliminaries are not introduced, e.g. Description Logics, GNNs, GraphSAGE, etc. making parts of the paper hard to follow.
Three different use cases of the approach are given, exemplifying the usefulness of the approach. However, the approach could have been evaluated in more detail, especially comparing it to other approaches and using different datasets.
As the idea and the results are promising, I think that the paper could be acceptable based on some revisions, as stated in detail below.

Detailed comments:
# Abstract
- "interpretability techniques to identify co-occurring edges...": This sentence is unclear: What are interpretability techniques, what is an edge? Is a relation meant?
- The abstract should briefly state the results of the experimental section

# Introduction and Related Work
- As this is a journal paper, there is enough space to make an explicit related work section and to discuss the related work in a lot more detail.
- The distinction between KGs as set of (subject, predicate, object)-triples and KGs enriched with ontological or hierarchical information should be made clear in the beginning, such that it gets clear which approach is able to model concept- and which only instance-information.
- A more detailed introduction to the principle of KGE and especially on the idea of representing concepts as boxes (or some other geometric object) is needed, as this is not a straightforward viewpoint.
- Kulmanov and others are not only able to model hierarchies, EL++ is more expressive. It is necessary to discuss here in detail why such an approach that allows for modeling also concept conjunction, role inclusions etc. is not used, but only a hierarchy is modeled.
- "Instead of representing relations as translations of classes" -> also the approaches not using GNNs are not restricted to translations, there are many other ideas, especially when not considering concept information. Here, a more detailed discussion is necessary.
- "Box embeddings have been combined with GNNs..." -> This is unclear: what is the difference? For which purpose have the box embeddings been considered when not for ensuring correctness? What does "semantically correct" actually mean? Being in line with the hierarchy?
- The general introduction of the topic is too short, a general overview of the proposed approach, the goals etc. should be given in more detail before starting to discuss the use case.
- How expressive are the ontologies mentioned? Not all of them are solely hierarchies, how much information is lost if the non-hierarchical part of the ontology is ignored?
- "KG embeddings have, for example, been used by Gualdi et al. (2024) to predict genes associated with diseases from a protein interaction KG." -> What is the difference to the proposed approach?
- in general, the related work should be discussed in more detail: there are many approaches able to model hierarchies and also approaches using box embeddings in a similar domain (for hierarchies, see, e.g., Zhanqiu Zhang,Jianyu Cai,Yongdong Zhang,Jie Wangy: "Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction" and others, for the similar domain, see, e.g., Adel Memariani,Martin Glauer,Simon Flügel,Fabian Neuhaus,Janna Hastings,Till Mossakowski: "Box embeddings for extending ontologies: a data‑driven and interpretable approach")
- Also, Box-embedding approaches such as TransBox, ELBE, etc. need to be discussed in more detail: what are their limitations and why should the proposed approach solve these limitations?
- The introduction needs an overview of the paper ("In Section 2, we present...") and a detailed discussion on the exact tasks performed and the outcomes of the experiments.

# Preliminaries
- What is a "mindeltaBoxTensor constructor"? Why is it used?
- An introduction to description logic is needed, as DL-terminology is used throughout the paper.
- Box embeddings are introduced in section 3, however, when defining the semantic loss, these embeddings are assumed to be known. There should be a short introduction to box embeddings in general in the preliminaries. First, hierarchies should be defined. Hierarchies do not necessarily involve disjointness axioms, the ones considered here seem to involve them. Some hierarchies could involve axioms of the type $A\sqsubseteq \exists R.C$ which seems to be not the case here. After defining the problem statement and how an embedding should look like, the loss functions can be defined.
- Why is the L_distance^Minus loss defined like this? Two boxes are disjoint, if they are disjoint in one dimension. Does enforcing all dimensions not to intersect improve the experimental outcomes?
- What is the influence of using Gumbel random variables? How is this done in detail?

# Material and Methods
- Introductory sentence: In this section, we will first ..., then ...
- Before explaining box embeddings and the like, there should be a detailed problem statement: What should be learned, what is given, what is the input to the GNN, etc.
- What is the outcome of the GNN training and why should the output of each layer be treated as a box? What are the expected advantages of using boxes? Isn't it possible to model the hierarchy in some other way? This seems to be the main idea of the paper and therefore needs a lot more detail. Also, a small artificial example would be helpful.
- "Heterogeneous KGs, with different domains made up of classes we do not want to embed together, can have separate embeddings for each domain, which are trained using separate class hierarchies." -> this should be formalized.
- How is the negative sampling done? Based on instances or concepts? Is it actually necessary? Wouldn't it be possible to set for each $A\sqsubseteq B$ in the hierarchy the constraint that $B\not\sqsubseteq A$? Isn't there any disjointness information given in the ontology?
- Handling nominals should not be discussed in 3.2 but earlier, when talking about box embeddings in general.
- Handling nominals as boxes could lead to problems: How is it interpreted if the boxes representing {a} and {b} are intersecting but not equal? What happens if the box is not fully part of a concept but only partly? In the experimental section, this happens several times.
- How could {a}\sqsubseteq \exists r.{b} be represented? In my understanding, only the hierarchy is represented in the box embedding. Or is this only used for the GNN-part?

## 3.3
- It would be helpful to have a strict separation between general method and application, maybe in two different sections: one section describing the embedding method, the GNN etc. in all detail and one section describing the application to this specific use case. Now, 3.3., 3.4 and 3.5 consist of some general discussion on how an approach for prediction or graph revision could look like and some specific discussion on one dataset. Especially 3.4 and 3.5 are not only usable with this specific data and there should therefore be a strict separation between method and use case.
- The use case in 3.3 needs a formal definition of the task to solve and the given data, especially which ontology exactly is used and how it is structured (e.g., how deep is the hierarchy?)
- Is the domain separation done manually? Could this be automated to use the approach not only for this specific dataset? I understand that the domain separation allows for varying the size of the embedding based on the number of classes in the domain, however, why should it decrease the overall dimensionality needed? If the classes are disjoint, then they could be represented as non-overlapping boxes next to each other in the same dimension. Thus, only the maximum dimensionality of the domains would be needed.
- How many infrequent edges have been removed? Less than 1000 occurences seem to be a quite high threshold. Have you observed a difference in result quality when setting a lower threshold?
- What is meant by "removing overlapping edges"? What are "nodes" in this context, instances or classes?
- What does $\sqsubseteq*$ mean? Is this not only the set of direct parents but the set of ancestors? This needs more detail: does it mean that $c\sqsubseteq\overline{p}$ is used as negative sample?
- GraphSAGE should be at least briefly introduced.
- Figure 3 seems to be a quite informative figure but needs to be explained in more detail.
- Why is the dimension of the most common target domains higher? Should this not be dependent on the complexity of the hierarchy? Or is such a high dimension necessary for the GNN and not dependent on the hierarchy to modeled?

## 3.4
- Why do we need a box embedding without a prediction task? What is the goal of this embedding?
- Why is such a simple ontology used?
- Why are the disjointness axioms randomly selected? Why aren't all available disjointness axioms used?
- This ontology does not incorporate relations. There are many approaches being able to learn an embedding of more complex ontologies (such as with conjunction or disjunction) geometrically, some of them with boxes. What is the advantage of this approach compared to the others? I understand the advantage when modeling relations, however, in this task, there seem to be no relations.

## 3.5
- This section needs a lot more detail: What is the exact aim, what is related work?
- If an edge represents a role assertion between two classes, is it then a directed edge representing $A\sqsubseteq \exists R.C$? But how can then box embeddings be trained? In my understanding, it is not possible to train axioms of the type $A\sqsubseteq \exists R.C$, only of the type $A\sqsubseteq B$?

# Results
- a comparison with standard box-embedding approaches, e.g. ELBE or BoxEL or the like is missing, also a comparison to standard KGE-methods (like a TransE-based method) not relying on hierarchical information would have been interesting.
- Some of the standard datasets for knowledge base embeddings could have been used to reduce the dependence of the evaluation on one dataset.
- How well is the hierarchy actually modeled? In Figure 7, it seems like the axioms are not fulfilled even though the ontology is overly simple and could be represented easily in two dimensions. Not all instance representations are fully included in the concept representations and the disjointness between Woman and Country is not satisfied. Is this a general problem or only in this toy example?

## 4.1
- Why is it compared to LightGBM and not to some other approaches?
- Have you tested GNN with box embeddings but without ontology information? This would show whether the box structure in general is helpful or really the hierarchy information.
- Why does the parity plot in 4(b) and (d) "show promise in generalizing to a new task"?

## 4.2
- The goal of this section could be clarified.
- For me as someone without any knowledge in biology, the section is hard to follow. Is it possible to explain the experiment and its goal more general? I think that for a computer science journal, it would be sufficient to state that the hypotheses can be justified; all detailed biological discussion can be moved to the appendix.
- Is this hypothesis only stated when using box embeddings or is it also an outcome of the other approaches? Are there other hypotheses that can be stated and that sound plausible? As this is only one example of a positive outcome, it does not have a high significance to show the overall validity of the approach.

## 4.3
- In Figure 7, is this box embedding dependent on the choice of the seed or is it a reproducible outcome?

## 4.4
- This evaluation needs a lot more detail (see the comments to 3.5)

Minor comments:
- footnote 1: paper title in quotation marks
- p.2: phenomena
- p.3: SGD needs a source
- p.4: "to find the distance from the subclass being completely contained within the superclass" -> strange wording
- (4) for 1 space missing

Graph Neural Network based hierarchy-aware Box Embeddings of Knowledge Graphs

Tracking #: 881-1890

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Recent blog posts

Journal Info

Submit

For Reviewers

Links

Search form

Tracking #: 881-1890

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Journal Info

Submit

For Reviewers

Links