Design Patterns for LLM-based Neuro-Symbolic Systems

Tracking #: 787-1778

Flag : Review Received

Authors:

Maaike de Boer

Quirine Smit

Michael van Bekkum

André Meyer-Vitali

Thomas Schmid

Responsible editor:

Guest Editors Knowledge Graphs and Neurosymbolic AI 2024

Submission Type:

Article in Special Issue (note in cover letter)

Full PDF Version:

nai-paper-787.pdf

Supplementary Files:

nai-supplementary-787.pdf

Cover Letter:

Dear Editor, Please find enclosed a manuscript entitled “Design Patterns for LLM-based Neuro-Symbolic Systems ”, which we are submitting for exclusive consideration of publication as an article to the special issue on "Knowledge Graphs and Neurosymbolic AI" of the Neurosymbolic Artificial Intelligence journal. In this paper, we use and extend the modular design patterns and Boxology language of van Bekkum et al. to fit LLMs, as they are a dominating trend in Artificial Intelligence (AI) past years, and they were previously not yet well presented in the patterns. The patterns provide a general language to describe, compare and understand the different architectures and methods used. The primary goal of this work is to support better understanding of LLM-based models as well as the engineering of LLM-based systems, particularly those that are used in combination with knowledge based systems, making them neuro-symbolic systems. In order to demonstrate the usefulness of this approach, we explore LLM-based neuro-symbolic architectures and approaches as well as use cases for these design patterns. This paper is an extension of our GeNeSy workshop paper named “Modular Design Patterns for Generative Neuro-Symbolic Systems”. The workshop organisers (Filip Ilievski, Jacopo de Berardinis, Jongmo Kim and Nitisha Jain) have invited us to submit an extended version to this special issue. In this paper, we extended the paper in the following direction: - We only focus on LLM-based NeSy systems, not generative AI systems, which changes the scope of the work slightly - We reviewed another 50+ papers and created (complex) LLM-based Neuro-Symbolic Design Patterns, as found in section 3.3 and further - We added RAG as a use case This paper should be of interest to a broad readership of the journal including those interested in neuro-symbolic systems in general, LLM-based systems and design patterns. Thank you for your consideration of our work. Please feel free to correspond with us by e-mail using maaike.deboer@tno.nl. Sincerely, Maaike de Boer, Quirine Smit, Michael van Bekkum, André Meyer-Vitali and Thomas Schmid

Approve Decision:

Approved

Revised Version:

Design Patterns for LLM-based Neuro-Symbolic Systems

Tags:

Reviewed

Decision:
Major Revision

Solicited Reviews:

Review #1 submitted on 17/Jan/2025

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Good

Content:
Technical Quality of the paper: Good
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Needs improvement
Level of English: Satisfactory
Overall presentation: Good

Detailed Comments:

(+) the topic is highly relevant and important
(+) the paper built on a solid foundation on the existing boxology notation
(+) representation of both abstract and concrete neurosymbolic AI applications

(-) limited new contents compared to the original workshop papers
(-) organization of the paper can be improved
(-) an updated taxonomy with all terms used in the patterns is missing

The paper proposed an extension of the generic modular design pattern and boxology notation for LLM-based models. This work aims to support a better understanding of LLM-based models, their interaction with other systems, and their applications. This article is an extension of a workshop paper [1], with additional contents mainly of several patterns on LLM-based systems, both in the training and deployment/application stages (Section 3.3-3.5; Fig. 4-9), and one additional use case (RAG).

The content is highly relevant to the special issue topic. It will become a valuable resource to support researchers and practitioners in better understanding the recent trends in combining LLMs and symbolic AI (especially KG) approaches. There are, however, several issues that require attention, such as the following:

(i) limited new content beyond the original workshop paper: In its current form, the paper is similar to the original workshop paper, with limited new content of Section 3.3-3.5, Fig. 4-9, and one additional use case (RAG). Further ideas for extension are additional use cases as part of Section 4 to provide additional cases for the six variants of LLM-KG synergies.

(ii) organization of the paper: The structure of Section 3 and Section 4 is not easy to comprehend presently. Since both (S3.2) transformer model and (S3.2.1) prompts and instructions are part of specific novel elementary patterns (S3.1) needed to represent LLMs, they can become sub-sections of S3.1. Similarly, S3.4 and S3.5 can be placed under S3.3.
Furthermore, a similar arrangement can be used to structure Section 4 accordingly (i.e., LLM-only vs LLM-NeSy) or at least to add such logical grouping in the section introduction.

(iii) an updated taxonomy with all terms used in the patterns is missing: Since the paper extends the original notation from [2], it introduces several new terms, such as "generate:create", "generate:label", various type of data ("data:ranked-response", "data:response"), which can confuse readers as of the real meaning behind these terms. Therefore, having an updated taxonomy with definitions for these terms is necessary to understand the systems better.

minor comments:
* P3 L4: syngerized -> synergized
* P3 L25: "... operations instances and models ..." --> "... operations (on) instances and models ..."?
* P3 L30: Missing reference "[?] extended ..."

[1] M. de Boer, Q. Smit, M. van Bekkum, A. Meyer-Vitali and T. Schmid, Modular Design Patterns for Generative Neuro-Symbolic Systems, GeNeSy (2024) https://ceur-ws.org/Vol-3749/genesy-03.pdf
[2] M. van Bekkum, M. de Boer, F. van Harmelen, A. Meyer-Vitali and A.t. Teije, Modular design patterns for hybrid learning and reasoning systems: a taxonomy, patterns and use cases, Applied Intelligence 51(9) (2021), 6528–6546.

Review #2 submitted on 17/Jan/2025

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Average

Content:
Technical Quality of the paper: Good
Originality of the paper: Yes
Adequacy of the bibliography: Yes, but see detailed comments

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Limited
Organization of the paper: Needs improvement
Level of English: Satisfactory
Overall presentation: Average

Detailed Comments:

This paper proposes to reuse and extend the design patterns in NeSy boxologies to capture LLM patterns. The new patterns capture the training, fine-tuning, and usage of LLMs. They can then represent LLM-based neuro-symbolic patterns of different families, which is illustrated with representative systems. This paper is an updated version of a recent workshop paper by the authors, which was non-archival.

This paper is relevant to the NAI journal. It identifies a clear gap in the existing boxologies towards LLMs, which is an apparent problem to be addressed. The new patterns correspond to emerging LLM theories, and the showcases on NeSy LLM-based system types and systems are convincing. The future works described in section 5 are inspiring.

The main weakness of this paper lies in its readability. There are many figures and subfigures whose understanding is critical for a reader to be able to appreciate the novelty of this work. However, none of the figures, including the novel patterns, are described in detail. The only detailed description is of the four individual boxology elements in the related work, however, a lot is left to the reader to assume and decipher. For example, what are the kinds of information that fall under symbols? Are two symbol boxes rendered separately actually different or can they be the same? Etc. Moreover, much of the exposition points to the patterns with their identifiers, without following any particular order. And, the newly introduced patterns lack clear motivation - it is not clear whether/why this new set is complete or non-redundant.
As a remedy, I would suggest that the authors devote more space in the paper on describing the patterns, especially the ones that are newly introduced (about LLMs). However, the remark is broader and more explanation and justification would be useful for all figures (1-9).

Another weakness of the presentation of the paper is its ambiguity in terms of the scope, between LLM patterns and LLM-based neuro-symbolic patterns. The paper seems to do both, the title suggests the latter, and the methodological sections start from the former going through the latter. The related work is split between background for both and related work for the latter. My suggestion here would be to align this story throughout the paper. As section 2.1 provides the relevant background, I think it would make sense to separate it from the related work. The background could serve to provide more depth and justification for the patterns in sections 3 and 4.

The related work in 2.2 seems to focus on the specific paper that the authors build on, but there are many other works around organizing NeSy and LLM systems that could be discussed, and some of them in more detail. Some examples are [1-6] below.

I am wondering whether the discussion in 3.4 and 3.5 is meant to provide anecdotal evidence or is it making a claim that the provided patterns cover all methods from a given family (e.g., KG-enhanced LLMs)?

Moreover, is a system belonging to a single family or can it belong to multiple? For example, Kagnet, MHGRN, and QA-GNN all use GNNs to perform question answering, but they are placed in the three opposing families (3.5.1 - 3.5.3).

While the future work plans in §5 are solid, they are not well-connected with the rest of the paper. Having some background knowledge, I know that for example Logic-LM would require control flows, but this is not stated in the paper. Similar remark holds for the other future work items, which come out of the blue. I would suggest that the authors try to connect the future work items to particular observations inside the main part of the paper.

Furthermore, several claims would require support and/or revision:
- the paper states that encoder-decoder models were introduced first, followed by specific encoder-only and decoder-only variants. This is questionable because the first transformer models were BERT (encoder-only) and GPT (decoder-only). Perhaps I misunderstood what the authors meant here, but this could be a source of confusion for other readers too.
- What is meant by a "higher, more abstract representation" on page 2? Calling the hidden layer representation in LLMs more abstract and higher seems questionable.

Typos/minor comments:
- it would be good to list the "advantages over purely statistical generative models" in the abstract
- "has changed world" -> "has changed the world" (page 1)
- citation missing on page 3 ("[?]")
- it sounds like 3.4 and 3.5 should be subsections of 3.3?

[1] Sabou, M., Llugiqi, M., Ekaputra, F. J., Waltersdorfer, L., & Tsaneva, S. (2024). Knowledge engineering in the age of neurosymbolic systems. Neurosymbolic AI Journal.
[2] Besold, Tarek R., Artur d’Avila Garcez, Sebastian Bader, Howard Bowman, Pedro Domingos, Pascal Hitzler, Kai-Uwe Kuehnberger, et al. “Neural-Symbolic Learning and Reasoning: A Survey and Interpretation.” arXiv, November 10, 2017. https://doi.org/10.48550/arXiv.1711.03902.
[3] Allen, Bradley P., and Filip Ilievski. “Standardizing Knowledge Engineering Practices with a Reference Architecture.” Application/pdf. Transactions on Graph Data and Knowledge (TGDK) 2, no. 1 (2024): 5:1-5:23. https://doi.org/10.4230/TGDK.2.1.5.
[4] Amador-Domínguez, Elvira, Emilio Serrano, and Daniel Manrique. “Neurosymbolic System Profiling: A Template-Based Approach.” Knowledge-Based Systems 287 (March 2024): 111441. https://doi.org/10.1016/j.knosys.2024.111441.
[5] Garcez, Artur d’Avila, and Luís C. Lamb. “Neurosymbolic AI: The 3rd Wave.” Artificial Intelligence Review 56, no. 11 (November 2023): 12387–406. https://doi.org/10.1007/s10462-023-10448-w.
[6] De Raedt, Luc, Sebastijan Dumančić, Robin Manhaeve, and Giuseppe Marra. "From statistical relational to neuro-symbolic artificial intelligence." arXiv preprint arXiv:2003.08316 (2020).

Review #3 submitted on 06/Nov/2024

By Jacopo de Berardinis
Review Details

Reviewer has chosen not to be Anonymous

Overall Impression: Good

Content:
Technical Quality of the paper: Average
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes, but see detailed comments

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Bad
Organization of the paper: Needs improvement
Level of English: Satisfactory
Overall presentation: Average

Detailed Comments:

This article contributes new design patterns for LLM-based Neuro-Symbolic systems by extending the Boxology approach. After conceptualising the basic patterns for current Transformer architectures, and extending them within the context of language models (prompting, instructions, etc.) the authors taxonomise LLM-based NeSy models into 3 categories: KG-enhanced LLMs, LLM-augmented KGs, and synergized LLMs and KGs; and propose a referential pattern for each of them, making distinction between the training and the inference stages. The article concludes by presenting 7 applications of the proposed patterns to specific NeSy (and non-NeSy) systems. Overall, this work contributes an original approach to document, analyse, and compare NeSy models and systems; and appears to provide a flexible paradigm to conceptualise current (and potentially future) methods, as demonstrated in the use cases section. Nevertheless, while the core contribution of this work is sound, I believe that this manuscript necessitates substantial work to improve the introduction, the motivations, and most importantly, to explain the authors' work at a reasonable level of detail and demonstrate practical applications beyond showing its expressiveness to represent current LLM-based NeSy models and systems.

Strengths
---------

- The approach is sound and intuitive, with diagrams supporting the understanding of the various patterns. The proposition is well-scoped and contextualised (especially 3.2.1. which describes prompts and instruction), and the authors demonstrate the expressiveness of their paradigm in representing various models at different stages and for different processes.

- The continuity with the ongoing Boxology research efforts adds value to the proposal, and shows that there is an overall vision behind this contribution.

- The article is well aligned to current GenAI + NeSy approaches, and the contributions mentioned are all relevant. However, as described below, a more general introduction of Symbolic + NeSy models and systems to remark their nature and their strengths (even before the inception of LLMs) would have helped to better contextualise this work.

Weaknesses
----------

- Introduction. The first paragraph of the introduction sounds a bit high-level. Overall, it needs to be significantly expanded and improved. Expressions like "OpenAI’s ChatGPT system has changed world of text generation forever" are also a bit informal. Also, "neuro-symbolic approaches" have been there for a while, and the reader may get the impression that they have been introduced as a response to LLMs' weaknesses. Moreover, the intended meaning of trustworthiness should be made clearer, as some definitions of trustworthiness already include explainability as a requirement (see, for example, as https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trust...). Also, I would have appreciated a brief intro on NeSy models (how they are designed, examples of how inference is performed, their key differences from purely statistical approaches, etc.) and systems before starting to introduce Boxology. Finally, when outlining the main contribution of the article, I could not grasp how the proposed approach actually contributes to addressing the challenge of making models/systems more explainable, etc.

- Related work. This section jumps directly to Transformers, as the key technology most current LLMs use. I would recommend to elaborate this starting from a more foundational perspective. At the core, current LLMs are autoregressive sequence models, which can be implemented with a self-attention network (SAN), where Transformers are a specific type of SAN. Are these patterns applicable to Transformers-based only? Also, given the focus on LLMs, it would be interesting to expand the current formulation with a Mixture of Experts (MoE) approach.

- Symbolic AI is not the only way to address the "side effects" of LLMs, and there has been substantial research in NLP to making these models explainable, safer, and more factual. I would recommend mentioning some of these efforts, while making clear distinctions with NeSy approaches as mentioned before.

- Section 2.2 (Processes). I struggled to understand the difference between generation, transformation, and inference, as I guess they could all be seen as instances of transformations? For example, a generation can be seen as a transformation of an input vector (a seeding/priming token, input noise, etc.), so there is often some form of data as input to a generation process. The same applies to inference.

- Section 3. I was expecting more of a step-by-step explanation of the each pattern, going through each component (at least for the first patterns introduced), so that the reader can appreciate the overall formulation/definition, and easily understand the logic and the intuition behind each pattern. Instead, the description of these patterns appeared to me very fragmented and dispersive, often assuming technical knowledge of previous works.
Also, a lot of relevant models from the literature are briefly presented (for example, in Section 3.4.3) but the little information given makes it difficult to understand how they specifically relate to the patterns. In sum, I would recommend focusing this part to explain how each pattern can generalise well and apply to each of these models/systems presented. For example, "BERT-MK has a similar dual encoder, but adds additional information from the neighbouring entities ..."; how is this captured by the pattern you are presenting?

- Figure 3B. Why is the classification head on top of the encoder producing a "symbol" as output of the "infer:deduce" process? From my understanding, if the model is trained for classification, what the model is learning is a probability distribution; so, at inference time, you would still have vector "data" (e.g. the logits) from which a "symbol" can be sampled. I believe more information is needed to clarify this.

- Section 4. To fully appreciate the expressiveness of the patterns in the use cases, more information on the NeSy systems is needed. This is the case of RAG. Instead, when more information about the system is given, such as for "KnowBERT", then the explanation of how the architecture relates to the pattern is a bit succinct ("The Boxology pattern for KnowBERT is depicted in Figure 13"). Ideally, each subsection should have both: a reasonable primer of the system (as done for KnowBERT), as well as a step-by-step explanation of the corresponding pattern, and how this is expected to generalise to other/similar systems within the same group. Section 4.6 is a good example of this.

- Section 4. ChatGPT. While there is published material and resources on the GPT family of models (as the authors rightly mention in the article), to the best of my knowledge, we do not have a full transparent and comprehensive understanding of the (overall) ChatGPT system (which is indeed more than a model per se). Therefore, I would have chosen an open LLM-based system for this section, such as Llama 3. Also, the opening of Section 4 mentions that the focus of this part is on LLM-based NeSy systems; and finding ChatGPT as the first use case was a bit counterintutitive to me.

- The transition from section 4 to the conclusions felt a bit sharp, and I would recommend adding a discussion section to cover some questions that may arise at that stage. For example, what can be said about the limitations of these patterns in terms of what they are capable to capture or not? What is an intended use case where the documentation of a NeSy system is bringing value for transparency, etc. as originally stated in the abstract and in the introduction (the motivations behind the paper). Which level of granularity does this approach offer and in which cases this may not be suitable? Most importantly, after reading the article, I was still wondering what the application of patterns actually enables; and clarifying or restating this is important to link back to the motivations outlined in the introduction.

- I got the impression that the current conclusions are a bit rushed and they do not mention the original motivations behind this work; the connection with Boxology and the extension to represent LLM-based NeSy systems; and how the description of such models and systems enable new avenues and opportunities to address the challenges presented by the authors in the introduction. Overall, the current conclusions are diminishing the potential of this work, and I strongly encourage the authors to expand them accordingly, shedding more light to their contribution.

Questions
---------

- How would this effort contextualise with related ML documentation efforts, such as, the Machine Learning Model Cards? (see for example https://modelcards.withgoogle.com/about)

- The current positioning of the article is on LLM-based systems, but I got the impression that the proposed approach could easily scale and be adapted (with little effort) to other generative systems? Given its potential, and the expressiveness of the approach, I would suggest to elaborate this in a discussion (or conclusions) section.

- Page 3, Line 46. What is the systematic study of ~500 papers for? I understand the authors are trying to make an argument on the flexibility and the maturity of the Boxology approach; however, it would be interesting to have more details here. Also, after Page 3 Line the 47, the narrative diverges a bit, and it becomes hard to see which points the authors are trying to make in relation to the work presented in this article; without more context, the reader may get the impression that all this work is not relevant. Others remarks, instead, sound a bit speculative at this stage (such as "This framework and implementation could be used in the implementation of the design patterns")

- Page 3, Line 48. What is EASY-AI?

- Page. Can Boxology be seen as a formal language to describe ML models or is it mainly a visual paradigm?

- Section 3.1. Could these patterns be used to describe Multimodal LLMs?

- Page 5, Line 7. I would recommend referring to classification models/heads rather than "classification systems", which may suggest a more complex process defined on top of a predictive model.

- Sections 3.4.1. and 3.4.2. What is the intended meaning of "KG model"? I had the impression that it is sometimes used to refer to a KG (3.4.1, Figure 4) but also to a model for KG embeddings (3.4.2.). I would recommend to introduce this beforehand in order to avoid confusion!

- Page 9, Line 51. "The selected papers are chosen ... but also to act as a fluent language interface or a formal language interface". This sentence is not clear to me, and I would appreciate additional context to explain this part.

Minor comments
--------------

- Page 3, line 3. Missing citation?

- Page 3, Line 4. Which authors?

- Page 3, Line 30. Broken citation.

- Page 3. Machine Learning is used before but the acronym is only introduced in page 4. Also, ML is never used in the paper. Similarly, the "Hets" acronym is introduced in line 44 page 4 but never used.

- Page 5, Line 19 "classical machine learning systems". Do you mean support vector machines, decision trees / random forests?

- Page 5, Line 25. vision transformers, please provide some references and explain how this would translate to such architectures?

- While readable, figures' resolution could be improved, and their captions, especially for Figures 1-3, could be expanded to provide more information on how to read them (even if this may sound a bit redundant with the sections).

- The use of sub-labelling for images is not consistent. Sometimes, a sub-label is capitalised (Figure 3A), other times is not (Figure 1a).

- Page 6 Line 44. RAG is mentioned, but not introduced to the reader.

- Page 8 Line 28. "LLM-based NeSy Design Patterns in Application". Do the authors refer to inference?

- Page 8 Line 48, can *be* used

Tracking #: 787-1778

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Supplementary Files:

Cover Letter:

Approve Decision:

Tags:

Recent blog posts

Journal Info

Submit

For Reviewers

Links

Search form

Tracking #: 787-1778

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Supplementary Files:

Cover Letter:

Approve Decision:

Tags:

Journal Info

Submit

For Reviewers

Links