By Connor Pryor
Review Details
Reviewer has chosen not to be Anonymous
Overall Impression: Average
Content:
Technical Quality of the paper: Average
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes, but see detailed comments
Presentation:
Adequacy of the abstract: No
Introduction: background and motivation: Limited
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Good
Detailed Comments:
Summary:
The authors present Neural Markov Prolog (NMP), a novel language combining Markov logic and Prolog syntax/semantics to define neural network architectures. This work extends the ideas in [1], which demonstrated how infinite tree-structured PGMs can correspond directly to neural networks. Specifically, the authors leverage the equivalence of derivatives of marginal log-likelihood and cross-entropy loss derivatives in a neural network with a sigmoid. While [1] primarily focused on theoretical equivalence, this paper introduces a concrete language (NMP) to define such neural network architectures and provides examples of its application.
Comments and Questions:
1. Abstract Clarity:
============================================
1a. The abstract needs a clearer summary of the contributions. A significant portion of the paper (around one-third) is dedicated to formulating popular neural architectures within the NMP framework, which is not sufficiently emphasized in the abstract.
2. Related Work:
============================================
I believe the smaller bibliography is appropriate given the focus on constructing neural networks from programmatic symbolic structures, which is a less-explored area of NeSy AI. That said, the introduction briefly mentions other NeSy approaches without sufficiently comparing this work to related systems.
2a. How does NMP compare to some of the more popular NeSy approaches like Neural Machines, Delta ILP, LTNs, DeepProbLog, Semantic Loss, TensorLog, NeuPSL, etc.? Are there overlaps in semantics?
2b. Creating a separate related work section could provide more context to the reader and make it easier to relate the work to the broader NeSy field.
3. Clarifying Contributions:
============================================
The paper would benefit from a clearer outline of its contributions in the introduction.
3a. Like in the abstract, is defining popular neural architectures a contribution, or is it secondary to introducing NMP syntax and semantics? If it is not a contribution, why describe so many systems?
3b. Additionally, clarify whether NMP introduces any novel inference or learning algorithms or if it focuses purely on creating a neural network with symbolic syntax.
3c. Is there a codebase for creating these neural models, i.e., translating the prolog program into a neural model or turning a neural network into logic? In the intro, it states: "Potential benefits include automatic translation of logic or text into neural networks, and of neural networks into logic or text." If not, this seems like a good direction for future work.
4. Example in Section 4:
============================================
4a. The example in Section 4 requires more detail or clarification:
- Is the grounded structure shown in Figure 1 the corresponding neural network?
- I may have missed this, but how do infinite weights propagate information in this setup? A brief explanation would help, and if was described in [1] a bit of background would be nice.
- It’s unclear how Figure 1 supports inference for friend(anna, bob).
5. The Grounded Neural Model:
============================================
5a. I may have missed this, but I am a touch confused about what section 5 is saying with respect to this paper. Is it trying to show how to translate a neural network into a Markov network structure? Or is this the grounded neural structure? Is this a contribution or just background from [1]?
6. Figure 3 Typo?:
============================================
6a. Should some of the grey nodes read input(2, ...) instead of all nodes reading input(1, ...)? Please double-check for accuracy.
7. Scalability of the Framework:
============================================
Section 7 does not discuss the scalability of expressing large neural architectures. While the fully connected network example (two layers, two units each) demonstrates the framework's basic functionality, it leaves questions about how the framework handles larger architectures.
For instance, if a 100-layer fully connected network were required, would each layer need to be explicitly defined, or does the framework provide a shorthand mechanism for specifying such structures? Addressing this point could reinforce the abstract's claim that NMP is a “flexible framework to elegantly express neural architectures.” If such scalability mechanisms exist, they should be explicitly acknowledged (e.g., “This is a simple example; more complex architectures can be defined using [specific method or feature]”). Alternatively, if scalability is a current limitation, the discussion should clearly state this.
9. Overall:
============================================
9a. How does this language handle cyclic dependencies? For instance, if there are two rules: smokes(X):- cancer(X) and cancer(X):-smokes(X), how would the neural structure generate? Obviously, this is just an example for illustration and does not significantly mean anything.
[1] On Neural Networks as Infinite Tree-Structured Probabilistic Graphical Models.