Machine Learning with Requirements: A Manifesto

Tracking #: 665-1645

Flag : Review Received

Authors:

Eleonora Giunchiglia

Fergus Imrie

Mihaela van der Schaar

Thomas Lukasiewicz

Responsible editor:

Annette ten Teije

Submission Type:

Other (note in cover letter)

Full PDF Version:

nai-paper-665.pdf

Cover Letter:

Position paper for inaugural issue

Approve Decision:

Approved

Revised Version:

Machine Learning with Requirements: A Manifesto

Tags:

Reviewed

Decision:
Minor Revision

Solicited Reviews:

Review #1 submitted on 15/Dec/2023

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Good

Content:
Technical Quality of the paper: Good
Originality of the paper: Yes
Adequacy of the bibliography: Yes

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Excellent

Detailed Comments:

This paper argues for the need of requirements for machine learning (ML) systems (based on two concrete examples from the domains of autonomous driving and healthcare) and subsequently proposes the pyramid model of how such requirements can be included in standard machine learning pipelines. The paper is visionary by proposing a paradigm shift from "performance-driven" to "requirements-driven" ML. As such, the paper discusses a very timely topic given the large-scale uptake of ML systems in virtually all domains. The paper is well-argued, well-written and thoroughly documented.

While I enjoyed reading this paper and found it very interesting, I found it difficult to understand how it is relevant to the neurosymbolic AI field. I assume this lies in the fact that the requirements would be represented in some symbolic approach - however this is just my interpretation, which might be incorrect. Therefore, the paper should be revised in order to explicitely highlight the relation and relevance of the discussed topic to the neurosymbolic AI field. This should be done both in the abstract/introduction and in the main body of the paper (e.g., by giving some examples of the "logical constraints" mentioned). Making this relation to neurosymbolic AI more prominent at the beginning of the paper would be important especially given that this paper will appear in an inaugural issue of the NAI journal.

Some smaller comments are:
* p4, l18-19: reduce repetition of "crushed"/"even" in the same sentence;
* p6, l35: "possible" => I think this should be "impossible"
* p7, l35; 'fasten' => increase its speed?

Review #2 submitted on 08/Jan/2024

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Good

Content:
Technical Quality of the paper: Good
Originality of the paper: Yes
Adequacy of the bibliography: No

Detailed Comments:

Pros: The work is well-motivated paper and it touches upon an important topic: that of regulating machine learning models. The paper is also very nicely structured.

Cons: I feel that the authors do not give enough information on the attempts that have already been made to integrate background knowledge into neural models at testing- or training- time. The paper would benefit a lot from such a discussion, especially if the authors stress the limitations of the current neurosymbolic literature w.r.t. their manifested model development pyramid. Below, I provide a few such references:

•Aaron M. Ferber, Bryan Wilder, Bistra Dilkina, and Milind Tambe. Mipaal: Mixed integer program as a layer. CoRR, abs/1907.05912, 2019.
•Marin Vlastelica, Anselm Paulus, Vít Musil, Georg Martius, and Michal Rolínek. Differentiation of blackbox combinatorial solvers. CoRR, abs/1912.02175, 2019.
•Anselm Paulus, Michal Rolínek, Vít Musil, Brandon Amos, and Georg Martius. Comboptnet: Fit the right np-hard problem by learning integer programming constraints, 2021.
•Jonathan Feldstein, Modestas Jurcius, and Efthymia Tsamoura. “Parallel Neurosymbolic Integration with Concordia”. In: Proceedings of the International Conference on Machine Learning (ICML), Honolulu, Hawaii, USA, 23-29 July. pp. 9870–9885. 2023.

Another line of research which should be mentioned is the one that studies learnability of neurosymbolic frameworks:

•Kaifu Wang, Efthymia Tsamoura, and Dan Roth. “On Learning Latent Models with Multi-Instance Weak Supervision”. In: Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS), 2023.
•Emanuele Marconato, Stefano Teso, Antonio Vergari, Andrea Passerini. “Not All Neuro-Symbolic Concepts Are Created Equal: Analysis and Mitigation of Reasoning Shortcuts”. In: Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS), 2023.

This line of research is relevant to your work as if we cannot formally ensure that the models that we obtain using soft or hard constraints (e.g., the ones specified by the domain experts) are not good enough, then why we should deal with constraints at all-- this is actually an open problem in neurosymbolic AI that should be stressed in your paper, perhaps as an afterthought.

Overall, I believe that the work is nice. It needs, however, a better positioning w.r.t. the current literature—especially a discussion about if the current literature is good enough to meet your development model (Fig. 4) and, if not, to stress future research should focus on.

Review #3 submitted on 04/Oct/2023

By Floris van der Hengst
Review Details

Reviewer has chosen not to be Anonymous

Overall Impression: Weak

Content:
Technical Quality of the paper: Weak
Originality of the paper: No
Adequacy of the bibliography: Yes, but see detailed comments

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Limited
Organization of the paper: Satisfactory
Level of English: Unsatisfactory
Overall presentation: Weak

Detailed Comments:

# Summary of paper in a few sentences.
This paper argues for the use of requirements throughout the machine learning model life cycle based on an analysis of two important and safety-critical application domains with requirements. A cyclical development model for machine learning systems is proposed in which requirements can be updated and used in each of its five stages. Within this model, neuro-symbolic approaches are identified as a key solution within the model creation and model training stages.

# Reasons to accept
The inclusion of (safety) requirements is an important challenge for machine learning practitioners and researchers. It makes sense to consider neuro-symbolic approaches for this challenge.

# Reasons to reject
The paper looks like a position paper and these may not be in scope for this journal.
The main contribution of the paper seems to be the pyramid model in Figure 4, but this model is very general, its connection to neuro-symbolic AI is poorly argued and it lacks novelty (see other remarks).

# Criteria
## Significance
This paper argues that requirements need to be incorporated into the entire machine learning development pipeline. Because some requirements can be expressed with knowledge bases / symbolic AI, the significance to NAIJ is reasonable.

## Background
This works cites an appropriate amount of works in which some neuro-symbolic approach is used to include requirements into the model creation and model training stages (using terminology from the paper). However, the main contribution of this paper is the pyramid model for the machine learning (system) development process. This calls for an embedding of this novel model into existing models of this process. Many reference models are cyclical rather than linear as suggested in this paper in e.g. Figure 1. And many of which explicitly mention the role of requirements in throughout the process.
- Martínez-Plumed, Fernando, et al. "CRISP-DM twenty years later: From data mining processes to data science trajectories." IEEE Transactions on Knowledge and Data Engineering 33.8 (2019): 3048-3061.
- Ashmore, Rob, Radu Calinescu, and Colin Paterson. "Assuring the machine learning lifecycle: Desiderata, methods, and challenges." ACM Computing Surveys (CSUR) 54.5 (2021): 1-39.
- Haakman, Mark, et al. "AI lifecycle models need to be revised: An exploratory study in Fintech." Empirical Software Engineering 26 (2021): 1-29.

The paper would also benefit from a discussion on *how* requirements can effectively be modeled (the leftmost arrow in the pyramid model in Figure 4) and an analysis on how this impacts the right-hand side of the model. This may also strengthen the link with neuro-symbolic AI.

## Novelty
This paper brings little novel insights or viewpoints. The paper mentions the cyclical nature and the role of requirements of the pyrimad model as innovations, but both are established (see earlier remarks). What is left is (i) the analysis of two domains and (ii) the link between requirements in ML and neuro-symbolic AI: (i) the two analyses are fully based on existing work and (ii) the link between requirements in ML and neuro-symbolic AI is an established one as evidenced by the examples cited in the paper.

## Technical quality
The paper is neither a survey nor a research paper to me, making technical quality hard to assess. However, there is room for improvement in the argumentation in this work. To name some issues:
- in the introduction it is claimed that requirements in any application domain can be obtained from an existing body of knowledge in this application domain and that there will be a continued push for adoption of systems even when these may have unintended consequences (ln15-18). Claims like these need to be supported
- there are many mentions of 'facts' and 'obvious' things which are not supported by evidence
- the presented pyramid model is contrasted to a "traditional 'performance-driven' [..] pipeline". Where is this traditional pipeline suggested (see note on background). Furthermore, the idea that predictive performance is the only metric to optimize for "traditionally" is immediately contradicted in the work itself, i.e. by mentioning fairness, robustness, explainability ... So the current model is already in use, what novelty remains?
- p4 ln41: "threshold is often picked equal to 0.5" this threshold is usually set based on specifics of the problem (misclassification costs, class imbalance, etc.)

## Presentation
The organisation of the text is good, the figures are clear and support the story.

The text could benefit from some additional language editing. Some sentences are too long and vague.
Sentences that require some attention:
- p2 ln 3: "the reasons" what reasons/reasons for what?
- p2 ln 11: "positive performance" high performance
- p2 ln 13: "spelled requirements" requirements
- p2 ln 14: "Though" Although
- p2 ln 33: "pros of [..] requirement" benefits of [...] requirements
- p6 ln 34: lower control
- p7: "requirement over" requirement on

## Length
Some repetitions could be eliminated to shorten the paper.

## Data availability
N/A

Tracking #: 665-1645

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Recent blog posts

Journal Info

Submit

For Reviewers

Links

Search form

Tracking #: 665-1645

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Journal Info

Submit

For Reviewers

Links