Semi-Automated Synthesis of Driving Rules

Tracking #: 736-1720

Flag : Review Received

Authors:

Diego Ortiz Barbosa

Leilani Gilpin

Alvaro A. Cardenas

Responsible editor:

Eleonora Giunchiglia

Submission Type:

Article in Special Issue (note in cover letter)

Full PDF Version:

nai-paper-736.pdf

Cover Letter:

Dear Editors, we wish to submit our improved work, Semi-Automated Synthesis of Driving Rules, in the special issue on Neurosymbolic AI for Cyberphysical systems of your journal of NeuroSymbolic Artificial Intelligence. This work is an improvement of a workshop paper of the same name presented more than a year ago in VehicleSec. This paper shows our work extracting and classifying rules taken from driver manuals from the US and some Australian provinces. We believe that our extractor, classification, and rules are valuable tools that the community can use to apply them in different neuro-symbolic areas regarding autonomous vehicles. Thank you for your consideration.

Approve Decision:

Approved

Tags:

Reviewed

Decision:
Major Revision

Solicited Reviews:

Review #1 submitted on 29/May/2024

By Louise Dennis
Review Details

Reviewer has chosen not to be Anonymous

Overall Impression: Weak

Content:
Technical Quality of the paper: Weak
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes, but see detailed comments

Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Good

Detailed Comments:

This is an interesting paper addressing a necessary piece of work - equipping AVs to drive in accordance with local traffic rules and norms. The data sets produced by this work are potentially an important resource for the community, but I feel some work is needed before this is strongly the case.

The related work section is weak. Although the focus is on informal rules, it's would probably be worth mentioning other significant approaches to encoding the legal rules.

Hanif Bhuiyan, Guido Governatori, Andy Bond, Sebastien Demmel, Mohammad Badiul Islam & Andry Rakotonirainy (2020): Traffic Rules Encoding using Defeasible Deontic Logic. In: Legal Knowledge and Information Systems: JURIX 2020: The Thirty-third Annual Conference, 334, IOS Press, pp. 3–12, doi:10.3233/FAIA200844.

Similarly, my own work, on the UK Highway Code since it that document combines both formal and informal rules with an interesting must/should distinction to separate the two.

Joe Collenette, Louise A. Dennis and Michael Fisher. Advising Autonomous Cars about the Rules of the Road. Fourth Workshop on Formal Methods for Autonomous Systems (FMAS 2022). EPTCS 371, 2022, pp. 62-76 DOI: https://doi.org/10.4204/EPTCS.371.5

More importantly there seems to be a lack of any survey of technology for automatically extracting formal representations from text documents. I'm aware there has been quite a bit of work on this in the field of ontologies, in particular. I don't have the expertise to point out specific references but I was quite struck by the omission since it seemed to feed in to the design decisions around the rule-extractor.

My main issue is with the rule-extraction process itself and I find it telling that the abstract frames is as "preliminary" and so focuses the contribution of the paper on the data-set rather than the rule extractor.

I'm not an expert in Natural Language processing, but the kind of basic parsing represented by the rule extractor seems to me to be far from the state of the art (even if you don't want to get into the issue of neural network models or LLMs). I was doing exercises like this when I studied for an MSc 30 years ago. Since this is framed as only preliminary that would probably be acceptable except that the bulk of the technical part of the paper is devoted to describing this process.

The authors provide some nice statistics about the capability of the process, comparing the number of rules that could be extracted easily and the number that needed to be refined. It would have been nice, though, to know how many (if any) rules were missed by the process. In particular, if the resulting datasets are to be of value, it will be important for them to have some kind of completeness property - even if it is only that the providers have checked against the actual manuals to make sure all rules are represented.

Section 8.4 asks an important question about whether the resulting datasets are practical for use with an AV. Some interesting manual categorisation has taken place here, but I'm a little unconvinced that the case is genuinely made for the utility of the rule sets even if they do have some kind of completeness property. My own experiencing formalising the UK Highway Code in a similar way revealed a lot of subtlty around how you would integrate with an AV - including questions of, for instance, the vehicles intention. Prakken has also done some thinking in this area around the challenges that would be involved. Related to this, it's not clear if there is consistency in terminology across the extracted triples - if my cars moves US state and uploads the next set of rules, do the terms align?

My advice would be to re-work to the paper to either focus more firmly on the datasets as the contribution, with some measure of their completeness and a better evaluation of their practical utility in terms of integration with AVs _or_ to improve the rule extraction process with reference to current state-of-the-art in natural language processing - I would particularly recommend looking at work extraction OWL (and other ontologies) from natural language.

Review #2 submitted on 24/Jun/2024

By Anonymous User
Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Average

Content:
Technical Quality of the paper: Weak
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes

Detailed Comments:

This paper introduces a semi-automated extraction of self-driving rules from a collection of driving manuals selected over different states. The aim is to synthesize driving rules specific to each contextual area and allow for successive integration in autonomous vehicles (AVs). To do so, the authors leverage first a selection of PDF files from autonomous driving manuals, pre-processed with PyPDF2 and successively converted using a tokenizer from NTLK. Rules that are not correctly synthesized are manually corrected to respect the format of an IF-THEN statement.

As I am not an expert in this field, the approach seems in line with active research in the field of AV.

There are some limitations, although, that are unaddressed by their contribution and require at least some discussion:

1) Validation of extracted rules. It seems that the validation procedure of extract rules only involves manually checking the match to the original sentences from the text. This is good to have but there should be also some discussion on the consistency of the whole extracted rules. Is it possible to formally verify that the rules are consistent? It would be relevant to point out (or even to check) that there are no inconsistencies in the rules and can be readily used in symbolic engines. What happens if there are exceptions to some rules? Are there repeated rules across different states causing redundancy? Consider section 3 of [1] as a reference for the redundancy verification, which uses RELSAT.
Are there synonyms in the rules, like car/vehicle/machine that would result in different nouns but should capture the same quantity?

I also point to other changes that could benefit the presentation:

2) The presentation of the automated rule extraction can be improved, for example, by including a pseudo-code or by including a picture of different steps. Each step has its limitations which should be pointed out and explained what other algorithms could be adopted. There seems to be room for improvement based on the reported results.

3) Figures 2 and 3 are hard to interpret without further explanation.

4) Should the rules extracted by the method be used in Neuro-Symbolic models, and if so, what are possible datasets and scenarios that are intended to be considered?

[1] ROAD-R: The Autonomous Driving Dataset with Logical Requirements; Giunchiglia et al. (2023)

Review #3 submitted on 17/Dec/2024

By Luca Andolfi
Review Details

Reviewer has chosen not to be Anonymous

Overall Impression: Average

Content:
Technical Quality of the paper: Average
Originality of the paper: Yes, but limited
Adequacy of the bibliography: Yes

Detailed Comments:

The article proposes a semi-automated pipeline for extracting driving rules from handbooks.
The goal is to find and encode in a formal specification the common sense rules (not solely regarding safety) that human beings effectively apply when driving as a result of their abstract reasoning ability. The obtained formalization can then be used by autonomous driving systems to enhance their reliability and overall robusteness to unexplained anomalies.

I have organized my review into sections a) to e): Merits, Issues, Questions, Suggestions and Conclusion.

a) MERITS.

I believe the article has the following merits:

1. It considers common sense rules in autonomous driving as fundamental element
to integrate in the development of autonomous vehicles (AVs): leveraging these rules
has several benefits from adapting AVs to operate in different regions with distinct
driving patterns, to enhancing the ability to detect system anomalies and increasing
the overall trustworthiness of AVs;

2. The proposed extraction framework expresses formally driving rules
and allows for inferences, portability, and adaptability.

b) ISSUES.

On the other hand, these are in my opinion the most evident issues:

1. The accuracy of the overall approach is not satisfactory. Indeed there are
at least two sources of errors to consider: the parser and the rule construction.
As far as the parser is concerned, its limitations are evident from the examples
shown in Section 7. and 8.1. Regarding the rules construction errors in this
phase are even more delicate to spot and correct as shown for the example
"if you are in an intersection when you see an emergency vehicle, continue through
the intersection". Here the authors had to look back at the original sentence to
recognize the error (a missing conjunction) because the rule is semantically correct
per sè.

2. The parsing method is handbook-specific: in particular section 8.2 mentions
that the California's driving manual is particularly well-suited for the proposed
extraction method. However, several issues are found with different handbooks.

3. A lot of manual work is required: Table 3 mentions that 539 of the 708 rules
were manually refined. Even though, as stated at the end of Section 7.1, the
authors did not read the manuals nor created the rules **completely** manually,
still the number of rules to be refined is high. Moreover, if I understand correctly,
also the classification of the rules is manual (this is mentioned at the end of
section 8.3).

4. Section 6.3.1. states that triples have shape (subject, verb, object). However,
in the following examples we see triples such as (continue, through, intersection),
(dangerous condition, at, rail) where "through" and "at" are prepositions and not
verbs. This is a conceptual inconsistency that needs to be addressed,
otherwise the semantics of the rules is undermined.

I understand that the goal of the article is to give a proof of concept regarding the extraction of the rules, but still 1-4 seem to significantly limit its applicability. I would like to hear from the authors about them.

c) QUESTIONS.

At the end of Section 8.4. I see that there are 416 rules in the so-called "Easy" category.
How many of these rules needed refinement before being used?

d) SUGGESTIONS.

Here are some suggestions with respect to my comments in b).

1. Parsing: I belive the approach may benefit from the (moderate) introduction of
machine learning solutions. Moderate means that the usage should stay in the
boundary of explainability.
Rule extraction: I would like to suggest a controlled use of LLMs.
For example, I think it would be possible to:

- compute the rule R1 as in the article,
- instruct via system prompt the LLM to use only words from the
parsed sentence in its answer and make it compute a rule R2,
- compare R1 and R2 against the extracted sentence as the ground truth (for
instance by counting how many entities in the ground truth they match each).

This could help identifying erros such as the one in "if you are in an
intersection when you see an emergency vehicle..." where "emergency" was not
mentioned in R1.

2. See 1.

3. For the classification of rules, one can integrate the use of some heuristics
(for example rules sharing the same words are likely related) to reduce the
supervision effort.

4. I was expecting to see something like this:
(continue, through, intersection) -> (self, continue through, intersection)
(dangerous condition, at, rail) -> (dangerous condition, detected at, rail).

e) CONCLUSION.

I think the article is suitable for acceptance, because the merits highlithed in a) remain valuable
and the approach (with its limitations) succeeds in showing this.
However it would strengthen the paper to include some improvements with respect to some of the concerns in b),
at least through additional proofs of concept.

Tracking #: 736-1720

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Recent blog posts

Journal Info

Submit

For Reviewers

Links

Search form

Tracking #: 736-1720

Flag : Review Received

Authors:

Responsible editor:

Submission Type:

Full PDF Version:

Cover Letter:

Approve Decision:

Tags:

Journal Info

Submit

For Reviewers

Links