By Anonymous User
Review Details
Reviewer has chosen to be Anonymous
Overall Impression: Good
Content:
Technical Quality of the paper: Good
Originality of the paper: Yes
Adequacy of the bibliography: Yes
Presentation:
Adequacy of the abstract: Yes
Introduction: background and motivation: Good
Organization of the paper: Satisfactory
Level of English: Satisfactory
Overall presentation: Good
Detailed Comments:
In the revised manuscript, the authors addressed most of my major concerns. In particular, they split the presentation of experimental results into two independent threads, and dropped some fragments of text that felt repetitive or irrelevant. The revised version reads much better. However, I still noticed a number of minor mistakes and omissions that need to be addressed in order to meet scientific standards. I’m listing them below – there are quite many of them, but they should be easy to address.
> prognostic health monitoring with vehicle failure prediction
in vehicle failure prediction?
> The proposed method is validated on the PETS2006 and the AVS2007 datasets
Citations should be provided for these datasets – they seem to be missing from the entire section 5.1.1
> The dataset comprises of 161 images,
comprises 161 images
> The current section
This section
> as depicted in Fig. 6:
Fig. 6 should be closer to this first reference to it.
> 1. The available set of labeled images is fed to
Numbered lists are usually used for relatively short statements. This enumeration here takes 3 pages, so it would be more natural to either split this text into subsubsections, or mark them otherwise (e.g. starting each of them with bold text ‘Stage 1’, ‘Stage 2’, etc.)
> uses them to derive meaningful attributes (or instances)
I guess this ‘or’ alternative alludes to that other cited paper, which used the term ‘instance’. Overall, it would be better to stick to the same term throughout the entire paper.
Fig. 5: The preferred placement of floats is at the top of the page.
> Next, the decision rules exploited by the rule-based classifier are defined.
Seems like it’s worth to start a new stage (Stage 4) here.
> A ”none of known” class is also created for ambiguous images which do not satisfy either decision rule.
Repetition: this has been stated just a few lines of text earlier.
Table 2: keep the same precision of all numbers, e.g. 50.0
The dataset names in Fig. 7 are unnaturally bulky – worth shrinking them.
Table 3: Table formatting (gridlines) is completely different from Table 7; this should be made uniform throughout the entire paper.
Baseline comparison: So it seems that this rule-based baseline model has been built manually by the authors. That’s kind-of OK, though why not use an existing decision rule/tree inducer (like C5.0 or C4.5) to induce such a rule, for the sake of greater objectivity? Also, it would be good to show that rule (judging from the context, there’s only one rule here).
p. 17
> the amount of time they encompass
the timespan they cover
Tables 4 and 5: Table captions should be placed at the top of the table.
> In the data, each vehicle is described by 8 specifications.
Feels like a term like ‘feature’ or ‘attribute’ would be more fitting here than ‘specification’.
> Thus, as in the previous use case, the symbolic component deals with two classes: healthy (negative class, as in ”vehicles not presenting failures”) and failing vehicles (positive class).
But Tables 4 and 5 show two alternative labellings for this dataset, respectively with two and five decision classes. So which one is ultimately used in the paper? If only one of them, then the other one should be completely discarded from the text. If both of them, then they should be clearly named (e.g. ‘binary classification’ and ‘fine-grained classification’) and referred to using these names.
> A few examples of these attributes can be seen in the first column of 6.
… of Table 6
> Subsequently (second block in Fig 3,
Subsequently (the second block in Fig 3),
> The neural component uses an LSTM neural network
Citation needed
> Within the neural component, an LSTM-autoencoder is trained to reconstruct sequences of time series data for healthy vehicles.
Many details are missing here: how was the history of trucks presented to the network, i.e. tokenized? What was the dimensionality of those tokens? What was the working dimensionality of the LSTM network? How was the model trained? I’m not expecting all the details necessary to reproduce these experiments, but at least some main details should be given, to improve the credibility of the authors' argument.
The curves in Fig. 8 are hard to distinguish, consider enlarging these plots.
> The missing data in the training set are handled by performing forward
filling,
Please explain briefly what you mean by that – I guess it’s repetition of the last known value, but that should be clearly stated.
> All evaluated models exhibit a very high accuracy (over 95is highly imbalanced, a high accuracy is not very indicative.
There’s something wrong with this sentence.
> The time windows corresponding to the class labels defined in 5 are also shown.
… in Table 5