Mitigating Model Collapse in Recursive Neurosymbolic Agents: The SONAR Benchmark for Semantic Plasticity

Tracking #: 919-1938

Flag : Review Assignment Stage

Authors: 

Andrew Greene

Responsible editor: 

Filip Ilievski

Submission Type: 

Article in Special Issue (note in cover letter)

Full PDF Version: 

Supplementary Files: 

Cover Letter: 

Dear Special Call Editors, I am pleased to submit the manuscript “Mitigating Model Collapse in Recursive Neurosymbolic Agents: The SONAR Benchmark for Semantic Plasticity” for consideration under the Special Call for Neurosymbolic Benchmark Papers. This submission presents SONAR, a publicly released neurosymbolic benchmark designed to evaluate semantic plasticity and metric reliability in recursive agentic systems. Unlike performance-oriented benchmarks, SONAR is intentionally constructed as a diagnostic infrastructure, targeting a failure mode that remains under-evaluated in NeSy systems: runtime semantic collapse under recursive self-interaction. The benchmark contributes the following: - A formally specified recursive task designed to stress long-horizon grounding, belief revision, and relational consistency. - A hybrid evaluation framework combining embedding-based divergence with symbolic stasis indicators, exposing a documented “Metrology Gap” between surface-level semantic change and grounded reasoning. - A fully open dataset and reference implementation, archived on Zenodo with versioning to enable replication, re-annotation, and alternative metric substitution. - Illustrative baseline evaluations, including an ablated control and a regulated variant, demonstrating how common metrics fail to distinguish Artifactual Divergence from Structural Divergence, and identifying a distinct “Sawtooth” stability signature in regulated loops. Importantly, the benchmark is domain-agnostic and can be instantiated across political, legal, scientific, or planning tasks without structural modification. The included political strategy task is used solely as an open-world stressor to prevent closed-world shortcutting. All materials follow open-access benchmark guidelines, including stable archival, documented creation methodology, licensing, and reproducibility notes. The manuscript is explicitly submitted as a Benchmark paper in accordance with the submission instructions. I confirm that this work is original and not under consideration elsewhere. I welcome open review and community engagement around the benchmark’s future evolution. Thank you for your consideration. Warm regards Andrew Greene Founder & Director | Ontological Engineering Pty Ltd Structured Semantic Infrastructure & Protocol Design Perth, Western Australia andrew.greene@ontologicalengineering.com.au 0408 622 558

Tags: 

  • Under Review