Formal Languages and Automata Theory | Cool Papers

#1 Active learning of upward-closed sets of words [PDF] [Copy] [Kimi] [REL]

We give a new proof of a result from well quasi-order theory on the computability of bases for upwards-closed sets of words. This new proof is based on Angluin's $L^*$ algorithm, that learns an automaton from a minimally adequate teacher. This relates in particular two results from the 1980s: Angluin's $L^*$ algorithm, and a result from Valk and Jantzen on the computability of bases for upwards-closed sets of tuples of integers. Along the way, we describe an algorithm for learning quasi-ordered automata from a minimally adequate teacher, and extend a generalization of Valk and Jantzen's result, encompassing both words and integers, to finitely generated monoids.

Subject: Formal Languages and Automata Theory

Publish: 2025-04-30 08:37:41 UTC

#2 Statistical process discovery [PDF] [Copy] [Kimi] [REL]

Authors: Pierre Cry, Paolo Ballarini, András Horváth, Pascale Le Gall

Stochastic process discovery is concerned with deriving a model capable of reproducing the stochastic character of observed executions of a given process, stored in a log. This leads to an optimisation problem in which the model's parameter space is searched for, driven by the resemblance between the log's and the model's stochastic languages. The bottleneck of such optimisation problem lay in the determination of the model's stochastic language which existing approaches deal with through, hardly scalable, exact computation approaches. In this paper we introduce a novel framework in which we combine a simulation-based Bayesian parameter inference scheme, used to search for the ``optimal'' instance of a stochastic model, with an expressive statistical model checking engine, used (during inference) to approximate the language of the considered model's instance. Because of its simulation-based nature, the payoff is that, the runtime for discovering of the optimal instance of a model can be easily traded in for accuracy, hence allowing to treat large models which would result in a prohibitive runtime with non-simulation based alternatives. We validate our approach on several popular event logs concerning real-life systems.

Subject: Formal Languages and Automata Theory

Publish: 2025-04-30 07:44:17 UTC

#3 Neuro-Symbolic Generation of Explanations for Robot Policies with Weighted Signal Temporal Logic [PDF] [Copy] [Kimi] [REL]

Authors: Mikihisa Yuasa, Ramavarapu S. Sreenivas, Huy T. Tran

Neural network-based policies have demonstrated success in many robotic applications, but often lack human-explanability, which poses challenges in safety-critical deployments. To address this, we propose a neuro-symbolic explanation framework that generates a weighted signal temporal logic (wSTL) specification to describe a robot policy in a interpretable form. Existing methods typically produce explanations that are verbose and inconsistent, which hinders explainability, and loose, which do not give meaningful insights into the underlying policy. We address these issues by introducing a simplification process consisting of predicate filtering, regularization, and iterative pruning. We also introduce three novel explainability evaluation metrics -- conciseness, consistency, and strictness -- to assess explanation quality beyond conventional classification metrics. Our method is validated in three simulated robotic environments, where it outperforms baselines in generating concise, consistent, and strict wSTL explanations without sacrificing classification accuracy. This work bridges policy learning with formal methods, contributing to safer and more transparent decision-making in robotics.

Subjects: Robotics , Formal Languages and Automata Theory

Publish: 2025-04-30 17:51:20 UTC

#4 Efficiently Finding All Minimal and Shortest Absent Subsequences in a String [PDF] [Copy] [Kimi] [REL]

Authors: Florin Manea, Tina Ringleb, Stefan Siemer, Maximilian Winkler

Given a string $w$ , another string $v$ is said to be a subsequence of $w$ if $v$ can be obtained from $w$ by removing some of its letters; on the other hand, $v$ is called an absent subsequence of $w$ if $v$ is not a subsequence of $w$ . The existing literature on absent subsequences focused on understanding, for a string $w$ , the set of its shortest absent subsequences (i.e., the shortest strings which are absent subsequences of $w$ ) and that of its minimal absent subsequences (i.e., those strings which are absent subsequences of $w$ but whose every proper subsequence occurs in $w$ ). Our contributions to this area of research are the following. Firstly, we present optimal algorithms (with linear time preprocessing and output-linear delay) for the enumeration of the shortest and, respectively, minimal absent subsequences. Secondly, we present optimal algorithms for the incremental enumeration of these strings with linear time preprocessing and constant delay; in this setting, we only output short edit-scripts showing how the currently enumerated string differs from the previous one. Finally, we provide an efficient algorithm for identifying a longest minimal absent subsequence of a string. All our algorithms improve the state-of-the-art results for the aforementioned problems.

Subjects: Data Structures and Algorithms , Formal Languages and Automata Theory

Publish: 2025-04-30 09:48:47 UTC