Formal Languages and Automata Theory

2025-01-20 | | Total: 2

#1 FC-Datalog as a Framework for Efficient String Querying [PDF] [Copy] [Kimi] [REL]

Authors: Owen M. Bell, Joel D. Day, Dominik D. Freydenberger

Core spanners are a class of document spanners that capture the core functionality of IBM's AQL. FC is a logic on strings built around word equations that when extended with constraints for regular languages can be seen as a logic for core spanners. The recently introduced FC-Datalog extends FC with recursion, which allows us to define recursive relations for core spanners. Additionally, as FC-Datalog captures P, it is also a tractable version of Datalog on strings. This presents an opportunity for optimization. We propose a series of FC-Datalog fragments with desirable properties in terms of complexity of model checking, expressive power, and efficiency of checking membership in the fragment. This leads to a range of fragments that all capture LOGSPACE, which we further restrict to obtain linear combined complexity. This gives us a framework to tailor fragments for particular applications. To showcase this, we simulate deterministic regex in a tailored fragment of FC-Datalog.

Subjects: Logic in Computer Science , Databases , Formal Languages and Automata Theory

Publish: 2025-01-17 18:34:48 UTC


#2 The structure of polynomial growth for tree automata/transducers and MSO set queries [PDF] [Copy] [Kimi] [REL]

Authors: Paul Gallot, Nathan Lhote, Lê Thành Dũng Nguyên

The ambiguity of a tree automaton is the number of distinct accepting runs over a given input. We show that, in polynomial time, one can decide whether the ambiguity grows polynomially or exponentially in the input size, and, in the former case, one can compute the exact polynomial degree. Equivalently, this amounts to studying the growth of the number of results of set queries in Monadic Second-Order logic (MSO) over ranked trees; we also prove a reparameterization theorem for such queries in the case of polynomial growth. This property of MSO set queries leads directly to a generalization of Bojańczyk's dimension minimization theorem for string-to-string polyregular functions. Our generalization applies to MSO set interpretations from trees, which subsume (as we show) tree-walking transducers and invisible pebble tree-to-string transducers. Finally, with a bit more work we obtain the following: * a new, short and conceptual proof that macro tree transducers of linear growth compute only tree-to-tree MSO transductions (a seminal theorem of Engelfriet and Maneth); * a procedure to decide polynomial size-to-height increase for macro tree transducers and compute the polynomial degree, which extends the recent decidability result of Gallot, Maneth, Nakano and Peyrat concerning linear size-to-height increase.

Subjects: Formal Languages and Automata Theory , Logic in Computer Science

Publish: 2025-01-17 15:57:21 UTC