Harmonic Token Projection (HTP): A Vocabulary-Free, Training-Free, Deterministic, and Reversible Embedding Methodology

#1 Harmonic Token Projection (HTP): A Vocabulary-Free, Training-Free, Deterministic, and Reversible Embedding Methodology [PDF³] [Copy] [Kimi³] [REL]

This paper introduces the Harmonic Token Projection (HTP), a reversible and deterministic framework for generating text embeddings without training, vocabularies, or stochastic parameters. Unlike neural embeddings that rely on statistical co-occurrence or optimization, HTP encodes each token analytically as a harmonic trajectory derived from its Unicode integer representation, establishing a bijective and interpretable mapping between discrete symbols and continuous vector space. The harmonic formulation provides phase-coherent projections that preserve both structure and reversibility, enabling semantic similarity estimation from purely geometric alignment. Experimental evaluation on the Semantic Textual Similarity Benchmark (STS-B) and its multilingual extension shows that HTP achieves a Spearman correlation of \r{ho} = 0.68 in English, maintaining stable performance across ten languages with negligible computational cost and sub-millisecond latency per sentence pair. This demonstrates that meaningful semantic relations can emerge from deterministic geometry, offering a transparent and efficient alternative to data-driven embeddings. Keywords: Harmonic Token Projection, reversible embedding, deterministic encoding, semantic similarity, multilingual representation.

Subjects: Computation and Language , Machine Learning

Publish: 2025-11-10 20:51:18 UTC

2511.20665

#1 Harmonic Token Projection (HTP): A Vocabulary-Free, Training-Free, Deterministic, and Reversible Embedding Methodology [PDF3] [Copy] [Kimi3] [REL]

#1 Harmonic Token Projection (HTP): A Vocabulary-Free, Training-Free, Deterministic, and Reversible Embedding Methodology [PDF³] [Copy] [Kimi³] [REL]