2024-10-29 | | Total: 7
The Travelling Salesman Problem - TSP is one of the most explored problems in the scientific literature to solve real problems regarding the economy, transportation, and logistics, to cite a few cases. Adapting TSP to solve different problems has originated several variants of the optimization problem with more complex objectives and different restrictions. Metaheuristics have been used to solve the problem in polynomial time. Several studies have tried hybridising metaheuristics with specialised heuristics to improve the quality of the solutions. However, we have found no study to evaluate whether the searching mechanism of a particular metaheuristic is more adequate for exploring hybridization. This paper focuses on the solution of the classical TSP using high-level hybridisations, experimenting with eight metaheuristics and heuristics derived from k-OPT, SISR, and segment intersection search, resulting in twenty-four combinations. Some combinations allow more than one set of searching parameters. Problems with 50 to 280 cities are solved. Parameter tuning of the metaheuristics is not carried out, exploiting the different searching patterns of the eight metaheuristics instead. The solutions' quality is compared to those presented in the literature.
Designing optimization approaches, whether heuristic or meta-heuristic, usually demands extensive manual intervention and has difficulty generalizing across diverse problem domains. The combination of Large Language Models (LLMs) and Evolutionary Algorithms (EAs) offers a promising new approach to overcome these limitations and make optimization more automated. In this setup, LLMs act as dynamic agents that can generate, refine, and interpret optimization strategies, while EAs efficiently explore complex solution spaces through evolutionary operators. Since this synergy enables a more efficient and creative search process, we first conduct an extensive review of recent research on the application of LLMs in optimization. We focus on LLMs' dual functionality as solution generators and algorithm designers. Then, we summarize the common and valuable designs in existing work and propose a novel LLM-EA paradigm for automated optimization. Furthermore, centered on this paradigm, we conduct an in-depth analysis of innovative methods for three key components: individual representation, variation operators, and fitness evaluation. We address challenges related to heuristic generation and solution exploration, especially from the LLM prompts' perspective. Our systematic review and thorough analysis of the paradigm can assist researchers in better understanding the current research and promoting the development of combining LLMs with EAs for automated optimization.
An old idea in optimization theory says that since the gradient is a dual vector it may not be subtracted from the weights without first being mapped to the primal space where the weights reside. We take this idea seriously in this paper and construct such a duality map for general neural networks. Our map, which we call modular dualization, forms a unifying theoretical basis for training algorithms that are a) fast and b) scalable. Modular dualization involves first assigning operator norms to layers based on the semantics of each layer, and then using these layerwise norms to recursively induce a duality map on the weight space of the full neural architecture. We conclude by deriving GPU-friendly algorithms for dualizing Embed, Linear and Conv2D layers -- the latter two methods are based on a new rectangular Newton-Schulz iteration that we propose. Our iteration was recently used to set new speed records for training NanoGPT. Overall, we hope that our theory of modular duality will yield a next generation of fast and scalable optimizers for general neural architectures.
Watermarking is a technique that involves embedding nearly unnoticeable statistical signals within generated content to help trace its source. This work focuses on a scenario where an untrusted third-party user sends prompts to a trusted language model (LLM) provider, who then generates a text from their LLM with a watermark. This setup makes it possible for a detector to later identify the source of the text if the user publishes it. The user can modify the generated text by substitutions, insertions, or deletions. Our objective is to develop a statistical method to detect if a published text is LLM-generated from the perspective of a detector. We further propose a methodology to segment the published text into watermarked and non-watermarked sub-strings. The proposed approach is built upon randomization tests and change point detection techniques. We demonstrate that our method ensures Type I and Type II error control and can accurately identify watermarked sub-strings by finding the corresponding change point locations. To validate our technique, we apply it to texts generated by several language models with prompts extracted from Google's C4 dataset and obtain encouraging numerical results. We release all code publicly at https://github.com/doccstat/llm-watermark-cpd.
In recent years, the widespread adoption of Large Language Models (LLMs) has sparked interest in their potential for application within the military domain. However, the current generation of LLMs demonstrate sub-optimal performance on Army use cases, due to the prevalence of domain-specific vocabulary and jargon. In order to fully leverage LLMs in-domain, many organizations have turned to fine-tuning to circumvent the prohibitive costs involved in training new LLMs from scratch. In light of this trend, we explore the viability of adapting open-source LLMs for usage in the Army domain in order to address their existing lack of domain-specificity. Our investigations have resulted in the creation of three distinct generations of TRACLM, a family of LLMs fine-tuned by The Research and Analysis Center (TRAC), Army Futures Command (AFC). Through continuous refinement of our training pipeline, each successive iteration of TRACLM displayed improved capabilities when applied to Army tasks and use cases. Furthermore, throughout our fine-tuning experiments, we recognized the need for an evaluation framework that objectively quantifies the Army domain-specific knowledge of LLMs. To address this, we developed MilBench, an extensible software framework that efficiently evaluates the Army knowledge of a given LLM using tasks derived from doctrine and assessments. We share preliminary results, models, methods, and recommendations on the creation of TRACLM and MilBench. Our work significantly informs the development of LLM technology across the DoD and augments senior leader decisions with respect to artificial intelligence integration.
In the current quantum computing paradigm, significant focus is placed on the reduction or mitigation of quantum decoherence. When designing new quantum processing units, the general objective is to reduce the amount of noise qubits are subject to, and in algorithm design, a large effort is underway to provide scalable error correction or mitigation techniques. Yet some previous work has indicated that certain classes of quantum algorithms, such as quantum machine learning, may, in fact, be intrinsically robust to or even benefit from the presence of a small amount of noise. Here, we demonstrate that noise levels in quantum hardware can be effectively tuned to enhance the ability of quantum neural networks to generalize data, acting akin to regularisation in classical neural networks. As an example, we consider a medical regression task, where, by tuning the noise level in the circuit, we improved the mean squared error loss by 8%.
Reinforcement learning (RL) has been successfully applied to solve the problem of finding obstacle-free paths for autonomous agents operating in stochastic and uncertain environments. However, when the underlying stochastic dynamics of the environment experiences drastic distribution shifts, the optimal policy obtained in the trained environment may be sub-optimal or may entirely fail in helping find goal-reaching paths for the agent. Approaches like domain randomization and robust RL can provide robust policies, but typically assume minor (bounded) distribution shifts. For substantial distribution shifts, retraining (either with a warm-start policy or from scratch) is an alternative approach. In this paper, we develop a novel approach called {\em Evolutionary Robust Policy Optimization} (ERPO), an adaptive re-training algorithm inspired by evolutionary game theory (EGT). ERPO learns an optimal policy for the shifted environment iteratively using a temperature parameter that controls the trade off between exploration and adherence to the old optimal policy. The policy update itself is an instantiation of the replicator dynamics used in EGT. We show that under fairly common sparsity assumptions on rewards in such environments, ERPO converges to the optimal policy in the shifted environment. We empirically demonstrate that for path finding tasks in a number of environments, ERPO outperforms several popular RL and deep RL algorithms (PPO, A3C, DQN) in many scenarios and popular environments. This includes scenarios where the RL algorithms are allowed to train from scratch in the new environment, when they are retrained on the new environment, or when they are used in conjunction with domain randomization. ERPO shows faster policy adaptation, higher average rewards, and reduced computational costs in policy adaptation.