Neural and Evolutionary Computing

Date: Thu, 9 May 2024 | Total: 5

#1 Folded context condensation in Path Integral formalism for infinite context transformers [PDF5] [Copy] [Kimi6]

Authors: Won-Gi Paeng ; Daesuk Kwon

This short note is written for rapid communication of long context training and to share the idea of how to train it with low memory usage. In the note, we generalize the attention algorithm and neural network of Generative Pre-Trained Transformers and reinterpret it in Path integral formalism. First, the role of the transformer is understood as the time evolution of the token state and second, it is suggested that the all key-token states in the same time as the query-token can attend to the attention with the query token states. As a result of the repetitive time evolution, it is discussed that the token states in the past sequence meats the token states in the present sequence so that the attention between separated sequences becomes possible for maintaining infinite contextual information just by using low memory for limited size of sequence. For the experiment, the $12$ input token window size was taken and one GPU with $24$GB memory was used for the pre-training. It was confirmed that more than $150$ length context is preserved. The sampling result of the training, the code and the other details will be included in the revised version of this note later.

#2 Learning Object Semantic Similarity with Self-Supervision [PDF7] [Copy] [Kimi1]

Authors: Arthur Aubret ; Timothy Schaumlöffel ; Gemma Roig ; Jochen Triesch

Humans judge the similarity of two objects not just based on their visual appearance but also based on their semantic relatedness. However, it remains unclear how humans learn about semantic relationships between objects and categories. One important source of semantic knowledge is that semantically related objects frequently co-occur in the same context. For instance, forks and plates are perceived as similar, at least in part, because they are often experienced together in a ``kitchen" or ``eating'' context. Here, we investigate whether a bio-inspired learning principle exploiting such co-occurrence statistics suffices to learn a semantically structured object representation {\em de novo} from raw visual or combined visual and linguistic input. To this end, we simulate temporal sequences of visual experience by binding together short video clips of real-world scenes showing objects in different contexts. A bio-inspired neural network model aligns close-in-time visual representations while also aligning visual and category label representations to simulate visuo-language alignment. Our results show that our model clusters object representations based on their context, e.g. kitchen or bedroom, in particular in high-level layers of the network, akin to humans. In contrast, lower-level layers tend to better reflect object identity or category. To achieve this, the model exploits two distinct strategies: the visuo-language alignment ensures that different objects of the same category are represented similarly, whereas the temporal alignment leverages that objects from the same context are frequently seen in succession to make their representations more similar. Overall, our work suggests temporal and visuo-language alignment as plausible computational principles for explaining the origins of certain forms of semantic knowledge in humans.

#3 Physics-based deep learning reveals rising heating demand heightens air pollution in Norwegian cities [PDF] [Copy] [Kimi]

Authors: Cong Cao ; Ramit Debnath ; R. Michael Alvarez

Policymakers frequently analyze air quality and climate change in isolation, disregarding their interactions. This study explores the influence of specific climate factors on air quality by contrasting a regression model with K-Means Clustering, Hierarchical Clustering, and Random Forest techniques. We employ Physics-based Deep Learning (PBDL) and Long Short-Term Memory (LSTM) to examine the air pollution predictions. Our analysis utilizes ten years (2009-2018) of daily traffic, weather, and air pollution data from three major cities in Norway. Findings from feature selection reveal a correlation between rising heating degree days and heightened air pollution levels, suggesting increased heating activities in Norway are a contributing factor to worsening air quality. PBDL demonstrates superior accuracy in air pollution predictions compared to LSTM. This paper contributes to the growing literature on PBDL methods for more accurate air pollution predictions using environmental variables, aiding policymakers in formulating effective data-driven climate policies.

#4 Analyzing design principles for competitive evolution strategies in constrained search spaces [PDF] [Copy] [Kimi]

Authors: Michael Hellwig ; Hans-Georg Beyer

In the context of the 2018 IEEE Congress of Evolutionary Computation, the Matrix Adaptation Evolution Strategy for constrained optimization turned out to be notably successful in the competition on constrained single objective real-parameter optimization. Across all considered instances the so-called $\epsilon$MAg-ES achieved the second rank. However, it can be considered to be the most successful participant in high dimensions. Unfortunately, the competition result does not provide any information about the modus operandi of a successful algorithm or its suitability for problems of a particular shape. To this end, the present paper is concerned with an extensive empirical analysis of the $\epsilon$MAg-ES working principles that is expected to provide insights about the performance contribution of specific algorithmic components. To avoid rankings with respect to insignificant differences within the algorithm realizations, the paper additionally introduces significance testing into the ranking process.

#5 Learning-to-learn enables rapid learning with phase-change memory-based in-memory computing [PDF] [Copy] [Kimi]

Authors: Thomas Ortner ; Horst Petschenig ; Athanasios Vasilopoulos ; Roland Renner ; Špela Brglez ; Thomas Limbacher ; Enrique Piñero ; Alejandro Linares Barranco ; Angeliki Pantazi ; Robert Legenstein

There is a growing demand for low-power, autonomously learning artificial intelligence (AI) systems that can be applied at the edge and rapidly adapt to the specific situation at deployment site. However, current AI models struggle in such scenarios, often requiring extensive fine-tuning, computational resources, and data. In contrast, humans can effortlessly adjust to new tasks by transferring knowledge from related ones. The concept of learning-to-learn (L2L) mimics this process and enables AI models to rapidly adapt with only little computational effort and data. In-memory computing neuromorphic hardware (NMHW) is inspired by the brain's operating principles and mimics its physical co-location of memory and compute. In this work, we pair L2L with in-memory computing NMHW based on phase-change memory devices to build efficient AI models that can rapidly adapt to new tasks. We demonstrate the versatility of our approach in two scenarios: a convolutional neural network performing image classification and a biologically-inspired spiking neural network generating motor commands for a real robotic arm. Both models rapidly learn with few parameter updates. Deployed on the NMHW, they perform on-par with their software equivalents. Moreover, meta-training of these models can be performed in software with high-precision, alleviating the need for accurate hardware models.