Neural and Evolutionary Computing

2024-10-22 | | Total: 14

#1 Spiking Neural Networks as a Controller for Emergent Swarm Agents [PDF] [Copy] [Kimi] [REL]

Authors: Kevin Zhu ; Connor Mattson ; Shay Snyder ; Ricardo Vega ; Daniel S. Brown ; Maryam Parsa ; Cameron Nowzari

Drones which can swarm and loiter in a certain area cost hundreds of dollars, but mosquitos can do the same and are essentially worthless. To control swarms of low-cost robots, researchers may end up spending countless hours brainstorming robot configurations and policies to ``organically" create behaviors which do not need expensive sensors and perception. Existing research explores the possible emergent behaviors in swarms of robots with only a binary sensor and a simple but hand-picked controller structure. Even agents in this highly limited sensing, actuation, and computational capability class can exhibit relatively complex global behaviors such as aggregation, milling, and dispersal, but finding the local interaction rules that enable more collective behaviors remains a significant challenge. This paper investigates the feasibility of training spiking neural networks to find those local interaction rules that result in particular emergent behaviors. In this paper, we focus on simulating a specific milling behavior already known to be producible using very simple binary sensing and acting agents. To do this, we use evolutionary algorithms to evolve not only the parameters (the weights, biases, and delays) of a spiking neural network, but also its structure. To create a baseline, we also show an evolutionary search strategy over the parameters for the incumbent hand-picked binary controller structure. Our simulations show that spiking neural networks can be evolved in binary sensing agents to form a mill.

Subjects: Neural and Evolutionary Computing ; Multiagent Systems ; Systems and Control

Publish: 2024-10-21 16:41:35 UTC

#2 TEXEL: A neuromorphic processor with on-chip learning for beyond-CMOS device integration [PDF] [Copy] [Kimi] [REL]

Authors: Hugh Greatorex ; Ole Richter ; Michele Mastella ; Madison Cotteret ; Philipp Klein ; Maxime Fabre ; Arianna Rubino ; Willian Soares Girão ; Junren Chen ; Martin Ziegler ; Laura Bégon-Lours ; Giacomo Indiveri ; Elisabetta Chicca

Recent advances in memory technologies, devices and materials have shown great potential for integration into neuromorphic electronic systems. However, a significant gap remains between the development of these materials and the realization of large-scale, fully functional systems. One key challenge is determining which devices and materials are best suited for specific functions and how they can be paired with CMOS circuitry. To address this, we introduce TEXEL, a mixed-signal neuromorphic architecture designed to explore the integration of on-chip learning circuits and novel two- and three-terminal devices. TEXEL serves as an accessible platform to bridge the gap between CMOS-based neuromorphic computation and the latest advancements in emerging devices. In this paper, we demonstrate the readiness of TEXEL for device integration through comprehensive chip measurements and simulations. TEXEL provides a practical system for testing bio-inspired learning algorithms alongside emerging devices, establishing a tangible link between brain-inspired computation and cutting-edge device research.

Subjects: Neural and Evolutionary Computing ; Hardware Architecture ; Emerging Technologies ; Machine Learning

Publish: 2024-10-21 10:30:24 UTC

#3 SNAP: Stopping Catastrophic Forgetting in Hebbian Learning with Sigmoidal Neuronal Adaptive Plasticity [PDF] [Copy] [Kimi] [REL]

Authors: Tianyi Xu ; Patrick Zheng ; Shiyan Liu ; Sicheng Lyu ; Isabeau Prémont-Schwarz

Artificial Neural Networks (ANNs) suffer from catastrophic forgetting, where the learning of new tasks causes the catastrophic forgetting of old tasks. Existing Machine Learning (ML) algorithms, including those using Stochastic Gradient Descent (SGD) and Hebbian Learning typically update their weights linearly with experience i.e., independently of their current strength. This contrasts with biological neurons, which at intermediate strengths are very plastic, but consolidate with Long-Term Potentiation (LTP) once they reach a certain strength. We hypothesize this mechanism might help mitigate catastrophic forgetting. We introduce Sigmoidal Neuronal Adaptive Plasticity (SNAP) an artificial approximation to Long-Term Potentiation for ANNs by having the weights follow a sigmoidal growth behaviour allowing the weights to consolidate and stabilize when they reach sufficiently large or small values. We then compare SNAP to linear weight growth and exponential weight growth and see that SNAP completely prevents the forgetting of previous tasks for Hebbian Learning but not for SGD-base learning.

Subjects: Neural and Evolutionary Computing ; Artificial Intelligence ; Machine Learning

Publish: 2024-10-20 07:20:33 UTC

#4 Fractional-order spike-timing-dependent gradient descent for multi-layer spiking neural networks [PDF] [Copy] [Kimi] [REL]

Authors: Yi Yang ; Richard M. Voyles ; Haiyan H. Zhang ; Robert A. Nawrocki

Accumulated detailed knowledge about the neuronal activities in human brains has brought more attention to bio-inspired spiking neural networks (SNNs). In contrast to non-spiking deep neural networks (DNNs), SNNs can encode and transmit spatiotemporal information more efficiently by exploiting biologically realistic and low-power event-driven neuromorphic architectures. However, the supervised learning of SNNs still remains a challenge because the spike-timing-dependent plasticity (STDP) of connected spiking neurons is difficult to implement and interpret in existing backpropagation learning schemes. This paper proposes a fractional-order spike-timing-dependent gradient descent (FO-STDGD) learning model by considering a derived nonlinear activation function that describes the relationship between the quasi-instantaneous firing rate and the temporal membrane potentials of nonleaky integrate-and-fire neurons. The training strategy can be generalized to any fractional orders between 0 and 2 since the FO-STDGD incorporates the fractional gradient descent method into the calculation of spike-timing-dependent loss gradients. The proposed FO-STDGD model is tested on the MNIST and DVS128 Gesture datasets and its accuracy under different network structure and fractional orders is analyzed. It can be found that the classification accuracy increases as the fractional order increases, and specifically, the case of fractional order 1.9 improves by 155% relative to the case of fractional order 1 (traditional gradient descent). In addition, our scheme demonstrates the state-of-the-art computational efficacy for the same SNN structure and training epochs.

Subjects: Neural and Evolutionary Computing ; Artificial Intelligence ; Machine Learning

Publish: 2024-10-20 05:31:34 UTC

#5 Green vehicle routing problem that jointly optimizes delivery speed and routing based on the characteristics of electric vehicles [PDF] [Copy] [Kimi] [REL]

Author: YY. Feng

The abundance of materials and the development of the economy have led to the flourishing of the logistics industry, but have also caused certain pollution. The research on GVRP (Green vehicle routing problem) for planning vehicle routes during transportation to reduce pollution is also increasingly developing. Further exploration is needed on how to integrate these research findings with real vehicles. This paper establishes an energy consumption model using real electric vehicles, fully considering the physical characteristics of each component of the vehicle. To avoid the distortion of energy consumption models affecting the results of route planning. The energy consumption model also incorporates the effects of vehicle start/stop, speed, distance, and load on energy consumption. In addition, a load first speed optimization algorithm was proposed, which selects the most suitable speed between every two delivery points while planning the route. In order to further reduce energy consumption while meeting the time window. Finally, an improved Adaptive Genetic Algorithm is used to solve for the most energy-efficient route. The experiment shows that the results of using this speed optimization algorithm are generally more energy-efficient than those without using this algorithm. The average energy consumption of constant speed delivery at different speeds is 17.16% higher than that after speed optimization. Provided a method that is closer to reality and easier for logistics companies to use. It also enriches the GVRP model.

Subjects: Neural and Evolutionary Computing ; Artificial Intelligence ; Computational Engineering, Finance, and Science

Publish: 2024-10-04 08:08:15 UTC

#6 BrainTransformers: SNN-LLM [PDF1] [Copy] [Kimi1] [REL]

Author: Zhengzheng Tang

This study introduces BrainTransformers, an innovative Large Language Model (LLM) implemented using Spiking Neural Networks (SNN). Our key contributions include: (1) designing SNN-compatible Transformer components such as SNNMatmul, SNNSoftmax, and SNNSiLU; (2) implementing an SNN approximation of the SiLU activation function; and (3) developing a Synapsis module to simulate synaptic plasticity. Our 3-billion parameter model, BrainTransformers-3B-Chat, demonstrates competitive performance across various benchmarks, including MMLU (63.2), BBH (54.1), ARC-C (54.3), and GSM8K (76.3), while potentially offering improved energy efficiency and biological plausibility. The model employs a three-stage training approach, including SNN-specific neuronal synaptic plasticity training. This research opens new avenues for brain-like AI systems in natural language processing and neuromorphic computing. Future work will focus on hardware optimization, developing specialized SNN fine-tuning tools, and exploring practical applications in energy-efficient computing environments.

Subjects: Neural and Evolutionary Computing ; Computation and Language ; Machine Learning

Publish: 2024-10-03 14:17:43 UTC

#7 Metric as Transform: Exploring beyond Affine Transform for Interpretable Neural Network [PDF] [Copy] [Kimi] [REL]

Author: Suman Sapkota

Artificial Neural Networks of varying architectures are generally paired with affine transformation at the core. However, we find dot product neurons with global influence less interpretable as compared to local influence of euclidean distance (as used in Radial Basis Function Network). In this work, we explore the generalization of dot product neurons to $l^p$-norm, metrics, and beyond. We find that metrics as transform performs similarly to affine transform when used in MultiLayer Perceptron or Convolutional Neural Network. Moreover, we explore various properties of Metrics, compare it with Affine, and present multiple cases where metrics seem to provide better interpretability. We develop an interpretable local dictionary based Neural Networks and use it to understand and reject adversarial examples.

Subjects: Machine Learning ; Computer Vision and Pattern Recognition ; Neural and Evolutionary Computing

Publish: 2024-10-21 16:22:19 UTC

#8 Small Contributions, Small Networks: Efficient Neural Network Pruning Based on Relative Importance [PDF] [Copy] [Kimi] [REL]

Authors: Mostafa Hussien ; Mahmoud Afifi ; Kim Khoa Nguyen ; Mohamed Cheriet

Recent advancements have scaled neural networks to unprecedented sizes, achieving remarkable performance across a wide range of tasks. However, deploying these large-scale models on resource-constrained devices poses significant challenges due to substantial storage and computational requirements. Neural network pruning has emerged as an effective technique to mitigate these limitations by reducing model size and complexity. In this paper, we introduce an intuitive and interpretable pruning method based on activation statistics, rooted in information theory and statistical analysis. Our approach leverages the statistical properties of neuron activations to identify and remove weights with minimal contributions to neuron outputs. Specifically, we build a distribution of weight contributions across the dataset and utilize its parameters to guide the pruning process. Furthermore, we propose a Pruning-aware Training strategy that incorporates an additional regularization term to enhance the effectiveness of our pruning method. Extensive experiments on multiple datasets and network architectures demonstrate that our method consistently outperforms several baseline and state-of-the-art pruning techniques.

Subjects: Machine Learning ; Artificial Intelligence ; Neural and Evolutionary Computing

Publish: 2024-10-21 16:18:31 UTC

#9 Karush-Kuhn-Tucker Condition-Trained Neural Networks (KKT Nets) [PDF] [Copy] [Kimi] [REL]

Authors: Shreya Arvind ; Rishabh Pomaje ; Rajshekhar V Bhat

This paper presents a novel approach to solving convex optimization problems by leveraging the fact that, under certain regularity conditions, any set of primal or dual variables satisfying the Karush-Kuhn-Tucker (KKT) conditions is necessary and sufficient for optimality. Similar to Theory-Trained Neural Networks (TTNNs), the parameters of the convex optimization problem are input to the neural network, and the expected outputs are the optimal primal and dual variables. A choice for the loss function in this case is a loss, which we refer to as the KKT Loss, that measures how well the network's outputs satisfy the KKT conditions. We demonstrate the effectiveness of this approach using a linear program as an example. For this problem, we observe that minimizing the KKT Loss alone outperforms training the network with a weighted sum of the KKT Loss and a Data Loss (the mean-squared error between the ground truth optimal solutions and the network's output). Moreover, minimizing only the Data Loss yields inferior results compared to those obtained by minimizing the KKT Loss. While the approach is promising, the obtained primal and dual solutions are not sufficiently close to the ground truth optimal solutions. In the future, we aim to develop improved models to obtain solutions closer to the ground truth and extend the approach to other problem classes.

Subjects: Machine Learning ; Artificial Intelligence ; Neural and Evolutionary Computing ; Optimization and Control

Publish: 2024-10-21 12:59:58 UTC

#10 Enhancing SNN-based Spatio-Temporal Learning: A Benchmark Dataset and Cross-Modality Attention Model [PDF] [Copy] [Kimi] [REL]

Authors: Shibo Zhou ; Bo Yang ; Mengwen Yuan ; Runhao Jiang ; Rui Yan ; Gang Pan ; Huajin Tang

Spiking Neural Networks (SNNs), renowned for their low power consumption, brain-inspired architecture, and spatio-temporal representation capabilities, have garnered considerable attention in recent years. Similar to Artificial Neural Networks (ANNs), high-quality benchmark datasets are of great importance to the advances of SNNs. However, our analysis indicates that many prevalent neuromorphic datasets lack strong temporal correlation, preventing SNNs from fully exploiting their spatio-temporal representation capabilities. Meanwhile, the integration of event and frame modalities offers more comprehensive visual spatio-temporal information. Yet, the SNN-based cross-modality fusion remains underexplored. In this work, we present a neuromorphic dataset called DVS-SLR that can better exploit the inherent spatio-temporal properties of SNNs. Compared to existing datasets, it offers advantages in terms of higher temporal correlation, larger scale, and more varied scenarios. In addition, our neuromorphic dataset contains corresponding frame data, which can be used for developing SNN-based fusion methods. By virtue of the dual-modal feature of the dataset, we propose a Cross-Modality Attention (CMA) based fusion method. The CMA model efficiently utilizes the unique advantages of each modality, allowing for SNNs to learn both temporal and spatial attention scores from the spatio-temporal features of event and frame modalities, subsequently allocating these scores across modalities to enhance their synergy. Experimental results demonstrate that our method not only improves recognition accuracy but also ensures robustness across diverse scenarios.

Subjects: Computer Vision and Pattern Recognition ; Machine Learning ; Neural and Evolutionary Computing

Publish: 2024-10-21 06:59:04 UTC

#11 A Remedy to Compute-in-Memory with Dynamic Random Access Memory: 1FeFET-1C Technology for Neuro-Symbolic AI [PDF] [Copy] [Kimi] [REL]

Authors: Xunzhao Yin ; Hamza Errahmouni Barkam ; Franz Müller ; Yuxiao Jiang ; Mohsen Imani ; Sukhrob Abdulazhanov ; Alptekin Vardar ; Nellie Laleni ; Zijian Zhao ; Jiahui Duan ; Zhiguo Shi ; Siddharth Joshi ; Michael Niemier ; Xiaobo Sharon Hu ; Cheng Zhuo ; Thomas Kämpfe ; Kai Ni

Neuro-symbolic artificial intelligence (AI) excels at learning from noisy and generalized patterns, conducting logical inferences, and providing interpretable reasoning. Comprising a 'neuro' component for feature extraction and a 'symbolic' component for decision-making, neuro-symbolic AI has yet to fully benefit from efficient hardware accelerators. Additionally, current hardware struggles to accommodate applications requiring dynamic resource allocation between these two components. To address these challenges-and mitigate the typical data-transfer bottleneck of classical Von Neumann architectures-we propose a ferroelectric charge-domain compute-in-memory (CiM) array as the foundational processing element for neuro-symbolic AI. This array seamlessly handles both the critical multiply-accumulate (MAC) operations of the 'neuro' workload and the parallel associative search operations of the 'symbolic' workload. To enable this approach, we introduce an innovative 1FeFET-1C cell, combining a ferroelectric field-effect transistor (FeFET) with a capacitor. This design, overcomes the destructive sensing limitations of DRAM in CiM applications, while capable of capitalizing decades of DRAM expertise with a similar cell structure as DRAM, achieves high immunity against FeFET variation-crucial for neuro-symbolic AI-and demonstrates superior energy efficiency. The functionalities of our design have been successfully validated through SPICE simulations and prototype fabrication and testing. Our hardware platform has been benchmarked in executing typical neuro-symbolic AI reasoning tasks, showing over 2x improvement in latency and 1000x improvement in energy efficiency compared to GPU-based implementations.

Subjects: Emerging Technologies ; Neural and Evolutionary Computing ; Symbolic Computation

Publish: 2024-10-20 05:52:03 UTC

#12 Universal approximation results for neural networks with non-polynomial activation function over non-compact domains [PDF] [Copy] [Kimi] [REL]

Authors: Ariel Neufeld ; Philipp Schmocker

In this paper, we generalize the universal approximation property of single-hidden-layer feed-forward neural networks beyond the classical formulation over compact domains. More precisely, by assuming that the activation function is non-polynomial, we derive universal approximation results for neural networks within function spaces over non-compact subsets of a Euclidean space, e.g., weighted spaces, $L^p$-spaces, and (weighted) Sobolev spaces over unbounded domains, where the latter includes the approximation of the (weak) derivatives. Furthermore, we provide some dimension-independent rates for approximating a function with sufficiently regular and integrable Fourier transform by neural networks with non-polynomial activation function.

Subjects: Machine Learning ; Machine Learning ; Neural and Evolutionary Computing ; Classical Analysis and ODEs

Publish: 2024-10-18 09:53:20 UTC

#13 Agent Skill Acquisition for Large Language Models via CycleQD [PDF] [Copy] [Kimi] [REL]

Authors: So Kuroki ; Taishi Nakamura ; Takuya Akiba ; Yujin Tang

Training large language models to acquire specific skills remains a challenging endeavor. Conventional training approaches often struggle with data distribution imbalances and inadequacies in objective functions that do not align well with task-specific performance. To address these challenges, we introduce CycleQD, a novel approach that leverages the Quality Diversity framework through a cyclic adaptation of the algorithm, along with a model merging based crossover and an SVD-based mutation. In CycleQD, each task's performance metric is alternated as the quality measure while the others serve as the behavioral characteristics. This cyclic focus on individual tasks allows for concentrated effort on one task at a time, eliminating the need for data ratio tuning and simplifying the design of the objective function. Empirical results from AgentBench indicate that applying CycleQD to LLAMA3-8B-INSTRUCT based models not only enables them to surpass traditional fine-tuning methods in coding, operating systems, and database tasks, but also achieves performance on par with GPT-3.5-TURBO, which potentially contains much more parameters, across these domains. Crucially, this enhanced performance is achieved while retaining robust language capabilities, as evidenced by its performance on widely adopted language benchmark tasks. We highlight the key design choices in CycleQD, detailing how these contribute to its effectiveness. Furthermore, our method is general and can be applied to image segmentation models, highlighting its applicability across different domains.

Subjects: Computation and Language ; Artificial Intelligence ; Neural and Evolutionary Computing

Publish: 2024-10-16 20:27:15 UTC

#14 Leveraging Event Streams with Deep Reinforcement Learning for End-to-End UAV Tracking [PDF] [Copy] [Kimi] [REL]

Authors: Ala Souissi ; Hajer Fradi ; Panagiotis Papadakis

In this paper, we present our proposed approach for active tracking to increase the autonomy of Unmanned Aerial Vehicles (UAVs) using event cameras, low-energy imaging sensors that offer significant advantages in speed and dynamic range. The proposed tracking controller is designed to respond to visual feedback from the mounted event sensor, adjusting the drone movements to follow the target. To leverage the full motion capabilities of a quadrotor and the unique properties of event sensors, we propose an end-to-end deep-reinforcement learning (DRL) framework that maps raw sensor data from event streams directly to control actions for the UAV. To learn an optimal policy under highly variable and challenging conditions, we opt for a simulation environment with domain randomization for effective transfer to real-world environments. We demonstrate the effectiveness of our approach through experiments in challenging scenarios, including fast-moving targets and changing lighting conditions, which result in improved generalization capabilities.

Subjects: Robotics ; Artificial Intelligence ; Neural and Evolutionary Computing

Publish: 2024-10-03 07:56:40 UTC