We gratefully acknowledge support from
the Simons Foundation and member institutions.

Neural and Evolutionary Computing

New submissions

[ total of 9 entries: 1-9 ]
[ showing up to 1000 entries per page: fewer | more ]

New submissions for Fri, 24 Mar 23

[1]  arXiv:2303.12797 [pdf, other]
Title: An algorithmic framework for the optimization of deep neural networks architectures and hyperparameters
Authors: Julie Keisler (EDF R&D OSIRIS, EDF R&D, CRIStAL), El-Ghazali Talbi (CRIStAL), Sandra Claudel (EDF R&D OSIRIS, EDF R&D), Gilles Cabriel (EDF R&D OSIRIS, EDF R&D)
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

In this paper, we propose an algorithmic framework to automatically generate efficient deep neural networks and optimize their associated hyperparameters. The framework is based on evolving directed acyclic graphs (DAGs), defining a more flexible search space than the existing ones in the literature. It allows mixtures of different classical operations: convolutions, recurrences and dense layers, but also more newfangled operations such as self-attention. Based on this search space we propose neighbourhood and evolution search operators to optimize both the architecture and hyper-parameters of our networks. These search operators can be used with any metaheuristic capable of handling mixed search spaces. We tested our algorithmic framework with an evolutionary algorithm on a time series prediction benchmark. The results demonstrate that our framework was able to find models outperforming the established baseline on numerous datasets.

[2]  arXiv:2303.12803 [pdf, other]
Title: Evolving Populations of Diverse RL Agents with MAP-Elites
Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)

Quality Diversity (QD) has emerged as a powerful alternative optimization paradigm that aims at generating large and diverse collections of solutions, notably with its flagship algorithm MAP-ELITES (ME) which evolves solutions through mutations and crossovers. While very effective for some unstructured problems, early ME implementations relied exclusively on random search to evolve the population of solutions, rendering them notoriously sample-inefficient for high-dimensional problems, such as when evolving neural networks. Follow-up works considered exploiting gradient information to guide the search in order to address these shortcomings through techniques borrowed from either Black-Box Optimization (BBO) or Reinforcement Learning (RL). While mixing RL techniques with ME unlocked state-of-the-art performance for robotics control problems that require a good amount of exploration, it also plagued these ME variants with limitations common among RL algorithms that ME was free of, such as hyperparameter sensitivity, high stochasticity as well as training instability, including when the population size increases as some components are shared across the population in recent approaches. Furthermore, existing approaches mixing ME with RL tend to be tied to a specific RL algorithm, which effectively prevents their use on problems where the corresponding RL algorithm fails. To address these shortcomings, we introduce a flexible framework that allows the use of any RL algorithm and alleviates the aforementioned limitations by evolving populations of agents (whose definition include hyperparameters and all learnable parameters) instead of just policies. We demonstrate the benefits brought about by our framework through extensive numerical experiments on a number of robotics control problems, some of which with deceptive rewards, taken from the QD-RL literature.

[3]  arXiv:2303.13080 [pdf, other]
Title: MSAT: Biologically Inspired Multi-Stage Adaptive Threshold for Conversion of Spiking Neural Networks
Subjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG)

Spiking Neural Networks (SNNs) can do inference with low power consumption due to their spike sparsity. ANN-SNN conversion is an efficient way to achieve deep SNNs by converting well-trained Artificial Neural Networks (ANNs). However, the existing methods commonly use constant threshold for conversion, which prevents neurons from rapidly delivering spikes to deeper layers and causes high time delay. In addition, the same response for different inputs may result in information loss during the information transmission. Inspired by the biological model mechanism, we propose a multi-stage adaptive threshold (MSAT). Specifically, for each neuron, the dynamic threshold varies with firing history and input properties and is positively correlated with the average membrane potential and negatively correlated with the rate of depolarization. The self-adaptation to membrane potential and input allows a timely adjustment of the threshold to fire spike faster and transmit more information. Moreover, we analyze the Spikes of Inactivated Neurons error which is pervasive in early time steps and propose spike confidence accordingly as a measurement of confidence about the neurons that correctly deliver spikes. We use such spike confidence in early time steps to determine whether to elicit spike to alleviate this error. Combined with the proposed method, we examine the performance on non-trivial datasets CIFAR-10, CIFAR-100, and ImageNet. We also conduct sentiment classification and speech recognition experiments on the IDBM and Google speech commands datasets respectively. Experiments show near-lossless and lower latency ANN-SNN conversion. To the best of our knowledge, this is the first time to build a biologically inspired multi-stage adaptive threshold for converted SNN, with comparable performance to state-of-the-art methods while improving energy efficiency.

[4]  arXiv:2303.13262 [pdf, ps, other]
Title: Noise impact on recurrent neural network with linear activation function
Comments: 12 pages, 6 figures, 23 references
Subjects: Neural and Evolutionary Computing (cs.NE)

In recent years, more and more researchers in the field of neural networks are interested in creating hardware implementations where neurons and the connection between them are realized physically. The physical implementation of ANN fundamentally changes the features of noise influence. In the case hardware ANNs, there are many internal sources of noise with different properties. The purpose of this paper is to study the peculiarities of internal noise propagation in recurrent ANN on the example of echo state network (ESN), to reveal ways to suppress such noises and to justify the stability of networks to some types of noises.
In this paper we analyse ESN in presence of uncorrelated additive and multiplicative white Gaussian noise. Here we consider the case when artificial neurons have linear activation function with different slope coefficients. Starting from studying only one noisy neuron we complicate the problem by considering how the input signal and the memory property affect the accumulation of noise in ESN. In addition, we consider the influence of the main types of coupling matrices on the accumulation of noise. So, as such matrices, we take a uniform matrix and a diagonal-like matrices with different coefficients called "blurring" coefficient.
We have found that the general view of variance and signal-to-noise ratio of ESN output signal is similar to only one neuron. The noise is less accumulated in ESN with diagonal reservoir connection matrix with large "blurring" coefficient. Especially it concerns uncorrelated multiplicative noise.

Cross-lists for Fri, 24 Mar 23

[5]  arXiv:2303.12807 (cross-list from cs.LG) [pdf, other]
Title: Granular-ball Optimization Algorithm
Comments: 10 pages, 22 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)

The existing intelligent optimization algorithms are designed based on the finest granularity, i.e., a point. This leads to weak global search ability and inefficiency. To address this problem, we proposed a novel multi-granularity optimization algorithm, namely granular-ball optimization algorithm (GBO), by introducing granular-ball computing. GBO uses many granular-balls to cover the solution space. Quite a lot of small and fine-grained granular-balls are used to depict the important parts, and a little number of large and coarse-grained granular-balls are used to depict the inessential parts. Fine multi-granularity data description ability results in a higher global search capability and faster convergence speed. In comparison with the most popular and state-of-the-art algorithms, the experiments on twenty benchmark functions demonstrate its better performance. The faster speed, higher approximation ability of optimal solution, no hyper-parameters, and simpler design of GBO make it an all-around replacement of most of the existing popular intelligent optimization algorithms.

[6]  arXiv:2303.13037 (cross-list from physics.optics) [pdf]
Title: Universal Linear Intensity Transformations Using Spatially-Incoherent Diffractive Processors
Comments: 29 Pages, 10 Figures
Subjects: Optics (physics.optics); Neural and Evolutionary Computing (cs.NE)

Under spatially-coherent light, a diffractive optical network composed of structured surfaces can be designed to perform any arbitrary complex-valued linear transformation between its input and output fields-of-view (FOVs) if the total number (N) of optimizable phase-only diffractive features is greater than or equal to ~2 Ni x No, where Ni and No refer to the number of useful pixels at the input and the output FOVs, respectively. Here we report the design of a spatially-incoherent diffractive optical processor that can approximate any arbitrary linear transformation in time-averaged intensity between its input and output FOVs. Under spatially-incoherent monochromatic light, the spatially-varying intensity point spread functon(H) of a diffractive network, corresponding to a given, arbitrarily-selected linear intensity transformation, can be written as H(m,n;m',n')=|h(m,n;m',n')|^2, where h is the spatially-coherent point-spread function of the same diffractive network, and (m,n) and (m',n') define the coordinates of the output and input FOVs, respectively. Using deep learning, supervised through examples of input-output profiles, we numerically demonstrate that a spatially-incoherent diffractive network can be trained to all-optically perform any arbitrary linear intensity transformation between its input and output if N is greater than or equal to ~2 Ni x No. These results constitute the first demonstration of universal linear intensity transformations performed on an input FOV under spatially-incoherent illumination and will be useful for designing all-optical visual processors that can work with incoherent, natural light.

[7]  arXiv:2303.13117 (cross-list from math.OC) [pdf, other]
Title: RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research
Comments: 21 pages
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Reinforcement learning has been applied in operation research and has shown promise in solving large combinatorial optimization problems. However, existing works focus on developing neural network architectures for certain problems. These works lack the flexibility to incorporate recent advances in reinforcement learning, as well as the flexibility of customizing model architectures for operation research problems. In this work, we analyze the end-to-end autoregressive models for vehicle routing problems and show that these models can benefit from the recent advances in reinforcement learning with a careful re-implementation of the model architecture. In particular, we re-implemented the Attention Model and trained it with Proximal Policy Optimization (PPO) in CleanRL, showing at least 8 times speed up in training time. We hereby introduce RLOR, a flexible framework for Deep Reinforcement Learning for Operation Research. We believe that a flexible framework is key to developing deep reinforcement learning models for operation research problems. The code of our work is publicly available at https://github.com/cpwan/RLOR.

Replacements for Fri, 24 Mar 23

[8]  arXiv:2301.11777 (replaced) [pdf, other]
Title: Interpreting learning in biological neural networks as zero-order optimization method
Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Statistics Theory (math.ST)
[9]  arXiv:2303.04238 (replaced) [pdf, other]
Title: Patch of Invisibility: Naturalistic Black-Box Adversarial Attacks on Object Detectors
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
[ total of 9 entries: 1-9 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2303, contact, help  (Access key information)