Neural and Evolutionary Computing
New submissions
[ showing up to 1000 entries per page: fewer  more ]
New submissions for Fri, 24 Mar 23
 [1] arXiv:2303.12797 [pdf, other]

Title: An algorithmic framework for the optimization of deep neural networks architectures and hyperparametersAuthors: Julie Keisler (EDF R&D OSIRIS, EDF R&D, CRIStAL), ElGhazali Talbi (CRIStAL), Sandra Claudel (EDF R&D OSIRIS, EDF R&D), Gilles Cabriel (EDF R&D OSIRIS, EDF R&D)Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
In this paper, we propose an algorithmic framework to automatically generate efficient deep neural networks and optimize their associated hyperparameters. The framework is based on evolving directed acyclic graphs (DAGs), defining a more flexible search space than the existing ones in the literature. It allows mixtures of different classical operations: convolutions, recurrences and dense layers, but also more newfangled operations such as selfattention. Based on this search space we propose neighbourhood and evolution search operators to optimize both the architecture and hyperparameters of our networks. These search operators can be used with any metaheuristic capable of handling mixed search spaces. We tested our algorithmic framework with an evolutionary algorithm on a time series prediction benchmark. The results demonstrate that our framework was able to find models outperforming the established baseline on numerous datasets.
 [2] arXiv:2303.12803 [pdf, other]

Title: Evolving Populations of Diverse RL Agents with MAPElitesSubjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI)
Quality Diversity (QD) has emerged as a powerful alternative optimization paradigm that aims at generating large and diverse collections of solutions, notably with its flagship algorithm MAPELITES (ME) which evolves solutions through mutations and crossovers. While very effective for some unstructured problems, early ME implementations relied exclusively on random search to evolve the population of solutions, rendering them notoriously sampleinefficient for highdimensional problems, such as when evolving neural networks. Followup works considered exploiting gradient information to guide the search in order to address these shortcomings through techniques borrowed from either BlackBox Optimization (BBO) or Reinforcement Learning (RL). While mixing RL techniques with ME unlocked stateoftheart performance for robotics control problems that require a good amount of exploration, it also plagued these ME variants with limitations common among RL algorithms that ME was free of, such as hyperparameter sensitivity, high stochasticity as well as training instability, including when the population size increases as some components are shared across the population in recent approaches. Furthermore, existing approaches mixing ME with RL tend to be tied to a specific RL algorithm, which effectively prevents their use on problems where the corresponding RL algorithm fails. To address these shortcomings, we introduce a flexible framework that allows the use of any RL algorithm and alleviates the aforementioned limitations by evolving populations of agents (whose definition include hyperparameters and all learnable parameters) instead of just policies. We demonstrate the benefits brought about by our framework through extensive numerical experiments on a number of robotics control problems, some of which with deceptive rewards, taken from the QDRL literature.
 [3] arXiv:2303.13080 [pdf, other]

Title: MSAT: Biologically Inspired MultiStage Adaptive Threshold for Conversion of Spiking Neural NetworksSubjects: Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG)
Spiking Neural Networks (SNNs) can do inference with low power consumption due to their spike sparsity. ANNSNN conversion is an efficient way to achieve deep SNNs by converting welltrained Artificial Neural Networks (ANNs). However, the existing methods commonly use constant threshold for conversion, which prevents neurons from rapidly delivering spikes to deeper layers and causes high time delay. In addition, the same response for different inputs may result in information loss during the information transmission. Inspired by the biological model mechanism, we propose a multistage adaptive threshold (MSAT). Specifically, for each neuron, the dynamic threshold varies with firing history and input properties and is positively correlated with the average membrane potential and negatively correlated with the rate of depolarization. The selfadaptation to membrane potential and input allows a timely adjustment of the threshold to fire spike faster and transmit more information. Moreover, we analyze the Spikes of Inactivated Neurons error which is pervasive in early time steps and propose spike confidence accordingly as a measurement of confidence about the neurons that correctly deliver spikes. We use such spike confidence in early time steps to determine whether to elicit spike to alleviate this error. Combined with the proposed method, we examine the performance on nontrivial datasets CIFAR10, CIFAR100, and ImageNet. We also conduct sentiment classification and speech recognition experiments on the IDBM and Google speech commands datasets respectively. Experiments show nearlossless and lower latency ANNSNN conversion. To the best of our knowledge, this is the first time to build a biologically inspired multistage adaptive threshold for converted SNN, with comparable performance to stateoftheart methods while improving energy efficiency.
 [4] arXiv:2303.13262 [pdf, ps, other]

Title: Noise impact on recurrent neural network with linear activation functionComments: 12 pages, 6 figures, 23 referencesSubjects: Neural and Evolutionary Computing (cs.NE)
In recent years, more and more researchers in the field of neural networks are interested in creating hardware implementations where neurons and the connection between them are realized physically. The physical implementation of ANN fundamentally changes the features of noise influence. In the case hardware ANNs, there are many internal sources of noise with different properties. The purpose of this paper is to study the peculiarities of internal noise propagation in recurrent ANN on the example of echo state network (ESN), to reveal ways to suppress such noises and to justify the stability of networks to some types of noises.
In this paper we analyse ESN in presence of uncorrelated additive and multiplicative white Gaussian noise. Here we consider the case when artificial neurons have linear activation function with different slope coefficients. Starting from studying only one noisy neuron we complicate the problem by considering how the input signal and the memory property affect the accumulation of noise in ESN. In addition, we consider the influence of the main types of coupling matrices on the accumulation of noise. So, as such matrices, we take a uniform matrix and a diagonallike matrices with different coefficients called "blurring" coefficient.
We have found that the general view of variance and signaltonoise ratio of ESN output signal is similar to only one neuron. The noise is less accumulated in ESN with diagonal reservoir connection matrix with large "blurring" coefficient. Especially it concerns uncorrelated multiplicative noise.
Crosslists for Fri, 24 Mar 23
 [5] arXiv:2303.12807 (crosslist from cs.LG) [pdf, other]

Title: Granularball Optimization AlgorithmComments: 10 pages, 22 figuresSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Optimization and Control (math.OC)
The existing intelligent optimization algorithms are designed based on the finest granularity, i.e., a point. This leads to weak global search ability and inefficiency. To address this problem, we proposed a novel multigranularity optimization algorithm, namely granularball optimization algorithm (GBO), by introducing granularball computing. GBO uses many granularballs to cover the solution space. Quite a lot of small and finegrained granularballs are used to depict the important parts, and a little number of large and coarsegrained granularballs are used to depict the inessential parts. Fine multigranularity data description ability results in a higher global search capability and faster convergence speed. In comparison with the most popular and stateoftheart algorithms, the experiments on twenty benchmark functions demonstrate its better performance. The faster speed, higher approximation ability of optimal solution, no hyperparameters, and simpler design of GBO make it an allaround replacement of most of the existing popular intelligent optimization algorithms.
 [6] arXiv:2303.13037 (crosslist from physics.optics) [pdf]

Title: Universal Linear Intensity Transformations Using SpatiallyIncoherent Diffractive ProcessorsComments: 29 Pages, 10 FiguresSubjects: Optics (physics.optics); Neural and Evolutionary Computing (cs.NE)
Under spatiallycoherent light, a diffractive optical network composed of structured surfaces can be designed to perform any arbitrary complexvalued linear transformation between its input and output fieldsofview (FOVs) if the total number (N) of optimizable phaseonly diffractive features is greater than or equal to ~2 Ni x No, where Ni and No refer to the number of useful pixels at the input and the output FOVs, respectively. Here we report the design of a spatiallyincoherent diffractive optical processor that can approximate any arbitrary linear transformation in timeaveraged intensity between its input and output FOVs. Under spatiallyincoherent monochromatic light, the spatiallyvarying intensity point spread functon(H) of a diffractive network, corresponding to a given, arbitrarilyselected linear intensity transformation, can be written as H(m,n;m',n')=h(m,n;m',n')^2, where h is the spatiallycoherent pointspread function of the same diffractive network, and (m,n) and (m',n') define the coordinates of the output and input FOVs, respectively. Using deep learning, supervised through examples of inputoutput profiles, we numerically demonstrate that a spatiallyincoherent diffractive network can be trained to alloptically perform any arbitrary linear intensity transformation between its input and output if N is greater than or equal to ~2 Ni x No. These results constitute the first demonstration of universal linear intensity transformations performed on an input FOV under spatiallyincoherent illumination and will be useful for designing alloptical visual processors that can work with incoherent, natural light.
 [7] arXiv:2303.13117 (crosslist from math.OC) [pdf, other]

Title: RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation ResearchComments: 21 pagesSubjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Reinforcement learning has been applied in operation research and has shown promise in solving large combinatorial optimization problems. However, existing works focus on developing neural network architectures for certain problems. These works lack the flexibility to incorporate recent advances in reinforcement learning, as well as the flexibility of customizing model architectures for operation research problems. In this work, we analyze the endtoend autoregressive models for vehicle routing problems and show that these models can benefit from the recent advances in reinforcement learning with a careful reimplementation of the model architecture. In particular, we reimplemented the Attention Model and trained it with Proximal Policy Optimization (PPO) in CleanRL, showing at least 8 times speed up in training time. We hereby introduce RLOR, a flexible framework for Deep Reinforcement Learning for Operation Research. We believe that a flexible framework is key to developing deep reinforcement learning models for operation research problems. The code of our work is publicly available at https://github.com/cpwan/RLOR.
Replacements for Fri, 24 Mar 23
 [8] arXiv:2301.11777 (replaced) [pdf, other]

Title: Interpreting learning in biological neural networks as zeroorder optimization methodAuthors: Johannes SchmidtHieberSubjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Statistics Theory (math.ST)
 [9] arXiv:2303.04238 (replaced) [pdf, other]

Title: Patch of Invisibility: Naturalistic BlackBox Adversarial Attacks on Object DetectorsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
[ showing up to 1000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, recent, 2303, contact, help (Access key information)