We gratefully acknowledge support from
the Simons Foundation and member institutions.

Electrical Engineering and Systems Science

New submissions

[ total of 89 entries: 1-89 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 26 Nov 20

[1]  arXiv:2011.12354 [pdf, other]
Title: PowerNet: Multi-agent Deep Reinforcement Learning for Scalable Powergrid Control
Comments: 8 pages
Subjects: Systems and Control (eess.SY); Multiagent Systems (cs.MA)

This paper develops an efficient multi-agent deep reinforcement learning algorithm for cooperative controls in powergrids. Specifically, we consider the decentralized inverter-based secondary voltage control problem in distributed generators (DGs), which is first formulated as a cooperative multi-agent reinforcement learning (MARL) problem. We then propose a novel on-policy MARL algorithm, PowerNet, in which each agent (DG) learns a control policy based on (sub-)global reward but local states from its neighboring agents. Motivated by the fact that a local control from one agent has limited impact on agents distant from it, we exploit a novel spatial discount factor to reduce the effect from remote agents, to expedite the training process and improve scalability. Furthermore, a differentiable, learning-based communication protocol is employed to foster the collaborations among neighboring agents. In addition, to mitigate the effects of system uncertainty and random noise introduced during on-policy learning, we utilize an action smoothing factor to stabilize the policy execution. To facilitate training and evaluation, we develop PGSim, an efficient, high-fidelity powergrid simulation platform. Experimental results in two microgrid setups show that the developed PowerNet outperforms a conventional model-based control, as well as several state-of-the-art MARL algorithms. The decentralized learning scheme and high sample efficiency also make it viable to large-scale power grids.

[2]  arXiv:2011.12361 [pdf, ps, other]
Title: A New Approach of Data Pre-processing for Data Compression in Smart Grids
Subjects: Signal Processing (eess.SP)

The conventional approach to pre-process data for compression is to apply transforms such as the Fourier, the Karhunen-Lo\`{e}ve, or wavelet transforms. One drawback from adopting such an approach is that it is independent of the use of the compressed data, which may induce significant optimality losses when measured in terms of final utility (instead of being measured in terms of distortion). We therefore revisit this paradigm by tayloring the data pre-processing operation to the utility function of the decision-making entity using the compressed (and therefore noisy) data. More specifically, the utility function consists of an Lp-norm, which is very relevant in the area of smart grids. Both a linear and a non-linear use-oriented transforms are designed and compared with conventional data pre-processing techniques, showing that the impact of compression noise can be significantly reduced.

[3]  arXiv:2011.12365 [pdf, other]
Title: Online Detection of Low-Quality Synchrophasor Data Considering Frequency Similarity
Comments: 3 pages, 6 figures
Subjects: Systems and Control (eess.SY)

This letter proposes a new approach for online detection of low-quality synchrophasor data under both normal and event conditions. The proposed approach utilizes the features of synchrophasor data in time and frequency domains to distinguish multiple regional PMU signals and detect low-quality synchrophasor data. The proposed approach does not require any offline study and it is more effective to detect low-quality data with apparently indistinguishable profiles. Case studies from recorded synchrophasor measurements verify the effectiveness of the proposed approach.

[4]  arXiv:2011.12398 [pdf, other]
Title: Distribution Conditional Denoising: A Flexible Discriminative Image Denoiser
Authors: Anthony Kelly
Comments: 10 pages, 8 figures, 4 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

A flexible discriminative image denoiser is introduced in which multi-task learning methods are applied to a densoising FCN based on U-Net. The activations of the U-Net model are modified by affine transforms that are a learned function of conditioning inputs. The learning procedure for multiple noise types and levels involves applying a distribution of noise parameters during training to the conditioning inputs, with the same noise parameters applied to a noise generating layer at the input (similar to the approach taken in a denoising autoencoder). It is shown that this flexible denoising model achieves state of the art performance on images corrupted with Gaussian and Poisson noise. It has also been shown that this conditional training method can generalise a fixed noise level U-Net denoiser to a variety of noise levels.

[5]  arXiv:2011.12401 [pdf, other]
Title: Data acquisition and image processing for solar irradiance forecast
Subjects: Image and Video Processing (eess.IV)

The energy available in Micro Grid (MG) that is powered by solar energy is tightly related to the weather conditions in the moment of generation. Very short-term forecast of solar irradiance provides the MG with the capability of automatically controlling the dispatch of energy. To achieve this, we propose a method for statistical quantification of cloud features extracted from long-ware infrared (IR) images to forecast the Clear Sky Index (CSI). The images are obtained using a data acquisition system (DAQ) mounted on a solar tracker. We explain how to remove cyclostationary bias in the data caused by the devices in the own DAQ. We investigate a method to obtain the CSI, after the detrending of Global Horizontal Irradiance (GHI) measurements. We propose a method to fusion multiple exposures of circumsolar visible (VI) light images. We implement a method for extracting physical features using radiometric measurements of the IR camera. We introduce a model to remove from IR images both the effect of the atmosphere scatter radiation, and the effect of the Sun direct radiation. We explain how to model of diffuse radiation of the IR camera window, which is produce by water spots and dust particles stack to the germanium lens of the DAQ enclosure. The frames, that were used to model the camera window, are selected using an atmospheric condition model. This model classifies the sky four different categories: clear, cumulus, stratus, and nimbus. We introduce a geometric transformation of the size of the pixels to their actual dimension in a plane of the atmosphere which is at a given height. This transformation is performed according to the elevation angle of the Sun and field of view (FOV) of the camera. We compare the error between the transformation and anapproximation of transformation.

[6]  arXiv:2011.12429 [pdf]
Title: Fully Automated Mitral Inflow Doppler Analysis Using Deep Learning
Journal-ref: IEEE BIBE 2020 Proceedings
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Echocardiography (echo) is an indispensable tool in a cardiologist's diagnostic armamentarium. To date, almost all echocardiographic parameters require time-consuming manual labeling and measurements by an experienced echocardiographer and exhibit significant variability, owing to the noisy and artifact-laden nature of echo images. For example, mitral inflow (MI) Doppler is used to assess left ventricular (LV) diastolic function, which is of paramount clinical importance to distinguish between different cardiac diseases. In the current work we present a fully automated workflow which leverages deep learning to a) label MI Doppler images acquired in an echo study, b) detect the envelope of MI Doppler signal, c) extract early and late filing (E and A wave) flow velocities and E-wave deceleration time from the envelope. We trained a variety of convolutional neural networks (CNN) models on 5544 images of 140 patients for predicting 24 image classes including MI Doppler images and obtained overall accuracy of 0.97 on 1737 images of 40 patients. Automated E and A wave velocity showed excellent correlation (Pearson R 0.99 and 0.98 respectively) and Bland Altman agreement (mean difference 0.06 and 0.05 m/s respectively and SD 0.03 for both) with the operator measurements. Deceleration time also showed good but lower correlation (Pearson R 0.82) and Bland-Altman agreement (mean difference: 34.1ms, SD: 30.9ms). These results demonstrate feasibility of Doppler echocardiography measurement automation and the promise of a fully automated echocardiography measurement package.

[7]  arXiv:2011.12436 [pdf]
Title: Characterisation of CMOS Image Sensor Performance in Low Light Automotive Applications
Journal-ref: Irish Machine Vision and Image Processing Conference Proceedings 2017
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

The applications of automotive cameras in Advanced Driver-Assistance Systems (ADAS) are growing rapidly as automotive manufacturers strive to provide 360 degree protection for their customers. Vision systems must capture high quality images in both daytime and night-time scenarios in order to produce the large informational content required for software analysis in applications such as lane departure, pedestrian detection and collision detection. The challenge in producing high quality images in low light scenarios is that the signal to noise ratio is greatly reduced. This can result in noise becoming the dominant factor in a captured image thereby making these safety systems less effective at night. This paper outlines a systematic method for characterisation of state of the art image sensor performance in response to noise, so as to improve the design and performance of automotive cameras in low light scenarios. The experiment outlined in this paper demonstrates how this method can be used to characterise the performance of CMOS image sensors in response to electrical noise on the power supply lines.

[8]  arXiv:2011.12485 [pdf, other]
Title: Single-Image Lens Flare Removal
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Lens flare is a common artifact in photographs occurring when the camera is pointed at a strong light source. It is caused by either multiple reflections within the lens or scattering due to scratches or dust on the lens, and may appear in a wide variety of patterns: halos, streaks, color bleeding, haze, etc. The diversity in its appearance makes flare removal extremely challenging. Existing software methods make strong assumptions about the artifacts' geometry or brightness, and thus only handle a small subset of flares. We take a principled approach to explicitly model the optical causes of flare, which leads to a novel semi-synthetic pipeline for generating flare-corrupted images from both empirical and wave-optics-simulated lens flares. Using the semi-synthetic data generated by this pipeline, we build a neural network to remove lens flare. Experiments show that our model generalizes well to real lens flares captured by different devices, and outperforms start-of-the-art methods by 3dB in PSNR.

[9]  arXiv:2011.12515 [pdf, other]
Title: MetaSensing: Intelligent Metasurface Assisted RF 3D Sensing by Deep Reinforcement Learning
Comments: 36 pages, 13 figures
Subjects: Signal Processing (eess.SP); Systems and Control (eess.SY)

Using RF signals for wireless sensing has gained increasing attention. However, due to the unwanted multi-path fading in uncontrollable radio environments, the accuracy of RF sensing is limited. Instead of passively adapting to the environment, in this paper, we consider the scenario where an intelligent metasurface is deployed for sensing the existence and locations of 3D objects. By programming its beamformer patterns, the metasurface can provide desirable propagation properties. However, achieving a high sensing accuracy is challenging, since it requires the joint optimization of the beamformer patterns and mapping of the received signals to the sensed outcome. To tackle this challenge, we formulate an optimization problem for minimizing the cross-entropy loss of the sensing outcome, and propose a deep reinforcement learning algorithm to jointly compute the optimal beamformer patterns and the mapping of the received signals. Simulation results verify the effectiveness of the proposed algorithm and show how the sizes of the metasurface and the target space influence the sensing accuracy.

[10]  arXiv:2011.12525 [pdf, other]
Title: Noise2Context: Context-assisted Learning 3D Thin-layer Low Dose CT Without Clean Data
Subjects: Image and Video Processing (eess.IV); Medical Physics (physics.med-ph)

Computed tomography (CT) has played a vital role in medical diagnosis, assessment, and therapy planning, etc. In clinical practice, concerns about the increase of X-ray radiation exposure attract more and more attention. To lower the X-ray radiation, low-dose CT is often used in certain scenarios, while it will induce the degradation of CT image quality. In this paper, we proposed a training method that trained denoising neural networks without any paired clean data. we trained the denoising neural network to map one noise LDCT image to its two adjacent LDCT images in a singe 3D thin-layer low-dose CT scanning, simultaneously In other words, with some latent assumptions, we proposed an unsupervised loss function with the integration of the similarity between adjacent CT slices in 3D thin-layer lowdose CT to train the denoising neural network in an unsupervised manner. For 3D thin-slice CT scanning, the proposed virtual supervised loss function was equivalent to a supervised loss function with paired noisy and clean samples when the noise in the different slices from a single scan was uncorrelated and zero-mean. Further experiments on Mayo LDCT dataset and a realistic pig head were carried out and demonstrated superior performance over existing unsupervised methods.

[11]  arXiv:2011.12558 [pdf]
Title: Signal Sets on Time Scales with Application to Hybrid Systems
Authors: Ti-Chung Lee (Senior Member IEEE), Ying Tan (Senior Member IEEE), Iven Mareels (Fellow, IEEE)
Comments: 8 pages, Just submitted to IEEE TAC
Subjects: Systems and Control (eess.SY)

Recently, time scales calculus is developed to unify continuous and discrete analysis. By extending the definition of time scales properly, this paper introduces the concept of a signal set as well as its stability properties in terms of the so-called pseudo distance measure. This leads to more general Lyapunov like conditions to check stability properties of systems with hybrid nature. By way of examples, the proposed framework is used to model hybrid systems with simplicity and flexibility to characterize trajectories in the behavior of hybrid systems.

[12]  arXiv:2011.12564 [pdf]
Title: Soft-Median Choice: An Automatic Feature Smoothing Method for Sound Event Detection
Comments: 5 pages, 6 figures, 1 table
Subjects: Audio and Speech Processing (eess.AS)

In existing Sound Event Detection (SED) algorithms, the roughness of extracted feature causes decline of precision and recall. In order to solve this problem, a novel automatic feature smoothing algorithm based on Soft-Median Choice is proposed. Firstly, in Convolutional Recurrent Neural Network (CRNN), 1-dimension (1-D) convolutional layers are added to extract more information temporally. Secondly, a novel module Median Choice with median filters and a Linear Choice is applied in CRNN to automatically get the knowledge of the features with different smoothing levels. Thirdly, a Soft-Median function is designed instead of median function so as to dredge the training path and smooth the training process. In the classifier, Linear Softmax is utilized to avoid the shortcomings of attention. Experiments reveal that the proposed method achieves higher precision and recall than the contrasting algorithms.

[13]  arXiv:2011.12570 [pdf]
Title: Characterization of Multi-Core Fiber Group Delay with Correlation OTDR and Modulation Phase Shift Methods
Comments: This work has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 762055 (BlueSpace Project)
Journal-ref: Optical Fiber Conference (OFC) 2020
Subjects: Signal Processing (eess.SP)

Using a Correlation-OTDR and a modulation phase shift method we characterized four multi-core fibers. The results show that the differential delay depends on the position of the core in the fiber and varies with temperature.

[14]  arXiv:2011.12571 [pdf]
Title: Group Delay Measurements of Multicore Fibers with Correlation Optical Time Domain Reflectometry
Comments: This work was partially funded by the blueSPACE project with funding from the European Union's Horizon 2020 research and innovation programme under grant agreement number 762055 and by the German Federal Ministry of Education and Research (BMBF) under the project OptiCON with grant agreement number 16KIS0989K
Journal-ref: International Conference on Transparent Optical Networks (ICTON) 2020
Subjects: Signal Processing (eess.SP)

Several multi-core fibers (MCF) were characterized using Correlation Optical Time Domain Reflectometry (C-OTDR) in terms of propagation delay and polarization mode dispersion (PMD). The results show that the propagation delay in the cores depends on the position of the core in the fiber and that the differential delay between the cores varies with temperature.

[15]  arXiv:2011.12576 [pdf]
Title: Latency Measurement of 100 km Fiber Using Correlation-OTDR
Comments: This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 762055 (blueSpace Project)
Journal-ref: 20th ITG-Symposium Photonic Networks 2019
Subjects: Signal Processing (eess.SP)

By means of C-OTDR (Correlation - Optical Time Domain Reflectometry), we measured the latency of100 km fiber with an accuracy of a few picoseconds. Based on iterating 49 measurements, we calculated a standard deviation of 12 ps between the round-trip latency values. To verify the reflection measurements, we used a single pass setup without reflector, which showed a maximum difference of only 11 ps.

[16]  arXiv:2011.12610 [pdf, other]
Title: Rank-One Network: An Effective Framework for Image Restoration
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

The principal rank-one (RO) components of an image represent the self-similarity of the image, which is an important property for image restoration. However, the RO components of a corrupted image could be decimated by the procedure of image denoising. We suggest that the RO property should be utilized and the decimation should be avoided in image restoration. To achieve this, we propose a new framework comprised of two modules, i.e., the RO decomposition and RO reconstruction. The RO decomposition is developed to decompose a corrupted image into the RO components and residual. This is achieved by successively applying RO projections to the image or its residuals to extract the RO components. The RO projections, based on neural networks, extract the closest RO component of an image. The RO reconstruction is aimed to reconstruct the important information, respectively from the RO components and residual, as well as to restore the image from this reconstructed information. Experimental results on four tasks, i.e., noise-free image super-resolution (SR), realistic image SR, gray-scale image denoising, and color image denoising, show that the method is effective and efficient for image restoration, and it delivers superior performance for realistic image SR and color image denoising.

[17]  arXiv:2011.12639 [pdf, other]
Title: Computation of Stabilizing and Relatively Optimal Feedback Control Laws Based on Demonstrations
Comments: 34 pages, 21 figures
Subjects: Systems and Control (eess.SY)

Assume a demonstrator that for any given state of a control system produces control input that steers the system toward the equilibrium. In this paper, we present an algorithm that uses such a demonstrator to compute a feedback control law that steers the system toward the equilibrium from any given state, and that, in addition, inherits optimality guarantees from the demonstrator. The resulting feedback control law is based on switched LQR tracking, and hence the resulting controller is much simpler and allows for a much more efficient implementation than a control law based on the direct usage of a typical demonstrator. Our algorithm is inspired by techniques from robot motion planning such as simulation based LQR trees, but also produces a Lyapunov-like function that provides a certificate for the stability of the resulting controller. And moreover, we provide rigorous convergence and optimality results for the convergence of the algorithm itself.

[18]  arXiv:2011.12643 [pdf, other]
Title: The Unreasonable Effectiveness of Encoder-Decoder Networks for Retinal Vessel Segmentation
Journal-ref: In: Fu H., Garvin M.K., MacGillivray T., Xu Y., Zheng Y. (eds) Ophthalmic Medical Image Analysis. OMIA 2020. Lecture Notes in Computer Science, vol 12069. Springer, Cham
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

We propose an encoder-decoder framework for the segmentation of blood vessels in retinal images that relies on the extraction of large-scale patches at multiple image-scales during training. Experiments on three fundus image datasets demonstrate that this approach achieves state-of-the-art results and can be implemented using a simple and efficient fully-convolutional network with a parameter count of less than 0.8M. Furthermore, we show that this framework - called VLight - avoids overfitting to specific training images and generalizes well across different datasets, which makes it highly suitable for real-world applications where robustness, accuracy as well as low inference time on high-resolution fundus images is required.

[19]  arXiv:2011.12652 [pdf, other]
Title: Evaluation of quality measures for color quantization
Authors: Giuliana Ramella
Comments: Preprint
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Visual quality evaluation is one of the challenging basic problems in image processing. It also plays a central role in the shaping, implementation, optimization, and testing of many methods. The existing image quality assessment methods focused on images corrupted by common degradation types while little attention was paid to color quantization. This in spite there is a wide range of applications requiring color quantization assessment being used as a preprocessing step when color-based tasks are more efficiently accomplished on a reduced number of colors. In this paper, we propose and carry-out a quantitative performance evaluation of nine well-known and commonly used full-reference image quality assessment measures. The evaluation is done by using two publicly available and subjectively rated image quality databases for color quantization degradation and by considering suitable combinations or subparts of them. The results indicate the quality measures that have closer performances in terms of their correlation to the subjective human rating and show that the evaluation of the statistical performance of the quality measures for color quantization is significantly impacted by the selected image quality database while maintaining a similar trend on each database. The detected strong similarity both on individual databases and on databases obtained by integration provides the ability to validate the integration process and to consider the quantitative performance evaluation on each database as an indicator for performance on the other databases. The experimental results are useful to address the choice of suitable quality measures for color quantization and to improve their future employment.

[20]  arXiv:2011.12657 [pdf, other]
Title: Zero-Shot Audio Classification with Factored Linear and Nonlinear Acoustic-Semantic Projections
Comments: Submitted to ICASSP 2021
Subjects: Audio and Speech Processing (eess.AS)

In this paper, we study zero-shot learning in audio classification through factored linear and nonlinear acoustic-semantic projections between audio instances and sound classes. Zero-shot learning in audio classification refers to classification problems that aim at recognizing audio instances of sound classes, which have no available training data but only semantic side information. In this paper, we address zero-shot learning by employing factored linear and nonlinear acoustic-semantic projections. We develop factored linear projections by applying rank decomposition to a bilinear model, and use nonlinear activation functions, such as tanh, to model the non-linearity between acoustic embeddings and semantic embeddings. Compared with the prior bilinear model, experimental results show that the proposed projection methods are effective for improving classification performance of zero-shot learning in audio classification.

[21]  arXiv:2011.12696 [pdf, other]
Title: Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG)

Bootstrapping speech recognition on limited data resources has been an area of active research for long. The recent transition to all-neural models and end-to-end (E2E) training brought along particular challenges as these models are known to be data hungry, but also came with opportunities around language-agnostic representations derived from multilingual data as well as shared word-piece output representations across languages that share script and roots.Here, we investigate the effectiveness of different strategies to bootstrap an RNN Transducer (RNN-T) based automatic speech recognition (ASR) system in the low resource regime,while exploiting the abundant resources available in other languages as well as the synthetic audio from a text-to-speech(TTS) engine. Experiments show that the combination of a multilingual RNN-T word-piece model, post-ASR text-to-text mapping, and synthetic audio can effectively bootstrap an ASR system for a new language in a scalable fashion with little target language data.

[22]  arXiv:2011.12706 [pdf, other]
Title: Quantized Neural Networks for Radar Interference Mitigation
Comments: ITEM Workshop at ECML-PKDD 2020
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

Radar sensors are crucial for environment perception of driver assistance systems as well as autonomous vehicles. Key performance factors are weather resistance and the possibility to directly measure velocity. With a rising number of radar sensors and the so far unregulated automotive radar frequency band, mutual interference is inevitable and must be dealt with. Algorithms and models operating on radar data in early processing stages are required to run directly on specialized hardware, i.e. the radar sensor. This specialized hardware typically has strict resource-constraints, i.e. a low memory capacity and low computational power. Convolutional Neural Network (CNN)-based approaches for denoising and interference mitigation yield promising results for radar processing in terms of performance. However, these models typically contain millions of parameters, stored in hundreds of megabytes of memory, and require additional memory during execution. In this paper we investigate quantization techniques for CNN-based denoising and interference mitigation of radar signals. We analyze the quantization potential of different CNN-based model architectures and sizes by considering (i) quantized weights and (ii) piecewise constant activation functions, which results in reduced memory requirements for model storage and during the inference step respectively.

[23]  arXiv:2011.12724 [pdf, ps, other]
Title: Handling Initial Conditions in Vector Fitting for Real Time Modeling of Power System Dynamics
Subjects: Systems and Control (eess.SY)

This paper develops a predictive modeling algorithm, denoted as Real-Time Vector Fitting (RTVF), which is capable of approximating the real-time linearized dynamics of multi-input multi-output (MIMO) dynamical systems via rational transfer function matrices. Based on a generalization of the well-known Time-Domain Vector Fitting (TDVF) algorithm, RTVF is suitable for online modeling of dynamical systems which experience both initial-state decay contributions in the measured output signals and concurrently active input signals. These adaptations were specifically contrived to meet the needs currently present in the electrical power systems community, where real-time modeling of low frequency power system dynamics is becoming an increasingly coveted tool by power system operators. After introducing and validating the RTVF scheme on synthetic test cases, this paper presents a series of numerical tests on high-order closed-loop generator systems in the IEEE 39-bus test system.

[24]  arXiv:2011.12772 [pdf, ps, other]
Title: Event-triggered Feedback Control for Signal Temporal Logic Tasks
Comments: Conference on Decision and Control (2018), 6 pages
Subjects: Systems and Control (eess.SY)

A framework for the event-triggered control synthesis under signal temporal logic (STL) tasks is proposed. In our previous work, a continuous-time feedback control law was designed, using the prescribed performance control technique, to satisfy STL tasks. We replace this continuous-time feedback control law by an event-triggered controller. The event-triggering mechanism is based on a maximum triggering interval and on a norm bound on the difference between the value of the current state and the value of the state at the last triggering instance. Simulations of a multi-agent system quantitatively show the efficacy of using an event-triggered controller to reduce communication and computation efforts.

[25]  arXiv:2011.12775 [pdf, ps, other]
Title: Decentralized Control Barrier Functions for Coupled Multi-Agent Systems under Signal Temporal Logic Tasks
Comments: European Control Conference (2019), 6 pages
Subjects: Systems and Control (eess.SY)

We study the problem of controlling multi-agent systems under a set of signal temporal logic tasks. Signal temporal logic is a formalism that is used to express time and space constraints for dynamical systems. Recent methods to solve the control synthesis problem for single-agent systems under signal temporal logic tasks are, however, subject to a high computational complexity. Methods for multi-agent systems scale at least linearly with the number of agents and induce even higher computational burdens. We propose a computationally-efficient control strategy to solve the multi-agent control synthesis problem that results in a robust satisfaction of a set of signal temporal logic tasks. In particular, a decentralized feedback control law is proposed that is based on time-varying control barrier functions. The obtained control law is discontinuous and formal guarantees are provided by nonsmooth analysis. Simulations show the efficacy of the presented method.

[26]  arXiv:2011.12811 [pdf, ps, other]
Title: Logarithmic Quantization based Symbolic Abstractions for Nonlinear Control Systems
Comments: 6 pages, 3 figures, conference paper
Subjects: Systems and Control (eess.SY)

This paper studies symbolic abstractions for nonlinear control systems using logarithmic quantization. With a logarithmic quantizer, we approximate the state and input sets, and then construct a novel discrete abstraction for nonlinear control systems. A feedback refinement relation between the constructed discrete abstraction and the original system is established. Using the constructed discrete abstraction, the safety controller synthesis problem is studied. With the discrete abstraction and the abstract specification, the existence of a safety controller is investigated, and the algorithm is proposed to compute the abstract controller. Finally, a numerical example is given to illustrate the obtained results.

[27]  arXiv:2011.12816 [pdf, ps, other]
Title: Dynamic Quantization based Symbolic Abstractions for Nonlinear Control Systems
Comments: 6 pages, 1 figures, conference paper
Subjects: Systems and Control (eess.SY)

This paper studies the construction of dynamic symbolic abstractions for nonlinear control systems via dynamic quantization. Since computational complexity is a fundamental problem in the use of discrete abstractions, a dynamic quantizer with a time-varying quantization parameter is first applied to deal with this problem. Due to the dynamic quantizer, a dynamic approximation approach is proposed for the state and input sets. Based on the dynamic approximation, dynamic symbolic abstractions are constructed for nonlinear control systems, and an approximate bisimulation relation is guaranteed for the original system and the constructed dynamic symbolic abstraction. Finally, the obtained results are illustrated through a numerical example from path planning of mobile robots.

[28]  arXiv:2011.12824 [pdf, ps, other]
Title: Symbolic Abstractions for Nonlinear Control Systems via Feedback Refinement Relation
Comments: 9 pages, 8 figures
Journal-ref: Automatica, 2020
Subjects: Systems and Control (eess.SY)

This paper studies the construction of symbolic abstractions for nonlinear control systems via feedback refinement relation. Both the delay-free and time-delay cases are addressed. For the delay-free case, to reduce the computational complexity, we propose a new approximation approach for the state and input sets based on a static quantizer, and then a novel symbolic model is constructed such that the original system and the symbolic model satisfy the feedback refinement relation. For the time-delay case, both static and dynamic quantizers are combined to approximate the state and input sets. This leads to a novel dynamic symbolic model for time-delay control systems, and a feedback refinement relation is established between the original system and the symbolic model. Finally, a numerical example is presented to illustrate the obtained results.

[29]  arXiv:2011.12835 [pdf, other]
Title: Privacy Preserving for Medical Image Analysis via Non-Linear Deformation Proxy
Comments: Submitted to CVPR2021
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

We propose a client-server system which allows for the analysis of multi-centric medical images while preserving patient identity. In our approach, the client protects the patient identity by applying a pseudo-random non-linear deformation to the input image. This results into a proxy image which is sent to the server for processing. The server then returns back the deformed processed image which the client reverts to a canonical form. Our system has three components: 1) a flow-field generator which produces a pseudo-random deformation function, 2) a Siamese discriminator that learns the patient identity from the processed image, 3) a medical image processing network that analyzes the content of the proxy images. The system is trained end-to-end in an adversarial manner. By fooling the discriminator, the flow-field generator learns to produce a bi-directional non-linear deformation which allows to remove and recover the identity of the subject from both the input image and output result. After end-to-end training, the flow-field generator is deployed on the client side and the segmentation network is deployed on the server side. The proposed method is validated on the task of MRI brain segmentation using images from two different datasets. Results show that the segmentation accuracy of our method is similar to a system trained on non-encoded images, while considerably reducing the ability to recover subject identity.

[30]  arXiv:2011.12844 [pdf, other]
Title: Physics-informed neural networks for myocardial perfusion MRI quantification
Comments: Submitted to Medical Image Analysis
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)

Tracer-kinetic models allow for the quantification of kinetic parameters such as blood flow from dynamic contrast-enhanced magnetic resonance (MR) images. Fitting the observed data with multi-compartment exchange models is desirable, as they are physiologically plausible and resolve directly for blood flow and microvascular function. However, the reliability of model fitting is limited by the low signal-to-noise ratio, temporal resolution, and acquisition length. This may result in inaccurate parameter estimates.
This study introduces physics-informed neural networks (PINNs) as a means to perform myocardial perfusion MR quantification, which provides a versatile scheme for the inference of kinetic parameters. These neural networks can be trained to fit the observed perfusion MR data while respecting the underlying physical conservation laws described by a multi-compartment exchange model. Here, we provide a framework for the implementation of PINNs in myocardial perfusion MR.
The approach is validated both in silico and in vivo. In the in silico study, an overall reduction in mean-squared error with the ground-truth parameters was observed compared to a standard non-linear least squares fitting approach. The in vivo study demonstrates that the method produces parameter values comparable to those previously found in literature, as well as providing parameter maps which match the clinical diagnosis of patients.

[31]  arXiv:2011.12853 [pdf, other]
Title: A demodulation procedure for multicarrier signals with slowly-varying carriers
Subjects: Signal Processing (eess.SP)

We propose a causal and implementable procedure to demodulate signals encoded by a multicarrier modulator, with slowly-varying carrier shapes. The intended application is the "sensorless" control of AC motors at low velocity by decoding the PWM-induced current ripple.

[32]  arXiv:2011.12857 [pdf, other]
Title: Convolutional Neural Networks for cytoarchitectonic brain mapping at large scale
Comments: Preprint submitted to NeuroImage
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Human brain atlases provide spatial reference systems for data characterizing brain organization at different levels, coming from different brains. Cytoarchitecture is a basic principle of the microstructural organization of the brain, as regional differences in the arrangement and composition of neuronal cells are indicators of changes in connectivity and function. Automated scanning procedures and observer-independent methods are prerequisites to reliably identify cytoarchitectonic areas, and to achieve reproducible models of brain segregation. Time becomes a key factor when moving from the analysis of single regions of interest towards high-throughput scanning of large series of whole-brain sections. Here we present a new workflow for mapping cytoarchitectonic areas in large series of cell-body stained histological sections of human postmortem brains. It is based on a Deep Convolutional Neural Network (CNN), which is trained on a pair of section images with annotations, with a large number of un-annotated sections in between. The model learns to create all missing annotations in between with high accuracy, and faster than our previous workflow based on observer-independent mapping. The new workflow does not require preceding 3D-reconstruction of sections, and is robust against histological artefacts. It processes large data sets with sizes in the order of multiple Terabytes efficiently. The workflow was integrated into a web interface, to allow access without expertise in deep learning and batch computing. Applying deep neural networks for cytoarchitectonic mapping opens new perspectives to enable high-resolution models of brain areas, introducing CNNs to identify borders of brain areas.

[33]  arXiv:2011.12864 [pdf, other]
Title: A Closed-form Localization Method Utilizing Pseudorange Measurements from Two Non-synchronized Positioning Systems
Subjects: Signal Processing (eess.SP)

In a time-of-arrival (TOA) or pseudorange based positioning system, user location is obtained by observing multiple anchor nodes (AN) at known positions. Utilizing more than one positioning systems, e.g., combining Global Positioning System (GPS) and BeiDou Navigation Satellite System (BDS), brings better positioning accuracy. However, ANs from two systems are usually synchronized to two different clock sources. Different from single-system localization, an extra user-to-system clock offset needs to be handled. Existing dual-system methods either have high computational complexity or sub-optimal positioning accuracy. In this paper, we propose a new closed-form dual-system localization (CDL) approach that has low complexity and optimal localization accuracy. We first convert the nonlinear problem into a linear one by squaring the distance equations and employing intermediate variables. Then, a weighted least squares (WLS) method is used to optimize the positioning accuracy. We prove that the positioning error of the new method reaches Cramer-Rao Lower Bound (CRLB) in far field conditions with small measurement noise. Simulations on 2D and 3D positioning scenes are conducted. Results show that, compared with the iterative approach, which has high complexity and requires a good initialization, the new CDL method does not require initialization and has lower computational complexity with comparable positioning accuracy. Numerical results verify the theoretical analysis on positioning accuracy, and show that the new CDL method has superior performance over the state-of-the-art closed-form method. Experiments using real GPS and BDS data verify the applicability of the new CDL method and the superiority of its performance in the real world.

[34]  arXiv:2011.12865 [pdf, other]
Title: Contrastive Representation Learning for Whole Brain Cytoarchitectonic Mapping in Histological Human Brain Sections
Comments: Preprint submitted to ISBI 2021
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)

Cytoarchitectonic maps provide microstructural reference parcellations of the brain, describing its organization in terms of the spatial arrangement of neuronal cell bodies as measured from histological tissue sections. Recent work provided the first automatic segmentations of cytoarchitectonic areas in the visual system using Convolutional Neural Networks. We aim to extend this approach to become applicable to a wider range of brain areas, envisioning a solution for mapping the complete human brain. Inspired by recent success in image classification, we propose a contrastive learning objective for encoding microscopic image patches into robust microstructural features, which are efficient for cytoarchitectonic area classification. We show that a model pre-trained using this learning task outperforms a model trained from scratch, as well as a model pre-trained on a recently proposed auxiliary task. We perform cluster analysis in the feature space to show that the learned representations form anatomically meaningful groups.

[35]  arXiv:2011.12877 [pdf, other]
Title: Error estimate in Second-order Continuous-Time Sigma-Delta modulators
Subjects: Signal Processing (eess.SP)

Continuous-time Sigma-Delta (CT-$\Sigma\Delta$) modulators are oversampling Analog-to-Digital converters that may provide a higher sampling rates and lower power consumption than their discrete counterpart. While approximation errors are established for high order discrete time $\Sigma\Delta$ modulators, theoretical analysis of the error between the filtered output and the input remain scarce. This paper presents a general framework to study this error: under regularity assumptions on the input and the filtering kernel, we prove for a second-order CT-$\Sigma\Delta$ that the error estimate may be in $o(1/N^2)$, where $N$ is the oversampling ratio. The whole theory is ultimately validated through numerical experiments.

[36]  arXiv:2011.12892 [pdf]
Title: Single point positioning using full and fractional pseudorange measurements from GPS and BDS
Subjects: Signal Processing (eess.SP)

In conventional global navigation satellite system (GNSS) receivers, usually full pseudorange measurements are required to complete a single point position fix. However, to obtain full pseudorange measurements takes longer time than for fractional pseudorange measurements. Considering such a fact, in order to shorten the time to first fix and improve the position accuracy during cold or warm start of a dual-constellation GNSS receiver, we propose a positioning algorithm using full and fractional pseudorange measurements from the two navigational constellations. This method uses four full pseudorange measurements from one constellation along with fractional ones from either or both constellations to obtain a potentially rapid position result with an identical accuracy to that of the conventional positioning method using full measurements. Tests with simulated and real Global Positioning System (GPS) and BeiDou Navigation Satellite System (BDS) data demonstrate that the proposed method can generate correct single point position solutions and the position error is identical with the result from the conventional approach using the full pseudorange measurements.

[37]  arXiv:2011.12941 [pdf, other]
Title: Small Footprint Convolutional Recurrent Networks for Streaming Wakeword Detection
Comments: \c{opyright} 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Subjects: Audio and Speech Processing (eess.AS)

In this work, we propose small footprint Convolutional Recurrent Neural Network models applied to the problem of wakeword detection and augment them with scaled dot product attention. We find that false accepts compared to Convolutional Neural Network models in a 250k parameter budget can be reduced by 25% with a 10% reduction in parameter size by using CRNNs, and we can get up to 32% improvement at a 50k parameter budget with 75% reduction in parameter size compared to word-level Dense Neural Network models. We discuss solutions to the challenging problem of performing inference on streaming audio with CRNNs, as well as differences in start-end index errors and latency in comparison to CNN, DNN, and DNN-HMM models.

Cross-lists for Thu, 26 Nov 20

[38]  arXiv:1802.01224 (cross-list from math.OC) [pdf, ps, other]
Title: Optimal Control of Left-Invariant Multi-Agent Systems with Asymmetric Formation Constraints
Comments: This work was supported by the Swedish Research Council (VR), Knut och Alice Wallenberg foundation (KAW), the H2020 Project Co4Robots and the H2020 ERC Starting Grant BUCOPHSYS. arXiv admin note: text overlap with arXiv:1808.04612
Journal-ref: 2018 European Control Conference (ECC), 1728-1733
Subjects: Optimization and Control (math.OC); Multiagent Systems (cs.MA); Systems and Control (eess.SY); Dynamical Systems (math.DS)

In this work, we study an optimal control problem for a multi-agent system modeled by an undirected formation graph with nodes describing the kinematics of each agent, given by a left-invariant control system on a Lie group. The agents should avoid collision between them in the workspace. Such a task is done by introducing some potential functions into the cost function for the optimal control problem, corresponding to fictitious forces, induced by the formation constraint among agents, that break the symmetry of the individual agents and the cost functions, and rendering the optimal control problem partially invariant by a Lie group of symmetries. Reduced necessary conditions for the existence of normal extremals are obtained using techniques of variational calculus on manifolds. As an application, we study an optimal control problem for multiple unicycles.

[39]  arXiv:2011.12353 (cross-list from cs.LG) [pdf, other]
Title: FireSRnet: Geoscience-Driven Super-Resolution of Future Fire Risk from Climate Change
Comments: 9 pages, 7 figures, 2 tables. To be published in Tackling Climate Change with Machine Learning workshop at NeurIPS 2020
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV)

With fires becoming increasingly frequent and severe across the globe in recent years, understanding climate change's role in fire behavior is critical for quantifying current and future fire risk. However, global climate models typically simulate fire behavior at spatial scales too coarse for local risk assessments. Therefore, we propose a novel approach towards super-resolution (SR) enhancement of fire risk exposure maps that incorporates not only 2000 to 2020 monthly satellite observations of active fires but also local information on land cover and temperature. Inspired by SR architectures, we propose an efficient deep learning model trained for SR on fire risk exposure maps. We evaluate this model on resolution enhancement and find it outperforms standard image interpolation techniques at both 4x and 8x enhancement while having comparable performance at 2x enhancement. We then demonstrate the generalizability of this SR model over northern California and New South Wales, Australia. We conclude with a discussion and application of our proposed model to climate model simulations of fire risk in 2040 and 2100, illustrating the potential for SR enhancement of fire risk maps from the latest state-of-the-art climate models.

[40]  arXiv:2011.12360 (cross-list from cs.RO) [pdf, other]
Title: A reinforcement learning control approach for underwater manipulation under position and torque constraints
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

In marine operations underwater manipulators play a primordial role. However, due to uncertainties in the dynamic model and disturbances caused by the environment, low-level control methods require great capabilities to adapt to change. Furthermore, under position and torque constraints the requirements for the control system are greatly increased. Reinforcement learning is a data driven control technique that can learn complex control policies without the need of a model. The learning capabilities of these type of agents allow for great adaptability to changes in the operative conditions. In this article we present a novel reinforcement learning low-level controller for the position control of an underwater manipulator under torque and position constraints. The reinforcement learning agent is based on an actor-critic architecture using sensor readings as state information. Simulation results using the Reach Alpha 5 underwater manipulator show the advantages of the proposed control strategy.

[41]  arXiv:2011.12362 (cross-list from math.OC) [pdf, ps, other]
Title: A Fixed-Time Stable Adaptation Law for Safety-Critical Control under Parametric Uncertainty
Comments: 8 pages, 4 figures, 4 tables, submitted to 2021 European Control Conference, under review
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

We present a novel technique for solving the problem of safe control for a general class of nonlinear, control-affine systems subject to parametric model uncertainty. Invoking Lyapunov analysis and the notion of fixed-time stability (FxTS), we introduce a parameter adaptation law which guarantees convergence of the estimates of unknown parameters in the system dynamics to their true values within a fixed-time independent of the initial parameter estimation error. We then synthesize the adaptation law with a robust, adaptive control barrier function (RaCBF) based quadratic program to compute safe control inputs despite the considered model uncertainty. To corroborate our results, we undertake a comparative case study on the efficacy of this result versus other recent approaches in the literature to safe control under uncertainty, and close by highlighting the value of our method in the context of an automobile overtake scenario.

[42]  arXiv:2011.12475 (cross-list from cs.IT) [pdf, other]
Title: Dynamic Hybrid Precoding Relying on Twin-Resolution Phase Shifters in Millimeter-Wave Communication Systems
Comments: 16 pages, 18 figures. Accepted in IEEE transactions on wireless communications
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

Hybrid analog/digital precoding in millimeter-wave (mmWave) multi-input multi-ouput (MIMO) systems is capable of achieving the near-optimal full-digital performance at reduced hardware cost and power consumption compared to its full-RF digital counterpart. However, having numerous phase shifters is still costly, especially when the phase shifters are of high resolution. In this paper, we propose a novel twin-resolution phase-shifter network for mmWave MIMO systems, which reduces the power consumption of an entirely high-resolution network, whilst mitigating the severe array gain reduction of an entirely low-resolution network. The connections between the twin phase shifters having different resolutions and the antennas are either fixed or dynamically configured. In the latter, we jointly design the phase-shifter network and the hybrid precoding matrix, where the phase of each entry in the analog precoding matrix can be dynamically designed according to the required resolution. This method is slightly modified for the fixed network's hybrid precoding matrix. Furthermore, we extend the proposed method to multi-user MIMO systems and provide its performance analysis. Our simulation results show that the proposed dynamic hybrid precoding method strikes an attractive performance vs. power consumption trade-off.

[43]  arXiv:2011.12536 (cross-list from cs.SD) [pdf, ps, other]
Title: Vocal Tract Length Perturbation for Text-Dependent Speaker Verification with Autoregressive Prediction Coding
Authors: Achintya kr. Sarkar, Zheng-Hua Tan (Senior Member, IEEE)
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)

In this letter, we propose a vocal tract length (VTL) perturbation method for text-dependent speaker verification (TD-SV), in which a set of TD-SV systems are trained, one for each VTL factor, and score-level fusion is applied to make a final decision. Next, we explore the bottleneck (BN) feature extracted by training deep neural networks with a self-supervised objective, autoregressive predictive coding (APC), for TD-SV and compare it with the well-studied speaker-discriminant BN feature. The proposed VTL method is then applied to APC and speaker-discriminant BN features. In the end, we combine the VTL perturbation systems trained on MFCC and the two BN features in the score domain. Experiments are performed on the RedDots challenge 2016 database of TD-SV using short utterances with Gaussian mixture model-universal background model and i-vector techniques. Results show the proposed methods significantly outperform the baselines.

[44]  arXiv:2011.12539 (cross-list from cs.LG) [pdf, other]
Title: Leveraging Predictions in Smoothed Online Convex Optimization via Gradient-based Algorithms
Authors: Yingying Li, Na Li
Comments: NeurIPS 2020
Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY); Optimization and Control (math.OC)

We consider online convex optimization with time-varying stage costs and additional switching costs. Since the switching costs introduce coupling across all stages, multi-step-ahead (long-term) predictions are incorporated to improve the online performance. However, longer-term predictions tend to suffer from lower quality. Thus, a critical question is: how to reduce the impact of long-term prediction errors on the online performance? To address this question, we introduce a gradient-based online algorithm, Receding Horizon Inexact Gradient (RHIG), and analyze its performance by dynamic regrets in terms of the temporal variation of the environment and the prediction errors. RHIG only considers at most $W$-step-ahead predictions to avoid being misled by worse predictions in the longer term. The optimal choice of $W$ suggested by our regret bounds depends on the tradeoff between the variation of the environment and the prediction accuracy. Additionally, we apply RHIG to a well-established stochastic prediction error model and provide expected regret and concentration bounds under correlated prediction errors. Lastly, we numerically test the performance of RHIG on quadrotor tracking problems.

[45]  arXiv:2011.12540 (cross-list from math.OC) [pdf, other]
Title: Community Energy Storage-based Energy Trading Management for Cost Benefits and Network Support
Comments: Accepted appear in the proceedings of International Conference on Smart Grids and Energy Systems(SGES), 2020
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

In this paper, the extent to which the integration of rooftop photovoltaic (PV) power with a community energy storage (CES) system can reduce energy cost and distribution network (DN) loss is explored. To this end, three energy trading systems (ETSs) are compared; first, an ETS where PV users exchange energy with the CES system in addition to the grid, second, an ETS where PV users merely exchange energy with the CES system, and third, an ETS where PV users only exchange energy with the grid. A multi-objective optimization framework, combined with a linear distribution network power flow model, is developed to study the trade-off between the energy cost and network power loss reductions while satisfying the DN voltage and current flow limits. Simulations, with real energy demand and PV power data, highlight that enabling the energy exchange between the users and the CES system can give a better trade-off between the DN power loss and energy cost reductions. Further, simulations demonstrate that all three ETSs deliver nearly 85% DN energy loss reduction with significantly increased revenues compared to an ETS without a CES system.

[46]  arXiv:2011.12569 (cross-list from cs.RO) [pdf, other]
Title: Learning Certified Control using Contraction Metric
Comments: Accepted to Conference on Robot Learning (CoRL) 2020
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

In this paper, we solve the problem of finding a certified control policy that drives a robot from any given initial state and under any bounded disturbance to the desired reference trajectory, with guarantees on the convergence or bounds on the tracking error. Such a controller is crucial in safe motion planning. We leverage the advanced theory in Control Contraction Metric and design a learning framework based on neural networks to co-synthesize the contraction metric and the controller for control-affine systems. We further provide methods to validate the convergence and bounded error guarantees. We demonstrate the performance of our method using a suite of challenging robotic models, including models with learned dynamics as neural networks. We compare our approach with leading methods using sum-of-squares programming, reinforcement learning, and model predictive control. Results show that our methods indeed can handle a broader class of systems with less tracking error and faster execution speed. Code is available at https://github.com/sundw2014/C3M.

[47]  arXiv:2011.12596 (cross-list from cs.SD) [pdf, other]
Title: MTCRNN: A multi-scale RNN for directed audio texture synthesis
Authors: M. Huzaifah, L. Wyse
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)

Audio textures are a subset of environmental sounds, often defined as having stable statistical characteristics within an adequately large window of time but may be unstructured locally. They include common everyday sounds such as from rain, wind, and engines. Given that these complex sounds contain patterns on multiple timescales, they are a challenge to model with traditional methods. We introduce a novel modelling approach for textures, combining recurrent neural networks trained at different levels of abstraction with a conditioning strategy that allows for user-directed synthesis. We demonstrate the model's performance on a variety of datasets, examine its performance on various metrics, and discuss some potential applications.

[48]  arXiv:2011.12649 (cross-list from cs.CL) [pdf, other]
Title: Neural Representations for Modeling Variation in English Speech
Comments: Submitted to Journal of Phonetics
Subjects: Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)

Variation in speech is often represented and investigated using phonetic transcriptions, but transcribing speech is time-consuming and error prone. To create reliable representations of speech independent from phonetic transcriptions, we investigate the extraction of acoustic embeddings from several self-supervised neural models. We use these representations to compute word-based pronunciation differences between non-native and native speakers of English, and evaluate these differences by comparing them with human native-likeness judgments. We show that Transformer-based speech representations lead to significant performance gains over the use of phonetic transcriptions, and find that feature-based use of Transformer models is most effective with one or more middle layers instead of the final layer. We also demonstrate that these neural speech representations not only capture segmental differences, but also intonational and durational differences that cannot be represented by a set of discrete symbols used in phonetic transcriptions.

[49]  arXiv:2011.12688 (cross-list from cs.CV) [pdf, other]
Title: Reduced Reference Perceptual Quality Model and Application to Rate Control for 3D Point Cloud Compression
Comments: 14 figures and 7 tables, submitted to IEEE T IP
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

In rate-distortion optimization, the encoder settings are determined by maximizing a reconstruction quality measure subject to a constraint on the bit rate. One of the main challenges of this approach is to define a quality measure that can be computed with low computational cost and which correlates well with perceptual quality. While several quality measures that fulfil these two criteria have been developed for images and video, no such one exists for 3D point clouds. We address this limitation for the video-based point cloud compression (V-PCC) standard by proposing a linear perceptual quality model whose variables are the V-PCC geometry and color quantization parameters and whose coefficients can easily be computed from two features extracted from the original 3D point cloud. Subjective quality tests with 400 compressed 3D point clouds show that the proposed model correlates well with the mean opinion score, outperforming state-of-the-art full reference objective measures in terms of Spearman rank-order and Pearsons linear correlation coefficient. Moreover, we show that for the same target bit rate, ratedistortion optimization based on the proposed model offers higher perceptual quality than rate-distortion optimization based on exhaustive search with a point-to-point objective quality metric.

[50]  arXiv:2011.12690 (cross-list from cs.LG) [pdf, other]
Title: DeepKoCo: Efficient latent planning with an invariant Koopman representation
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Systems and Control (eess.SY)

This paper presents DeepKoCo, a novel model-based agent that learns a latent Koopman representation from images. This representation allows DeepKoCo to plan efficiently using linear control methods, such as linear model predictive control. Compared to traditional agents, DeepKoCo is invariant to task-irrelevant dynamics, thanks to the use of a tailored lossy autoencoder network that allows DeepKoCo to learn latent dynamics that reconstruct and predict only observed costs, rather than all observed dynamics. As our results show, DeepKoCo achieves a similar final performance as traditional model-free methods on complex control tasks, while being considerably more robust to distractor dynamics, making the proposed agent more amenable for real-life applications.

[51]  arXiv:2011.12707 (cross-list from cs.LG) [pdf, other]
Title: Prediction of neonatal mortality in Sub-Saharan African countries using data-level linkage of multiple surveys
Comments: 3 pages
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Databases (cs.DB); Signal Processing (eess.SP)

Existing datasets available to address crucial problems, such as child mortality and family planning discontinuation in developing countries, are not ample for data-driven approaches. This is partly due to disjoint data collection efforts employed across locations, times, and variations of modalities. On the other hand, state-of-the-art methods for small data problem are confined to image modalities. In this work, we proposed a data-level linkage of disjoint surveys across Sub-Saharan African countries to improve prediction performance of neonatal death and provide cross-domain explainability.

[52]  arXiv:2011.12713 (cross-list from cs.CR) [pdf]
Title: A Secure Deep Probabilistic Dynamic Thermal Line Rating Prediction
Comments: The work is accepted for publication in Journal of Modern Power Systems and Clean Energy
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Signal Processing (eess.SP)

Accurate short-term prediction of overhead line (OHL) transmission ampacity can directly affect the efficiency of power system operation and planning. Any overestimation of the dynamic thermal line rating (DTLR) can lead to lifetime degradation and failure of OHLs, safety hazards, etc. This paper presents a secure yet sharp probabilistic prediction model for the hour-ahead forecasting of the DTLR. The security of the proposed DTLR limits the frequency of DTLR prediction exceeding the actual DTLR. The model is based on an augmented deep learning architecture that makes use of a wide range of predictors, including historical climatology data and latent variables obtained during DTLR calculation. Furthermore, by introducing a customized cost function, the deep neural network is trained to consider the DTLR security based on the required probability of exceedance while minimizing deviations of the predicted DTLRs from the actual values. The proposed probabilistic DTLR is developed and verified using recorded experimental data. The simulation results validate the superiority of the proposed DTLR compared to state-of-the-art prediction models using well-known evaluation metrics.

[53]  arXiv:2011.12735 (cross-list from cs.CV) [pdf, other]
Title: Simple statistical methods for unsupervised brain anomaly detection on MRI are competitive to deep learning methods
Comments: 20 pages, 7 figures, to be submitted to Medical Image Analysis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Statistical analysis of magnetic resonance imaging (MRI) can help radiologists to detect pathologies that are otherwise likely to be missed. Deep learning (DL) has shown promise in modeling complex spatial data for brain anomaly detection. However, DL models have major deficiencies: they need large amounts of high-quality training data, are difficult to design and train and are sensitive to subtle changes in scanning protocols and hardware. Here, we show that also simple statistical methods such as voxel-wise (baseline and covariance) models and a linear projection method using spatial patterns can achieve DL-equivalent (3D convolutional autoencoder) performance in unsupervised pathology detection. All methods were trained (N=395) and compared (N=44) on a novel, expert-curated multiparametric (8 sequences) head MRI dataset of healthy and pathological cases, respectively. We show that these simple methods can be more accurate in detecting small lesions and are considerably easier to train and comprehend. The methods were quantitatively compared using AUC and average precision and evaluated qualitatively on clinical use cases comprising brain atrophy, tumors (small metastases) and movement artefacts. Our results demonstrate that while DL methods may be useful, they should show a sufficiently large performance improvement over simpler methods to justify their usage. Thus, simple statistical methods should provide the baseline for benchmarks. Source code and trained models are available on GitHub (https://github.com/vsaase/simpleBAD).

[54]  arXiv:2011.12738 (cross-list from quant-ph) [pdf, other]
Title: Cosine series quantum sampling method with applications in signal and image processing
Subjects: Quantum Physics (quant-ph); Signal Processing (eess.SP)

A novel family of Cosine series Quantum Sampling (QCoSamp) operators appropriate for quantum computing is described. The development of quantum algorithms, analogous to classical algorithms, we apply to the harmonic analysis of signals. We show quantum sampling through measurements of a quantum system, and after operators of the family are applied, allow for input signal mapping with a Fourier series representation. Technical methodologies employed, facilitating the implementation of each QCoSamp algorithm to a quantum computer and application to the field of signal and image processing we also described.
Keywords: quantum computing, quantum information theory, quantum operator, quantum sampling, Fourier sine-cosine series, signal processing, image processing

[55]  arXiv:2011.12754 (cross-list from cs.SD) [pdf, other]
Title: Feature Selection based on Principal Component Analysis for Underwater Source Localization by Deep Learning
Subjects: Sound (cs.SD); Signal Processing (eess.SP); Atmospheric and Oceanic Physics (physics.ao-ph)

In this paper, we propose an interpretable feature selection method based on principal component analysis (PCA) and principal component regression (PCR), which can extract important features for underwater source localization by only introducing the source location without other prior information. This feature selection method is combined with a two-step framework for underwater source localization based on the semi-supervised learning scheme. In the framework, the first step utilizes a convolutional autoencoder to extract the latent features from the whole available dataset. The second step performs source localization via an encoder multi-layer perceptron (MLP) trained on a limited labeled portion of the dataset. The proposed approach has been validated on the public dataset SwllEx-96 Event S5. The result shows the framework has appealing accuracy and robustness on the unseen data, especially when the number of data used to train gradually decreases. After feature selection, not only the training stage has a 95\% acceleration but the performance of the framework becomes more robust on the depth and more accurate when the number of labeled data used to train is extremely limited.

[56]  arXiv:2011.12815 (cross-list from cs.CV) [pdf, other]
Title: Interpreting U-Nets via Task-Driven Multiscale Dictionary Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

U-Nets have been tremendously successful in many imaging inverse problems. In an effort to understand the source of this success, we show that one can reduce a U-Net to a tractable, well-understood sparsity-driven dictionary model while retaining its strong empirical performance. We achieve this by extracting a certain multiscale convolutional dictionary from the standard U-Net. This dictionary imitates the structure of the U-Net in its convolution, scale-separation, and skip connection aspects, while doing away with the nonlinear parts. We show that this model can be trained in a task-driven dictionary learning framework and yield comparable results to standard U-Nets on a number of relevant tasks, including CT and MRI reconstruction. These results suggest that the success of the U-Net may be explained mainly by its multiscale architecture and the induced sparse representation.

[57]  arXiv:2011.12832 (cross-list from q-bio.NC) [pdf]
Title: External Electromagnetic Wave Excitation of a PreSynaptic Neuron Based on LIF model
Comments: 5pages,4figures,etech2020
Subjects: Neurons and Cognition (q-bio.NC); Systems and Control (eess.SY)

Interaction of electromagnetic (EM) waves with human tissue has been a longstanding research topic for electrical and biomedical engineers. However, few numbers of publications discuss the impacts of external EM-waves on neural stimulation and communication through the nervous system. In fact, complex biological neural channels are a main barrier for intact and comprehensive analyses in this area. One of the everpresent challenges in neural communication responses is dependency of vesicle release probability on the input spiking pattern. In this regard, this study sheds light on consequences of changing the frequency of external EM-wave excitation on the post-synaptic neuron's spiking rate. It is assumed that the penetration depth of the wave in brain does not cover the postsynaptic neuron. Consequently, we model neurotransmission of a bipartite chemical synapse. In addition, the way that external stimulation affects neurotransmission is examined. Unlike multiple frequency component EM-waves, the monochromatic incident wave does not face frequency shift and distortion in dispersive media. In this manner, a single frequency signal is added as external current in the modified leaky integrated-andfire (LIF) model. The results demonstrate existence of a node equilibrium point in the first order dynamical system of LIF model. A fold bifurcation (for presupposed LIF model values) occurs when the external excitation frequency is near 200 Hz. The outcomes provided in this paper enable us to select proper frequency excitation for neural signaling. Correspondingly, the cut-off frequency reliance on elements' values in LIF circuit is found.

[58]  arXiv:2011.12839 (cross-list from cs.AR) [pdf]
Title: Low Latency CMOS Hardware Acceleration for Fully Connected Layers in Deep Neural Networks
Subjects: Hardware Architecture (cs.AR); Neural and Evolutionary Computing (cs.NE); Signal Processing (eess.SP)

We present a novel low latency CMOS hardware accelerator for fully connected (FC) layers in deep neural networks (DNNs). The FC accelerator, FC-ACCL, is based on 128 8x8 or 16x16 processing elements (PEs) for matrix-vector multiplication, and 128 multiply-accumulate (MAC) units integrated with 128 High Bandwidth Memory (HBM) units for storing the pretrained weights. Micro-architectural details for CMOS ASIC implementations are presented and simulated performance is compared to recent hardware accelerators for DNNs for AlexNet and VGG 16. When comparing simulated processing latency for a 4096-1000 FC8 layer, our FC-ACCL is able to achieve 48.4 GOPS (with a 100 MHz clock) which improves on a recent FC8 layer accelerator quoted at 28.8 GOPS with a 150 MHz clock. We have achieved this considerable improvement by fully utilizing the HBM units for storing and reading out column-specific FClayer weights in 1 cycle with a novel colum-row-column schedule, and implementing a maximally parallel datapath for processing these weights with the corresponding MAC and PE units. When up-scaled to 128 16x16 PEs, for 16x16 tiles of weights, the design can reduce latency for the large FC6 layer by 60 % in AlexNet and by 3 % in VGG16 when compared to an alternative EIE solution which uses compression.

[59]  arXiv:2011.12906 (cross-list from cs.CV) [pdf, other]
Title: Open-World Learning Without Labels
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Signal Processing (eess.SP)

Open-world learning is a problem where an autonomous agent detects things that it does not know and learns them over time from a non-stationary and never-ending stream of data; in an open-world environment, the training data and objective criteria are never available at once. The agent should grasp new knowledge from learning without forgetting acquired prior knowledge. Researchers proposed a few open-world learning agents for image classification tasks that operate in complex scenarios. However, all prior work on open-world learning has all labeled data to learn the new classes from the stream of images. In scenarios where autonomous agents should respond in near real-time or work in areas with limited communication infrastructure, human labeling of data is not possible. Therefore, supervised open-world learning agents are not scalable solutions for such applications. Herein, we propose a new framework that enables agents to learn new classes from a stream of unlabeled data in an unsupervised manner. Also, we study the robustness and learning speed of such agents with supervised and unsupervised feature representation. We also introduce a new metric for open-world learning without labels. We anticipate our theories and method to be a starting point for developing autonomous true open-world never-ending learning agents.

[60]  arXiv:2011.12946 (cross-list from math.OC) [pdf, ps, other]
Title: Exploratory LQG Mean Field Games with Entropy Regularization
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY); Probability (math.PR); Machine Learning (stat.ML)

We study a general class of entropy-regularized multi-variate LQG mean field games (MFGs) in continuous time with $K$ distinct sub-population of agents. We extend the notion of actions to action distributions (exploratory actions), and explicitly derive the optimal action distributions for individual agents in the limiting MFG. We demonstrate that the optimal set of action distributions yields an $\epsilon$-Nash equilibrium for the finite-population entropy-regularized MFG. Furthermore, we compare the resulting solutions with those of classical LQG MFGs and establish the equivalence of their existence.

Replacements for Thu, 26 Nov 20

[61]  arXiv:1810.01248 (replaced) [pdf, other]
Title: A Lightweight Music Texture Transfer System
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[62]  arXiv:1810.04059 (replaced) [pdf, other]
Title: Dynamic Optimization with Convergence Guarantees
Comments: Revision of previous submission. Main changes: (i) Extended literature review; (ii) clarified assumptions; (iii) added a section on setting up and solving related NLP, with discussion on sparsity structure; (iv) added numerical examples. Some added material also appears in an IEEE CDC 2020 conference paper preprint (arXiv:2009.06217)
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)
[63]  arXiv:1905.00267 (replaced) [pdf, other]
Title: New Infinite Families of Perfect Quaternion Sequences and Williamson Sequences
Comments: Version accepted for publication
Journal-ref: IEEE Transactions on Information Theory, volume 66, issue 12 (2020) pages 7739-7751
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP); Combinatorics (math.CO)
[64]  arXiv:1908.11416 (replaced) [pdf, other]
Title: Localization of MEG and EEG Brain Signals by Alternating Projection
Subjects: Signal Processing (eess.SP)
[65]  arXiv:1909.01419 (replaced) [pdf, other]
Title: Learning Koopman Eigenfunctions and Invariant Subspaces from Data: Symmetric Subspace Decomposition
Comments: 17 pages
Subjects: Systems and Control (eess.SY); Dynamical Systems (math.DS)
[66]  arXiv:1912.03890 (replaced) [pdf, other]
Title: Distributed Feedback Control of Multi-Channel Linear Systems
Comments: arXiv admin note: text overlap with arXiv:1909.11823
Subjects: Systems and Control (eess.SY)
[67]  arXiv:1912.07182 (replaced) [pdf, ps, other]
Title: Effect of Pixelation on the Parameter Estimation of Single Molecule Trajectories
Journal-ref: IEEE Transactions on Computational Imaging, 2020
Subjects: Signal Processing (eess.SP)
[68]  arXiv:2001.11202 (replaced) [pdf, other]
Title: Image Embedded Segmentation: Uniting Supervised and Unsupervised Objectives for Segmenting Histopathological Images
Comments: This work has been submitted for possible publication
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[69]  arXiv:2003.02089 (replaced) [pdf, ps, other]
Title: Gradient Statistics Aware Power Control for Over-the-Air Federated Learning
Comments: 30 pages, 8 figures
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG)
[70]  arXiv:2003.06812 (replaced) [pdf, other]
Title: Iterative training of neural networks for intra prediction
Comments: 15 pages, 16 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[71]  arXiv:2004.01707 (replaced) [pdf, other]
Title: Data-driven regularization parameter selection in dynamic MRI
Comments: 22 pages, 7 figures
Subjects: Medical Physics (physics.med-ph); Image and Video Processing (eess.IV)
[72]  arXiv:2004.03821 (replaced) [pdf, other]
Title: Energy-efficient Resource Allocation for Mobile Edge Computing Aided by Multiple Relays
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
[73]  arXiv:2004.07774 (replaced) [pdf, ps, other]
Title: Computing all identifiable functions for ODE models
Subjects: Systems and Control (eess.SY); Symbolic Computation (cs.SC); Logic (math.LO); Quantitative Methods (q-bio.QM)
[74]  arXiv:2006.01441 (replaced) [pdf, other]
Title: CT-based COVID-19 Triage: Deep Multitask Learning Improves Joint Identification and Severity Quantification
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[75]  arXiv:2006.06426 (replaced) [pdf, other]
Title: Deep generative models for musical audio synthesis
Authors: M. Huzaifah, L. Wyse
Comments: This is the authors' own pre-submission version of a chapter for Handbook of Artificial Intelligence for Music: Foundations, Advanced Approaches, and Developments for Creativity, edited by Eduardo R. Miranda, for Springer
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
[76]  arXiv:2007.03535 (replaced) [pdf, other]
Title: Light Field Image Super-Resolution Using Deformable Convolution
Comments: Accepted by IEEE Transactions on Image Processing
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[77]  arXiv:2007.11095 (replaced) [pdf, other]
Title: A Lite Distributed Semantic Communication System for Internet of Things
Comments: Accpeted by JSAC
Subjects: Signal Processing (eess.SP)
[78]  arXiv:2008.10271 (replaced) [pdf, other]
Title: Semantic Labeling of Large-Area Geographic Regions Using Multi-View and Multi-Date Satellite Images and Noisy OSM Training Labels
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[79]  arXiv:2008.10601 (replaced) [pdf, ps, other]
Title: String Stable Integral Control of Vehicle Platoons with Actuator Dynamics and Disturbances
Comments: 6 pages, 8 figures, accepted in the 59th IEEE Conference on Decision and Control
Subjects: Systems and Control (eess.SY)
[80]  arXiv:2009.11039 (replaced) [pdf, other]
Title: Zero-inertia Offshore Grids: N-1 Security and Active Power Sharing
Comments: Submitted to "IEEE Transactions on Power Delivery" on October 22, 2020
Subjects: Systems and Control (eess.SY)
[81]  arXiv:2009.11961 (replaced) [pdf, ps, other]
Title: N-BEATS neural network for mid-term electricity load forecasting
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP)
[82]  arXiv:2010.03360 (replaced) [pdf, other]
Title: Interpreting Imagined Speech Waves with Machine Learning techniques
Subjects: Signal Processing (eess.SP); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[83]  arXiv:2010.08123 (replaced) [pdf, other]
Title: Melody Classifier with Stacked-LSTM
Authors: You Li, Zhuowen Lin
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[84]  arXiv:2011.03689 (replaced) [pdf, other]
Title: Detection and Evaluation of human and machine generated speech in spoofing attacks on automatic speaker verification systems
Comments: 6 pages excluding references. Paper accepted by IEEE Spoken Language Technology (SLT) 2021
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)
[85]  arXiv:2011.05755 (replaced) [src]
Title: Cryo-RALib -- a modular library for accelerating alignment in cryo-EM
Comments: We did not clearly describe which part of the library is already implemented in the original EMAN2/gpu isac code. Figures 1 and 2 uses the architecture from the original code and thus is more appropriate to put into Section II. Figure 3 and Algorithm 1 is an extension that we need to describe in more detail to highlight the differences. Therefore, the draft needs to be reorganized
Subjects: Quantitative Methods (q-bio.QM); Distributed, Parallel, and Cluster Computing (cs.DC); Image and Video Processing (eess.IV)
[86]  arXiv:2011.06934 (replaced) [pdf, ps, other]
Title: Neural network for estimation of optical characteristics of optically active and turbid scattering media
Authors: Ali Alavi
Comments: 12 pages, presubmission
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG)
[87]  arXiv:2011.08972 (replaced) [pdf, ps, other]
Title: Reducing the Mutual Outage Probability of Cooperative Non-Orthogonal Multiple Access
Subjects: Signal Processing (eess.SP)
[88]  arXiv:2011.11715 (replaced) [pdf, other]
Title: Multi-task Language Modeling for Improving Speech Recognition of Rare Words
Comments: Submitted to ICASSP 2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[89]  arXiv:2011.12130 (replaced) [pdf]
Title: A Generalizable Model for Fault Detection in Offshore Wind Turbines Based on Deep Learning
Subjects: Systems and Control (eess.SY); Machine Learning (cs.LG); Signal Processing (eess.SP)
[ total of 89 entries: 1-89 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, recent, 2011, contact, help  (Access key information)