We gratefully acknowledge support from
the Simons Foundation and member institutions.

Electrical Engineering and Systems Science

New submissions

[ total of 51 entries: 1-51 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Mon, 17 Jan 22

[1]  arXiv:2201.05213 [pdf, other]
Title: Parallel Neural Local Lossless Compression
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG); Machine Learning (stat.ML)

The recently proposed Neural Local Lossless Compression (NeLLoC), which is based on a local autoregressive model, has achieved state-of-the-art (SOTA) out-of-distribution (OOD) generalization performance in the image compression task. In addition to the encouragement of OOD generalization, the local model also allows parallel inference in the decoding stage. In this paper, we propose a parallelization scheme for local autoregressive models. We discuss the practicalities of implementing this scheme, and provide experimental evidence of significant gains in compression runtime compared to the previous, non-parallel implementation.

[2]  arXiv:2201.05267 [pdf, other]
Title: Bi-level Volt/VAR Optimization in Distribution Networks with Smart PV Inverters
Subjects: Systems and Control (eess.SY)

Optimal Volt/VAR control (VVC) in distribution networks relies on an effective coordination between the conventional utility-owned mechanical devices and the smart residential photovoltaic (PV) inverters. Typically, a central controller carries out a periodic optimization and sends setpoints to the local controller of each device. However, instead of tracking centrally dispatched setpoints, smart PV inverters can cooperate on a much faster timescale to reach optimality within a PV inverter group. To accommodate such PV inverter groups in the VVC architecture, this paper proposes a bi-level optimization framework. The upper-level determines the setpoints of the mechanical devices to minimize the network active power losses, while the lower-level represents the coordinated actions that the inverters take for their own objectives. The interactions between these two levels are captured in the bi-level optimization, which is solved using the Karush-Kuhn-Tucker (KKT) conditions. This framework fully exploits the capabilities of the different types of voltage regulation devices and enables them to cooperatively optimize their goals. Case studies on typical distribution networks with field-recorded data demonstrate the effectiveness and advantages of the proposed approach.

[3]  arXiv:2201.05271 [pdf, ps, other]
Title: Trajectory and Transmit Power Optimization for IRS-Assisted UAV Communication under Malicious Jamming
Comments: IRS-Assisted UAV Communication under Malicious Jamming
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)

In this letter, we investigate an unmanned aerial vehicle (UAV) communication system, where an intelligent reflecting surface (IRS) is deployed to assist in the transmission from a ground node (GN) to the UAV in the presence of a jammer. We aim to maximize the average rate of the UAV communication by jointly optimizing the GN's transmit power, the IRS's passive beamforming and the UAV's trajectory. However, the formulated problem is difficult to solve due to the non-convex objective function and the coupled optimization variables. Thus, to tackle it, we propose an alternating optimization (AO) based algorithm by exploiting the successive convex approximation (SCA) and semidefinite relaxation (SDR) techniques. Simulation results show that the proposed algorithm can significantly improve the average rate compared with the benchmark algorithms. Moreover, it also shows that when the jamming power is large and the number of IRS elements is relatively small, deploying the IRS near the jammer outperforms deploying it near the GN, and vice versa.

[4]  arXiv:2201.05331 [pdf, ps, other]
Title: Semi-automated Virtual Unfolded View Generation Method of Stomach from CT Volumes
Comments: Accepted paper as a poster presentation at MICCAI 2013 (International Conference on Medical Image Computing and Computer-Assisted Intervention), Nagoya, Japan
Journal-ref: Published in Proceedings of MICCAI 2013, LNCS 8149, pp.332-339, 2013
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

CT image-based diagnosis of the stomach is developed as a new way of diagnostic method. A virtual unfolded (VU) view is suitable for displaying its wall. In this paper, we propose a semi-automated method for generating VU views of the stomach. Our method requires minimum manual operations. The determination of the unfolding forces and the termination of the unfolding process are automated. The unfolded shape of the stomach is estimated based on its radius. The unfolding forces are determined so that the stomach wall is deformed to the expected shape. The iterative deformation process is terminated if the difference of the shapes between the deformed shape and expected shape is small. Our experiments using 67 CT volumes showed that our proposed method can generate good VU views for 76.1% cases.

[5]  arXiv:2201.05344 [pdf, other]
Title: AWSnet: An Auto-weighted Supervision Attention Network for Myocardial Scar and Edema Segmentation in Multi-sequence Cardiac Magnetic Resonance Images
Comments: 19 pages, 10 figures, accepted by Medical Image Analysis
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Multi-sequence cardiac magnetic resonance (CMR) provides essential pathology information (scar and edema) to diagnose myocardial infarction. However, automatic pathology segmentation can be challenging due to the difficulty of effectively exploring the underlying information from the multi-sequence CMR data. This paper aims to tackle the scar and edema segmentation from multi-sequence CMR with a novel auto-weighted supervision framework, where the interactions among different supervised layers are explored under a task-specific objective using reinforcement learning. Furthermore, we design a coarse-to-fine framework to boost the small myocardial pathology region segmentation with shape prior knowledge. The coarse segmentation model identifies the left ventricle myocardial structure as a shape prior, while the fine segmentation model integrates a pixel-wise attention strategy with an auto-weighted supervision model to learn and extract salient pathological structures from the multi-sequence CMR data. Extensive experimental results on a publicly available dataset from Myocardial pathology segmentation combining multi-sequence CMR (MyoPS 2020) demonstrate our method can achieve promising performance compared with other state-of-the-art methods. Our method is promising in advancing the myocardial pathology assessment on multi-sequence CMR data. To motivate the community, we have made our code publicly available via https://github.com/soleilssss/AWSnet/tree/master.

[6]  arXiv:2201.05373 [pdf]
Title: A New Deep Hybrid Boosted and Ensemble Learning-based Brain Tumor Analysis using MRI
Comments: 26 pages, 9 figures, 8 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Brain tumors analysis is important in timely diagnosis and effective treatment to cure patients. Tumor analysis is challenging because of tumor morphology like size, location, texture, and heteromorphic appearance in the medical images. In this regard, a novel two-phase deep learning-based framework is proposed to detect and categorize brain tumors in magnetic resonance images (MRIs). In the first phase, a novel deep boosted features and ensemble classifiers (DBF-EC) scheme is proposed to detect tumor MRI images from healthy individuals effectively. The deep boosted feature space is achieved through the customized and well-performing deep convolutional neural networks (CNNs), and consequently, fed into the ensemble of machine learning (ML) classifiers. While in the second phase, a new hybrid features fusion-based brain tumor classification approach is proposed, comprised of dynamic-static feature and ML classifier to categorize different tumor types. The dynamic features are extracted from the proposed BRAIN-RENet CNN, which carefully learns heteromorphic and inconsistent behavior of various tumors, while the static features are extracted using HOG. The effectiveness of the proposed two-phase brain tumor analysis framework is validated on two standard benchmark datasets; collected from Kaggle and Figshare containing different types of tumor, including glioma, meningioma, pituitary, and normal images. Experimental results proved that the proposed DBF-EC detection scheme outperforms and achieved accuracy (99.56%), precision (0.9991), recall (0.9899), F1-Score (0.9945), MCC (0.9892), and AUC-PR (0.9990). While the classification scheme, the joint employment of the deep features fusion of proposed BRAIN-RENet and HOG features improves performance significantly in terms of recall (0.9913), precision (0.9906), F1-Score (0.9909), and accuracy (99.20%) on diverse datasets.

[7]  arXiv:2201.05420 [pdf, other]
Title: A Study of Transducer based End-to-End ASR with ESPnet: Architecture, Auxiliary Loss and Decoding Strategies
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)

In this study, we present recent developments of models trained with the RNN-T loss in ESPnet. It involves the use of various architectures such as recently proposed Conformer, multi-task learning with different auxiliary criteria and multiple decoding strategies, including our own proposition. Through experiments and benchmarks, we show that our proposed systems can be competitive against other state-of-art systems on well-known datasets such as LibriSpeech and AISHELL-1. Additionally, we demonstrate that these models are promising against other already implemented systems in ESPnet in regards to both performance and decoding speed, enabling the possibility to have powerful systems for a streaming task. With these additions, we hope to expand the usefulness of the ESPnet toolkit for the research community and also give tools for the ASR industry to deploy our systems in realistic and production environments.

[8]  arXiv:2201.05483 [pdf, other]
Title: Adaptive Deep PnP Algorithm for Video Snapshot Compressive Imaging
Comments: The code to reproduce the results is at this https URL
Subjects: Image and Video Processing (eess.IV)

Video Snapshot compressive imaging (SCI) is a promising technique to capture high-speed videos, which transforms the imaging speed from the detector to mask modulating and only needs a single measurement to capture multiple frames. The algorithm to reconstruct high-speed frames from the measurement plays a vital role in SCI. In this paper, we consider the promising reconstruction algorithm framework, namely plug-and-play (PnP), which is flexible to the encoding process comparing with other deep learning networks. One drawback of existing PnP algorithms is that they use a pre-trained denoising network as a plugged prior while the training data of the network might be different from the task in real applications. Towards this end, in this work, we propose the online PnP algorithm which can adaptively update the network's parameters within the PnP iteration; this makes the denoising network more applicable to the desired data in the SCI reconstruction. Furthermore, for color video imaging, RGB frames need to be recovered from Bayer pattern or named demosaicing in the camera pipeline.To address this challenge, we design a two-stage reconstruction framework to optimize these two coupled ill-posed problems and introduce a deep demosaicing prior specifically for video demosaicing. Extensive results on both simulation and real datasets verify the superiority of our adaptive deep PnP algorithm.

[9]  arXiv:2201.05490 [pdf, other]
Title: An Almost Globally Stable Adaptive Phase-Locked Loop for Synchronization of a Grid-Connected Voltage Source Converter
Comments: 16 pages, 6 figures
Subjects: Systems and Control (eess.SY)

In this paper we are interested in the problem of adaptive synchronization of a voltage source converter with a possibly weak grid with unknown angle and frequency, but knowledge of its parameters. To guarantee a suitable synchronization with the angle of the three-phase grid voltage we design an adaptive observer for such a signal requiring measurements only at the point of common coupling. Then we propose two alternative certainty-equivalent, adaptive phase-locked loops that ensure the angle estimation error goes to zero for almost all initial conditions. Although well-known, for the sake of completeness, we also present a PI controller with feedforward action that ensures the converter currents converge to an arbitrary desired value. Relevance of the theoretical results and their robustness to variation of the grid parameters are thoroughly discussed and validated in the challenging scenario of a converter connected to a grid with low short-circuit-ratio.

[10]  arXiv:2201.05501 [pdf, ps, other]
Title: Study of Frequency domain exponential functional link network filters
Comments: 32 pages, 17 figures
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

The exponential functional link network (EFLN) filter has attracted tremendous interest due to its enhanced nonlinear modeling capability. However, the computational complexity will dramatically increase with the dimension growth of the EFLN-based filter. To improve the computational efficiency, we propose a novel frequency domain exponential functional link network (FDEFLN) filter in this paper. The idea is to organize the samples in blocks of expanded input data, transform them from time domain to frequency domain, and thus execute the filtering and adaptation procedures in frequency domain with the overlap-save method. A FDEFLN-based nonlinear active noise control (NANC) system has also been developed to form the frequency domain exponential filtered-s least mean-square (FDEFsLMS) algorithm. Moreover, the stability, steady-state performance and computational complexity of algorithms are analyzed. Finally, several numerical experiments corroborate the proposed FDEFLN-based algorithms in nonlinear system identification, acoustic echo cancellation and NANC implementations, which demonstrate much better computational efficiency.

[11]  arXiv:2201.05502 [pdf]
Title: Fast and accurate waveform modeling of long-haul multi-channel optical fiber transmission using a hybrid model-data driven scheme
Comments: 8 pages, 5 figures, 1 table, 30 conference
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

The modeling of optical wave propagation in optical fiber is a task of fast and accurate solving the nonlinear Schr\"odinger equation (NLSE), and can enable the research progress and system design of optical fiber communications, which are the infrastructure of modern communication systems. Traditional modeling of fiber channels using the split-step Fourier method (SSFM) has long been regarded as challenging in long-haul wavelength division multiplexing (WDM) optical fiber communication systems because it is extremely time-consuming. Here we propose a linear-nonlinear feature decoupling distributed (FDD) waveform modeling scheme to model long-haul WDM fiber channel, where the channel linear effects are modelled by the NLSE-derived model-driven methods and the nonlinear effects are modelled by the data-driven deep learning methods. Meanwhile, the proposed scheme only focuses on one-span fiber distance fitting, and then recursively transmits the model to achieve the required transmission distance. The proposed modeling scheme is demonstrated to have high accuracy, high computing speeds, and robust generalization abilities for different optical launch powers, modulation formats, channel numbers and transmission distances. The total running time of FDD waveform modeling scheme for 41-channel 1040-km fiber transmission is only 3 minutes versus more than 2 hours using SSFM for each input condition, which achieves a 98% reduction in computing time. Considering the multi-round optimization by adjusting system parameters, the complexity reduction is significant. The results represent a remarkable improvement in nonlinear fiber modeling and open up novel perspectives for solution of NLSE-like partial differential equations and optical fiber physics problems.

[12]  arXiv:2201.05520 [pdf, other]
Title: Value of Fleet Vehicle Grid in Providing Transmission System Operator Services
Journal-ref: 2020 Fifteenth International Conference on Ecological Vehicles and Renewable Energies (EVER)
Subjects: Systems and Control (eess.SY)

In this paper a new aggregated model for electric vehicle (EV) fleets is presented that considers their daily and weekly usage patterns. A frequency-constrained stochastic unit commitment model is employed to optimally schedule EV charging and discharging as well as the provision of frequency response (FR) in an electricity system, while respecting the vehicles' energy requirements and driving schedules. Through case studies we demonstrate that an EV with vehicle to grid (V2G) capability can reduce system costs in a future GB electricity grid by up to {\pounds}12,000 per year, and reduce CO2 emissions by 60 tonnes per year, mainly due to reduced curtailment of wind power. The paper also quantifies the changes in the benefits of fleet V2G resulting from variations in FR delivery time, the penetration of wind or uptake of alternative flexibility providers. Finally , a battery degradation model dependent on an EV's state of charge is proposed and implemented in the stochastic scheduling problem. It enables significant degradation cost reductions of 16% with only a 0.4% reduction of an EV's system value.

[13]  arXiv:2201.05524 [pdf, other]
Title: Waveform Learning for Reduced Out-of-Band Emissions Under a Nonlinear Power Amplifier
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

Machine learning (ML) has shown great promise in optimizing various aspects of the physical layer processing in wireless communication systems. In this paper, we use ML to learn jointly the transmit waveform and the frequency-domain receiver. In particular, we consider a scenario where the transmitter power amplifier is operating in a nonlinear manner, and ML is used to optimize the waveform to minimize the out-of-band emissions. The system also learns a constellation shape that facilitates pilotless detection by the simultaneously learned receiver. The simulation results show that such an end-to-end optimized system can communicate data more accurately and with less out-of-band emissions than conventional systems, thereby demonstrating the potential of ML in optimizing the air interface. To the best of our knowledge, there are no prior works considering the power amplifier induced emissions in an end-to-end learned system. These findings pave the way towards an ML-native air interface, which could be one of the building blocks of 6G.

[14]  arXiv:2201.05548 [pdf, other]
Title: Utilizing geospatial data for assessing energy security: Mapping small solar home systems using unmanned aerial vehicles and deep learning
Subjects: Image and Video Processing (eess.IV)

Solar home systems (SHS), a cost-effective solution for rural communities far from the grid in developing countries, are small solar panels and associated equipment that provides power to a single household. A crucial resource for targeting further investment of public and private resources, as well as tracking the progress of universal electrification goals, is shared access to high-quality data on individual SHS installations including information such as location and power capacity. Though recent studies utilizing satellite imagery and machine learning to detect solar panels have emerged, they struggle to accurately locate many SHS due to limited image resolution (some small solar panels only occupy several pixels in satellite imagery). In this work, we explore the viability and cost-performance tradeoff of using automatic SHS detection on unmanned aerial vehicle (UAV) imagery as an alternative to satellite imagery. More specifically, we explore three questions: (i) what is the detection performance of SHS using drone imagery; (ii) how expensive is the drone data collection, compared to satellite imagery; and (iii) how well does drone-based SHS detection perform in real-world scenarios. We collect and publicly-release a dataset of high-resolution drone imagery encompassing SHS imaged under real-world conditions and use this dataset and a dataset from Rwanda to evaluate the capabilities of deep learning models to recognize SHS, including those that are too small to be reliably recognized in satellite imagery. The results suggest that UAV imagery may be a viable alternative to identify very small SHS from perspectives of both detection accuracy and financial costs of data collection. UAV-based data collection may be a practical option for supporting electricity access planning strategies for achieving sustainable development goals and for monitoring the progress towards those goals.

[15]  arXiv:2201.05577 [pdf, other]
Title: Unsupervised Sparse Unmixing of Atmospheric Trace Gases from Hyperspectral Satellite Data
Comments: 5 pages, 4 figures
Subjects: Signal Processing (eess.SP)

In this letter, a new approach for the retrieval of the vertical column concentrations of trace gases from hyperspectral satellite observations, is proposed. The main idea is to perform a linear spectral unmixing by estimating the abundances of trace gases spectral signatures in each mixed pixel collected by an imaging spectrometer in the ultraviolet region. To this aim, the sparse nature of the measurements is brought to light and the compressive sensing paradigm is applied to estimate the concentrations of the gases' endemembers given by an a priori wide spectral library, including reference cross sections measured at different temperatures and pressures at the same time. The proposed approach has been experimentally assessed using both simulated and real hyperspectral dataset. Specifically, the experimental analysis relies on the retrieval of sulfur dioxide during volcanic emissions using data collected by the TROPOspheric Monitoring Instrument. To validate the procedure, we also compare the obtained results with the sulfur dioxide total column product based on the differential optical absorption spectroscopy technique and the retrieved concentrations estimated using the blind source separation.

Cross-lists for Mon, 17 Jan 22

[16]  arXiv:2201.05184 (cross-list from cs.NI) [pdf, ps, other]
Title: Achieving AI-enabled Robust End-to-End Quality of Experience over Radio Access Networks
Subjects: Networking and Internet Architecture (cs.NI); Systems and Control (eess.SY)

Emerging applications such as Augmented Reality, the Internet of Vehicles and Remote Surgery require both computing and networking functions working in harmony. The End-to-end (E2E) quality of experience (QoE) for these applications depends on the synchronous allocation of networking and computing resources. However, the relationship between the resources and the E2E QoE outcomes is typically stochastic and non-linear. In order to make efficient resource allocation decisions, it is essential to model these relationships. This article presents a novel machine-learning based approach to learn these relationships and concurrently orchestrate both resources for this purpose. The machine learning models further help make robust allocation decisions regarding stochastic variations and simplify robust optimization to a conventional constrained optimization. When resources are insufficient to accommodate all application requirements, our framework supports executing some of the applications with minimal degradation (graceful degradation) of E2E QoE. We also show how we can implement the learning and optimization methods in a distributed fashion by the Software-Defined Network (SDN) and Kubernetes technologies. Our results show that deep learning-based modelling achieves E2E QoE with approximately 99.8\% accuracy, and our robust joint-optimization technique allocates resources efficiently when compared to existing differential services alternatives.

[17]  arXiv:2201.05208 (cross-list from math.NA) [pdf]
Title: The Padé matrix pencil method with spurious pole information assimilation
Comments: 22 pages
Subjects: Numerical Analysis (math.NA); Systems and Control (eess.SY)

We present a novel method for calculating Pad\'e approximants that is capable of eliminating spurious poles placed at the point of development and of identifying and eliminating spurious poles created by precision limitations and/or noisy coefficients. Information contained in in the eliminated poles is assimilated producing a reduced order Pad\'e approximant (PA). While the [m+k/m] conformation produced by the algorithm is flexible, the m value of the rational approximant produced by the algorithm reported here is determined by the number of spurious poles eliminated. Spurious poles due to coefficient noise/precision limitations are identified using an evidence-based filter parameter applied to the singular values of a matrix comprised of the series coefficients. The rational function poles are found directly by solving a generalized eigenvalue problem defined by a matrix pencil. Spurious poles place at the point of development, responsible in some algorithms for degeneracy, are identified by their magnitudes. Residues are found by solving an overdetermined linear matrix equation. The method is compared with the so-called Robust Pad\'e Approximation (RPA) method and shown to be competitive on the problems studied. By eliminating spurious poles, particularly in functions with branch points, such as those encountered solving the power-flow problem, solution of these complex-valued problems is made more reliable.

[18]  arXiv:2201.05212 (cross-list from math.OC) [pdf, other]
Title: Probabilistic design of optimal sequential decision-making algorithms in learning and control
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

This survey is focused on certain sequential decision-making problems that involve optimizing over probability functions. We discuss the relevance of these problems for learning and control. The survey is organized around a framework that combines a problem formulation and a set of resolution methods. The formulation consists of an infinite-dimensional optimization problem. The methods come from approaches to search optimal solutions in the space of probability functions. Through the lenses of this overarching framework we revisit popular learning and control algorithms, showing that these naturally arise from suitable variations on the formulation mixed with different resolution methods. A running example, for which we make the code available, complements the survey. Finally, a number of challenges arising from the survey are also outlined.

[19]  arXiv:2201.05225 (cross-list from cs.IT) [pdf, other]
Title: Learning-Based MIMO Channel Estimation under Spectrum Efficient Pilot Allocation and Feedback
Comments: Pre-print
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

Wireless links using massive MIMO transceivers are vital for next generation wireless communications networks networks. Precoding in Massive MIMO transmission requires accurate downlink channel state information (CSI). Many recent works have effectively applied deep learning (DL) to jointly train UE-side compression networks for delay domain CSI and a BS-side decoding scheme. Vitally, these works assume that the full delay domain CSI is available at the UE, but in reality, the UE must estimate the delay domain based on a limited number of frequency domain pilots. In this work, we propose a linear pilot-to-delay (P2D) estimator that transforms sparse frequency pilots to the truncated delay CSI. We show that the P2D estimator is accurate under frequency downsampling, and we demonstrate that the P2D estimate can be effectively utilized with existing autoencoder-based CSI estimation networks. In addition to accounting for pilot-based estimates of downlink CSI, we apply unrolled optimization networks to emulate iterative solutions to compressed sensing (CS), and we demonstrate better estimation performance than prior autoencoder-based DL networks. Finally, we investigate the efficacy of trainable CS networks for in a differential encoding network for time-varying CSI estimation, and we propose a new network, MarkovNet-ISTA-ENet, comprised of both a CS network for initial CSI estimation and multiple autoencoders to estimate the error terms. We demonstrate that this heterogeneous network has better asymptotic performance than networks comprised of only one type of network.

[20]  arXiv:2201.05240 (cross-list from cs.IT) [pdf, ps, other]
Title: Integrated Sensing and Communication with Millimeter Wave Full Duplex Hybrid Beamforming
Comments: 6 pages, 4 figures, Submitted for publication in the Proceedings of IEEE ICC 2022, Seoul, South Korea
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

Integrated Sensing and Communication (ISAC) has attracted substantial attraction in recent years for spectral efficiency improvement, enabling hardware and spectrum sharing for simultaneous sensing and signaling operations. In-band Full Duplex (FD) is being considered as a key enabling technology for ISAC applications due to its simultaneous transmission and reception capability. In this paper, we present an FD-based ISAC system operating at millimeter Wave (mmWave) frequencies, where a massive Multiple-Input Multiple-Output (MIMO) Base Station (BS) node employing hybrid Analog and Digital (A/D) beamforming is communicating with a DownLink (DL) multi-antenna user and the same waveform is utilized at the BS receiver for sensing the radar targets in its coverage environment. We develop a sensing algorithm that is capable of estimating Direction of Arrival (DoA), range, and relative velocity of the radar targets. A joint optimization framework for designing the A/D transmit and receive beamformers as well as the Self-Interference (SI) cancellation is presented with the objective to maximize the achievable DL rate and the accuracy of the radar target sensing performance. Our simulation results, considering fifth Generation (5G) Orthogonal Frequency Division Multiplexing (OFDM) waveforms, verify our approach's high precision in estimating DoA, range, and velocity of multiple radar targets, while maximizing the DL communication rate.

[21]  arXiv:2201.05244 (cross-list from cs.SD) [pdf, ps, other]
Title: Beyond chord vocabularies: Exploiting pitch-relationships in a chord estimation metric
Authors: Johanna Devaney
Comments: Extended abstract, 3 pages, 2 tables
Journal-ref: Late-Breaking Demo Session of the 22nd International Society for Music Information Retrieval Conference (2021)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Chord estimation metrics treat chord labels as independent of one another. This fails to represent the pitch relationships between the chords in a meaningful way, resulting in evaluations that must make compromises with complex chord vocabularies and that often require time-consuming qualitative analyses to determine details about how a chord estimation algorithm performs. This paper presents an accuracy metric for chord estimation that compares the pitch content of the estimated chords against the ground truth that captures both the correct notes that are estimated and additional notes that are inserted into the estimate. This is not a stand-alone evaluation protocol but rather a metric that can be integrated as a weighting into existing evaluation approaches.

[22]  arXiv:2201.05247 (cross-list from cs.RO) [pdf, other]
Title: Multi-agent Motion Planning from Signal Temporal Logic Specifications
Comments: Accepted to IEEE Robotics and Automation Letters (RA-L)
Subjects: Robotics (cs.RO); Multiagent Systems (cs.MA); Systems and Control (eess.SY)

We tackle the challenging problem of multi-agent cooperative motion planning for complex tasks described using signal temporal logic (STL), where robots can have nonlinear and nonholonomic dynamics. Existing methods in multi-agent motion planning, especially those based on discrete abstractions and model predictive control (MPC), suffer from limited scalability with respect to the complexity of the task, the size of the workspace, and the planning horizon. We present a method based on {\em timed waypoints\/} to address this issue. We show that timed waypoints can help abstract nonlinear behaviors of the system as safety envelopes around the reference path defined by those waypoints. Then the search for waypoints satisfying the STL specifications can be inductively encoded as a mixed-integer linear program. The agents following the synthesized timed waypoints have their tasks automatically allocated, and are guaranteed to satisfy the STL specifications while avoiding collisions. We evaluate the algorithm on a wide variety of benchmarks. Results show that it supports multi-agent planning from complex specification over long planning horizons, and significantly outperforms state-of-the-art abstraction-based and MPC-based motion planning methods. The implementation is available at https://github.com/sundw2014/STLPlanning.

[23]  arXiv:2201.05342 (cross-list from math.OC) [pdf, other]
Title: Distributed Q-Learning for Stochastic LQ Control with Unknown Uncertainty
Authors: Zhaorong Zhang (1), Juanjuan Xu (1), Xun Li (2) ((1) Shandong University (2) the Hong Kong Polytechnic University)
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

This paper studies a discrete-time stochastic control problem with linear quadratic criteria over an infinite-time horizon. We focus on a class of control systems whose system matrices are associated with random parameters involving unknown statistical properties. In particular, we design a distributed Q-learning algorithm to tackle the Riccati equation and derive the optimal controller stabilizing the system. The key technique is that we convert the problem of solving the Riccati equation into deriving the zero point of a matrix equation and devise a distributed stochastic approximation method to compute the estimates of the zero point. The convergence analysis proves that the distributed Q-learning algorithm converges to the correct value eventually. A numerical example sheds light on that the distributed Q-learning algorithm converges asymptotically.

[24]  arXiv:2201.05442 (cross-list from physics.app-ph) [pdf]
Title: Highly sensitive fire alarm system based on cellulose paper with low temperature response and wireless signal conversion
Subjects: Applied Physics (physics.app-ph); Signal Processing (eess.SP)

Highly sensitive smart sensors for early fire detection with remote warning capabilities are urgently required to improve the fire safety of combustible materials in diverse applications. The highly-sensitive fire alarm can detect fire situation within a short time quickly when a fire disaster is about to occur, which is conducive to achieve fire tuned. Herein, a novel fire alarm is designed by using flame-retardant cellulose paper loaded with graphene oxide (GO) and two-dimensional titanium carbide (Ti3C2, MXene). Owing to the excellent temperature dependent electrical resistance switching effect of GO, it acts as an electrical insulator at room temperature and becomes electrically conductive at high temperature. During a fire incident, the partial oxygen-containing groups on GO will undergo complete removal, which results in the conductivity transformation.Besides the use of GO feature, this work also introduces conductive MXene to enhance fire detection speed and warning at low temperature, especially below 300 {\deg}C. The designed flame-retardant fire alarm is sensitive enough to detect fire incident, showing a response time of 2 s at 250 {\deg}C, which is calculated by a novel and quantifiable technique. More importantly, the designed fire alarm sensor is coupled to a wireless communication interface to conveniently transmit fire signal remotely. Therefore, when an abnormal temperature is detected, the signal is wirelessly transmitted to a liquid crystal display (LCD) screen when displays a message such as "FIRE DANGER". The designed smart fire alarm paper is promising for use as a smart wallpaper for interior house decoration and other applications requiring early fire detection and warning.

[25]  arXiv:2201.05452 (cross-list from cs.SD) [pdf, other]
Title: Multiphonic modeling using Impulse Pattern Formulation (IPF)
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS); Adaptation and Self-Organizing Systems (nlin.AO); Applied Physics (physics.app-ph)

Multiphonics, the presence of multiple pitches within the sound, can be produced in several ways. In wind instruments, they can appear at low blowing pressure when complex fingerings are used. Such multiphonics can be modeled by the Impulse Pattern Formulation (IPF). This top-down method regards musical instruments as systems working with impulses originating from a generating entity, travel through the instrument, are reflected at various positions, and are exponentially damped. Eventually, impulses return to the generating entity and retrigger or interact with subsequent impulses. Due to this straightforward approach, the IPF can explain fundamental principles of complex dynamic systems. While modeling wind instruments played with blowing pressures at the threshold of tone onset, the IPF captures transitions between regular periodicity at nominal pitch, bifurcations, and noise. This corresponds to behavior found in wind instruments where multiphonics appear at the transition between noise and regular musical note regimes. Using the IPF, complex fingerings correspond to multiple reflection points at open finger holes with different reflection strengths. Multiphonics can be modeled if reflection points farther away show higher reflection strength and thus, disrupt periodic motion. The IPF can also synthesize multiphonic sounds by concatenating typical wind instrument waveforms at adjacent impulse time points.

[26]  arXiv:2201.05510 (cross-list from cs.SD) [pdf, ps, other]
Title: Anomalous Sound Detection using Spectral-Temporal Information Fusion
Subjects: Sound (cs.SD); Audio and Speech Processing (eess.AS)

Unsupervised anomalous sound detection aims to detect unknown abnormal sounds of machines from normal sounds. However, the state-of-the-art approaches are not always stable and perform dramatically differently even for machines of the same type, making it impractical for general applications. This paper proposes a spectral-temporal fusion based self-supervised method to model the feature of the normal sound, which improves the stability and performance consistency in detection of anomalous sounds from individual machines, even of the same type. Experiments on the DCASE 2020 Challenge Task 2 dataset show that the proposed method achieved 81.39%, 83.48%, 98.22% and 98.83% in terms of the minimum AUC (worst-case detection performance amongst individuals) in four types of real machines (fan, pump, slider and valve), respectively, giving 31.79%, 17.78%, 10.42% and 21.13% improvement compared to the state-of-the-art method, i.e., Glow_Aff. Moreover, the proposed method has improved AUC (average performance of individuals) for all the types of machines in the dataset.

[27]  arXiv:2201.05554 (cross-list from cs.SD) [pdf, other]
Title: Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition
Comments: Proceedings of INTERSPEECH 2021
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)

Automatic recognition of disordered speech remains a highly challenging task to date. Sources of variability commonly found in normal speech including accent, age or gender, when further compounded with the underlying causes of speech impairment and varying severity levels, create large diversity among speakers. To this end, speaker adaptation techniques play a vital role in current speech recognition systems. Motivated by the spectro-temporal level differences between disordered and normal speech that systematically manifest in articulatory imprecision, decreased volume and clarity, slower speaking rates and increased dysfluencies, novel spectro-temporal subspace basis embedding deep features derived by SVD decomposition of speech spectrum are proposed to facilitate both accurate speech intelligibility assessment and auxiliary feature based speaker adaptation of state-of-the-art hybrid DNN and end-to-end disordered speech recognition systems. Experiments conducted on the UASpeech corpus suggest the proposed spectro-temporal deep feature adapted systems consistently outperformed baseline i-Vector adaptation by up to 2.63% absolute (8.6% relative) reduction in word error rate (WER) with or without data augmentation. Learning hidden unit contribution (LHUC) based speaker adaptation was further applied. The final speaker adapted system using the proposed spectral basis embedding features gave an overall WER of 25.6% on the UASpeech test set of 16 dysarthric speakers

[28]  arXiv:2201.05562 (cross-list from cs.SD) [pdf, other]
Title: Investigation of Data Augmentation Techniques for Disordered Speech Recognition
Comments: Proceedings of INTERSPEECH 2020
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)

Disordered speech recognition is a highly challenging task. The underlying neuro-motor conditions of people with speech disorders, often compounded with co-occurring physical disabilities, lead to the difficulty in collecting large quantities of speech required for system development. This paper investigates a set of data augmentation techniques for disordered speech recognition, including vocal tract length perturbation (VTLP), tempo perturbation and speed perturbation. Both normal and disordered speech were exploited in the augmentation process. Variability among impaired speakers in both the original and augmented data was modeled using learning hidden unit contributions (LHUC) based speaker adaptive training. The final speaker adapted system constructed using the UASpeech corpus and the best augmentation approach based on speed perturbation produced up to 2.92% absolute (9.3% relative) word error rate (WER) reduction over the baseline system without data augmentation, and gave an overall WER of 26.37% on the test set containing 16 dysarthric speakers.

[29]  arXiv:2201.05599 (cross-list from cs.RO) [pdf]
Title: Smart Magnetic Microrobots Learn to Swim with Deep Reinforcement Learning
Comments: 23 pages, 5 figures
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)

Swimming microrobots are increasingly developed with complex materials and dynamic shapes and are expected to operate in complex environments in which the system dynamics are difficult to model and positional control of the microrobot is not straightforward to achieve. Deep reinforcement learning is a promising method of autonomously developing robust controllers for creating smart microrobots, which can adapt their behavior to operate in uncharacterized environments without the need to model the system dynamics. Here, we report the development of a smart helical magnetic hydrogel microrobot that used the soft actor critic reinforcement learning algorithm to autonomously derive a control policy which allowed the microrobot to swim through an uncharacterized biomimetic fluidic environment under control of a time varying magnetic field generated from a three-axis array of electromagnets. The reinforcement learning agent learned successful control policies with fewer than 100,000 training steps, demonstrating sample efficiency for fast learning. We also demonstrate that we can fine tune the control policies learned by the reinforcement learning agent by fitting mathematical functions to the learned policy's action distribution via regression. Deep reinforcement learning applied to microrobot control is likely to significantly expand the capabilities of the next generation of microrobots.

Replacements for Mon, 17 Jan 22

[30]  arXiv:2002.07378 (replaced) [pdf, other]
Title: Distributed Adaptive Newton Methods with Global Superlinear Convergence
Comments: Accepted to Automatica as regular paper. 13 pages
Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA); Signal Processing (eess.SP); Systems and Control (eess.SY)
[31]  arXiv:2105.14071 (replaced) [pdf, other]
Title: Classification of Brain Tumours in MR Images using Deep Spatiospatial Models
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[32]  arXiv:2106.05490 (replaced) [pdf]
Title: SignalNet: A Low Resolution Sinusoid Decomposition and Estimation Network
Comments: Submitted to IEEE Transactions on Signal Processing
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
[33]  arXiv:2106.09296 (replaced) [pdf, other]
Title: Voice2Series: Reprogramming Acoustic Models for Time Series Classification
Comments: Updated version with a correction. The full draft was submitted in Jan 2021. The Voice2Series project initially was launched in Sep 2020. Accepted to ICML 2021, 16 Pages
Journal-ref: Proceedings of the 38th International Conference on Machine Learning 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[34]  arXiv:2107.12068 (replaced) [pdf, other]
Title: QoE Evaluation for Adaptive Video Streaming: Enhanced MDT with Deep Learning
Comments: 14 pages, 14 figures, "This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible."
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
[35]  arXiv:2107.12632 (replaced) [pdf, other]
Title: INTPIX4NA -- new integration-type silicon-on-insulator pixel detector for imaging application
Comments: Accepted for publication at JINST (2022/01/14 Typo correction ver.)
Journal-ref: 2021 JINST 16 P08054
Subjects: Instrumentation and Detectors (physics.ins-det); Image and Video Processing (eess.IV); High Energy Physics - Experiment (hep-ex)
[36]  arXiv:2108.00829 (replaced) [pdf]
Title: Objective crystallographic symmetry classifications of noisy and noise-free 2D periodic patterns with strong Fedorov type pseudosymmetries
Authors: Peter Moeck
Comments: 35 pages, 8 figures, 9 data tables, 2 tables of results, 126 references, 8 appendices with 8 additional figures, accepted in shorten form for publication in the Advances Section of Acta Cryst. A on October, 22, 2021, to be published in 2022
Subjects: Image and Video Processing (eess.IV); Applied Physics (physics.app-ph)
[37]  arXiv:2108.07368 (replaced) [pdf]
Title: CaraNet: Context Axial Reverse Attention Network for Segmentation of Small Medical Objects
Comments: Accepted by SPIE Medical Imaging: Image Processing (oral presentation)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[38]  arXiv:2108.13754 (replaced) [pdf]
Title: MRI lung lobe segmentation in pediatric cystic fibrosis patients using a recurrent neural network trained with publicly accessible CT datasets
Authors: Orso Pusterla (1,2,3), Rahel Heule (4,5), Francesco Santini (1,3,6), Thomas Weikert (6), Corin Willers (2), Simon Andermatt (3), Robin Sandkühler (3), Sylvia Nyilas (7), Philipp Latzin (2), Oliver Bieri (1,3), Grzegorz Bauman (1,3) ((1) Department of Radiology, Division of Radiological Physics, University Hospital Basel, University of Basel, Basel, Switzerland, (2) Division of Pediatric Respiratory Medicine and Allergology, Department of Pediatrics, Inselspital, Bern University Hospital, University of Bern, Switzerland, (3) Department of Biomedical Engineering, University of Basel, Basel, Switzerland, (4) High Field Magnetic Resonance, Max Planck Institute for Biological Cybernetics, Tübingen, Germany, (5) Department of Biomedical Magnetic Resonance, University of Tübingen, Tübingen, Germany, (6) Department of Radiology, University Hospital Basel, University of Basel, Basel, Switzerland, (7) Department of Diagnostic, Interventional and Pediatric Radiology, Inselspital, Bern University Hospital, University of Bern, Switzerland)
Comments: 9 Figures, 1 Table. Supplementary Material: 1 Appendix, 1 Table, 3 Supporting Figures
Subjects: Medical Physics (physics.med-ph); Image and Video Processing (eess.IV)
[39]  arXiv:2109.11899 (replaced) [pdf, ps, other]
Title: OTFS Without CP in Massive MIMO: Breaking Doppler Limitations with TR-MRC and Windowing
Comments: Accepted by WCNC 2022
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
[40]  arXiv:2110.01292 (replaced) [pdf, other]
Title: A Survey on Channel Estimation and Practical Passive Beamforming Design for Intelligent Reflecting Surface Aided Wireless Communications
Comments: 76 pages, 17 figures, and 10 tables. In this paper, we provide a comprehensive survey on the up-to-date research in IRS-aided wireless communications, with an emphasis on the promising solutions to tackle practical design issues
Subjects: Information Theory (cs.IT); Emerging Technologies (cs.ET); Signal Processing (eess.SP)
[41]  arXiv:2110.05815 (replaced) [pdf, other]
Title: Covariance-Based Joint Device Activity and Delay Detection in Asynchronous mMTC
Comments: Accepted by IEEE SPL
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
[42]  arXiv:2110.10332 (replaced) [pdf]
Title: AI-Based Detection, Classification and Prediction/Prognosis in Medical Imaging: Towards Radiophenomics
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[43]  arXiv:2111.03777 (replaced) [pdf, other]
Title: Privacy attacks for automatic speech recognition acoustic models in a federated learning framework
Comments: Submitted to ICASSP 2022
Subjects: Computation and Language (cs.CL); Cryptography and Security (cs.CR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[44]  arXiv:2111.13849 (replaced) [pdf, ps, other]
Title: Robust Adaptive Safety-Critical Control for Unknown Systems with Finite-Time Element-Wise Parameter Estimation
Subjects: Systems and Control (eess.SY)
[45]  arXiv:2112.04364 (replaced) [pdf, other]
Title: Generalization Error Bounds for Iterative Recovery Algorithms Unfolded as Neural Networks
Comments: 29 pages, 6 figures
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP); Machine Learning (stat.ML)
[46]  arXiv:2112.07983 (replaced) [pdf, other]
Title: Data-Driven Models for Control Engineering Applications Using the Koopman Operator
Comments: accepted for: 2022 3rd International Conference on Artificial Intelligence, Robotics and Control (AIRC 2022)
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
[47]  arXiv:2112.15484 (replaced) [pdf, other]
Title: A Research Agenda for AI Planning in the Field of Flexible Production Systems
Subjects: Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
[48]  arXiv:2201.02574 (replaced) [pdf, other]
Title: An Incremental Learning Approach to Automatically Recognize Pulmonary Diseases from the Multi-vendor Chest Radiographs
Comments: Computers in Biology and Medicine
Journal-ref: Computers in Biology and Medicine, 2021
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[49]  arXiv:2201.04476 (replaced) [pdf, other]
Title: First Arrival Position in Molecular Communication Via Generator of Diffusion Semigroup
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
[50]  arXiv:2201.05013 (replaced) [pdf, other]
Title: Fish sounds: towards the evaluation of marine acoustic biodiversity through data-driven audio source separation
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[51]  arXiv:2201.05024 (replaced) [pdf, other]
Title: Real-Time GPU-Accelerated Machine Learning Based Multiuser Detection for 5G and Beyond
Comments: submitted to IEEEAccess
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG); Machine Learning (stat.ML)
[ total of 51 entries: 1-51 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, recent, 2201, contact, help  (Access key information)