We gratefully acknowledge support from
the Simons Foundation and member institutions.

Image and Video Processing

New submissions

[ total of 16 entries: 1-16 ]
[ showing up to 1000 entries per page: fewer | more ]

New submissions for Fri, 26 Feb 21

[1]  arXiv:2102.12525 [pdf, other]
Title: Prior Image-Constrained Reconstruction using Style-Based Generative Models
Comments: 10 + 12 pages
Subjects: Image and Video Processing (eess.IV)

Obtaining an accurate and reliable estimate of an object from highly incomplete imaging measurements remains a holy grail of imaging science. Deep learning methods have shown promise in learning object priors or constraints to improve the conditioning of an ill-posed imaging inverse problem. In this study, a framework for estimating an object of interest that is semantically related to a known prior image, is proposed. An optimization problem is formulated in the disentangled latent space of a style-based generative model, and semantically meaningful constraints are imposed using the disentangled latent representation of the prior image. Stable recovery from incomplete measurements with the help of a prior image is theoretically analyzed. Numerical experiments demonstrating the superior performance of our approach as compared to related methods are presented.

[2]  arXiv:2102.12755 [pdf, other]
Title: Coarse-to-fine Airway Segmentation Using Multi information Fusion Network and CNN-based Region Growing
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Automatic airway segmentation from chest computed tomography (CT) scans plays an important role in pulmonary disease diagnosis and computer-assisted therapy. However, low contrast at peripheral branches and complex tree-like structures remain as two mainly challenges for airway segmentation. Recent research has illustrated that deep learning methods perform well in segmentation tasks. Motivated by these works, a coarse-to-fine segmentation framework is proposed to obtain a complete airway tree. Our framework segments the overall airway and small branches via the multi-information fusion convolution neural network (Mif-CNN) and the CNN-based region growing, respectively. In Mif-CNN, atrous spatial pyramid pooling (ASPP) is integrated into a u-shaped network, and it can expend the receptive field and capture multi-scale information. Meanwhile, boundary and location information are incorporated into semantic information. These information are fused to help Mif-CNN utilize additional context knowledge and useful features. To improve the performance of the segmentation result, the CNN-based region growing method is designed to focus on obtaining small branches. A voxel classification network (VCN), which can entirely capture the rich information around each voxel, is applied to classify the voxels into airway and non-airway. In addition, a shape reconstruction method is used to refine the airway tree.

[3]  arXiv:2102.12759 [pdf, other]
Title: Binary segmentation of medical images using implicit spline representations and deep learning
Comments: 17 pages, 5 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

We propose a novel approach to image segmentation based on combining implicit spline representations with deep convolutional neural networks. This is done by predicting the control points of a bivariate spline function whose zero-set represents the segmentation boundary. We adapt several existing neural network architectures and design novel loss functions that are tailored towards providing implicit spline curve approximations. The method is evaluated on a congenital heart disease computed tomography medical imaging dataset. Experiments are carried out by measuring performance in various standard metrics for different networks and loss functions. We determine that splines of bidegree $(1,1)$ with $128\times128$ coefficient resolution performed optimally for $512\times 512$ resolution CT images. For our best network, we achieve an average volumetric test Dice score of almost 92%, which reaches the state of the art for this congenital heart disease dataset.

[4]  arXiv:2102.12764 [pdf, other]
Title: Reducing Labelled Data Requirement for Pneumonia Segmentation using Image Augmentations
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Deep learning semantic segmentation algorithms can localise abnormalities or opacities from chest radiographs. However, the task of collecting and annotating training data is expensive and requires expertise which remains a bottleneck for algorithm performance. We investigate the effect of image augmentations on reducing the requirement of labelled data in the semantic segmentation of chest X-rays for pneumonia detection. We train fully convolutional network models on subsets of different sizes from the total training data. We apply a different image augmentation while training each model and compare it to the baseline trained on the entire dataset without augmentations. We find that rotate and mixup are the best augmentations amongst rotate, mixup, translate, gamma and horizontal flip, wherein they reduce the labelled data requirement by 70% while performing comparably to the baseline in terms of AUC and mean IoU in our experiments.

[5]  arXiv:2102.12898 [pdf, other]
Title: ShuffleUNet: Super resolution of diffusion-weighted MRIs using deep learning
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Diffusion-weighted magnetic resonance imaging (DW-MRI) can be used to characterise the microstructure of the nervous tissue, e.g. to delineate brain white matter connections in a non-invasive manner via fibre tracking. Magnetic Resonance Imaging (MRI) in high spatial resolution would play an important role in visualising such fibre tracts in a superior manner. However, obtaining an image of such resolution comes at the expense of longer scan time. Longer scan time can be associated with the increase of motion artefacts, due to the patient's psychological and physical conditions. Single Image Super-Resolution (SISR), a technique aimed to obtain high-resolution (HR) details from one single low-resolution (LR) input image, achieved with Deep Learning, is the focus of this study. Compared to interpolation techniques or sparse-coding algorithms, deep learning extracts prior knowledge from big datasets and produces superior MRI images from the low-resolution counterparts. In this research, a deep learning based super-resolution technique is proposed and has been applied for DW-MRI. Images from the IXI dataset have been used as the ground-truth and were artificially downsampled to simulate the low-resolution images. The proposed method has shown statistically significant improvement over the baselines and achieved an SSIM of $0.913\pm0.045$.

[6]  arXiv:2102.12960 [pdf]
Title: Deep learning based electrical noise removal enables high spectral optoacoustic contrast in deep tissue
Comments: 19 pages, 6 figures, 1 table
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG)

Image contrast in multispectral optoacoustic tomography (MSOT) can be severely reduced by electrical noise and interference in the acquired optoacoustic signals. Signal processing techniques have proven insufficient to remove the effects of electrical noise because they typically rely on simplified models and fail to capture complex characteristics of signal and noise. Moreover, they often involve time-consuming processing steps that are unsuited for real-time imaging applications. In this work, we develop and demonstrate a discriminative deep learning (DL) approach to separate electrical noise from optoacoustic signals prior to image reconstruction. The proposed DL algorithm is based on two key features. First, it learns spatiotemporal correlations in both noise and signal by using the entire optoacoustic sinogram as input. Second, it employs training based on a large dataset of experimentally acquired pure noise and synthetic optoacoustic signals. We validated the ability of the trained model to accurately remove electrical noise on synthetic data and on optoacoustic images of a phantom and the human breast. We demonstrate significant enhancements of morphological and spectral optoacoustic images reaching 19% higher blood vessel contrast and localized spectral contrast at depths of more than 2 cm for images acquired in vivo. We discuss how the proposed denoising framework is applicable to clinical multispectral optoacoustic tomography and suitable for real-time operation.

[7]  arXiv:2102.13066 [pdf]
Title: On Instabilities of Conventional Multi-Coil MRI Reconstruction to Small Adverserial Perturbations
Comments: To appear in Proceedings of the 29th Annual Meeting of ISMRM, 2021
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP); Medical Physics (physics.med-ph)

Although deep learning (DL) has received much attention in accelerated MRI, recent studies suggest small perturbations may lead to instabilities in DL-based reconstructions, leading to concern for their clinical application. However, these works focus on single-coil acquisitions, which is not practical. We investigate instabilities caused by small adversarial attacks for multi-coil acquisitions. Our results suggest that, parallel imaging and multi-coil CS exhibit considerable instabilities against small adversarial perturbations.

Cross-lists for Fri, 26 Feb 21

[8]  arXiv:2102.12582 (cross-list from cs.LG) [pdf]
Title: Disentangling brain heterogeneity via semi-supervised deep-learning and MRI: dimensional representations of Alzheimer's Disease
Comments: 37 pages, 11 figures
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV); Neurons and Cognition (q-bio.NC); Quantitative Methods (q-bio.QM)

Heterogeneity of brain diseases is a challenge for precision diagnosis/prognosis. We describe and validate Smile-GAN (SeMI-supervised cLustEring-Generative Adversarial Network), a novel semi-supervised deep-clustering method, which dissects neuroanatomical heterogeneity, enabling identification of disease subtypes via their imaging signatures relative to controls. When applied to MRIs (2 studies; 2,832 participants; 8,146 scans) including cognitively normal individuals and those with cognitive impairment and dementia, Smile-GAN identified 4 neurodegenerative patterns/axes: P1, normal anatomy and highest cognitive performance; P2, mild/diffuse atrophy and more prominent executive dysfunction; P3, focal medial temporal atrophy and relatively greater memory impairment; P4, advanced neurodegeneration. Further application to longitudinal data revealed two distinct progression pathways: P1$\rightarrow$P2$\rightarrow$P4 and P1$\rightarrow$P3$\rightarrow$P4. Baseline expression of these patterns predicted the pathway and rate of future neurodegeneration. Pattern expression offered better yet complementary performance in predicting clinical progression, compared to amyloid/tau. These deep-learning derived biomarkers offer promise for precision diagnostics and targeted clinical trial recruitment.

[9]  arXiv:2102.12670 (cross-list from cs.RO) [pdf, other]
Title: Real-Time Ellipse Detection for Robotics Applications
Comments: Submitted to RA-L with IROS 2021 option. Currently under review
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

We propose a new algorithm for real-time detection and tracking of elliptic patterns suitable for real-world robotics applications. The method fits ellipses to each contour in the image frame and rejects ellipses that do not yield a good fit. It can detect complete, partial, and imperfect ellipses in extreme weather and lighting conditions and is lightweight enough to be used on robots' resource-limited onboard computers. The method is used on an example application of autonomous UAV landing on a fast-moving vehicle to show its performance indoors, outdoors, and in simulation on a real-world robotics task. The comparison with other well-known ellipse detection methods shows that our proposed algorithm outperforms other methods with the F1 score of 0.981 on a dataset with over 1500 frames. The videos of experiments, the source codes, and the collected dataset are provided with the paper.

[10]  arXiv:2102.12839 (cross-list from cs.CV) [pdf, other]
Title: A deep perceptual metric for 3D point clouds
Comments: Presented at IS&T Electronic Imaging: Image Quality and System Performance, January 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Signal Processing (eess.SP)

Point clouds are essential for storage and transmission of 3D content. As they can entail significant volumes of data, point cloud compression is crucial for practical usage. Recently, point cloud geometry compression approaches based on deep neural networks have been explored. In this paper, we evaluate the ability to predict perceptual quality of typical voxel-based loss functions employed to train these networks. We find that the commonly used focal loss and weighted binary cross entropy are poorly correlated with human perception. We thus propose a perceptual loss function for 3D point clouds which outperforms existing loss functions on the ICIP2020 subjective dataset. In addition, we propose a novel truncated distance field voxel grid representation and find that it leads to sparser latent spaces and loss functions that are more correlated with perceived visual quality compared to a binary representation. The source code is available at https://github.com/mauriceqch/2021_pc_perceptual_loss.

Replacements for Fri, 26 Feb 21

[11]  arXiv:2012.05767 (replaced) [pdf, other]
Title: Learning Tubule-Sensitive CNNs for Pulmonary Airway and Artery-Vein Segmentation in CT
Comments: 15 pages, IEEE TMI
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[12]  arXiv:2003.05438 (replaced) [pdf, other]
Title: Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning
Comments: 12 pages. Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[13]  arXiv:2007.10521 (replaced) [pdf, other]
Title: DeepCorn: A Semi-Supervised Deep Learning Method for High-Throughput Image-Based Corn Kernel Counting and Yield Estimation
Comments: 27 pages, 7 figures
Journal-ref: Knowledge-Based Systems (2021): 106874
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[14]  arXiv:2010.15718 (replaced) [pdf, other]
Title: On the limits to learning input data from gradients
Subjects: Cryptography and Security (cs.CR); Distributed, Parallel, and Cluster Computing (cs.DC); Image and Video Processing (eess.IV)
[15]  arXiv:2011.05755 (replaced) [pdf, other]
Title: Cryo-RALib -- a modular library for accelerating alignment in cryo-EM
Subjects: Quantitative Methods (q-bio.QM); Distributed, Parallel, and Cluster Computing (cs.DC); Image and Video Processing (eess.IV)
[16]  arXiv:2102.09199 (replaced) [pdf, other]
Title: Minimizing false negative rate in melanoma detection and providing insight into the causes of classification
Comments: supplementary materials included
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[ total of 16 entries: 1-16 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, recent, 2102, contact, help  (Access key information)