We gratefully acknowledge support from
the Simons Foundation and member institutions.

Image and Video Processing

New submissions

[ total of 24 entries: 1-24 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 9 Jul 20

[1]  arXiv:2007.03817 [pdf, other]
Title: Self-supervised Skull Reconstruction in Brain CT Images with Decompressive Craniectomy
Comments: Accepted for publication in MICCAI 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Decompressive craniectomy (DC) is a common surgical procedure consisting of the removal of a portion of the skull that is performed after incidents such as stroke, traumatic brain injury (TBI) or other events that could result in acute subdural hemorrhage and/or increasing intracranial pressure. In these cases, CT scans are obtained to diagnose and assess injuries, or guide a certain therapy and intervention.
We propose a deep learning based method to reconstruct the skull defect removed during DC performed after TBI from post-operative CT images. This reconstruction is useful in multiple scenarios, e.g. to support the creation of cranioplasty plates, accurate measurements of bone flap volume and total intracranial volume, important for studies that aim to relate later atrophy to patient outcome. We propose and compare alternative self-supervised methods where an encoder-decoder convolutional neural network (CNN) estimates the missing bone flap on post-operative CTs. The self-supervised learning strategy only requires images with complete skulls and avoids the need for annotated DC images. For evaluation, we employ real and simulated images with DC, comparing the results with other state-of-the-art approaches. The experiments show that the proposed model outperforms current manual methods, enabling reconstruction even in highly challenging cases where big skull defects have been removed during surgery.

[2]  arXiv:2007.03882 [pdf, other]
Title: Low-dimensional Manifold Constrained Disentanglement Network for Metal Artifact Reduction
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Deep neural network based methods have achieved promising results for CT metal artifact reduction (MAR), most of which use many synthesized paired images for training. As synthesized metal artifacts in CT images may not accurately reflect the clinical counterparts, an artifact disentanglement network (ADN) was proposed with unpaired clinical images directly, producing promising results on clinical datasets. However, without sufficient supervision, it is difficult for ADN to recover structural details of artifact-affected CT images based on adversarial losses only. To overcome these problems, here we propose a low-dimensional manifold (LDM) constrained disentanglement network (DN), leveraging the image characteristics that the patch manifold is generally low-dimensional. Specifically, we design an LDM-DN learning algorithm to empower the disentanglement network through optimizing the synergistic network loss functions while constraining the recovered images to be on a low-dimensional patch manifold. Moreover, learning from both paired and unpaired data, an efficient hybrid optimization scheme is proposed to further improve the MAR performance on clinical datasets. Extensive experiments demonstrate that the proposed LDM-DN approach can consistently improve the MAR performance in paired and/or unpaired learning settings, outperforming competing methods on synthesized and clinical datasets.

[3]  arXiv:2007.03951 [pdf, other]
Title: Designing and Training of A Dual CNN for Image Denoising
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Deep convolutional neural networks (CNNs) for image denoising have recently attracted increasing research interest. However, plain networks cannot recover fine details for a complex task, such as real noisy images. In this paper, we propsoed a Dual denoising Network (DudeNet) to recover a clean image. Specifically, DudeNet consists of four modules: a feature extraction block, an enhancement block, a compression block, and a reconstruction block. The feature extraction block with a sparse machanism extracts global and local features via two sub-networks. The enhancement block gathers and fuses the global and local features to provide complementary information for the latter network. The compression block refines the extracted information and compresses the network. Finally, the reconstruction block is utilized to reconstruct a denoised image. The DudeNet has the following advantages: (1) The dual networks with a parse mechanism can extract complementary features to enhance the generalized ability of denoiser. (2) Fusing global and local features can extract salient features to recover fine details for complex noisy images. (3) A Small-size filter is used to reduce the complexity of denoiser. Extensive experiments demonstrate the superiority of DudeNet over existing current state-of-the-art denoising methods.

[4]  arXiv:2007.04226 [pdf, other]
Title: Labelling imaging datasets on the basis of neuroradiology reports: a validation study
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Natural language processing (NLP) shows promise as a means to automate the labelling of hospital-scale neuroradiology magnetic resonance imaging (MRI) datasets for computer vision applications. To date, however, there has been no thorough investigation into the validity of this approach, including determining the accuracy of report labels compared to image labels as well as examining the performance of non-specialist labellers. In this work, we draw on the experience of a team of neuroradiologists who labelled over 5000 MRI neuroradiology reports as part of a project to build a dedicated deep learning-based neuroradiology report classifier. We show that, in our experience, assigning binary labels (i.e. normal vs abnormal) to images from reports alone is highly accurate. In contrast to the binary labels, however, the accuracy of more granular labelling is dependent on the category, and we highlight reasons for this discrepancy. We also show that downstream model performance is reduced when labelling of training reports is performed by a non-specialist. To allow other researchers to accelerate their research, we make our refined abnormality definitions and labelling rules available, as well as our easy-to-use radiology report labelling app which helps streamline this process.

[5]  arXiv:2007.04258 [pdf, other]
Title: Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment
Comments: Under review at Medical Image Analysis
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

The interpretation of medical images is a challenging task, often complicated by the presence of artifacts, occlusions, limited contrast and more. Most notable is the case of chest radiography, where there is a high inter-rater variability in the detection and classification of abnormalities. This is largely due to inconclusive evidence in the data or subjective definitions of disease appearance. An additional example is the classification of anatomical views based on 2D Ultrasound images. Often, the anatomical context captured in a frame is not sufficient to recognize the underlying anatomy. Current machine learning solutions for these problems are typically limited to providing probabilistic predictions, relying on the capacity of underlying models to adapt to limited information and the high degree of label noise. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose a system that learns not only the probabilistic estimate for classification, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that this approach is essential to account for the inherent ambiguity characteristic of medical images from different radiologic exams including computed radiography, ultrasonography and magnetic resonance imaging. In our experiments we demonstrate that sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC for various tasks, e.g., by 8% to 0.91 with an expected rejection rate of under 25% for the classification of different abnormalities in chest radiographs. In addition, we show that using uncertainty-driven bootstrapping to filter the training data, one can achieve a significant increase in robustness and accuracy.

Cross-lists for Thu, 9 Jul 20

[6]  arXiv:2007.03800 (cross-list from cs.LG) [pdf, ps, other]
Title: Efficient and Parallel Separable Dictionary Learning
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV); Numerical Analysis (math.NA); Machine Learning (stat.ML)

Separable, or Kronecker product, dictionaries provide natural decompositions for 2D signals, such as images. In this paper, we describe an algorithm to learn such dictionaries which is highly parallelizable and which reaches sparse representations competitive with the previous state of the art dictionary learning algorithms from the literature. We highlight the performance of the proposed method to sparsely represent image data and for image denoising applications.

[7]  arXiv:2007.03851 (cross-list from cs.CV) [pdf, other]
Title: SiENet: Siamese Expansion Network for Image Extrapolation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Different from image inpainting, image outpainting has relative less context in the image center to capture and more content at the image border to predict. Therefore, classical encoder-decoder pipeline of existing methods may not predict the outstretched unknown content perfectly. In this paper, a novel two-stage siamese adversarial model for image extrapolation, named Siamese Expansion Network (SiENet) is proposed. In two stages, a novel border sensitive convolution named adaptive filling convolution is designed for allowing encoder to predict the unknown content, alleviating the burden of decoder. Besides, to introduce prior knowledge to network and reinforce the inferring ability of encoder, siamese adversarial mechanism is designed to enable our network to model the distribution of covered long range feature for that of uncovered image feature. The results on four datasets has demonstrated that our method outperforms existing state-of-the-arts and could produce realistic results.

[8]  arXiv:2007.03893 (cross-list from eess.SP) [pdf, other]
Title: Multi-Resolution Beta-Divergence NMF for Blind Spectral Unmixing
Comments: 13 pages
Subjects: Signal Processing (eess.SP); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)

Blind spectral unmixing is the problem of decomposing the spectrum of a mixed signal or image into a collection of source spectra and their corresponding activations indicating the proportion of each source present in the mixed spectrum. To perform this task, nonnegative matrix factorization (NMF) based on the $\beta$-divergence, referred to as $\beta$-NMF, is a standard and state-of-the art technique. Many NMF-based methods factorize a data matrix that is the result of a resolution trade-off between two adversarial dimensions. Two instrumental examples are (1)~audio spectral unmixing for which the frequency-by-time data matrix is computed with the short-time Fourier transform and is the result of a trade-off between the frequency resolution and the temporal resolution, and (2)~blind hyperspectral unmixing for which the wavelength-by-location data matrix is a trade-off between the number of wavelengths measured and the spatial resolution. In this paper, we propose a new NMF-based method, dubbed multi-resolution $\beta$-NMF (MR-$\beta$-NMF), to address this issue by fusing the information coming from multiple data with different resolutions in order to produce a factorization with high resolutions for all the dimensions. MR-$\beta$-NMF performs a form of nonnegative joint factorization based on the $\beta$-divergence. In order to solve this problem, we propose multiplicative updates based on a majorization-minimization algorithm. We show on numerical experiments that MR-$\beta$-NMF is able to obtain high resolutions in both dimensions for two applications: the joint-factorization of two audio spectrograms, and the hyperspectral and multispectral data fusion problem.

[9]  arXiv:2007.03956 (cross-list from physics.optics) [pdf]
Title: Guidestar-free image-guided wavefront-shaping
Subjects: Optics (physics.optics); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Optical imaging through scattering media is a fundamental challenge in many applications. Recently, substantial breakthroughs such as imaging through biological tissues and looking around corners have been obtained by the use of wavefront-shaping approaches. However, these require an implanted guide-star for determining the wavefront correction, controlled coherent illumination, and most often raster scanning of the shaped focus. Alternative novel computational approaches that exploit speckle correlations, avoid guide-stars and wavefront control but are limited to small two-dimensional objects contained within the memory-effect correlations range. Here, we present a new concept, image-guided wavefront-shaping, allowing non-invasive, guidestar-free, widefield, incoherent imaging through highly scattering layers, without illumination control. Most importantly, the wavefront-correction is found even for objects that are larger than the memory-effect range, by blindly optimizing image-quality metrics. We demonstrate imaging of extended objects through highly-scattering layers and multi-core fibers, paving the way for non-invasive imaging in various applications, from microscopy to endoscopy.

[10]  arXiv:2007.04018 (cross-list from physics.med-ph) [pdf, other]
Title: Simultaneous Estimation of X-ray Back-Scatter and Forward-Scatter using Multi-Task Learning
Comments: 10 pages, 3 figures, 1 table, accepted at MICCAI 2020
Subjects: Medical Physics (physics.med-ph); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Scattered radiation is a major concern impacting X-ray image-guided procedures in two ways. First, back-scatter significantly contributes to patient (skin) dose during complicated interventions. Second, forward-scattered radiation reduces contrast in projection images and introduces artifacts in 3-D reconstructions. While conventionally employed anti-scatter grids improve image quality by blocking X-rays, the additional attenuation due to the anti-scatter grid at the detector needs to be compensated for by a higher patient entrance dose. This also increases the room dose affecting the staff caring for the patient. For skin dose quantification, back-scatter is usually accounted for by applying pre-determined scalar back-scatter factors or linear point spread functions to a primary kerma forward projection onto a patient surface point. However, as patients come in different shapes, the generalization of conventional methods is limited. Here, we propose a novel approach combining conventional techniques with learning-based methods to simultaneously estimate the forward-scatter reaching the detector as well as the back-scatter affecting the patient skin dose. Knowing the forward-scatter, we can correct X-ray projections, while a good estimate of the back-scatter component facilitates an improved skin dose assessment. To simultaneously estimate forward-scatter as well as back-scatter, we propose a multi-task approach for joint back- and forward-scatter estimation by combining X-ray physics with neural networks. We show that, in theory, highly accurate scatter estimation in both cases is possible. In addition, we identify research directions for our multi-task framework and learning-based scatter estimation in general.

Replacements for Thu, 9 Jul 20

[11]  arXiv:1908.09414 (replaced) [pdf, other]
Title: CycleGAN with a Blur Kernel for Deconvolution Microscopy: Optimal Transport Geometry
Comments: This paper is accepted for IEEE Trans. Computational Imaging
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[12]  arXiv:1912.04981 (replaced) [pdf, other]
Title: Phase Retrieval Using Conditional Generative Adversarial Networks
Comments: Accepted at the 25th International Conference on Pattern Recognition 2020 (ICPR)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[13]  arXiv:1912.07197 (replaced) [pdf, other]
Title: Dense Recurrent Neural Networks for Accelerated MRI: History-Cognizant Unrolling of Optimization Algorithms
Journal-ref: IEEE Journal of Selected Topics in Signal Processing, 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP); Medical Physics (physics.med-ph)
[14]  arXiv:2005.09112 (replaced) [pdf]
Title: Measles Rash Image Detection Using Deep Convolutional Neural Network
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
[15]  arXiv:2005.10626 (replaced) [pdf, other]
Title: Efficient and Phase-aware Video Super-resolution for Cardiac MRI
Comments: MICCAI 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[16]  arXiv:2006.04725 (replaced) [pdf, other]
Title: Biomechanics-informed Neural Networks for Myocardial Motion Tracking in MRI
Comments: The paper is early accepted by MICCAI 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[17]  arXiv:2007.02606 (replaced) [pdf, other]
Title: A Convolutional Approach to Vertebrae Detection and Labelling in Whole Spine MRI
Comments: Accepted full paper to Medical Image Computing and Computer Assisted Intervention 2020. 11 pages plus appendix
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[18]  arXiv:2007.03107 (replaced) [pdf, other]
Title: Multi-image Super Resolution of Remotely Sensed Images using Residual Feature Attention Deep Neural Networks
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[19]  arXiv:1906.01123 (replaced) [pdf, other]
Title: Depth-Aware Arbitrary Style Transfer Using Instance Normalization
Comments: Replacement of the previous version due to the following improvements: depth estimation methods comparison added, better depth estimation network used, transformation to proximity map added with offset and contrast parameters. Dependency on these parameters shown, comparison of AdaIN and proposed method added, user evaluation study completely remade for improved version of the proposed method
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[20]  arXiv:1911.12641 (replaced) [pdf, other]
Title: PhIT-Net: Photo-consistent Image Transform for Robust Illumination Invariant Matching
Comments: Modified title. Added figures in section 3 for better understanding of the general concept. Added table summarizing graphs. New paper format (two columns)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[21]  arXiv:2004.09691 (replaced) [pdf, other]
Title: A Data and Compute Efficient Design for Limited-Resources Deep Learning
Comments: Accepted for poster presentation at the Practical Machine Learning for Developing Countries (PML4DC) workshop, ICLR 2020
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV); Machine Learning (stat.ML)
[22]  arXiv:2005.01530 (replaced) [pdf, other]
Title: Overcoming information reduced data and experimentally uncertain parameters in ptychography with regularized optimization
Comments: 18 pages, 7 figures, 3 tables
Subjects: Signal Processing (eess.SP); Image and Video Processing (eess.IV)
[23]  arXiv:2006.03761 (replaced) [pdf, other]
Title: GRNet: Gridding Residual Network for Dense Point Cloud Completion
Comments: ECCV 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[24]  arXiv:2007.01947 (replaced) [pdf, other]
Title: Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation
Comments: Full version of ECCV2020 Oral, CVPR2020 LID workshop Best Paper and LID challenge Track1 winner; website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[ total of 24 entries: 1-24 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, recent, 2007, contact, help  (Access key information)