Medical Imaging & Data Analysis
Learning to Localize Objects Using Limited Annotation, With Applications to Thoracic Diseases
Motivation: The localization of objects in images is a longstanding objective within the field of image processing. Most current techniques are based on machine learning approaches, which typically require careful annotation of training samples in the form of expensive bounding box labels. The need for such large-scale annotation has only been exacerbated by the widespread adoption of deep learning techniques within the image processing community: deep learning is notoriously data-hungry. Method: In this work, we attack this problem directly by providing a new method for learning to localize objects with limited annotation: most training images can simply be annotated with their whole image labels (and no bounding box), with only a small fraction marked with bounding boxes. The training is driven by a novel loss function, which is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning. Care is taken to ensure that the loss is numerically well-posed. Additionally, we propose a neural network architecture which accounts for both patch dependence, through the use of Conditional Random Field layers, and shift-invariance, through the inclusion of anti-aliasing filters. Results: We demonstrate our method on the task of localizing thoracic diseases in chest X-ray images, achieving state-of-the-art performance on the ChestX-ray14 dataset. We further show that with a modicum of additional effort our technique can be extended from object localization to object detection, attaining high quality results on the Kaggle RSNA Pneumonia Detection Challenge. Conclusion: The technique presented in this paper has the potential to enable high accuracy localization in regimes in which annotated data is either scarce or expensive to acquire. Future work will focus on applying the ideas presented in this paper to the realm of semantic segmentation.
PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI
Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relatively high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the concurrent learning-based design of data acquisition and image reconstruction schemes. Such schemes have already demonstrated substantial effectiveness, leading to considerably shorter acquisition times and improved quality of image reconstruction. Inspired by this initial success, in this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization, carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and backpropagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks.
Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis
Despite their great promise, artificial intelligence (AI) systems have yet to become ubiquitous in the daily practice of medicine largely due to several crucial unmet needs of healthcare practitioners. These include lack of explanations in clinically meaningful terms, handling the presence of unknown medical conditions, and transparency regarding the system’s limitations, both in terms of statistical performance as well as recognizing situations for which the system’s predictions are irrelevant. We articulate these unmet clinical needs as machine-learning (ML) problems and systematically address them with cutting-edge ML techniques. We focus on electrocardiogram (ECG) analysis as an example domain in which AI has great potential and tackle two challenging tasks: the detection of a heterogeneous mix of known and unknown arrhythmias from ECG and the identification of underlying cardio-pathology from segments annotated as normal sinus rhythm recorded in patients with an intermittent arrhythmia. We validate our methods by simulating a screening for arrhythmias in a large-scale population while adhering to statistical significance requirements. Specifically, our system 1) visualizes the relative importance of each part of an ECG segment for the final model decision; 2) upholds specified statistical constraints on its out-of-sample performance and provides uncertainty estimation for its predictions; 3) handles inputs containing unknown rhythm types; and 4) handles data from unseen patients while also flagging cases in which the model’s outputs are not usable for a specific patient. This work represents a significant step toward overcoming the limitations currently impeding the integration of AI into clinical practice in cardiology and medicine in general.
3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI
Magnetic Resonance Imaging (MRI) has long been considered to be among the gold standards of today’s diagnostic imaging. The most significant drawback of MRI is long acquisition times, prohibiting its use in standard practice for some applications. Compressed sensing (CS) proposes to subsample the k-space (the Fourier domain dual to the physical space of spatial coordinates) leading to significantly accelerated acquisition. However, the benefit of compressed sensing has not been fully exploited; most of the sampling densities obtained through CS do not produce a trajectory that obeys the stringent constraints of the MRI machine imposed in practice. Inspired by recent success of deep learning-based approaches for image reconstruction and ideas from computational imaging on learning-based design of imaging systems, we introduce 3D FLAT, a novel protocol for data-driven design of 3D non-Cartesian accelerated trajectories in MRI. Our proposal leverages the entire 3D k-space to simultaneously learn a physically feasible acquisition trajectory with a reconstruction method. Experimental results, performed as a proof-of-concept, suggest that 3D FLAT achieves higher image quality for a given readout time compared to standard trajectories such as radial, stack-of-stars, or 2D learned trajectories (trajectories that evolve only in the 2D plane while fully sampling along the third dimension). Furthermore, we demonstrate evidence supporting the significant benefit of performing MRI acquisitions using non-Cartesian 3D trajectories over 2D non-Cartesian trajectories acquired slice-wise.
Towards learned optimal q-space sampling in diffusion MRI
Fiber tractography is an important tool of computational neuroscience that enables reconstructing the spatial connectivity and organization of white matter of the brain. Fiber tractography takes advantage of diffusion Magnetic Resonance Imaging (dMRI) which allows measuring the apparent diffusivity of cerebral water along different spatial directions. Unfortunately, collecting such data comes at the price of reduced spatial resolution and substantially elevated acquisition times, which limits the clinical applicability of dMRI. This problem has been thus far addressed using two principal strategies. Most of the efforts have been extended towards improving the quality of signal estimation for any, yet fixed sampling scheme (defined through the choice of diffusion encoding gradients). On the other hand, optimization over the sampling scheme has also proven to be effective. Inspired by the previous results, the present work consolidates the above strategies into a unified estimation framework, in which the optimization is carried out with respect to both estimation model and sampling design concurrently. The proposed solution offers substantial improvements in the quality of signal estimation as well as the accuracy of ensuing analysis by means of fiber tractography. While proving the optimality of the learned estimation models would probably need more extensive evaluation, we nevertheless claim that the learned sampling schemes can be of immediate use, offering a way to improve the dMRI analysis without the necessity of deploying the neural network used for their estimation. We present a comprehensive comparative analysis based on the Human Connectome Project data.
Data-driven prediction of embryo implantation probability using IVF time-lapse imaging
The process of fertilizing a human egg outside the body in order to help those suffering from infertility to conceive is known as in vitro fertilization (IVF). Despite being the most effective method of assisted reproductive technology (ART), the average success rate of IVF is a mere 20-40%. One step that is critical to the success of the procedure is selecting which embryo to transfer to the patient, a process typically conducted manually and without any universally accepted and standardized criteria. In this paper, we describe a novel data-driven system trained to directly predict embryo implantation probability from embryogenesis time-lapse imaging videos. Using retrospectively collected videos from 272 embryos, we demonstrate that, when compared to an external panel of embryologists, our algorithm results in a 12% increase of positive predictive value and a 29% increase of negative predictive value.
Over-parameterized models for vector fields
Vector fields arise in a variety of quantity measure and visualization techniques such as fluid flow imaging, motion estimation, deformation measures, and color imaging, leading to a better understanding of physical phenomena. Recent progress in vector field imaging technologies has emphasized the need for efficient noise removal and reconstruction algorithms. A key ingredient in the success of extracting signals from noisy measurements is prior information, which can often be represented as a parameterized model. In this work, we extend the over-parameterization variational framework in order to perform model-based reconstruction of vector fields. The over-parameterization methodology combines local modeling of the data with global model parameter regularization. By considering the vector field as a linear combination of basis vector fields and appropriate scale and rotation coefficients, the denoising problem reduces to a simpler form of coefficient recovery. We introduce two versions of the over-parameterization framework: total variation-based method and sparsity-based method, relying on the co-sparse analysis model. We demonstrate the efficiency of the proposed frameworks for two- and three-dimensional vector fields with linear and quadratic over-parameterization models.
Localization with limited annotation for chest X-rays
Localization of an object within an image is a common task in medical imaging. Learning to localize or detect objects typically requires the collection of data which has been labelled with bounding boxes or similar annotations, which can be very time consuming and expensive. A technique which could perform such learning with much less annotation would, therefore, be quite valuable. We present such a technique for localization with limited annotation, in which the number of images with bounding boxes can be a small fraction of the total dataset (e.g. less than 1%); all other images only possess a whole image label and no bounding box. We propose a novel loss function for tackling this problem; the loss is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning and is numerically well-posed. Furthermore, we propose a new architecture which accounts for both patch dependence and shift-invariance, through the inclusion of CRF layers and anti-aliasing filters, respectively. We apply our technique to the localization of thoracic diseases in chest X-ray images and demonstrate state-of-the-art localization performance on the ChestX-ray14 dataset.
Learning beamforming in ultrasound imaging
Joint learning of Cartesian undersampling and reconstruction for accelerated MRI
Magnetic Resonance Imaging (MRI) is considered today the golden-standard modality for soft tissues. The long acquisition times, however, make it more prone to motion artifacts as well as contribute to the relatively high costs of this examination. Over the years, multiple studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MRI, however, these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the simultaneous learning-based design of the acquisition and reconstruction schemes manifesting significant improvement in the reconstruction quality with a constrained time budget. Inspired by these successes, in this work, we propose to learn accelerated MR acquisition schemes (in the form of Cartesian trajectories) jointly with the image reconstruction operator. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using the learned Cartesian trajectories at different speed up rates.
Self-supervised learning of inverse problem solvers in medical imaging
In the past few years, deep learning-based methods have demonstrated enormous success for solving inverse problems in medical imaging. In this work, we address the following question: Given a set of measurements obtained from real imaging experiments, what is the best way to use a learnable model and the physics of the modality to solve the inverse problem and reconstruct the latent image? Standard supervised learning based methods approach this problem by collecting data sets of known latent images and their corresponding measurements. However, these methods are often impractical due to the lack of availability of appropriately sized training sets, and, more generally, due to the inherent difficulty in measuring the “groundtruth” latent image. In light of this, we propose a self-supervised approach to training inverse models in medical imaging in the absence of aligned data. Our method only requiring access to the measurements and the forward model at training. We showcase its effectiveness on inverse problems arising in accelerated magnetic resonance imaging (MRI).
High frame-rate cardiac ultrasound imaging with deep learning
Cardiac ultrasound imaging requires a high frame rate in order to capture rapid motion. This can be achieved by multi-line acquisition (MLA), where several narrow-focused received lines are obtained from each wide-focused transmitted line. This shortens the acquisition time at the expense of introducing block artifacts. In this paper, we propose a data-driven learning-based approach to improve the MLA image quality. We train an end-to-end convolutional neural network on pairs of real ultrasound cardiac data, acquired through MLA and the corresponding single-line acquisition (SLA). The network achieves a significant improvement in image quality for both 5- and 7-line MLA resulting in a decorrelation measure similar to that of SLA while having the frame rate of MLA.
High quality ultrasonic multi-line transmission through deep learning
Frame rate is a crucial consideration in cardiac ultrasound imaging and 3D sonography. Several methods have been proposed in the medical ultrasound literature aiming at accelerating the image acquisition. In this paper, we consider one such method called multi-line transmission (MLT), in which several evenly separated focused beams are transmitted simultaneously. While MLT reduces the acquisition time, it comes at the expense of a heavy loss of contrast due to the interactions between the beams (cross-talk artifact). In this paper, we introduce a data-driven method to reduce the artifacts arising in MLT. To this end, we propose to train an end-to-end convolutional neural network consisting of correction layers followed by a constant apodization layer. The network is trained on pairs of raw data obtained through MLT and the corresponding single-line transmission (SLT) data. Experimental evaluation demonstrates signicant improvement both in the visual image quality and in objective measures such as contrast ratio and contrast-to-noise ratio, while preserving resolution unlike traditional apodization-based methods. We show that the proposed method is able to generalize
well across dierent patients and anatomies on real and phantom data.
Passive electric impedance tomography
We introduce an electric impedance tomography modality without any active current injection. By loading the probe electrodes with a time-varying network of impedances, the proposed technique exploits electrical fields existing in the medium due to biological activity or EM interference from the environment or an implantable device. A phantom validation of the technique is presented.
Printable anisotropic phantom for EEG with distributed current sources
We introduce an electric impedance tomography modality without any active current injection. By loading the probe electrodes with a time-varying network of impedances, the proposed technique exploits electrical fields existing in the medium due to biological activity or EM interference from the environment or an implaPresented is the phantom mimicking the electromagnetic properties of the human head. The fabrication is based on the additive manufacturing (3d-printing) technology combined with the electrically conductive gel. The novel key features of the phantom are the controllable anisotropic electrical conductivity of the skull and the densely packed actively multiplexed monopolar current sources permitting interpolation of the measured gain function to any dipolar current source position and orientation within the head. The phantom was tested in realistic environment successfully simulating the possible signals from neural activations situated at any depth within the brain as well as EMI and motion artifacts. The proposed design can be readily repeated in any lab having an access to a standard 100 micron precision 3d-printer. The meshes of the phantom are available from the corresponding author.ntable device. A phantom validation of the technique is presented.
VibroEEG: Improved EEG source reconstruction by combined acoustic-electric imaging
Electroencephalography (EEG) is the electrical neural activity recording modality with high temporal and low spatial resolution. Here we propose a novel technique that we call vibroEEG improving significantly the source localization accuracy of EEG. Our method combines electric potential acquisition in concert with acoustic excitation of the vibrational modes of the electrically active cerebral cortex which displace periodically the sources of the low frequency neural electrical activity. The sources residing on the maxima of the induced modes will be maximally weighted in the corresponding spectral components of the broadband signals measured on the noninvasive electrodes. In vibroEEG, for the first time the rich internal geometry of the cerebral cortex can be utilized to separate sources of neural activity lying close in the sense of the Euclidean metric. When the modes are excited locally using phased arrays the neural activity can essentially be probed at any cortical location. When a single transducer is used to induce the excitations, the EEG gain matrix is still being enriched with numerous independent gain vectors increasing its rank. We show theoretically and on numerical simulation that in both cases the source localization accuracy improves substantially.
Towards CT-quality ultrasound imaging using deep learning
The cost-effectiveness and practical harmlessness of ultra- sound imaging have made it one of the most widespread tools for medical diagnosis. Unfortunately, the beam-forming based image formation produces granular speckle noise, blur- ring, shading and other artifacts. To overcome these effects, the ultimate goal would be to reconstruct the tissue acoustic properties by solving a full wave propagation inverse prob- lem. In this work, we make a step towards this goal, using Multi-Resolution Convolutional Neural Networks (CNN). As a result, we are able to reconstruct CT-quality images from the reflected ultrasound radio-frequency(RF) data obtained by simulation from real CT scans of a human body. We also show that CNN is able to imitate existing computationally heavy despeckling methods, thereby saving orders of magni- tude in computations and making them amenable to real-time applications.
White matter fiber representation using continuous dictionary learning
With increasingly sophisticated Diffusion Weighted MRI acquisition methods and modelling techniques, very large sets of streamlines (fibers) are presently generated per imaged brain. These reconstructions of white matter architecture, which are important for human brain research and pre-surgical planning, require a large amount of storage and are often unwieldy and difficult to manipulate and analyze. This work proposes a novel continuous parsimonious framework in which signals are sparsely represented in a dictionary with continuous atoms. The significant innovation in our new methodology is the ability to train such continuous dictionaries, unlike previous approaches that either used pre-fixed continuous transforms or training with finite atoms. This leads to an innovative fiber representation method, which uses Continuous Dictionary Learning to sparsely code each fiber with high accuracy. This method is tested on numerous tractograms produced from the Human Connectome Project data and achieves state-of-the-art performances in compression ratio and reconstruction error.
Boosted metric learning for 3D multi-modal deformable registration
Defining a suitable metric is one of the biggest challenges in deformable image fusion from different modalities. In this paper, we propose a novel approach for multi-modal metric learning in the deformable registration framework that consists of embedding data from both modalities into a common metric space whose metric is used to parametrize the similarity. Specifically, we use image representation in the Fourier/Gabor space which introduces invariance to the local pose parameters, and the Hamming metric as the target embedding space, which allows constructing the embedding using boosted learning algorithms. The resulting metric is incorporated into a discrete optimization framework. Very promising results demonstrate the potential of the proposed method.
Data fusion through cross-modality metric learning using similarity-sensitive hashing
Visual understanding is often based on measuring similarity between observations. Learning similarities specific to a certain perception task from a set of examples has been shown advantageous in various computer vision and pattern recognition problems. In many important applications, the data that one needs to compare come from different representations or modalities, and the similarity between such data operates on objects that may have different and often incommensurable structure and dimensionality. In this paper, we propose a framework for supervised similarity learning based on embedding the input data from two arbitrary spaces into the Hamming space. The mapping is expressed as a binary classification problem with positive and negative examples, and can be efficiently learned using boosting algorithms. The utility and efficiency of such a generic approach is demonstrated on several challenging applications including cross-representation shape retrieval and alignment of multi-modal medical images.
Unmixing tissues: sparse component analysis in multi-contrast MRI
We pose the problem of tissue classification in MRI as a blind source separation (BSS) problem and solve it by means of sparse component analysis (SCA). Assuming that most MR images can be sparsely represented, we consider their optimal sparse representation. Sparse components define a physically-meaningful feature space for classification. We demonstrate our approach on simulated and real multi-contrast MRI data. The proposed framework is general in that it is applicable to other modalities of medical imaging as well, whenever the linear mixing model is applicable.
Optimal nonlinear line-of-flight estimation in positron emission tomography
We consider detection of high-energy photons in PET using thick scintillation crystals. Parallax effect and multiple Compton interactions such crystals significantly reduce the accuracy of conventional detection methods. In order to estimate the photon line of flight based on photomultiplier responses, we use asymptotically optimal nonlinear techniques, implemented by feedforward and radial basis function (RBF) neural networks. Incorporation of information about angles of incidence of photons significantly improves the accuracy of estimation. The proposed estimators are fast enough to perform detection, using conventional computers. Monte-Carlo simulation results show that our approach significantly outperforms the conventional Anger algorithm.
Reconstruction in ultrasound diffraction tomography using non-uniform FFT
We show an iterative reconstruction framework for diffraction ultrasound tomography. The use of broadband illumination allows a significant reduction of the number of projections compared to straight ray tomography. The proposed algorithm makes use of the forward nonuniform fast Fourier transform (NUFFT) for iterative Fourier inversion. Incorporation of total variation regularization allows the reduction of noise and Gibbs phenomena while preserving the edges. The complexity of the NUFFT-based reconstruction is comparable to the frequency domain interpolation (gridding) algorithm, whereas the reconstruction accuracy (in sense of the L2 and the L∞ norm) is better.
Iterative reconstruction in diffraction tomography using non-uniform fast Fourier transform
We show an iterative reconstruction framework for diffraction ultrasound tomography. The use of broadband illumination allows the number of projections to be reduced significantly compared to straight ray tomography. The proposed algorithm makes use of fast forward non-uniform Fourier transform (NUFFT) for iterative Fourier inversion. Incorporation of total variation regularization allows noise and Gibbs phenomena to be reduced whilst preserving the edges.
Optimal nonlinear estimation of photon coordinates in PET
We consider detection of high-energy photons in PET using thick scintillation crystals. Parallax effect and multiple Compton interactions in this type of crystals significantly reduce the accuracy of conventional detection methods. In order to estimate the scintillation point coordinates based on photomultiplier responses, we use asymptotically optimal nonlinear techniques, implemented by feed-forward neural networks, radial basis functions (RBF) networks, and neuro-fuzzy systems. Incorporation of information about angles of incidence of photons significantly improves the accuracy of estimation. The proposed estimators are fast enough to perform detection using conventional computers.