Relevant publications
Sanketh Vedula
Seeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures
Proteins jiggle around, adopting ensembles of interchanging conformations. Here we show through a large-scale analysis of the Protein Data Bank and using molecular dynamics simulations, that segments of protein chains can also commonly adopt dual, transiently stable conformations which is not explained by direct interactions. Our analysis highlights how alternate conformations can be maintained as non-interchanging, separated states intrinsic to the protein chain, namely through steric barriers or the adoption of transient secondary structure elements. We further demonstrate that despite the commonality of the phenomenon, current structural ensemble prediction methods fail to capture these bimodal distributions of conformations.
Scalable unsupervised alignment of general metric and non-metric structures
Aligning data from different domains is a fundamental problem in machine learning with broad applications across very different areas, most notably aligning experimental readouts in single-cell multiomics. Mathematically, this problem can be formulated as the minimization of disagreement of pair-wise quantities such as distances and is related to the Gromov-Hausdorff and Gromov-Wasserstein distances. Computationally, it is a quadratic assignment problem (QAP) that is known to be NP-hard. Prior works attempted to solve the QAP directly with entropic or low-rank regularization on the permutation, which is computationally tractable only for modestly-sized inputs, and encode only limited inductive bias related to the domains being aligned. We consider the alignment of metric structures formulated as a discrete Gromov-Wasserstein problem and instead of solving the QAP directly, we propose to learn a related well-scalable linear assignment problem (LAP) whose solution is also a minimizer of the QAP. We also show a flexible extension of the proposed framework to general non-metric dissimilarities through differentiable ranks. We extensively evaluate our approach on synthetic and real datasets from single-cell multiomics and neural latent spaces, achieving state-of-the-art performance while being conceptually and computationally simple.
Vector quantile regression on manifolds
Quantile regression (QR) is a statistical tool for distribution-free estimation of conditional quantiles of a target variable given explanatory features. QR is limited by the assumption that the target distribution is univariate and defined on an Euclidean domain. Although the notion of quantiles was recently extended to multi-variate distributions, QR for multi-variate distributions on manifolds remains underexplored, even though many important applications inherently involve data distributed on, e.g., spheres (climate and geological phenomena), and tori (dihedral angles in proteins). By leveraging optimal transport theory and c-concave functions, we meaningfully define conditional vector quantile functions of high-dimensional variables on manifolds (M-CVQFs). Our approach allows for quantile estimation, regression, and computation of conditional confidence sets and likelihoods. We demonstrate the approach’s efficacy and provide insights regarding the meaning of non-Euclidean quantiles through synthetic and real data experiments.
Demystifying graph sparsification algorithms in graph properties preservation
Graph sparsification is a technique that approximates a given graph by a sparse graph with a subset of vertices and/or edges. The goal of an effective sparsification algorithm is to maintain specific graph properties relevant to the downstream task while minimizing the graph’s size. Graph algorithms often suffer from long execution time due to the irregularity and the large real-world graph size. Graph sparsification can be applied to greatly reduce the run time of graph algorithms by substituting the full graph with a much smaller sparsified graph, without significantly degrading the output quality. However, the interaction between numerous sparsifiers and graph properties is not widely explored, and the potential of graph sparsification is not fully understood.
In this work, we cover 16 widely-used graph metrics, 12 representative graph sparsification algorithms, and 14 real-world input graphs spanning various categories, exhibiting diverse characteristics, sizes, and densities. We developed a framework to extensively assess the performance of these sparsification algorithms against graph metrics, and provide insights to the results. Our study shows that there is no one sparsifier that performs the best in preserving all graph properties, e.g. sparsifiers that preserve distance-related graph properties (eccentricity) struggle to perform well on Graph Neural Networks (GNN). This paper presents a comprehensive experimental study evaluating the performance of sparsification algorithms in preserving essential graph metrics. The insights inform future research in incorporating matching graph sparsification to graph algorithms to maximize benefits while minimizing quality degradation. Furthermore, we provide a framework to facilitate the future evaluation of evolving sparsification algorithms, graph metrics, and ever-growing graph data.
GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference
The high memory bandwidth demand of sparse embedding layers continues to be a critical challenge in scaling the performance of recommendation models. While prior works have exploited heterogeneous memory system designs and partial embedding sum memoization techniques, they offer limited benefits. This is because prior designs either target a very small subset of embeddings to simplify their analysis or incur a high processing cost to account for all embeddings, which does not scale with the large sizes of modern embedding tables. This paper proposes GRACE—a lightweight and scalable graph-based algorithm-system co-design framework to significantly improve the embedding layer performance of recommendation models. GRACE proposes a novel Item Co-occurrence Graph (ICG) that scalably records item co-occurrences. GRACE then presents a new system-aware ICG clustering algorithm to find frequently accessed item combinations of arbitrary lengths to compute and memoize their partial sums. High-frequency partial sums are stored in a software-managed cache space to reduce memory traffic and improve the throughput of computing sparse features. We further present a cache data layout and low-cost address computation logic to efficiently lookup item embeddings and their partial sums. Our evaluation shows that GRACE significantly outperforms the state-of-the-art techniques SPACE and MERCI by 1.5× and 1.4×, respectively.
Continuous vector quantile regression
Vector quantile regression (VQR) estimates the conditional vector quantile function (CVQF), a fundamental quantity which fully represents the conditional distribution of Y|X. VQR is formulated as an optimal transport (OT) problem between a uniform U~μ and the target (X,Y)~ν, the solution of which is a unique transport map, co-monotonic with U. Recently NL-VQR has been proposed to estimate support non-linear CVQFs, together with fast solvers which enabled the use of this tool in practical applications. Despite its utility, the scalability and estimation quality of NL-VQR is limited due to a discretization of the OT problem onto a grid of quantile levels. We propose a novel continuous formulation and parametrization of VQR using partial input-convex neural networks (PICNNs). Our approach allows for accurate, scalable, differentiable and invertible estimation of non-linear CVQFs. We further demonstrate, theoretically and experimentally, how continuous CVQFs can be used for general statistical inference tasks: estimation of likelihoods, CDFs, confidence sets, coverage, sampling, and more. This work is an important step towards unlocking the full potential of VQR.
Vector quantile regression on manifolds
Quantile regression (QR) is a statistical tool for distribution-free
estimation of conditional quantiles of a target variable given explanatory
features. QR is limited by the assumption that the target distribution is
univariate and defined on an Euclidean domain. Although the notion of quantiles
was recently extended to multi-variate distributions, QR for multi-variate
distributions on manifolds remains underexplored, even though many important
applications inherently involve data distributed on, e.g., spheres (climate
measurements), tori (dihedral angles in proteins), or Lie groups (attitude in
navigation). By leveraging optimal transport theory and the notion of
c-concave functions, we meaningfully define conditional vector quantile
functions of high-dimensional variables on manifolds (M-CVQFs). Our approach
allows for quantile estimation, regression, and computation of conditional
confidence sets. We demonstrate the approach’s efficacy and provide insights
regarding the meaning of non-Euclidean quantiles through preliminary synthetic
data experiments.
Fast nonlinear vector quantile regression
Quantile regression (QR) is a powerful tool for estimating one or more conditional quantiles of a target variable Y given explanatory features X. A limitation of QR is that it is only defined for scalar target variables, due to the formulation of its objective function, and since the notion of quantiles has no standard definition for multivariate distributions. Recently, vector quantile regression (VQR) was proposed as an extension of QR for high-dimensional target variables, thanks to a meaningful generalization of the notion of quantiles to multivariate distributions. Despite its elegance, VQR is arguably not applicable in practice due to several limitations: (i) it assumes a linear model for the quantiles of the target Y given the features X; (ii) its exact formulation is intractable even for modestly-sized problems in terms of target dimensions, the number of regressed quantile levels, or the number of features, and its relaxed dual formulation may violate the monotonicity of the estimated quantiles; (iii) no fast or scalable solvers for VQR currently exist. In this work we fully address these limitations, namely: (i) We extend VQR to the non-linear case, showing substantial improvement over linear VQR; (ii) We propose vector monotone rearrangement, a method which ensures the estimates obtained by VQR relaxations are monotone functions; (iii) We provide fast, GPU-accelerated solvers for linear and nonlinear VQR which maintain a fixed memory footprint with the number of samples and quantile levels, and demonstrate that they scale to millions of samples and thousands of quantile levels; (iv) We release an optimized python package of our solvers as to widespread the use of VQR in real-world applications.
Mint: An Accelerator For Mining Temporal Motifs
A variety of complex systems, including social and communication networks, financial markets, biology, and neuroscience are modeled using temporal graphs that contain a set of nodes and directed timestamped edges. Temporal motifs in temporal graphs are generalized from subgraph patterns in static graphs in that they also account for edge ordering and time duration, in addition to the graph structure. Mining temporal motifs is a fundamental problem used in several application domains. However, existing software frameworks offer suboptimal performance due to high algorithmic complexity and irregular memory accesses of temporal motif mining. This paper presents Mint—a novel accelerator architecture and a programming model for mining temporal motifs efficiently. We first divide this workload into three fundamental tasks: search, book-keeping, and backtracking. Based on this, we propose a task–centric programming model that enables decoupled, asynchronous execution. This model unlocks massive opportunities for parallelism, and allows storing task context information on-chip. To best utilize the proposed programming model, we design a domain-specific hardware accelerator using its data path and memory subsystem design to cater to the unique workload characteristics of temporal motif mining. To further improve performance, we propose a novel optimization called search index memoization that significantly reduces memory traffic. We comprehensively compare the performance of Mint with state-of-the-art temporal motif mining software frameworks (both approximate and exact) running on both CPU and GPU, and show 9×–2576× benefit in performance.
Joint optimization of system design and reconstruction in MIMO radar imaging
Multiple-input multiple-output (MIMO) radar is one of the leading depth sensing modalities. However, the usage of multiple receive channels lead to relative high costs and prevent the penetration of MIMOs in many areas such as the automotive industry. Over the last years, few studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MIMO radars, however these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of simultaneous learningbased design of the acquisition and reconstruction schemes, manifesting significant improvement in the reconstruction quality. Inspired by these successes, in this work, we propose to learn MIMO acquisition parameters in the form of receive (Rx) antenna elements locations jointly with an image neuralnetwork based reconstruction. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using our learned acquisition parameters with and without the neural-network reconstruction.
PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI
Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relatively high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the concurrent learning-based design of data acquisition and image reconstruction schemes. Such schemes have already demonstrated substantial effectiveness, leading to considerably shorter acquisition times and improved quality of image reconstruction. Inspired by this initial success, in this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization, carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and backpropagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks.
Spectral geometric matrix completion
Deep Matrix Factorization (DMF) is an emerging approach to the problem of reconstructing a matrix from a subset of its entries. Recent works have established that gradient descent applied to a DMF model induces an implicit regularization on the rank of the recovered matrix. Despite these promising theoretical results, empirical evaluation of vanilla DMF on real benchmarks exhibits poor reconstructions which we attribute to the extremely low number of samples available. We propose an explicit spectral regularization scheme that is able to make DMF models competitive on real benchmarks, while still maintaining the implicit regularization induced by gradient descent, thus enjoying the best of both worlds.
3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI
Magnetic Resonance Imaging (MRI) has long been considered to be among the gold standards of today’s diagnostic imaging. The most significant drawback of MRI is long acquisition times, prohibiting its use in standard practice for some applications. Compressed sensing (CS) proposes to subsample the k-space (the Fourier domain dual to the physical space of spatial coordinates) leading to significantly accelerated acquisition. However, the benefit of compressed sensing has not been fully exploited; most of the sampling densities obtained through CS do not produce a trajectory that obeys the stringent constraints of the MRI machine imposed in practice. Inspired by recent success of deep learning-based approaches for image reconstruction and ideas from computational imaging on learning-based design of imaging systems, we introduce 3D FLAT, a novel protocol for data-driven design of 3D non-Cartesian accelerated trajectories in MRI. Our proposal leverages the entire 3D k-space to simultaneously learn a physically feasible acquisition trajectory with a reconstruction method. Experimental results, performed as a proof-of-concept, suggest that 3D FLAT achieves higher image quality for a given readout time compared to standard trajectories such as radial, stack-of-stars, or 2D learned trajectories (trajectories that evolve only in the 2D plane while fully sampling along the third dimension). Furthermore, we demonstrate evidence supporting the significant benefit of performing MRI acquisitions using non-Cartesian 3D trajectories over 2D non-Cartesian trajectories acquired slice-wise.
Towards learned optimal q-space sampling in diffusion MRI
Fiber tractography is an important tool of computational neuroscience that enables reconstructing the spatial connectivity and organization of white matter of the brain. Fiber tractography takes advantage of diffusion Magnetic Resonance Imaging (dMRI) which allows measuring the apparent diffusivity of cerebral water along different spatial directions. Unfortunately, collecting such data comes at the price of reduced spatial resolution and substantially elevated acquisition times, which limits the clinical applicability of dMRI. This problem has been thus far addressed using two principal strategies. Most of the efforts have been extended towards improving the quality of signal estimation for any, yet fixed sampling scheme (defined through the choice of diffusion encoding gradients). On the other hand, optimization over the sampling scheme has also proven to be effective. Inspired by the previous results, the present work consolidates the above strategies into a unified estimation framework, in which the optimization is carried out with respect to both estimation model and sampling design concurrently. The proposed solution offers substantial improvements in the quality of signal estimation as well as the accuracy of ensuing analysis by means of fiber tractography. While proving the optimality of the learned estimation models would probably need more extensive evaluation, we nevertheless claim that the learned sampling schemes can be of immediate use, offering a way to improve the dMRI analysis without the necessity of deploying the neural network used for their estimation. We present a comprehensive comparative analysis based on the Human Connectome Project data.
Joint learning of Cartesian undersampling and reconstruction for accelerated MRI
Magnetic Resonance Imaging (MRI) is considered today the golden-standard modality for soft tissues. The long acquisition times, however, make it more prone to motion artifacts as well as contribute to the relatively high costs of this examination. Over the years, multiple studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MRI, however, these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the simultaneous learning-based design of the acquisition and reconstruction schemes manifesting significant improvement in the reconstruction quality with a constrained time budget. Inspired by these successes, in this work, we propose to learn accelerated MR acquisition schemes (in the form of Cartesian trajectories) jointly with the image reconstruction operator. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using the learned Cartesian trajectories at different speed up rates.
Deep matrix factorization with spectral geometric regularization
We address the problem of reconstructing a matrix from a subset of its entries. Current methods, branded as geometric matrix completion, augment classical rank regularization techniques by incorporating geometric information into the solution. This information is usually provided as graphs encoding relations between rows/columns. In this work, we propose a simple spectral approach for solving the matrix completion problem, via the framework of functional maps. We introduce the zoomout loss, a multiresolution spectral geometric loss inspired by recent advances in shape correspondence, whose minimization leads to state-of-the-art results on various recommender systems datasets. Surprisingly, for some datasets, we were able to achieve comparable results even without incorporating geometric information. This puts into question both the quality of such information and current methods’ ability to use it in a meaningful and efficient way.
Code is available either as Google Colab notebook, or via https://github.com/amitboy/SGMC
Learning beamforming in ultrasound imaging
Self-supervised learning of inverse problem solvers in medical imaging
In the past few years, deep learning-based methods have demonstrated enormous success for solving inverse problems in medical imaging. In this work, we address the following question: Given a set of measurements obtained from real imaging experiments, what is the best way to use a learnable model and the physics of the modality to solve the inverse problem and reconstruct the latent image? Standard supervised learning based methods approach this problem by collecting data sets of known latent images and their corresponding measurements. However, these methods are often impractical due to the lack of availability of appropriately sized training sets, and, more generally, due to the inherent difficulty in measuring the “groundtruth” latent image. In light of this, we propose a self-supervised approach to training inverse models in medical imaging in the absence of aligned data. Our method only requiring access to the measurements and the forward model at training. We showcase its effectiveness on inverse problems arising in accelerated magnetic resonance imaging (MRI).
High frame-rate cardiac ultrasound imaging with deep learning
Cardiac ultrasound imaging requires a high frame rate in order to capture rapid motion. This can be achieved by multi-line acquisition (MLA), where several narrow-focused received lines are obtained from each wide-focused transmitted line. This shortens the acquisition time at the expense of introducing block artifacts. In this paper, we propose a data-driven learning-based approach to improve the MLA image quality. We train an end-to-end convolutional neural network on pairs of real ultrasound cardiac data, acquired through MLA and the corresponding single-line acquisition (SLA). The network achieves a significant improvement in image quality for both 5- and 7-line MLA resulting in a decorrelation measure similar to that of SLA while having the frame rate of MLA.
High quality ultrasonic multi-line transmission through deep learning
Frame rate is a crucial consideration in cardiac ultrasound imaging and 3D sonography. Several methods have been proposed in the medical ultrasound literature aiming at accelerating the image acquisition. In this paper, we consider one such method called multi-line transmission (MLT), in which several evenly separated focused beams are transmitted simultaneously. While MLT reduces the acquisition time, it comes at the expense of a heavy loss of contrast due to the interactions between the beams (cross-talk artifact). In this paper, we introduce a data-driven method to reduce the artifacts arising in MLT. To this end, we propose to train an end-to-end convolutional neural network consisting of correction layers followed by a constant apodization layer. The network is trained on pairs of raw data obtained through MLT and the corresponding single-line transmission (SLT) data. Experimental evaluation demonstrates signicant improvement both in the visual image quality and in objective measures such as contrast ratio and contrast-to-noise ratio, while preserving resolution unlike traditional apodization-based methods. We show that the proposed method is able to generalize
well across dierent patients and anatomies on real and phantom data.
Towards CT-quality ultrasound imaging using deep learning
The cost-effectiveness and practical harmlessness of ultra- sound imaging have made it one of the most widespread tools for medical diagnosis. Unfortunately, the beam-forming based image formation produces granular speckle noise, blur- ring, shading and other artifacts. To overcome these effects, the ultimate goal would be to reconstruct the tissue acoustic properties by solving a full wave propagation inverse prob- lem. In this work, we make a step towards this goal, using Multi-Resolution Convolutional Neural Networks (CNN). As a result, we are able to reconstruct CT-quality images from the reflected ultrasound radio-frequency(RF) data obtained by simulation from real CT scans of a human body. We also show that CNN is able to imitate existing computationally heavy despeckling methods, thereby saving orders of magni- tude in computations and making them amenable to real-time applications.