Publications

Topics:
  1. Y. Elul, E. Rozenberg, A. Boyarski, Y. Yaniv, A. Schuster, A. M. Bronstein , Data-driven modeling of interrelated dynamical systems, Nature Communications Physics (7), 144, 2024 details

    Data-driven modeling of interrelated dynamical systems

    Y. Elul, E. Rozenberg, A. Boyarski, Y. Yaniv, A. Schuster, A. M. Bronstein
    Nature Communications Physics (7), 144, 2024
    Picture for Data-driven modeling of interrelated dynamical systems
    --->>

    Non-linear dynamical systems describe numerous real-world phenomena, ranging from the weather, to financial markets and disease progression. Individual systems may share substantial common information, for example patients’ anatomy. Lately, deep-learning has emerged as a leading method for data-driven modeling of non-linear dynamical systems. Yet, despite recent breakthroughs, prior works largely ignored the existence of shared information between different systems. However, such cases are quite common, for example, in medicine: we may wish to have a patient-specific model for some disease, but the data collected from a single patient is usually too small to train a deep-learning model. Hence, we must properly utilize data gathered from other patients. Here, we explicitly consider such cases by jointly modeling multiple systems. We show that the current single-system models consistently fail when trying to learn simultaneously from multiple systems. We suggest a framework for jointly approximating the Koopman operators of multiple systems, while intrinsically exploiting common information. We demonstrate how we can adapt to a new system using order-of-magnitude less new data and show the superiority of our model over competing methods, in terms of both forecasting ability and statistical fidelity, across chaotic, cardiac, and climate systems.

    O. Wengrowicz, A. M. Bronstein, O. Cohen, Unsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography, Optics Express 32(6), pp. 8791-8803, 2024 details

    Unsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography

    O. Wengrowicz, A. M. Bronstein, O. Cohen
    Optics Express 32(6), pp. 8791-8803, 2024
    Picture for Unsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography
    --->>

    We explore numerically an unsupervised, physics-informed, deep learning-based reconstruction technique for time-resolved imaging by multiplexed ptychography. In our method, the untrained deep learning model replaces the iterative algorithm’s update step, yielding superior reconstructions of multiple dynamic object frames compared to conventional methodologies. More precisely, we demonstrate improvements in image quality and resolution, while reducing sensitivity to the number of recorded frames, the mutual orthogonality of different probe modes, overlap between neighboring probe beams and the cutoff frequency of the ptychographic microscope – properties that are generally of paramount importance for ptychographic reconstruction algorithms.

    G. Serussi, T. Shor, T. Hirshberg, C. Baskin, A. M. Bronstein, Active propulsion noise shaping for multi-rotor aircraft localization, arXiv:2402.17289, 2023 details

    Active propulsion noise shaping for multi-rotor aircraft localization

    G. Serussi, T. Shor, T. Hirshberg, C. Baskin, A. M. Bronstein
    arXiv:2402.17289, 2023
    Picture for Active propulsion noise shaping for multi-rotor aircraft localization
    --->>

    Multi-rotor aerial autonomous vehicles (MAVs) primarily rely on vision for navigation purposes. However, visual localization and odometry techniques suffer from poor performance in low or direct sunlight, a limited field of view, and vulnerability to occlusions. Acoustic sensing can serve as a complementary or even alternative modality for vision in many situations, and it also has the added benefits of lower system cost and energy footprint, which is especially important for micro aircraft. This paper proposes actively controlling and shaping the aircraft propulsion noise generated by the rotors to benefit localization tasks, rather than considering it a harmful nuisance. We present a neural network architecture for selfnoise-based localization in a known environment. We show that training it simultaneously with learning time-varying rotor phase modulation achieves accurate and robust localization. The proposed methods are evaluated using a computationally affordable simulation of MAV rotor noise in 2D acoustic environments that is fitted to real recordings of rotor pressure fields.

    M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein, Vector quantile regression on manifolds, Proc. AIStats, 2024 details

    Vector quantile regression on manifolds

    M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein
    Proc. AIStats, 2024
    Picture for Vector quantile regression on manifolds
    --->>

    Quantile regression (QR) is a statistical tool for distribution-free estimation of conditional quantiles of a target variable given explanatory features. QR is limited by the assumption that the target distribution is univariate and defined on an Euclidean domain. Although the notion of quantiles was recently extended to multi-variate distributions, QR for multi-variate distributions on manifolds remains underexplored, even though many important applications inherently involve data distributed on, e.g., spheres (climate and geological phenomena), and tori (dihedral angles in proteins). By leveraging optimal transport theory and c-concave functions, we meaningfully define conditional vector quantile functions of high-dimensional variables on manifolds (M-CVQFs). Our approach allows for quantile estimation, regression, and computation of conditional confidence sets and likelihoods. We demonstrate the approach’s efficacy and provide insights regarding the meaning of non-Euclidean quantiles through synthetic and real data experiments.

    Y. Chen, H. Ye, S. Vedula, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati, Demystifying graph sparsification algorithms in graph properties preservation, Proc.Int'l Conf. on Very Large Databases (VLDB), 2024 details

    Demystifying graph sparsification algorithms in graph properties preservation

    Y. Chen, H. Ye, S. Vedula, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati
    Proc.Int'l Conf. on Very Large Databases (VLDB), 2024
    Picture for Demystifying graph sparsification algorithms in graph properties preservation
    --->>

    Graph sparsification is a technique that approximates a given graph by a sparse graph with a subset of vertices and/or edges. The goal of an effective sparsification algorithm is to maintain specific graph properties relevant to the downstream task while minimizing the graph’s size. Graph algorithms often suffer from long execution time due to the irregularity and the large real-world graph size. Graph sparsification can be applied to greatly reduce the run time of graph algorithms by substituting the full graph with a much smaller sparsified graph, without significantly degrading the output quality. However, the interaction between numerous sparsifiers and graph properties is not widely explored, and the potential of graph sparsification is not fully understood.
    In this work, we cover 16 widely-used graph metrics, 12 representative graph sparsification algorithms, and 14 real-world input graphs spanning various categories, exhibiting diverse characteristics, sizes, and densities. We developed a framework to extensively assess the performance of these sparsification algorithms against graph metrics, and provide insights to the results. Our study shows that there is no one sparsifier that performs the best in preserving all graph properties, e.g. sparsifiers that preserve distance-related graph properties (eccentricity) struggle to perform well on Graph Neural Networks (GNN). This paper presents a comprehensive experimental study evaluating the performance of sparsification algorithms in preserving essential graph metrics. The insights inform future research in incorporating matching graph sparsification to graph algorithms to maximize benefits while minimizing quality degradation. Furthermore, we provide a framework to facilitate the future evaluation of evolving sparsification algorithms, graph metrics, and ever-growing graph data.

    A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, An amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units, Proc. US National Academy of Sciences (PNAS), 2023 details

    An amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units

    A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein
    Proc. US National Academy of Sciences (PNAS), 2023
    Picture for An amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units
    --->>

    Protein structure, both at the global and local level, dictates function. Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. Here, we show that a Ramachandran-type plot focusing on the two dihedral angles separated by the peptide bond, and entirely contained within an amino acid pair, defines a local structural unit. We further demonstrate the usefulness of this cross-peptide-bond Ramachandran plot by showing that it captures β-turn conformations in coil regions, that traditional Ramachandran plot outliers fall into occupied regions of our plot, and that thermophilic proteins prefer specific amino acid pair conformations. Further, we demonstrate experimentally that the effect of a point mutation on backbone conformation and protein stability depends on the amino acid pair context, i.e., the identity of the adjacent amino acid, in a manner predictable by our method.

    T. Weiss, L. Cosmo, E. Mayo Yanes, S. Chakraborty, A. M. Bronstein, R. Gershoni-Poranne, Guided diffusion for inverse molecular design, Nature Computational Science 3(10), 873–882, 2023 details

    Guided diffusion for inverse molecular design

    T. Weiss, L. Cosmo, E. Mayo Yanes, S. Chakraborty, A. M. Bronstein, R. Gershoni-Poranne
    Nature Computational Science 3(10), 873–882, 2023
    Picture for Guided diffusion for inverse molecular design
    --->>

    The holy grail of materials science is de novo molecular design — i.e., the ability to engineer molecules with desired characteristics. Recently, this goal has become increasingly achievable thanks to developments such as equivariant graph neural networks that can better predict molecular properties, and to the improved performance of generation tasks, in particular of conditional generation, in text-to-image generators and large language models. Herein, we introduce GaUDI, a guided diffusion model for inverse molecular design, which combines these advances and can generate novel molecules with desired properties. GaUDI decouples the generator and the property-predicting models and can be guided using both point-wise targets and open-ended targets (e.g., minimum/maximum). We demonstrate GaUDI’s effectiveness using single- and multiple-objective tasks applied to newly-generated data sets of polycyclic aromatic systems, achieving nearly 100% validity of generated molecules. Further, for some tasks, GaUDI discovers better molecules than those present in our data set of 475k molecules.

    E. Schwartz, A. M. Bronstein, R. Giryes, ISP distillation, IEEE Open Journal of Signal Processing 4, 12-20, 2023 details

    ISP distillation

    E. Schwartz, A. M. Bronstein, R. Giryes
    IEEE Open Journal of Signal Processing 4, 12-20, 2023
    Picture for ISP distillation
    --->>

    Nowadays, many of the images captured are ‘observed’ by machines only and not by humans, e.g., in autonomous systems. High-level machine vision models, such as object recognition or semantic segmentation, assume images are transformed into some canonical image space by the camera Image Signal Processor (ISP). However, the camera ISP is optimized for producing visually pleasing images for human observers and not for machines. Therefore, one may spare the ISP compute time and apply vision models directly to RAW images. Yet, it has been shown that training such models directly on RAW images results in a performance drop. To mitigate this drop, we use a RAW and RGB image pairs dataset, which can be easily acquired with no human labeling. We then train a model that is applied directly to the RAW data by using knowledge distillation such that the model predictions for RAW images will be aligned with the predictions of an off-the-shelf pre-trained model for processed RGB images. Our experiments show that our performance on RAW images for object classification and semantic segmentation is significantly better than models trained on labeled RAW images. It also reasonably matches the predictions of a pre-trained model on processed RGB images, while saving the ISP compute overhead.

    T. Blau, R. Ganz, C. Baskin, M. Elad, A. M. Bronstein, Classifier robustness enhancement via test-time transformation, arXiv preprint arXiv:2303.15409 2023 details

    Classifier robustness enhancement via test-time transformation

    T. Blau, R. Ganz, C. Baskin, M. Elad, A. M. Bronstein
    arXiv preprint arXiv:2303.15409 2023
    Picture for Classifier robustness enhancement via test-time transformation
    --->>

    It has been recently discovered that adversarially trained classifiers exhibit an intriguing property, referred to as perceptually aligned gradients (PAG). PAG implies that the gradients of such classifiers possess a meaningful structure, aligned with human perception. Adversarial training is currently the best-known way to achieve classification robustness under adversarial attacks. The PAG property, however, has yet to be leveraged for further improving classifier robustness. In this work, we introduce Classifier Robustness Enhancement Via Test-Time Transformation (TETRA) — a novel defense method that utilizes PAG, enhancing the performance of trained robust classifiers. Our method operates in two phases. First, it modifies the input image via a designated targeted adversarial attack into each of the dataset’s classes. Then, it classifies the input image based on the distance to each of the modified instances, with the assumption that the shortest distance relates to the true class. We show that the proposed method achieves state-of-the-art results and validate our claim through extensive experiments on a variety of defense methods, classifier architectures, and datasets. We also empirically demonstrate that TETRA can boost the accuracy of any differentiable adversarial training classifier across a variety of attacks, including ones unseen at training. Specifically, applying TETRA leads to substantial improvement of up to +23%, +20%, and +26% on CIFAR10, CIFAR100, and ImageNet, respectively.

    E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie, Designing nonlinear photonic crystals for high-dimensional quantum state engineering, ICLR Workshop on Machine Learning for Materials, 2023 details

    Designing nonlinear photonic crystals for high-dimensional quantum state engineering

    E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie
    ICLR Workshop on Machine Learning for Materials, 2023
    Picture for Designing nonlinear photonic crystals for high-dimensional quantum state engineering
    --->>

    We propose a novel, physically-constrained and differentiable approach for the generation of D-dimensional qudit states via spontaneous parametric downconversion (SPDC) in quantum optics. We circumvent any limitations imposed by the inherently stochastic nature of the physical process and incorporate a set of stochastic dynamical equations governing its evolution under the SPDC Hamiltonian. We demonstrate the effectiveness of our model through the design of
    structured nonlinear photonic crystals (NLPCs) and shaped pump beams; and show, theoretically and experimentally, how to generate maximally entangled states in the spatial degree of freedom. The learning of NLPC structures offers a promising new avenue for shaping and controlling arbitrary quantum states and enables all-optical coherent control of the generated states. We believe that this approach can readily be extended from bulky crystals to thin Metasurfaces and potentially applied to other quantum systems sharing a similar Hamiltonian structures, such as superfluids and superconductors.

    E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie, A machine learning approach to generate quantum light, ICLR Workshop on Physics for Machine Learning, 2023 details

    A machine learning approach to generate quantum light

    E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie
    ICLR Workshop on Physics for Machine Learning, 2023
    Picture for A machine learning approach to generate quantum light
    --->>

    Spontaneous parametric down-conversion (SPDC) is a key technique in quantum optics used to generate entangled photon pairs. However, generating a desirable D-dimensional qudit state in the SPDC process remains a challenge. In this paper, we introduce a physically-constrained and differentiable model to overcome this challenge, and demonstrate its effectiveness through the design of shaped pump beams and structured nonlinear photonic crystals. We avoid any restrictions induced by the stochastic nature of our physical process and integrate a set of stochastic dynamical equations governing its evolution under the SPDC Hamiltonian. Our model is capable of learning the relevant interaction parameters and designing nonlinear quantum optical systems that achieve desired quantum states. We show, theoretically and experimentally, how to generate maximally entangled states in the spatial degree of freedom. Additionally, we demonstrate all-optical coherent control of the generated state by reshaping the pump beam. Our work has potential applications in high-dimensional quantum key distribution and quantum information processing.

    H. Ye, S. Vedula, Y. Chen, Y. Yang, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati, GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference, Proc. ACM Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023 details

    GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference

    H. Ye, S. Vedula, Y. Chen, Y. Yang, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati
    Proc. ACM Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023
    Picture for GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference
    --->>

    The high memory bandwidth demand of sparse embedding layers continues to be a critical challenge in scaling the performance of recommendation models. While prior works have exploited heterogeneous memory system designs and partial embedding sum memoization techniques, they offer limited benefits. This is because prior designs either target a very small subset of embeddings to simplify their analysis or incur a high processing cost to account for all embeddings, which does not scale with the large sizes of modern embedding tables. This paper proposes GRACE—a lightweight and scalable graph-based algorithm-system co-design framework to significantly improve the embedding layer performance of recommendation models. GRACE proposes a novel Item Co-occurrence Graph (ICG) that scalably records item co-occurrences. GRACE then presents a new system-aware ICG clustering algorithm to find frequently accessed item combinations of arbitrary lengths to compute and memoize their partial sums. High-frequency partial sums are stored in a software-managed cache space to reduce memory traffic and improve the throughput of computing sparse features. We further present a cache data layout and low-cost address computation logic to efficiently lookup item embeddings and their partial sums. Our evaluation shows that GRACE significantly outperforms the state-of-the-art techniques SPACE and MERCI by 1.5× and 1.4×, respectively.

    S. Vedula, I. Tallini, A. A. Rosenberg, M. Pegoraro, E. Rodolà, Y. Romano, A. M. Bronstein, Continuous vector quantile regression, Proc. ICML Workshop Frontiers4LCD, 2023 details

    Continuous vector quantile regression

    S. Vedula, I. Tallini, A. A. Rosenberg, M. Pegoraro, E. Rodolà, Y. Romano, A. M. Bronstein
    Proc. ICML Workshop Frontiers4LCD, 2023
    Picture for Continuous vector quantile regression
    --->>

    Vector quantile regression (VQR) estimates the conditional vector quantile function (CVQF), a fundamental quantity which fully represents the conditional distribution of Y|X. VQR is formulated as an optimal transport (OT) problem between a uniform U~μ and the target (X,Y)~ν, the solution of which is a unique transport map, co-monotonic with U. Recently NL-VQR has been proposed to estimate support non-linear CVQFs, together with fast solvers which enabled the use of this tool in practical applications. Despite its utility, the scalability and estimation quality of NL-VQR is limited due to a discretization of the OT problem onto a grid of quantile levels. We propose a novel continuous formulation and parametrization of VQR using partial input-convex neural networks (PICNNs). Our approach allows for accurate, scalable, differentiable and invertible estimation of non-linear CVQFs. We further demonstrate, theoretically and experimentally, how continuous CVQFs can be used for general statistical inference tasks: estimation of likelihoods, CDFs, confidence sets, coverage, sampling, and more. This work is an important step towards unlocking the full potential of VQR.

    M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein, Vector quantile regression on manifolds, ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023 details

    Vector quantile regression on manifolds

    M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein
    ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023
    Picture for Vector quantile regression on manifolds
    --->>

    Quantile regression (QR) is a statistical tool for distribution-free
    estimation of conditional quantiles of a target variable given explanatory
    features. QR is limited by the assumption that the target distribution is
    univariate and defined on an Euclidean domain. Although the notion of quantiles
    was recently extended to multi-variate distributions, QR for multi-variate
    distributions on manifolds remains underexplored, even though many important
    applications inherently involve data distributed on, e.g., spheres (climate
    measurements), tori (dihedral angles in proteins), or Lie groups (attitude in
    navigation). By leveraging optimal transport theory and the notion of
    c-concave functions, we meaningfully define conditional vector quantile
    functions of high-dimensional variables on manifolds (M-CVQFs). Our approach
    allows for quantile estimation, regression, and computation of conditional
    confidence sets. We demonstrate the approach’s efficacy and provide insights
    regarding the meaning of non-Euclidean quantiles through preliminary synthetic
    data experiments.

    T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-Poranne, Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons, Journal of Organic Chemistry, 2023 details

    Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons

    T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-Poranne
    Journal of Organic Chemistry, 2023
    Picture for Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons
    --->>

    In this work, interpretable deep learning was used to identify structure-property relationships governing the HOMO-LUMO gap and relative stability of polybenzenoid hydrocarbons (PBHs). To this end, a ring-based graph representation was used. In addition to affording reduced training times and excellent predictive ability, this representation could be combined with a subunit-based perception of PBHs, allowing chemical insights to be presented in terms of intuitive and simple structural motifs. The resulting insights agree with conventional organic chemistry knowledge and electronic structure-based analyses, and also reveal new behaviors and identify influential structural motifs. In particular, we evaluated and compared the effects of linear, angular, and branching motifs on these two molecular properties, as well as explored the role of dispersion in mitigating torsional strain inherent in non-planar PBHs. Hence, the observed regularities and the proposed analysis contribute to a deeper understanding of the behavior of PBHs and form the foundation for design strategies for new functional PBHs.

    A. A. Rosenberg, S. Vedula, Y. Romano, A. M. Bronstein, Fast nonlinear vector quantile regression, Proc. ICML, 2023 details

    Fast nonlinear vector quantile regression

    A. A. Rosenberg, S. Vedula, Y. Romano, A. M. Bronstein
    Proc. ICML, 2023
    Picture for Fast nonlinear vector quantile regression
    --->>

    Quantile regression (QR) is a powerful tool for estimating one or more conditional quantiles of a target variable Y given explanatory features X. A limitation of QR is that it is only defined for scalar target variables, due to the formulation of its objective function, and since the notion of quantiles has no standard definition for multivariate distributions. Recently, vector quantile regression (VQR) was proposed as an extension of QR for high-dimensional target variables, thanks to a meaningful generalization of the notion of quantiles to multivariate distributions. Despite its elegance, VQR is arguably not applicable in practice due to several limitations: (i) it assumes a linear model for the quantiles of the target Y given the features X; (ii) its exact formulation is intractable even for modestly-sized problems in terms of target dimensions, the number of regressed quantile levels, or the number of features, and its relaxed dual formulation may violate the monotonicity of the estimated quantiles; (iii) no fast or scalable solvers for VQR currently exist. In this work we fully address these limitations, namely: (i) We extend VQR to the non-linear case, showing substantial improvement over linear VQR; (ii) We propose vector monotone rearrangement, a method which ensures the estimates obtained by VQR relaxations are monotone functions; (iii) We provide fast, GPU-accelerated solvers for linear and nonlinear VQR which maintain a fixed memory footprint with the number of samples and quantile levels, and demonstrate that they scale to millions of samples and thousands of quantile levels; (iv) We release an optimized python package of our solvers as to widespread the use of VQR in real-world applications.

    D. Zadok, O. Salzman, A. Wolf, A. M. Bronstein, Towards predicting fine finger motions from ultrasound images via kinematic representation, Proc. ICRA, 2023 details

    Towards predicting fine finger motions from ultrasound images via kinematic representation

    D. Zadok, O. Salzman, A. Wolf, A. M. Bronstein
    Proc. ICRA, 2023
    Picture for Towards predicting fine finger motions from ultrasound images via kinematic representation
    --->>

    A central challenge in building robotic prostheses is the creation of a sensor-based system able to read physiological signals from the lower limb and instruct a robotic hand to perform various tasks. Existing systems typically perform discrete gestures such as pointing or grasping, by employing electromyography (EMG) or ultrasound (US) technologies to analyze the state of the muscles. In this work, we study the inference problem of identifying the activation of specific fingers from a sequence of US images when performing dexterous tasks such as keyboard typing or playing the piano. While estimating finger gestures has been done in the past by detecting prominent gestures, we are interested in classification done in the context of fine motions that evolve over time. We consider this task as an important step towards higher adoption rates of robotic prostheses among arm amputees, as it has the potential to dramatically increase functionality in performing daily tasks. Our key observation, motivating this work, is that modeling the hand as a robotic manipulator allows to encode an intermediate representation wherein US images are mapped to said configurations. Given a sequence of such learned configurations, coupled with a neural-network architecture that exploits temporal coherence, we are able to infer fine finger motions. We evaluated our method by collecting data from a group of subjects and demonstrating how our framework can be used to replay music played or text typed. To the best of our knowledge, this is the first study demonstrating these downstream tasks within an end-to-end system.

    A. M. Bronstein, A. Marx, Water stabilizes an alternate turn conformation in horse heart myoglobin, Nature Scientific Reports, 2023 details

    Water stabilizes an alternate turn conformation in horse heart myoglobin

    A. M. Bronstein, A. Marx
    Nature Scientific Reports, 2023
    Picture for Water stabilizes an alternate turn conformation in horse heart myoglobin
    --->>

    Comparison of myoglobin structures reveals that protein isolated from horse heart consistently adopts an alternate turn conformation in comparison to its homologues. Analysis of hundreds of high-resolution structures discounts crystallization conditions or the surrounding amino acid protein environment as explaining this difference, that is also not captured by the AlphaFold prediction. Rather, a water molecule is identified as stabilizing the conformation in the horse heart structure, which immediately reverts to the whale conformation in molecular dynamics simulations excluding that structural water.

    B. Gahtan, R. Cohen, A. M. Bronstein, G. Kedar, Using deep reinforcement learning for mmWave real-time scheduling, Proc. Int'l Conf. Network of the Future (NoF), 2023 details

    Using deep reinforcement learning for mmWave real-time scheduling

    B. Gahtan, R. Cohen, A. M. Bronstein, G. Kedar
    Proc. Int'l Conf. Network of the Future (NoF), 2023
    Picture for Using deep reinforcement learning for mmWave real-time scheduling
    --->>

    We study the problem of real-time scheduling in a multi-hop millimeter-wave (mmWave) mesh. We develop a model-free deep reinforcement learning algorithm called Adaptive Activator RL (AARL), which determines the subset of mmWave links that should be activated during each time slot and the power level for each link. The most important property of AARL is its ability to make scheduling decisions within the strict time slot constraints of typical 5G mmWave networks. AARL can handle a variety of network topologies, network loads, and interference models, it can also adapt to different workloads. We demonstrate the operation of AARL on several topologies: a small topology with 10 links, a moderately-sized mesh with 48 links, and a large topology with 96 links. We show that for each topology, we compare the throughput obtained by AARL to that of a benchmark algorithm called RPMA (Residual Profit Maximizer Algorithm). The most important advantage of AARL compared to RPMA is that it is much faster and can make the necessary scheduling decisions very rapidly during every time slot, while RPMA cannot. In addition, the quality of the scheduling decisions made by AARL outperforms those made by RPMA.

    T. Shor, T. Weiss, D. Noti, A. M. Bronstein, Multi PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI, Proc. Medical Imaging with Deep Learning (MIDL), 2023 details

    Multi PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI

    T. Shor, T. Weiss, D. Noti, A. M. Bronstein
    Proc. Medical Imaging with Deep Learning (MIDL), 2023
    Picture for Multi PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI
    --->>

    Dynamic Magnetic Resonance Imaging (MRI) is known to be a powerful and reliable technique for the dynamic imaging of internal organs and tissues, making it a leading diagnostic tool. A major difficulty in using MRI in this setting is the relatively long acquisition time (and, hence, increased cost) required for imaging in high spatio-temporal resolution,
    leading to the appearance of related motion artifacts and decrease in resolution. Compressed Sensing (CS) techniques have become a common tool to reduce MRI acquisition time by subsampling images in the k-space according to some acquisition trajectory. Several studies have particularly focused on applying deep learning techniques to learn these acquisition trajectories in order to attain better image reconstruction, rather than using some predefined set of trajectories. To the best of our knowledge, learning acquisition trajectories has been only explored in the context of static MRI. In this study, we consider acquisition trajectory learning in the dynamic imaging setting. We design an end-to-end pipeline for the joint optimization of multiple per-frame acquisition trajectories along with a reconstruction neural network, and demonstrate improved image reconstruction quality in shorter acquisition times.

    A. B. Bainson, J. Hermanns, P. Petsinis, N. Aavad, C. Dam Larsen, T. Swayne, A. Boyarski, D. Mottin, A. M. Bronstein, P. Karras, Spectral subgraph localization, Proc. Learning on Graphs Conference, 2023 details

    Spectral subgraph localization

    A. B. Bainson, J. Hermanns, P. Petsinis, N. Aavad, C. Dam Larsen, T. Swayne, A. Boyarski, D. Mottin, A. M. Bronstein, P. Karras
    Proc. Learning on Graphs Conference, 2023
    Picture for Spectral subgraph localization
    --->>

    Several graph analysis problems are based on some variant of subgraph isomorphism: Given two graphs, G and Q, does G contain a subgraph isomorphic to Q? As this problem is NP-complete, past work usually avoids addressing it explicitly. In this paper, we propose a method that localizes, i.e., finds the best-match position of, Q in G, by aligning their Laplacian spectra and enhance its stability via bagging strategies; we relegate the finding of an exact node correspondence from Q to G to a subsequent and separate graph alignment task. We demonstrate that our localization strategy outperforms a baseline based on the state-of-the-art method for graph alignment in terms of accuracy on real graphs and scales to hundreds of nodes as no other method does.

    J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. Karras, GRASP: Graph Alignment through Spectral Signatures, Proc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, 2022 details

    GRASP: Graph Alignment through Spectral Signatures

    J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. Karras
    Proc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, 2022
    Picture for GRASP: Graph Alignment through Spectral Signatures
    --->>

    What is the best way to match the nodes of two graphs? This graph alignment problem generalizes graph isomorphism and arises in applications from social network analysis to bioinformatics. Some solutions assume that auxiliary information on known matches or node or edge attributes is available, or utilize arbitrary graph features. Such methods fare poorly in the pure form of the problem, in which only graph structures are given. Other proposals translate the problem to one of aligning node embeddings, yet, by doing so, provide only a single-scale view of the graph. In this paper, we transfer the shape-analysis concept of functional maps from the continuous to the discrete case, and treat the graph alignment problem as a special case of the problem of finding a mapping between functions on graphs. We present GRASP, a method that first establishes a correspondence between functions derived from Laplacian matrix eigenvectors, which capture multiscale structural characteristics, and then exploits this correspondence to align nodes. Our experimental study, featuring noise levels higher than anything used in previous studies, shows that GRASP outperforms state-of-the-art methods for graph alignment across noise levels and graph types.

    P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. Liu, Deep fused two-step cross-modal hashing with multiple semantic supervision, Multimedia Tools and Applications, 2022 details

    Deep fused two-step cross-modal hashing with multiple semantic supervision

    P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. Liu
    Multimedia Tools and Applications, 2022
    Picture for Deep fused two-step cross-modal hashing with multiple semantic supervision
    --->>

    Existing cross-modal hashing methods ignore the informative multimodal joint information and cannot fully exploit the semantic labels. In this paper, we propose a deep fused two-step cross-modal hashing (DFTH) framework with multiple semantic supervision. In the first step, DFTH learns unified hash codes for instances by a fusion network. Semantic label and similarity reconstruction have been introduced to acquire binary codes that are informative, discriminative and semantic similarity preserving. In the second step, two modality-specific hash networks are learned under the supervision of common hash codes reconstruction, label reconstruction, and intra-modal and inter-modal semantic similarity reconstruction. The modality-specific hash networks can generate semantic preserving binary codes for out-of-sample queries. To deal with the vanishing gradients of binarization, continuous differentiable tanh is introduced to approximate the discrete sign function, making the networks able to back-propagate by automatic gradient computation. Extensive experiments on MIRFlickr25K and NUS-WIDE show the superiority of DFTH over state-of-the-art methods.

    P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. Liu, Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval, Applied Intelligence, 52(1), pp. 33-54, 2022 details

    Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval

    P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. Liu
    Applied Intelligence, 52(1), pp. 33-54, 2022
    Picture for Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
    --->>

    Cross-modal retrieval aims to retrieve related items across different modalities, for example, using an image query to retrieve related text. The existing deep methods ignore both the intra-modal and inter-modal intra-class low-rank structures when fusing various modalities, which decreases the retrieval performance. In this paper, two deep models (denoted as ILCMR and Semi-ILCMR) based on intra-class low-rank regularization are proposed for supervised and semi-supervised cross-modal retrieval, respectively. Specifically, ILCMR integrates the image network and text network into a unified framework to learn a common feature space by imposing three regularization terms to fuse the cross-modal data. First, to align them in the label space, we utilize semantic consistency regularization to convert the data representations to probability distributions over the classes. Second, we introduce an intra-modal low-rank regularization, which encourages the intra-class samples that originate from the same space to be more relevant in the common feature space. Third, an inter-modal low-rank regularization is applied to reduce the cross-modal discrepancy. To enable the low-rank regularization to be optimized using automatic gradients during network back-propagation, we propose the rank-r approximation and specify the explicit gradients for theoretical completeness. In addition to the three regularization terms that rely on label information incorporated by ILCMR, we propose Semi-ILCMR in the semi-supervised regime, which introduces a low-rank constraint before projecting the general representations into the common feature space. Extensive experiments on four public cross-modal datasets demonstrate the superiority of ILCMR and Semi-ILCMR over other state-of-the-art methods.

    Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. Baskin, Physical passive patch adversarial attacks on visual odometry systems, Proc. ACCV, 2022 details

    Physical passive patch adversarial attacks on visual odometry systems

    Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. Baskin
    Proc. ACCV, 2022
    Picture for Physical passive patch adversarial attacks on visual odometry systems
    --->>

    Deep neural networks are known to be susceptible to adversarial perturbations — small perturbations that alter the output of the network and exist under strict norm limitations. While such perturbations are usually discussed as tailored to a specific input, a universal perturbation can be constructed to alter the model’s output on a set of inputs. Universal perturbations present a more realistic case of adversarial attacks, as awareness of the model’s exact input is not required. In addition, the universal attack setting raises the subject of generalization to unseen data, where given a set of inputs, the universal perturbations aim to alter the model’s output on out-of-sample data. In this work, we study physical passive patch adversarial attacks on visual odometry-based autonomous navigation systems. A visual odometry system aims to infer the relative camera motion between two corresponding viewpoints, and is frequently used by vision-based autonomous navigation systems to estimate their state. For such navigation systems, a patch adversarial perturbation poses a severe security issue, as it can be used to mislead a system onto some collision course. To the best of our knowledge, we show for the first time that the error margin of a visual odometry model can be significantly increased by deploying patch adversarial attacks in the scene. We provide evaluation on synthetic closed-loop drone navigation data and demonstrate that a comparable vulnerability exists in real data.

    L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. Bronstein, Machine learning approaches demonstrate that protein structures carry information about their genetic coding, Nature Scientific Reports, 2022 details

    Machine learning approaches demonstrate that protein structures carry information about their genetic coding

    L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. Bronstein
    Nature Scientific Reports, 2022
    Picture for Machine learning approaches demonstrate that protein structures carry information about their genetic coding
    --->>

    Synonymous codons translate into the same amino acid. Although the identity of synonymous codons is often considered
    inconsequential to the final protein structure there is mounting evidence for an association between the two. Our study
    examined this association using regression and classification models, finding that codon sequences predict protein backbone dihedral angles with a lower error than amino acid sequences, and that models trained with true dihedral angles have better classification of synonymous codons given structural information than models trained with random dihedral angles. Using this classification approach, we investigated local codon-codon dependencies and tested whether synonymous codon identity can be predicted more accurately from codon context than amino acid context alone, and most specifically which codon context position carries the most predictive power.

    A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues, biorXiv/2022/513383, 2022 details

    Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues

    A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein
    biorXiv/2022/513383, 2022
    Picture for Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues
    --->>

    Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. A large-scale analysis of high-resolution protein structures suggests that amino acid pairs constitute another layer of ordered structure, more local than these conventionally defined secondary structures. We develop a cross-peptide-bond Ramachandran plot that captures the 15 conformational preferences of the amino acid pairs and show that the effect of a particular mutation on the stability of a protein depends in a predictable manner on the adjacent amino acid context.

    A. Rosenberg, A. Marx, A. M. Bronstein, Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon, Nature Communications, 2022 details

    Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon

    A. Rosenberg, A. Marx, A. M. Bronstein
    Nature Communications, 2022
    Picture for Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon
    --->>

    Synonymous codons translate into chemically identical amino acids. Once considered inconsequential to the formation of the protein product, there is now significant evidence to suggest that codon usage affects co-translational protein folding and the final structure of the expressed protein. Here we develop a method for computing and comparing codon-specific Ramachandran plots and demonstrate that the backbone dihedral angle distributions of some synonymous codons are distinguishable with statistical significance for some secondary structures. This shows that there exists a dependence between codon identity and backbone torsion of the translated amino acid. Although these findings cannot pinpoint the causal direction of this dependence, we discuss the vast biological implications should coding be shown to directly shape protein conformation and demonstrate the usefulness of this method as a tool for probing associations between codon usage and protein structure. Finally, we urge for the inclusion of exact genetic information into structural databases.

    E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse design of spontaneous parametric downconversion for generation of high-dimensional qudits, Optica 9, 602-615, 2022 details

    Inverse design of spontaneous parametric downconversion for generation of high-dimensional qudits

    E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie
    Optica 9, 602-615, 2022

    Spontaneous parametric down-conversion in quantum optics is an invaluable resource for the realization of high-dimensional qudits with spatial modes of light. One of the main open challenges is how to directly generate a desirable qudit state in the SPDC process. This problem can be addressed through advanced computational learning methods; however, due to difficulties in modeling the SPDC process by a fully differentiable algorithm that takes into account all interaction effects, progress has been limited. Here, we overcome these limitations and introduce a physically-constrained and differentiable model, validated against experimental results for shaped pump beams and structured crystals, capable of learning every interaction parameter in the process. We avoid any restrictions induced by the stochastic nature of our physical model and integrate the dynamic equations governing the evolution under the SPDC Hamiltonian. We solve the inverse problem of designing a nonlinear quantum optical system that achieves the desired quantum state of down-converted photon pairs. The desired states are defined using either the second-order correlations between different spatial modes or by specifying the required density matrix. By learning nonlinear volume holograms as well as different pump shapes, we successfully show how to generate maximally entangled states. Furthermore, we simulate all-optical coherent control over the generated quantum state by actively changing the profile of the pump beam. Our work can be useful for applications such as novel designs of high-dimensional quantum key distribution and quantum information processing protocols. In addition, our method can be readily applied for controlling other degrees of freedom of light in the SPDC process, such as the spectral and temporal properties, and may even be used in condensed-matter systems having a similar interaction Hamiltonian.

    N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. Dreslinski, Mint: An Accelerator For Mining Temporal Motifs, Proc. MICRO, 2022 details

    Mint: An Accelerator For Mining Temporal Motifs

    N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. Dreslinski
    Proc. MICRO, 2022
    Picture for Mint: An Accelerator For Mining Temporal Motifs
    --->>

    A variety of complex systems, including social and communication networks, financial markets, biology, and neuroscience are modeled using temporal graphs that contain a set of nodes and directed timestamped edges. Temporal motifs in temporal graphs are generalized from subgraph patterns in static graphs in that they also account for edge ordering and time duration, in addition to the graph structure. Mining temporal motifs is a fundamental problem used in several application domains. However, existing software frameworks offer suboptimal performance due to high algorithmic complexity and irregular memory accesses of temporal motif mining. This paper presents Mint—a novel accelerator architecture and a programming model for mining temporal motifs efficiently. We first divide this workload into three fundamental tasks: search, book-keeping, and backtracking. Based on this, we propose a task–centric programming model that enables decoupled, asynchronous execution. This model unlocks massive opportunities for parallelism, and allows storing task context information on-chip. To best utilize the proposed programming model, we design a domain-specific hardware accelerator using its data path and memory subsystem design to cater to the unique workload characteristics of temporal motif mining. To further improve performance, we propose a novel optimization called search index memoization that significantly reduces memory traffic. We comprehensively compare the performance of Mint with state-of-the-art temporal motif mining software frameworks (both approximate and exact) running on both CPU and GPU, and show 9×–2576× benefit in performance.

    E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. Litany, Contrast to divide: Self-supervised pre-training for learning with noisy labels, Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022 details

    Contrast to divide: Self-supervised pre-training for learning with noisy labels

    E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. Litany
    Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022
    Picture for Contrast to divide: Self-supervised pre-training for learning with noisy labels
    --->>

    The success of learning with noisy labels (LNL) methods relies heavily on the success of a warm-up stage where standard supervised training is performed using the full (noisy) training set. In this paper, we identify a” warm-up obstacle”: the inability of standard warm-up stages to train high quality feature extractors and avert memorization of noisy labels. We propose” Contrast to Divide”(C2D), a simple framework that solves this problem by pre-training the feature extractor in a self-supervised fashion. Using self-supervised pre-training boosts the performance of existing LNL approaches by drastically reducing the warm-up stage’s susceptibility to noise level, shortening its duration, and improving extracted feature quality. C2D works out of the box with existing methods and demonstrates markedly improved performance, especially in the high noise regime, where we get a boost of more than 27% for CIFAR-100 with 90% noise over the previous state of the art. In real-life noise settings, C2D trained on mini-WebVision outperforms previous works both in WebVision and ImageNet validation sets by 3% top-1 accuracy. We perform an in-depth analysis of the framework, including investigating the performance of different pre-training approaches and estimating the effective upper bound of the LNL performance with semi-supervised learning.

    G. Pai, A. Bronstein, R. Talmon, R. Kimmel, Deep isometric maps, Image and Vision Computing, 2022 details

    Deep isometric maps

    G. Pai, A. Bronstein, R. Talmon, R. Kimmel
    Image and Vision Computing, 2022
    Picture for Deep isometric maps
    --->>

    Isometric feature mapping is an established time-honored algorithm in manifold learning and non-linear dimensionality reduction. Its prominence can be attributed to the output of a coherent global low-dimensional representation of data by preserving intrinsic distances. In order to enable an efficient and more applicable isometric feature mapping, a diverse set of sophisticated advancements have been proposed to the original algorithm to incorporate important factors like sparsity of computation, conformality, topological constraints and spectral geometry. However, a significant shortcoming of most approaches is the dependence on large scale dense-spectral decompositions or the inability to generalize to points far away from the sampling of the manifold.

    In this paper, we explore an unsupervised deep learning approach for computing distance-preserving maps for non-linear dimensionality reduction. We demonstrate that our framework is general enough to incorporate all previous advancements and show a significantly improved local and non-local generalization of the isometric mapping. Our approach involves training with only a few landmark datapoints and therefore avoids the need for population of dense matrices as well as computing their spectral decomposition.

    N. Diamant, N. Shandor, A. M. Bronstein, Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples, arXiv:2111.08419, 2022 details

    Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples

    N. Diamant, N. Shandor, A. M. Bronstein
    arXiv:2111.08419, 2022
    Picture for Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples
    --->>

    Understating and controlling generative models’ latent space is a complex task. In this paper, we propose a novel method for learning to control any desired attribute in a pre-trained GAN’s latent space, for the purpose of editing synthesized and real-world data samples accordingly. We perform Sim2Real learning, relying on minimal samples to achieve an unlimited amount of continuous precise edits. We present an Autoencoder-based model that learns to encode the semantics of changes between images as a basis for editing new samples later on, achieving precise desired results – example shown in Fig. 1. While previous editing methods rely on a known structure of latent spaces (e.g., linearity of some semantics in StyleGAN), our method inherently does not require any structural constraints. We demonstrate our method in the domain of facial imagery: editing different expressions, poses, and lighting attributes, achieving state-of-the-art results.

    T. Blau, R. Ganz, B. Kawar, A. M. Bronstein, M. Elad , Threat model-agnostic adversarial defense using diffusion models, arXiv preprint arXiv:2207.08089, 2022 details

    Threat model-agnostic adversarial defense using diffusion models

    T. Blau, R. Ganz, B. Kawar, A. M. Bronstein, M. Elad
    arXiv preprint arXiv:2207.08089, 2022
    Picture for Threat model-agnostic adversarial defense using diffusion models
    --->>

    Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks. Following the discovery of this vulnerability in real-world imaging and vision applications, the associated safety concerns have attracted vast research attention, and many defense techniques have been developed. Most of these defense methods rely on adversarial training (AT) — training the classification network on images perturbed according to a specific threat model, which defines the magnitude of the allowed modification. Although AT leads to promising results, training on a specific threat model fails to generalize to other types of perturbations. A different approach utilizes a preprocessing step to remove the adversarial perturbation from the attacked image. In this work, we follow the latter path and aim to develop a technique that leads to robust classifiers across various realizations of threat models. To this end, we harness the recent advances in stochastic generative modeling, and means to leverage these for sampling from conditional distributions. Our defense relies on an addition of Gaussian i.i.d noise to the attacked image, followed by a pretrained diffusion process — an architecture that performs a stochastic iterative process over a denoising network, yielding a high perceptual quality denoised outcome. The obtained robustness with this stochastic preprocessing step is validated through extensive experiments on the CIFAR-10 dataset, showing that our method outperforms the leading defense methods under various threat models.

    A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky, Detector-free weakly supervised grounding by separation, Proc. CVPR, 2022 details

    Detector-free weakly supervised grounding by separation

    A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky
    Proc. CVPR, 2022
    Picture for Detector-free weakly supervised grounding by separation
    --->>

    Nowadays, there is an abundance of data involving images and surrounding free-form text weakly corresponding to those images. Weakly Supervised phrase-Grounding (WSG) deals with the task of using this data to learn to localize (or to ground) arbitrary text phrases in images without any additional annotations. However, most recent SotA methods for WSG assume the existence of a pre-trained object detector, relying on it to produce the ROIs for localization. In this work, we focus on the task of Detector-Free WSG (DF-WSG) to solve WSG without relying on a pre-trained detector. We directly learn everything from the images and associated free-form text pairs, thus potentially gaining an advantage on the categories unsupported by the detector. The key idea behind our proposed Grounding by Separation (GbS) method is synthesizing `text to image-regions’ associations by random alpha-blending of arbitrary image pairs and using the corresponding texts of the pair as conditions to recover the alpha map from the blended image via a segmentation network. At test time, this allows using the query phrase as a condition for a non-blended query image, thus interpreting the test image as a composition of a region corresponding to the phrase and the complement region. Using this approach we demonstrate a significant accuracy improvement, of up to 8.5% over previous DF-WSG SotA, for a range of benchmarks including Flickr30K, Visual Genome, and ReferIt, as well as a significant complementary improvement (above 7%) over the detector-based approaches for WSG.

    D. E. Fordham, D. Rosentraub, A. L. Polsky, T. Aviram, Y. Wolf, O. Perl, A. Devir, S. Rosentraub, D. H. Silver, Y. Gold Zamir, A. M. Bronstein, M. Lara Lara, J. Ben Nagi, A. Alvarez, S. Munné, Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?, Human Reproduction, Volume 37, Issue 10, Pages 2275–2290, 2022 details

    Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?

    D. E. Fordham, D. Rosentraub, A. L. Polsky, T. Aviram, Y. Wolf, O. Perl, A. Devir, S. Rosentraub, D. H. Silver, Y. Gold Zamir, A. M. Bronstein, M. Lara Lara, J. Ben Nagi, A. Alvarez, S. Munné
    Human Reproduction, Volume 37, Issue 10, Pages 2275–2290, 2022
    Picture for Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?
    --->>

    STUDY QUESTION
    What is the accuracy and agreement of embryologists when assessing the implantation probability of blastocysts using time-lapse imaging (TLI), and can it be improved with a data-driven algorithm?

    SUMMARY ANSWER
    The overall interobserver agreement of a large panel of embryologists was moderate and prediction accuracy was modest, while the purpose-built artificial intelligence model generally resulted in higher performance metrics.

    WHAT IS KNOWN ALREADY
    Previous studies have demonstrated significant interobserver variability amongst embryologists when assessing embryo quality. However, data concerning embryologists’ ability to predict implantation probability using TLI is still lacking. Emerging technologies based on data-driven tools have shown great promise for improving embryo selection and predicting clinical outcomes.

    STUDY DESIGN, SIZE, DURATION
    TLI video files of 136 embryos with known implantation data were retrospectively collected from two clinical sites between 2018 and 2019 for the performance assessment of 36 embryologists and comparison with a deep neural network (DNN).

    PARTICIPANTS/MATERIALS, SETTING, METHODS
    We recruited 39 embryologists from 13 different countries. All participants were blinded to clinical outcomes. A total of 136 TLI videos of embryos that reached the blastocyst stage were used for this experiment. Each embryo’s likelihood of successfully implanting was assessed by 36 embryologists, providing implantation probability grades (IPGs) from 1 to 5, where 1 indicates a very low likelihood of implantation and 5 indicates a very high likelihood. Subsequently, three embryologists with over 5 years of experience provided Gardner scores. All 136 blastocysts were categorized into three quality groups based on their Gardner scores. Embryologist predictions were then converted into predictions of implantation (IPG ≥ 3) and no implantation (IPG ≤ 2). Embryologists’ performance and agreement were assessed using Fleiss kappa coefficient. A 10-fold cross-validation DNN was developed to provide IPGs for TLI video files. The model’s performance was compared to that of the embryologists.

    MAIN RESULTS AND THE ROLE OF CHANCE
    Logistic regression was employed for the following confounding variables: country of residence, academic level, embryo scoring system, log years of experience and experience using TLI. None were found to have a statistically significant impact on embryologist performance at α = 0.05. The average implantation prediction accuracy for the embryologists was 51.9% for all embryos (N = 136). The average accuracy of the embryologists when assessing top quality and poor quality embryos (according to the Gardner score categorizations) was 57.5% and 57.4%, respectively, and 44.6% for fair quality embryos. Overall interobserver agreement was moderate (κ = 0.56, N = 136). The best agreement was achieved in the poor + top quality group (κ = 0.65, N = 77), while the agreement in the fair quality group was lower (κ = 0.25, N = 59). The DNN showed an overall accuracy rate of 62.5%, with accuracies of 62.2%, 61% and 65.6% for the poor, fair and top quality groups, respectively. The AUC for the DNN was higher than that of the embryologists overall (0.70 DNN vs 0.61 embryologists) as well as in all of the Gardner groups (DNN vs embryologists—Poor: 0.69 vs 0.62; Fair: 0.67 vs 0.53; Top: 0.77 vs 0.54).

    LIMITATIONS, REASONS FOR CAUTION
    Blastocyst assessment was performed using video files acquired from time-lapse incubators, where each video contained data from a single focal plane. Clinical data regarding the underlying cause of infertility and endometrial thickness before the transfer was not available, yet may explain implantation failure and lower accuracy of IPGs. Implantation was defined as the presence of a gestational sac, whereas the detection of fetal heartbeat is a more robust marker of embryo viability. The raw data were anonymized to the extent that it was not possible to quantify the number of unique patients and cycles included in the study, potentially masking the effect of bias from a limited patient pool. Furthermore, the lack of demographic data makes it difficult to draw conclusions on how representative the dataset was of the wider population. Finally, embryologists were required to assess the implantation potential, not embryo quality. Although this is not the traditional approach to embryo evaluation, morphology/morphokinetics as a means of assessing embryo quality is believed to be strongly correlated with viability and, for some methods, implantation potential.

    WIDER IMPLICATIONS OF THE FINDINGS
    Embryo selection is a key element in IVF success and continues to be a challenge. Improving the predictive ability could assist in optimizing implantation success rates and other clinical outcomes and could minimize the financial and emotional burden on the patient. This study demonstrates moderate agreement rates between embryologists, likely due to the subjective nature of embryo assessment. In particular, we found that average embryologist accuracy and agreement were significantly lower for fair quality embryos when compared with that for top and poor quality embryos. Using data-driven algorithms as an assistive tool may help IVF professionals increase success rates and promote much needed standardization in the IVF clinic. Our results indicate a need for further research regarding technological advancement in this field.

    S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky, MetAdapt: Meta-learned task-adaptive architecture for few-shot classification, Pattern Recognition Letters, 2021 details

    MetAdapt: Meta-learned task-adaptive architecture for few-shot classification

    S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky
    Pattern Recognition Letters, 2021
    Picture for MetAdapt: Meta-learned task-adaptive architecture for few-shot classification
    --->>

    Few-Shot Learning (FSL) is a topic of rapidly growing interest. Typically, in FSL a model is trained on a dataset consisting of many small tasks (meta-tasks) and learns to adapt to novel tasks that it will encounter during test time. This is also referred to as meta-learning. Another topic closely related to meta-learning with a lot of interest in the community is Neural Architecture Search (NAS), automatically finding optimal architecture instead of engineering it manually. In this work we combine these two aspects of meta-learning. So far, meta-learning FSL methods have focused on optimizing parameters of pre-defined network architectures, in order to make them easily adaptable to novel tasks. Moreover, it was observed that, in general, larger architectures perform better than smaller ones up to a certain saturation point (where they start to degrade due to over-fitting). However, little attention has been given to explicitly optimizing the architectures for FSL, nor to an adaptation of the architecture at test time to particular novel tasks. In this work, we propose to employ tools inspired by the Differentiable Neural Architecture Search (D-NAS) literature in order to optimize the architecture for FSL without over-fitting. Additionally, to make the architecture task adaptive, we propose the concept of ‘MetAdapt Controller’ modules. These modules are added to the model and are meta-trained to predict the optimal network connections for a given novel task. Using the proposed approach we observe state-of-the-art resu

    T. Weiss, N. Peretz, S. Vedula, A. Feuer, A. M. Bronstein, Joint optimization of system design and reconstruction in MIMO radar imaging, Proc. IEEE Int'l Workshop on Machine Learning for Signal Processing, 2021 details

    Joint optimization of system design and reconstruction in MIMO radar imaging

    T. Weiss, N. Peretz, S. Vedula, A. Feuer, A. M. Bronstein
    Proc. IEEE Int'l Workshop on Machine Learning for Signal Processing, 2021
    Picture for Joint optimization of system design and reconstruction in MIMO radar imaging
    --->>

    Multiple-input multiple-output (MIMO) radar is one of the leading depth sensing modalities. However, the usage of multiple receive channels lead to relative high costs and prevent the penetration of MIMOs in many areas such as the automotive industry. Over the last years, few studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MIMO radars, however these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of simultaneous learningbased design of the acquisition and reconstruction schemes, manifesting significant improvement in the reconstruction quality. Inspired by these successes, in this work, we propose to learn MIMO acquisition parameters in the form of receive (Rx) antenna elements locations jointly with an image neuralnetwork based reconstruction. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using our learned acquisition parameters with and without the neural-network reconstruction.

    Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, Loss aware post-training quantization, Machine Learning, 2021 details

    Loss aware post-training quantization

    Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson
    Machine Learning, 2021
    Picture for Loss aware post-training quantization
    --->>

    Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. We show that the structure is flat and separable for mild quantization, enabling straightforward post-training quantization methods to achieve good results. We show that with more aggressive quantization, the loss landscape becomes highly non-separable with steep curvature, making the selection of quantization parameters more challenging. Armed with this understanding, we design a method that quantizes the layer parameters jointly, enabling significant accuracy improvement over current post-training quantization methods.

    C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, CAT: Compression-aware training for bandwidth reduction, JMLR, 2021 details

    CAT: Compression-aware training for bandwidth reduction

    C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson
    JMLR, 2021
    Picture for CAT: Compression-aware training for bandwidth reduction
    --->>

    Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving visual processing tasks. One of the major obstacles hindering the ubiquitous use of CNNs for inference is their relatively high memory bandwidth requirements, which can be a main energy consumer and throughput bottleneck in hardware accelerators. Accordingly, an efficient feature map compression method can result in substantial performance gains. Inspired by quantization-aware training approaches, we propose a compression-aware training (CAT) method that involves training the model in a way that allows better compression of feature maps during inference. Our method trains the model to achieve low-entropy feature maps, which enables efficient compression at inference time using classical transform coding methods. CAT significantly improves the state-of-the-art results reported for quantization. For example, on ResNet-34 we achieve 73.1% accuracy (0.2% degradation from the baseline) with an average representation of only 1.79 bits per value.

    E. Amrani, A. M. Bronstein, Self-supervised classification network, Proc. ECCV, 2022 details

    Self-supervised classification network

    E. Amrani, A. M. Bronstein
    Proc. ECCV, 2022
    Picture for Self-supervised classification network
    --->>

    We present Self-Classifier — a novel self-supervised end-to-end classification neural network. Self-Classifier learns labels and representations simultaneously in a single-stage end-to-end manner by optimizing for same-class prediction of two augmented views of the same sample. To guarantee non-degenerate solutions (i.e., solutions where all labels are assigned to the same class), a uniform prior is asserted on the labels. We show mathematically that unlike the regular cross-entropy loss, our approach avoids such solutions. Self-Classifier is simple to implement and is scalable to practically unlimited amounts of data. Unlike other unsupervised classification approaches, it does not require any form of pre-training or the use of expectation maximization algorithms, pseudo-labelling or external clustering. Unlike other contrastive learning representation learning approaches, it does not require a memory bank or a second network. Despite its relative simplicity, our approach achieves comparable results to state-of-the-art performance with ImageNet, CIFAR10 and CIFAR100 for its two objectives: unsupervised classification and unsupervised representation learning. Furthermore, it is the first unsupervised end-to-end classification network to perform well on the large-scale ImageNet dataset. Code will be made available.

    E. Rozenberg, D. Freedman, A. M. Bronstein, Learning to localize objects using limited annotation with applications to thoracic diseases, IEEE Access Vol. 9, 2021 details

    Learning to localize objects using limited annotation with applications to thoracic diseases

    E. Rozenberg, D. Freedman, A. M. Bronstein
    IEEE Access Vol. 9, 2021
    Picture for Learning to localize objects using limited annotation with applications to thoracic diseases
    --->>

    Motivation: The localization of objects in images is a longstanding objective within the field of image processing. Most current techniques are based on machine learning approaches, which typically require careful annotation of training samples in the form of expensive bounding box labels. The need for such large-scale annotation has only been exacerbated by the widespread adoption of deep learning techniques within the image processing community: deep learning is notoriously data-hungry. Method: In this work, we attack this problem directly by providing a new method for learning to localize objects with limited annotation: most training images can simply be annotated with their whole image labels (and no bounding box), with only a small fraction marked with bounding boxes. The training is driven by a novel loss function, which is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning. Care is taken to ensure that the loss is numerically well-posed. Additionally, we propose a neural network architecture which accounts for both patch dependence, through the use of Conditional Random Field layers, and shift-invariance, through the inclusion of anti-aliasing filters. Results: We demonstrate our method on the task of localizing thoracic diseases in chest X-ray images, achieving state-of-the-art performance on the ChestX-ray14 dataset. We further show that with a modicum of additional effort our technique can be extended from object localization to object detection, attaining high quality results on the Kaggle RSNA Pneumonia Detection Challenge. Conclusion: The technique presented in this paper has the potential to enable high accuracy localization in regimes in which annotated data is either scarce or expensive to acquire. Future work will focus on applying the ideas presented in this paper to the realm of semantic segmentation.

    T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. Bronstein, PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI, Journal of Machine Learning for Biomedical Imaging (MELBA), 2021 details

    PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI

    T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. Bronstein
    Journal of Machine Learning for Biomedical Imaging (MELBA), 2021
    Picture for PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI
    --->>

    Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relatively high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the concurrent learning-based design of data acquisition and image reconstruction schemes. Such schemes have already demonstrated substantial effectiveness, leading to considerably shorter acquisition times and improved quality of image reconstruction. Inspired by this initial success, in this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization, carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and backpropagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks.

    Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. Yaniv, Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis, Proc. US National Academy of Sciences (PNAS), 2021 details

    Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis

    Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. Yaniv
    Proc. US National Academy of Sciences (PNAS), 2021
    Picture for Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis
    --->>

    Despite their great promise, artificial intelligence (AI) systems have yet to become ubiquitous in the daily practice of medicine largely due to several crucial unmet needs of healthcare practitioners. These include lack of explanations in clinically meaningful terms, handling the presence of unknown medical conditions, and transparency regarding the system’s limitations, both in terms of statistical performance as well as recognizing situations for which the system’s predictions are irrelevant. We articulate these unmet clinical needs as machine-learning (ML) problems and systematically address them with cutting-edge ML techniques. We focus on electrocardiogram (ECG) analysis as an example domain in which AI has great potential and tackle two challenging tasks: the detection of a heterogeneous mix of known and unknown arrhythmias from ECG and the identification of underlying cardio-pathology from segments annotated as normal sinus rhythm recorded in patients with an intermittent arrhythmia. We validate our methods by simulating a screening for arrhythmias in a large-scale population while adhering to statistical significance requirements. Specifically, our system 1) visualizes the relative importance of each part of an ECG segment for the final model decision; 2) upholds specified statistical constraints on its out-of-sample performance and provides uncertainty estimation for its predictions; 3) handles inputs containing unknown rhythm types; and 4) handles data from unseen patients while also flagging cases in which the model’s outputs are not usable for a specific patient. This work represents a significant step toward overcoming the limitations currently impeding the integration of AI into clinical practice in cardiology and medicine in general.

    L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. Giryes, StarNet: towards weakly supervised few-shot detection and explainable few-shot classification, Proc. AAAI, 2021 details

    StarNet: towards weakly supervised few-shot detection and explainable few-shot classification

    L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. Giryes
    Proc. AAAI, 2021
    Picture for StarNet: towards weakly supervised few-shot detection and explainable few-shot classification
    --->>

    In this paper, we propose a new few-shot learning method called StarNet, which is an end-to-end trainable non-parametric star-model few-shot classifier. While being meta-trained using only image-level class labels, StarNet learns not only to predict the class labels for each query image of a few-shot task, but also to localize (via a heatmap) what it believes to be the key image regions supporting its prediction, thus effectively detecting the instances of the novel categories. The localization is enabled by the StarNet’s ability to find large, arbitrarily shaped, semantically matching regions between all pairs of support and query images of a few-shot task. We evaluate StarNet on multiple few-shot classification benchmarks attaining significant state-of-the-art improvement on the CUB and ImageNetLOC-FS, and smaller improvements on other benchmarks. At the same time, in many cases, StarNet provides plausible explanations for its class label predictions, by highlighting the correctly paired novel category instances on the query and on its best matching support (for the predicted class). In addition, we test the proposed approach on the previously unexplored and challenging task of Weakly Supervised Few-Shot Object Detection (WS-FSOD), obtaining significant improvements over the baselines.

    E. Amrani, R. Ben-Ari, D. Rotman, A. M. Bronstein, Noise estimation using density estimation for self-supervised multimodal learning, Proc. AAAI, 2021 details

    Noise estimation using density estimation for self-supervised multimodal learning

    E. Amrani, R. Ben-Ari, D. Rotman, A. M. Bronstein
    Proc. AAAI, 2021
    Picture for Noise estimation using density estimation for self-supervised multimodal learning
    --->>

    One of the key factors of enabling machine learning models to comprehend and solve real-world tasks is to leverage multimodal data. Unfortunately, the annotation of multimodal data is challenging and expensive. Recently, self-supervised multimodal methods that combine vision and language were proposed to learn multimodal representations without annotation. However, these methods choose to ignore the presence of high levels of noise and thus yield sub-optimal results. In this work, we show that the problem of noise estimation for multimodal data can be reduced to a multimodal density estimation task. Using multimodal density estimation, we propose a noise estimation building block for multimodal representation learning that is based strictly on the inherent correlation between different modalities. We demonstrate how our noise estimation can be broadly integrated and achieves comparable results to state-of-the-art performance on five different benchmark datasets for two challenging multimodal tasks: Video Question Answering and Text-To-Video Retrieval.

    O. Dahary, M. Jacoby, A. M. Bronstein, Digital Gimbal: End-to-end deep image stabilization with learnable exposure times, Proc. CVPR, 2021 details

    Digital Gimbal: End-to-end deep image stabilization with learnable exposure times

    O. Dahary, M. Jacoby, A. M. Bronstein
    Proc. CVPR, 2021
    Picture for Digital Gimbal: End-to-end deep image stabilization with learnable exposure times
    --->>

    Mechanical image stabilization using actuated gimbals enables capturing long-exposure shots without suffering from blur due to camera motion. These devices, however, are often physically cumbersome and expensive, limiting their widespread use. In this work, we propose to digitally emulate a mechanically stabilized system from the input of a fast unstabilized camera. To exploit the trade-off between motion blur at long exposures and low SNR at short exposures, we train a CNN that estimates a sharp high-SNR image by aggregating a burst of noisy short-exposure frames, related by unknown motion. We further suggest learning the burst’s exposure times in an end-to-end manner, thus balancing the noise and blur across the frames. We demonstrate this method’s advantage over the traditional approach of deblurring a single image or denoising a fixed-exposure burst.

    A. Boyarski, S. Vedula, A. M. Bronstein, Spectral geometric matrix completion, Proc. Mathematical and Scientific Machine Learning, 2021 details

    Spectral geometric matrix completion

    A. Boyarski, S. Vedula, A. M. Bronstein
    Proc. Mathematical and Scientific Machine Learning, 2021
    Picture for Spectral geometric matrix completion
    --->>

    Deep Matrix Factorization (DMF) is an emerging approach to the problem of reconstructing a matrix from a subset of its entries. Recent works have established that gradient descent applied to a DMF model induces an implicit regularization on the rank of the recovered matrix. Despite these promising theoretical results, empirical evaluation of vanilla DMF on real benchmarks exhibits poor reconstructions which we attribute to the extremely low number of samples available. We propose an explicit spectral regularization scheme that is able to make DMF models competitive on real benchmarks, while still maintaining the implicit regularization induced by gradient descent, thus enjoying the best of both worlds.

    E. Rozenberg, A. Karnieli, O. Yesharim, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse design of quantum holograms in three-dimensional nonlinear photonic crystals, CLEO, 2021 details

    Inverse design of quantum holograms in three-dimensional nonlinear photonic crystals

    E. Rozenberg, A. Karnieli, O. Yesharim, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie
    CLEO, 2021
    Picture for Inverse design of quantum holograms in three-dimensional nonlinear photonic crystals
    --->>

    We introduce a systematic approach for designing 3D nonlinear photonic crystals and pump beams for generating desired quantum correlations between structured photon-pairs. Our model is fully differentiable, allowing accurate and efficient learning and discovery of novel designs.

    A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson, Early-stage neural network hardware performance analysis, Sustainability 13(2):717, 2021 details

    Early-stage neural network hardware performance analysis

    A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson
    Sustainability 13(2):717, 2021
    Picture for Early-stage neural network hardware performance analysis
    --->>
    The demand for running NNs in embedded environments has increased significantly in recent years due to the significant success of convolutional neural network (CNN) approaches in various tasks, including image recognition and generation. The task of achieving high accuracy on resource-restricted devices, however, is still considered to be challenging, which is mainly due to the vast number of design parameters that need to be balanced. While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. This change is hard to evaluate, and the lack of balance may lead to lower utilization of either memory bandwidth or computational resources, thereby reducing performance. This paper introduces a hardware performance analysis framework for identifying bottlenecks in the early stages of CNN hardware design. We demonstrate how the proposed method can help in evaluating different architecture alternatives of resource-restricted CNN accelerators (e.g., part of real-time embedded systems) early in design stages and, thus, prevent making design mistakes.
    Keywords: neural networks; accelerators; quantization; CNN architecture
    C. Baskin, E. Schwartz, E. Zheltonozhskii, N. Liss, R. Giryes, A. M. Bronstein, A. Mendelson, UNIQ: Uniform noise injection for non-uniform quantization of neural networks, ACM Transactions on Computer Systems (TOCS), 2020 details

    UNIQ: Uniform noise injection for non-uniform quantization of neural networks

    C. Baskin, E. Schwartz, E. Zheltonozhskii, N. Liss, R. Giryes, A. M. Bronstein, A. Mendelson
    ACM Transactions on Computer Systems (TOCS), 2020
    Picture for UNIQ: Uniform noise injection for non-uniform quantization of neural networks
    --->>

    We present a novel method for training a neural network amenable to inference in low-precision arithmetic with quantized weights and activations. The training is performed in full precision with random noise injection emulating quantization noise. In order to circumvent the need to simulate realistic quantization noise distributions, the weight distributions are uniformized by a non-linear transfor- mation, and uniform noise is injected. This procedure emulates a non-uniform k-quantile quantizer at inference time, which adapts to the specific distribution of the quantized parameters. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. The method achieves state-of-the-art results for training low-precision networks on ImageNet. In particular, we observe no degradation in accuracy for MobileNet and ResNet-18/34/50 on ImageNet with as low as 4-bit quantization of weights. Our solution achieves the state-of-the-art results in accuracy, in the low computational budget regime, compared to similar models.

    B. Finkelshtein, C. Baskin, E. Zheltonozhskii, U. Alon, Single-node attack for fooling graph neural networks, arXiv:2011.03574, 2020 details

    Single-node attack for fooling graph neural networks

    B. Finkelshtein, C. Baskin, E. Zheltonozhskii, U. Alon
    arXiv:2011.03574, 2020
    Picture for Single-node attack for fooling graph neural networks
    --->>

    Graph neural networks (GNNs) have shown broad applicability in a variety of domains. Some of these domains, such as social networks and product recommendations, are fertile ground for malicious users and behavior. In this paper, we show that GNNs are vulnerable to the extremely limited scenario of a single-node adversarial example, where the node cannot be picked by the attacker. That is, an attacker can force the GNN to classify any target node to a chosen label by only slightly perturbing another single arbitrary node in the graph, even when not being able to pick that specific attacker node. When the adversary is allowed to pick a specific attacker node, the attack is even more effective. We show that this attack is effective across various GNN types, such as GraphSAGE, GCN, GAT, and GIN, across a variety of real-world datasets, and as a targeted and a non-targeted attack.

    J. Alush-Aben, L. Ackerman-Schraier, T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, 3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI, Proc. Machine Learning for Medical Image Reconstruction, MICCAI 2020 details

    3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI

    J. Alush-Aben, L. Ackerman-Schraier, T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein
    Proc. Machine Learning for Medical Image Reconstruction, MICCAI 2020
    Picture for 3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI
    --->>

    Magnetic Resonance Imaging (MRI) has long been considered to be among the gold standards of today’s diagnostic imaging. The most significant drawback of MRI is long acquisition times, prohibiting its use in standard practice for some applications. Compressed sensing (CS) proposes to subsample the k-space (the Fourier domain dual to the physical space of spatial coordinates) leading to significantly accelerated acquisition. However, the benefit of compressed sensing has not been fully  exploited; most of the sampling densities obtained through CS do not produce a trajectory that obeys the stringent constraints of the MRI machine imposed in practice. Inspired by recent success of deep learning-based approaches for image reconstruction and ideas from computational imaging on learning-based design of imaging systems, we introduce 3D FLAT, a novel protocol for data-driven design of 3D non-Cartesian accelerated trajectories in MRI. Our proposal leverages the entire 3D k-space to simultaneously learn a physically feasible acquisition trajectory with a reconstruction method. Experimental results, performed as a proof-of-concept, suggest that 3D FLAT achieves higher image quality for a given readout time compared to standard trajectories such as radial, stack-of-stars, or 2D learned trajectories (trajectories that evolve only in the 2D plane while fully sampling along the third dimension). Furthermore, we demonstrate evidence supporting the significant benefit of performing MRI acquisitions using non-Cartesian 3D trajectories over 2D non-Cartesian trajectories acquired slice-wise.

    T. Weiss, S. Vedula, O. Senouf, O. Michailovich, A. M. Bronstein, Towards learned optimal q-space sampling in diffusion MRI, Proc. Computational Diffusion MRI, MICCAI 2020 details

    Towards learned optimal q-space sampling in diffusion MRI

    T. Weiss, S. Vedula, O. Senouf, O. Michailovich, A. M. Bronstein
    Proc. Computational Diffusion MRI, MICCAI 2020
    Picture for Towards learned optimal q-space sampling in diffusion MRI
    --->>

    Fiber tractography is an important tool of computational neuroscience that enables reconstructing the spatial connectivity and organization of white matter of the brain. Fiber tractography takes advantage of diffusion Magnetic Resonance Imaging (dMRI) which allows measuring the apparent diffusivity of cerebral water along different spatial directions. Unfortunately, collecting such data comes at the price of reduced spatial resolution and substantially elevated acquisition times, which limits the clinical applicability of dMRI. This problem has been thus far addressed using two principal strategies. Most of the efforts have been extended towards improving the quality of signal estimation for any, yet fixed sampling scheme (defined through the choice of diffusion encoding gradients). On the other hand, optimization over the sampling scheme has also proven to be effective. Inspired by the previous results, the present work consolidates the above strategies into a unified estimation framework, in which the optimization is carried out with respect to both estimation model and sampling design concurrently. The proposed solution offers substantial improvements in the quality of signal estimation as well as the accuracy of ensuing analysis by means of fiber tractography. While proving the optimality of the learned estimation models would probably need more extensive evaluation, we nevertheless claim that the learned sampling schemes can be of immediate use, offering a way to improve the dMRI analysis without the necessity of deploying the neural network used for their estimation. We present a comprehensive comparative analysis based on the Human Connectome Project data.

    E. Zheltonozhskii, C. Baskin, A. M. Bronstein, A. Mendelson, Self-supervised learning for large-scale unsupervised image clustering, NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020 details

    Self-supervised learning for large-scale unsupervised image clustering

    E. Zheltonozhskii, C. Baskin, A. M. Bronstein, A. Mendelson
    NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020
    Picture for Self-supervised learning for large-scale unsupervised image clustering
    --->>

    Unsupervised learning has always been appealing to machine learning researchers and practitioners, allowing them to avoid an expensive and complicated process of labeling the data. However, unsupervised learning of complex data is challenging, and even the best approaches show much weaker performance than their supervised counterparts. Self-supervised deep learning has become a strong instrument for representation learning in computer vision. However, those methods have not been evaluated in a fully unsupervised setting.
    In this paper, we propose a simple scheme for unsupervised classification based on self-supervised representations. We evaluate the proposed approach with several recent self-supervised methods showing that it achieves competitive results for ImageNet classification (39% accuracy on ImageNet with 1000 clusters and 46% with overclustering). We suggest adding the unsupervised evaluation to a set of standard benchmarks for self-supervised learning.

     

    G. Mariani, L. Cosmo, A. M. Bronstein, E. Rodolà, Generating adversarial surfaces via band-limited perturbations, Computer Graphics Forum, 2020 details

    Generating adversarial surfaces via band-limited perturbations

    G. Mariani, L. Cosmo, A. M. Bronstein, E. Rodolà
    Computer Graphics Forum, 2020
    Picture for Generating adversarial surfaces via band-limited perturbations
    --->>

    Adversarial attacks have demonstrated remarkable efficacy in altering the output of a learning model by applying a minimal perturbation to the input data. While increasing attention has been placed on the image domain, however, the study of adversarial perturbations for geometric data has been notably lagging behind. In this paper, we show that effective adversarial attacks can be concocted for surfaces embedded in 3D, under weak smoothness assumptions on the perceptibility of the attack. We address the case of deformable 3D shapes in particular, and introduce a general model that is not tailored to any specific surface representation, nor does it assume access to a parametric description of the 3D object.In this context, we consider targeted and untargeted variants of the attack, demonstrating compelling results in either case. We further show how discovering adversarial examples, and then using them for adversarial training, leads to an increase in both robustness and accuracy. Our findings are confirmed empirically over multiple datasets spanning different semantic classes and deformations.

    B. Chmiel, C. Baskin, R. Banner, E. Zheltonozshkii, Y. Yermolin, A. Karbachevsky, A. M. Bronstein, A. Mendelson, Feature map transform coding for energy-efficient CNN inference, Proc. Intl. Joint Conf. on Neural Networks (IJCNN), 2020 details

    Feature map transform coding for energy-efficient CNN inference

    B. Chmiel, C. Baskin, R. Banner, E. Zheltonozshkii, Y. Yermolin, A. Karbachevsky, A. M. Bronstein, A. Mendelson
    Proc. Intl. Joint Conf. on Neural Networks (IJCNN), 2020
    Picture for Feature map transform coding for energy-efficient CNN inference
    --->>

    Convolutional neural networks (CNNs) achieve state-of-the-art accuracy in a variety of tasks in computer vision and beyond. One of the major obstacles hindering the ubiquitous use of CNNs for inference on low-power edge devices is their relatively high computational complexity and memory bandwidth requirements. The latter often dominates the energy footprint on modern hardware. In this paper, we introduce a lossy transform coding approach, inspired by image and video compression, designed to reduce the memory bandwidth due to the storage of intermediate activation calculation results. Our method exploits the high correlations between feature maps and adjacent pixels and allows to halve the data transfer volumes to the main memory without re-training. We analyze the performance of our approach on a variety of CNN architectures and demonstrated FPGA implementation of ResNet18 with our approach results in a reduction of around 40% in the memory energy footprint compared to quantized network with negligible impact on accuracy. A reference implementation accompanies the paper.

    E. Amrani, R. Ben-Ari, T. Hakim, A. M. Bronstein, Self-Supervised Object Detection and Retrieval Using Unlabeled Videos, CVPR workshop, 2020 details

    Self-Supervised Object Detection and Retrieval Using Unlabeled Videos

    E. Amrani, R. Ben-Ari, T. Hakim, A. M. Bronstein
    CVPR workshop, 2020
    Picture for Self-Supervised Object Detection and Retrieval Using Unlabeled Videos
    --->>

    Unlabeled video in the wild presents a valuable, yet so far unharnessed, source of information for learning vision tasks. We present the first attempt of fully self-supervised learning of object detection from subtitled videos without any manual object annotation. To this end, we use the How2 multi-modal collection of instructional videos with English subtitles. We pose the problem as learning with a weakly- and noisily-labeled data, and propose a novel training model that can confront high noise levels, and yet train a classifier to localize the object of interest in the video frames, without any manual labeling involved. We evaluate our approach on a set of 11 manually annotated objects in over 5000 frames and compare it to an existing weakly-supervised approach as baseline. Benchmark data and code will be released upon acceptance of the paper.

    D. H. Silver, M. Feder, Y. Gold-Zamir, A. L. Polsky, S. Rosentraub, E. Shachor, A. Weinberger, P. Mazur, V. D. Zukin, A. M. Bronstein, Data-driven prediction of embryo implantation probability using IVF time-lapse imaging, Proc. MIDL, 2020 details

    Data-driven prediction of embryo implantation probability using IVF time-lapse imaging

    D. H. Silver, M. Feder, Y. Gold-Zamir, A. L. Polsky, S. Rosentraub, E. Shachor, A. Weinberger, P. Mazur, V. D. Zukin, A. M. Bronstein
    Proc. MIDL, 2020

    The process of fertilizing a human egg outside the body in order to help those suffering from infertility to conceive is known as in vitro fertilization (IVF). Despite being the most effective method of assisted reproductive technology (ART), the average success rate of IVF is a mere 20-40%. One step that is critical to the success of the procedure is selecting which embryo to transfer to the patient, a process typically conducted manually and without any universally accepted and standardized criteria. In this paper, we describe a novel data-driven system trained to directly predict embryo implantation probability from embryogenesis time-lapse imaging videos. Using retrospectively collected videos from 272 embryos, we demonstrate that, when compared to an external panel of embryologists, our algorithm results in a 12% increase of positive predictive value and a 29% increase of negative predictive value.

    T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Joint learning of Cartesian undersampling and reconstruction for accelerated MRI, Proc. Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2020 details

    Joint learning of Cartesian undersampling and reconstruction for accelerated MRI

    T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, O. Michailovich, M. Zibulevsky
    Proc. Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2020
    Picture for Joint learning of Cartesian undersampling and reconstruction for accelerated MRI
    --->>

    Magnetic Resonance Imaging (MRI) is considered today the golden-standard modality for soft tissues. The long acquisition times, however, make it more prone to motion artifacts as well as contribute to the relatively high costs of this examination. Over the years, multiple studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MRI, however, these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the simultaneous learning-based design of the acquisition and reconstruction schemes manifesting significant improvement in the reconstruction quality with a constrained time budget. Inspired by these successes, in this work, we propose to learn accelerated MR acquisition schemes (in the form of Cartesian trajectories) jointly with the image reconstruction operator. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using the learned Cartesian trajectories at different speed up rates.

    S. Sommer, A. M. Bronstein, Horizontal flows and manifold stochastics in geometric deep learning, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2020 details

    Horizontal flows and manifold stochastics in geometric deep learning

    S. Sommer, A. M. Bronstein
    IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2020
    Picture for Horizontal flows and manifold stochastics in geometric deep learning
    --->>

    We introduce two constructions in geometric deep learning for 1) transporting orientation-dependent convolutional filters over a manifold in a continuous way and thereby defining a convolution operator that naturally incorporates the rotational effect of holonomy; and 2) allowing efficient evaluation of manifold convolution layers by sampling manifold valued random variables that center around a weighted Brownian motion maximum likelihood mean. Both methods are inspired by stochastics on manifolds and geometric statistics, and provide examples of how stochastic methods — here horizontal frame bundle flows and non-linear bridge sampling schemes, can be used in geometric deep learning. We outline the theoretical foundation of the two methods, discuss their relation to Euclidean deep networks and existing methodology in geometric deep learning, and establish important properties of the proposed constructions.

    K. Rotker, D. Ben-Bashat, A. M. Bronstein, Over-parameterized models for vector fields, SIAM Journal on Imaging Sciences (SIIMS), 2020 details

    Over-parameterized models for vector fields

    K. Rotker, D. Ben-Bashat, A. M. Bronstein
    SIAM Journal on Imaging Sciences (SIIMS), 2020
    Picture for Over-parameterized models for vector fields
    --->>

    Vector fields arise in a variety of quantity measure and visualization techniques such as fluid flow imaging, motion estimation, deformation measures, and color imaging, leading to a better understanding of physical phenomena. Recent progress in vector field imaging technologies has emphasized the need for efficient noise removal and reconstruction algorithms. A key ingredient in the success of extracting signals from noisy measurements is prior information, which can often be represented as a parameterized model. In this work, we extend the over-parameterization variational framework in order to perform model-based reconstruction of vector fields. The over-parameterization methodology combines local modeling of the data with global model parameter regularization. By considering the vector field as a linear combination of basis vector fields and appropriate scale and rotation coefficients, the denoising problem reduces to a simpler form of coefficient recovery. We introduce two versions of the over-parameterization framework: total variation-based method and sparsity-based method, relying on the co-sparse analysis model. We demonstrate the efficiency of the proposed frameworks for two- and three-dimensional vector fields with linear and quadratic over-parameterization models.

    A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras. A. M. Bronstein, I. Oseledets, E. Müller, Intrinsic multi-scale evaluation of generative models, Proc. ICLR, 2020 details

    Intrinsic multi-scale evaluation of generative models

    A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras. A. M. Bronstein, I. Oseledets, E. Müller
    Proc. ICLR, 2020
    Picture for Intrinsic multi-scale evaluation of generative models
    --->>

    Generative models are often used to sample high-dimensional data points from a manifold with small intrinsic dimension. Existing techniques for comparing generative models focus on global data properties such as mean and covariance; in that sense, they are extrinsic and uni-scale. We develop the first, to our knowledge, intrinsic and multi-scale method for characterizing and comparing underlying data manifolds, based on comparing all data moments by lower-bounding the spectral notion of the Gromov-Wasserstein distance between manifolds. In a thorough experimental study, we demonstrate that our method effectively evaluates the quality of generative models; further, we showcase its efficacy in discerning the disentanglement process in neural networks.

    A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson, HCM: Hardware-aware complexity metric for neural network architectures, arXiv:2004.08906, 2020 details

    HCM: Hardware-aware complexity metric for neural network architectures

    A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson
    arXiv:2004.08906, 2020
    Picture for HCM: Hardware-aware complexity metric for neural network architectures
    --->>

    Convolutional Neural Networks (CNNs) have become common in many fields including computer vision, speech recognition, and natural language processing. Although CNN hardware accelerators are already included as part of many SoC architectures, the task of achieving high accuracy on resource-restricted devices is still considered challenging, mainly due to the vast number of design parameters that need to be balanced to achieve an efficient solution. Quantization techniques, when applied to the network parameters, lead to a reduction of power and area and may also change the ratio between communication and computation. As a result, some algorithmic solutions may suffer from lack of memory bandwidth or computational resources and fail to achieve the expected performance due to hardware constraints. Thus, the system designer and the micro-architect need to understand at early development stages the impact of their high-level decisions (e.g., the architecture of the CNN and the amount of bits used to represent its parameters) on the final product (e.g., the expected power saving, area, and accuracy). Unfortunately, existing tools fall short of supporting such decisions. This paper introduces a hardware-aware complexity metric that aims to assist the system designer of the neural network architectures, through the entire project lifetime (especially at its early stages) by predicting the impact of architectural and micro-architectural decisions on the final product. We demonstrate how the proposed metric can help evaluate different design alternatives of neural network models on resource-restricted devices such as real-time embedded systems, and to avoid making design mistakes at early stages.

    E. Zheltonozhskii, C. Baskin, Y. Nemcovsky, B. Chmiel, A. Mendelson, A. M. Bronstein, Colored noise injection for training adversarially robust neural networks, arXiv:2003.02188, 2020 details

    Colored noise injection for training adversarially robust neural networks

    E. Zheltonozhskii, C. Baskin, Y. Nemcovsky, B. Chmiel, A. Mendelson, A. M. Bronstein
    arXiv:2003.02188, 2020
    Picture for Colored noise injection for training adversarially robust neural networks
    --->>

    Even though deep learning have shown unmatched performance on various tasks, neural networks has been shown to be vulnerable to small adversarial perturbation of the input which lead to significant performance degradation. In this work we extend the idea of adding independent Gaussian noise to weights and activation during adversarial training (PNI) to injection of colored noise for defense against common white-box and black-box attacks. We show that our approach outperforms PNI and various previous approaches in terms of adversarial accuracy on CIFAR-10 dataset. In addition, we provide an extensive ablation study of the proposed method justifying the chosen configurations.

    A. Livne, A. M. Bronstein, R. Kimmel, Z. Aviv, S. Grofit, Do we need depth in state-of-the-art face authentication?, Proc. IEEE Int'l Conf. on 3D Vision (3DV), 2020 details

    Do we need depth in state-of-the-art face authentication?

    A. Livne, A. M. Bronstein, R. Kimmel, Z. Aviv, S. Grofit
    Proc. IEEE Int'l Conf. on 3D Vision (3DV), 2020
    Picture for Do we need depth in state-of-the-art face authentication?
    --->>

    Some face recognition methods are designed to utilize geometric features extracted from depth sensors to handle the challenges of single-image based recognition technologies. However, calculating the geometrical data is an expensive and challenging process. Here, we introduce a novel method that learns distinctive geometric features from stereo camera systems without the need to explicitly compute the facial surface or depth map. The raw face stereo images along with coordinate maps allow a CNN to learn geometric features. This way, we keep the simplicity and cost-efficiency of recogn