# Publications

Topics:
1. C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, CAT: Compression-aware training for bandwidth reduction, JMLR, 2021 details

## CAT: Compression-aware training for bandwidth reduction

C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson
JMLR, 2021
--->>

Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving visual processing tasks. One of the major obstacles hindering the ubiquitous use of CNNs for inference is their relatively high memory bandwidth requirements, which can be a main energy consumer and throughput bottleneck in hardware accelerators. Accordingly, an efficient feature map compression method can result in substantial performance gains. Inspired by quantization-aware training approaches, we propose a compression-aware training (CAT) method that involves training the model in a way that allows better compression of feature maps during inference. Our method trains the model to achieve low-entropy feature maps, which enables efficient compression at inference time using classical transform coding methods. CAT significantly improves the state-of-the-art results reported for quantization. For example, on ResNet-34 we achieve 73.1% accuracy (0.2% degradation from the baseline) with an average representation of only 1.79 bits per value.

E. Amrani, A. M. Bronstein, Self-supervised classification network, arXiv:2103.10994, 2021 details

## Self-supervised classification network

E. Amrani, A. M. Bronstein
arXiv:2103.10994, 2021
--->>

We present Self-Classifier — a novel self-supervised end-to-end classification neural network. Self-Classifier learns labels and representations simultaneously in a single-stage end-to-end manner by optimizing for same-class prediction of two augmented views of the same sample. To guarantee non-degenerate solutions (i.e., solutions where all labels are assigned to the same class), a uniform prior is asserted on the labels. We show mathematically that unlike the regular cross-entropy loss, our approach avoids such solutions. Self-Classifier is simple to implement and is scalable to practically unlimited amounts of data. Unlike other unsupervised classification approaches, it does not require any form of pre-training or the use of expectation maximization algorithms, pseudo-labelling or external clustering. Unlike other contrastive learning representation learning approaches, it does not require a memory bank or a second network. Despite its relative simplicity, our approach achieves comparable results to state-of-the-art performance with ImageNet, CIFAR10 and CIFAR100 for its two objectives: unsupervised classification and unsupervised representation learning. Furthermore, it is the first unsupervised end-to-end classification network to perform well on the large-scale ImageNet dataset. Code will be made available.

E. Rozenberg, D. Freedman, A. M. Bronstein, Learning to Localize Objects Using Limited Annotation, With Applications to Thoracic Diseases, IEEE Access Vol. 9 details

## Learning to Localize Objects Using Limited Annotation, With Applications to Thoracic Diseases

E. Rozenberg, D. Freedman, A. M. Bronstein
IEEE Access Vol. 9
--->>

Motivation: The localization of objects in images is a longstanding objective within the field of image processing. Most current techniques are based on machine learning approaches, which typically require careful annotation of training samples in the form of expensive bounding box labels. The need for such large-scale annotation has only been exacerbated by the widespread adoption of deep learning techniques within the image processing community: deep learning is notoriously data-hungry. Method: In this work, we attack this problem directly by providing a new method for learning to localize objects with limited annotation: most training images can simply be annotated with their whole image labels (and no bounding box), with only a small fraction marked with bounding boxes. The training is driven by a novel loss function, which is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning. Care is taken to ensure that the loss is numerically well-posed. Additionally, we propose a neural network architecture which accounts for both patch dependence, through the use of Conditional Random Field layers, and shift-invariance, through the inclusion of anti-aliasing filters. Results: We demonstrate our method on the task of localizing thoracic diseases in chest X-ray images, achieving state-of-the-art performance on the ChestX-ray14 dataset. We further show that with a modicum of additional effort our technique can be extended from object localization to object detection, attaining high quality results on the Kaggle RSNA Pneumonia Detection Challenge. Conclusion: The technique presented in this paper has the potential to enable high accuracy localization in regimes in which annotated data is either scarce or expensive to acquire. Future work will focus on applying the ideas presented in this paper to the realm of semantic segmentation.

T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. Bronstein, PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI, Journal of Machine Learning for Biomedical Imaging (MELBA), 2021 details

## PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI

T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. Bronstein
Journal of Machine Learning for Biomedical Imaging (MELBA), 2021
--->>

Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relatively high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the concurrent learning-based design of data acquisition and image reconstruction schemes. Such schemes have already demonstrated substantial effectiveness, leading to considerably shorter acquisition times and improved quality of image reconstruction. Inspired by this initial success, in this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization, carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and backpropagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks.

A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky, Detector-free weakly supervised grounding by separation, arXiv:2104.09829, 2021 details

## Detector-free weakly supervised grounding by separation

A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky
arXiv:2104.09829, 2021
--->>

Nowadays, there is an abundance of data involving images and surrounding free-form text weakly corresponding to those images. Weakly Supervised phrase-Grounding (WSG) deals with the task of using this data to learn to localize (or to ground) arbitrary text phrases in images without any additional annotations. However, most recent SotA methods for WSG assume the existence of a pre-trained object detector, relying on it to produce the ROIs for localization. In this work, we focus on the task of Detector-Free WSG (DF-WSG) to solve WSG without relying on a pre-trained detector. We directly learn everything from the images and associated free-form text pairs, thus potentially gaining an advantage on the categories unsupported by the detector. The key idea behind our proposed Grounding by Separation (GbS) method is synthesizing text to image-regions’ associations by random alpha-blending of arbitrary image pairs and using the corresponding texts of the pair as conditions to recover the alpha map from the blended image via a segmentation network. At test time, this allows using the query phrase as a condition for a non-blended query image, thus interpreting the test image as a composition of a region corresponding to the phrase and the complement region. Using this approach we demonstrate a significant accuracy improvement, of up to 8.5% over previous DF-WSG SotA, for a range of benchmarks including Flickr30K, Visual Genome, and ReferIt, as well as a significant complementary improvement (above 7%) over the detector-based approaches for WSG.

Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. Yaniv, Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis, PNAS 2021 details

## Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis

Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. Yaniv
PNAS 2021
--->>

Despite their great promise, artificial intelligence (AI) systems have yet to become ubiquitous in the daily practice of medicine largely due to several crucial unmet needs of healthcare practitioners. These include lack of explanations in clinically meaningful terms, handling the presence of unknown medical conditions, and transparency regarding the system’s limitations, both in terms of statistical performance as well as recognizing situations for which the system’s predictions are irrelevant. We articulate these unmet clinical needs as machine-learning (ML) problems and systematically address them with cutting-edge ML techniques. We focus on electrocardiogram (ECG) analysis as an example domain in which AI has great potential and tackle two challenging tasks: the detection of a heterogeneous mix of known and unknown arrhythmias from ECG and the identification of underlying cardio-pathology from segments annotated as normal sinus rhythm recorded in patients with an intermittent arrhythmia. We validate our methods by simulating a screening for arrhythmias in a large-scale population while adhering to statistical significance requirements. Specifically, our system 1) visualizes the relative importance of each part of an ECG segment for the final model decision; 2) upholds specified statistical constraints on its out-of-sample performance and provides uncertainty estimation for its predictions; 3) handles inputs containing unknown rhythm types; and 4) handles data from unseen patients while also flagging cases in which the model’s outputs are not usable for a specific patient. This work represents a significant step toward overcoming the limitations currently impeding the integration of AI into clinical practice in cardiology and medicine in general.

E. Amrani, R. Ben-Ari, D. Rotman, A. M. Bronstein, Noise estimation using density estimation for self-supervised multimodal learning, AAAI, 2021 details

## Noise estimation using density estimation for self-supervised multimodal learning

E. Amrani, R. Ben-Ari, D. Rotman, A. M. Bronstein
AAAI, 2021
--->>

One of the key factors of enabling machine learning models to comprehend and solve real-world tasks is to leverage multimodal data. Unfortunately, the annotation of multimodal data is challenging and expensive. Recently, self-supervised multimodal methods that combine vision and language were proposed to learn multimodal representations without annotation. However, these methods choose to ignore the presence of high levels of noise and thus yield sub-optimal results. In this work, we show that the problem of noise estimation for multimodal data can be reduced to a multimodal density estimation task. Using multimodal density estimation, we propose a noise estimation building block for multimodal representation learning that is based strictly on the inherent correlation between different modalities. We demonstrate how our noise estimation can be broadly integrated and achieves comparable results to state-of-the-art performance on five different benchmark datasets for two challenging multimodal tasks: Video Question Answering and Text-To-Video Retrieval.

O. Dahary, M. Jacoby, A. M. Bronstein, Digital Gimbal: End-to-end deep image stabilization with learnable exposure times, Proc. CVPR, 2021 details

## Digital Gimbal: End-to-end deep image stabilization with learnable exposure times

O. Dahary, M. Jacoby, A. M. Bronstein
Proc. CVPR, 2021
--->>

Mechanical image stabilization using actuated gimbals enables capturing long-exposure shots without suffering from blur due to camera motion. These devices, however, are often physically cumbersome and expensive, limiting their widespread use. In this work, we propose to digitally emulate a mechanically stabilized system from the input of a fast unstabilized camera. To exploit the trade-off between motion blur at long exposures and low SNR at short exposures, we train a CNN that estimates a sharp high-SNR image by aggregating a burst of noisy short-exposure frames, related by unknown motion. We further suggest learning the burst’s exposure times in an end-to-end manner, thus balancing the noise and blur across the frames. We demonstrate this method’s advantage over the traditional approach of deblurring a single image or denoising a fixed-exposure burst.

A. Boyarski, S. Vedula, A. M. Bronstein, Spectral geometric matrix completion, Proc. Mathematical and Scientific Machine Learning, 2021 details

## Spectral geometric matrix completion

A. Boyarski, S. Vedula, A. M. Bronstein
Proc. Mathematical and Scientific Machine Learning, 2021
--->>

Deep Matrix Factorization (DMF) is an emerging approach to the problem of reconstructing a matrix from a subset of its entries. Recent works have established that gradient descent applied to a DMF model induces an implicit regularization on the rank of the recovered matrix. Despite these promising theoretical results, empirical evaluation of vanilla DMF on real benchmarks exhibits poor reconstructions which we attribute to the extremely low number of samples available. We propose an explicit spectral regularization scheme that is able to make DMF models competitive on real benchmarks, while still maintaining the implicit regularization induced by gradient descent, thus enjoying the best of both worlds.

E. Rozenberg, A. Karnieli, O. Yesharim, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse Design of Quantum Holograms in Three-Dimensional Nonlinear Photonic Crystals, CLEO, 2021 details

## Inverse Design of Quantum Holograms in Three-Dimensional Nonlinear Photonic Crystals

E. Rozenberg, A. Karnieli, O. Yesharim, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie
CLEO, 2021
--->>

We introduce a systematic approach for designing 3D nonlinear photonic crystals and pump beams for generating desired quantum correlations between structured photon-pairs. Our model is fully differentiable, allowing accurate and efficient learning and discovery of novel designs.

A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson, Early-stage neural network hardware performance analysis, Sustainability 13(2):717, 2021 details

## Early-stage neural network hardware performance analysis

A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson
Sustainability 13(2):717, 2021
--->>
The demand for running NNs in embedded environments has increased significantly in recent years due to the significant success of convolutional neural network (CNN) approaches in various tasks, including image recognition and generation. The task of achieving high accuracy on resource-restricted devices, however, is still considered to be challenging, which is mainly due to the vast number of design parameters that need to be balanced. While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. This change is hard to evaluate, and the lack of balance may lead to lower utilization of either memory bandwidth or computational resources, thereby reducing performance. This paper introduces a hardware performance analysis framework for identifying bottlenecks in the early stages of CNN hardware design. We demonstrate how the proposed method can help in evaluating different architecture alternatives of resource-restricted CNN accelerators (e.g., part of real-time embedded systems) early in design stages and, thus, prevent making design mistakes.
Keywords: neural networks; accelerators; quantization; CNN architecture
C. Baskin, E. Schwartz, E. Zheltonozhskii, N. Liss, R. Giryes, A. M. Bronstein, A. Mendelson, UNIQ: Uniform noise injection for non-uniform quantization of neural networks, ACM Transactions on Computer Systems (TOCS), 2020 details

## UNIQ: Uniform noise injection for non-uniform quantization of neural networks

C. Baskin, E. Schwartz, E. Zheltonozhskii, N. Liss, R. Giryes, A. M. Bronstein, A. Mendelson
ACM Transactions on Computer Systems (TOCS), 2020
--->>

We present a novel method for training a neural network amenable to inference in low-precision arithmetic with quantized weights and activations. The training is performed in full precision with random noise injection emulating quantization noise. In order to circumvent the need to simulate realistic quantization noise distributions, the weight distributions are uniformized by a non-linear transfor- mation, and uniform noise is injected. This procedure emulates a non-uniform k-quantile quantizer at inference time, which adapts to the specific distribution of the quantized parameters. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. The method achieves state-of-the-art results for training low-precision networks on ImageNet. In particular, we observe no degradation in accuracy for MobileNet and ResNet-18/34/50 on ImageNet with as low as 4-bit quantization of weights. Our solution achieves the state-of-the-art results in accuracy, in the low computational budget regime, compared to similar models.

B. Finkelshtein, C. Baskin, E. Zheltonozhskii, U. Alon, Single-node attack for fooling graph neural networks, arXiv:2011.03574, 2020 details

## Single-node attack for fooling graph neural networks

B. Finkelshtein, C. Baskin, E. Zheltonozhskii, U. Alon
arXiv:2011.03574, 2020
--->>

Graph neural networks (GNNs) have shown broad applicability in a variety of domains. Some of these domains, such as social networks and product recommendations, are fertile ground for malicious users and behavior. In this paper, we show that GNNs are vulnerable to the extremely limited scenario of a single-node adversarial example, where the node cannot be picked by the attacker. That is, an attacker can force the GNN to classify any target node to a chosen label by only slightly perturbing another single arbitrary node in the graph, even when not being able to pick that specific attacker node. When the adversary is allowed to pick a specific attacker node, the attack is even more effective. We show that this attack is effective across various GNN types, such as GraphSAGE, GCN, GAT, and GIN, across a variety of real-world datasets, and as a targeted and a non-targeted attack.

J. Alush-Aben, L. Ackerman-Schraier, T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, 3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI, Proc. Machine Learning for Medical Image Reconstruction, MICCAI 2020 details

## 3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI

J. Alush-Aben, L. Ackerman-Schraier, T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein
Proc. Machine Learning for Medical Image Reconstruction, MICCAI 2020
--->>

Magnetic Resonance Imaging (MRI) has long been considered to be among the gold standards of today’s diagnostic imaging. The most significant drawback of MRI is long acquisition times, prohibiting its use in standard practice for some applications. Compressed sensing (CS) proposes to subsample the k-space (the Fourier domain dual to the physical space of spatial coordinates) leading to significantly accelerated acquisition. However, the benefit of compressed sensing has not been fully  exploited; most of the sampling densities obtained through CS do not produce a trajectory that obeys the stringent constraints of the MRI machine imposed in practice. Inspired by recent success of deep learning-based approaches for image reconstruction and ideas from computational imaging on learning-based design of imaging systems, we introduce 3D FLAT, a novel protocol for data-driven design of 3D non-Cartesian accelerated trajectories in MRI. Our proposal leverages the entire 3D k-space to simultaneously learn a physically feasible acquisition trajectory with a reconstruction method. Experimental results, performed as a proof-of-concept, suggest that 3D FLAT achieves higher image quality for a given readout time compared to standard trajectories such as radial, stack-of-stars, or 2D learned trajectories (trajectories that evolve only in the 2D plane while fully sampling along the third dimension). Furthermore, we demonstrate evidence supporting the significant benefit of performing MRI acquisitions using non-Cartesian 3D trajectories over 2D non-Cartesian trajectories acquired slice-wise.

T. Weiss, S. Vedula, O. Senouf, O. Michailovich, A. M. Bronstein, Towards learned optimal q-space sampling in diffusion MRI, Proc. Computational Diffusion MRI, MICCAI 2020 details

## Towards learned optimal q-space sampling in diffusion MRI

T. Weiss, S. Vedula, O. Senouf, O. Michailovich, A. M. Bronstein
Proc. Computational Diffusion MRI, MICCAI 2020
--->>

Fiber tractography is an important tool of computational neuroscience that enables reconstructing the spatial connectivity and organization of white matter of the brain. Fiber tractography takes advantage of diffusion Magnetic Resonance Imaging (dMRI) which allows measuring the apparent diffusivity of cerebral water along different spatial directions. Unfortunately, collecting such data comes at the price of reduced spatial resolution and substantially elevated acquisition times, which limits the clinical applicability of dMRI. This problem has been thus far addressed using two principal strategies. Most of the efforts have been extended towards improving the quality of signal estimation for any, yet fixed sampling scheme (defined through the choice of diffusion encoding gradients). On the other hand, optimization over the sampling scheme has also proven to be effective. Inspired by the previous results, the present work consolidates the above strategies into a unified estimation framework, in which the optimization is carried out with respect to both estimation model and sampling design concurrently. The proposed solution offers substantial improvements in the quality of signal estimation as well as the accuracy of ensuing analysis by means of fiber tractography. While proving the optimality of the learned estimation models would probably need more extensive evaluation, we nevertheless claim that the learned sampling schemes can be of immediate use, offering a way to improve the dMRI analysis without the necessity of deploying the neural network used for their estimation. We present a comprehensive comparative analysis based on the Human Connectome Project data.

E. Zheltonozhskii, C. Baskin, A. M. Bronstein, A. Mendelson, Self-supervised learning for large-scale unsupervised image clustering, NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020 details

## Self-supervised learning for large-scale unsupervised image clustering

E. Zheltonozhskii, C. Baskin, A. M. Bronstein, A. Mendelson
NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020
--->>

Unsupervised learning has always been appealing to machine learning researchers and practitioners, allowing them to avoid an expensive and complicated process of labeling the data. However, unsupervised learning of complex data is challenging, and even the best approaches show much weaker performance than their supervised counterparts. Self-supervised deep learning has become a strong instrument for representation learning in computer vision. However, those methods have not been evaluated in a fully unsupervised setting.
In this paper, we propose a simple scheme for unsupervised classification based on self-supervised representations. We evaluate the proposed approach with several recent self-supervised methods showing that it achieves competitive results for ImageNet classification (39% accuracy on ImageNet with 1000 clusters and 46% with overclustering). We suggest adding the unsupervised evaluation to a set of standard benchmarks for self-supervised learning.

G. Mariani, L. Cosmo, A. M. Bronstein, E. Rodolà, Generating adversarial surfaces via band-limited perturbations, Computer Graphics Forum, 2020 details

## Generating adversarial surfaces via band-limited perturbations

G. Mariani, L. Cosmo, A. M. Bronstein, E. Rodolà
Computer Graphics Forum, 2020
--->>

Adversarial attacks have demonstrated remarkable efﬁcacy in altering the output of a learning model by applying a minimal perturbation to the input data. While increasing attention has been placed on the image domain, however, the study of adversarial perturbations for geometric data has been notably lagging behind. In this paper, we show that effective adversarial attacks can be concocted for surfaces embedded in 3D, under weak smoothness assumptions on the perceptibility of the attack. We address the case of deformable 3D shapes in particular, and introduce a general model that is not tailored to any speciﬁc surface representation, nor does it assume access to a parametric description of the 3D object.In this context, we consider targeted and untargeted variants of the attack, demonstrating compelling results in either case. We further show how discovering adversarial examples, and then using them for adversarial training, leads to an increase in both robustness and accuracy. Our ﬁndings are conﬁrmed empirically over multiple datasets spanning different semantic classes and deformations.

E. Amrani, R. Ben-Ari, T. Hakim, A. M. Bronstein, Self-Supervised Object Detection and Retrieval Using Unlabeled Videos, CVPR workshop, 2020 details

## Self-Supervised Object Detection and Retrieval Using Unlabeled Videos

E. Amrani, R. Ben-Ari, T. Hakim, A. M. Bronstein
CVPR workshop, 2020
--->>

Unlabeled video in the wild presents a valuable, yet so far unharnessed, source of information for learning vision tasks. We present the first attempt of fully self-supervised learning of object detection from subtitled videos without any manual object annotation. To this end, we use the How2 multi-modal collection of instructional videos with English subtitles. We pose the problem as learning with a weakly- and noisily-labeled data, and propose a novel training model that can confront high noise levels, and yet train a classifier to localize the object of interest in the video frames, without any manual labeling involved. We evaluate our approach on a set of 11 manually annotated objects in over 5000 frames and compare it to an existing weakly-supervised approach as baseline. Benchmark data and code will be released upon acceptance of the paper.

D. H. Silver, M. Feder, Y. Gold-Zamir, A. L. Polsky, S. Rosentraub, E. Shachor, A. Weinberger, P. Mazur, V. D. Zukin, A. M. Bronstein, Data-driven prediction of embryo implantation probability using IVF time-lapse imaging, Proc. MIDL, 2020 details

## Data-driven prediction of embryo implantation probability using IVF time-lapse imaging

D. H. Silver, M. Feder, Y. Gold-Zamir, A. L. Polsky, S. Rosentraub, E. Shachor, A. Weinberger, P. Mazur, V. D. Zukin, A. M. Bronstein
Proc. MIDL, 2020

The process of fertilizing a human egg outside the body in order to help those suffering from infertility to conceive is known as in vitro fertilization (IVF). Despite being the most effective method of assisted reproductive technology (ART), the average success rate of IVF is a mere 20-40%. One step that is critical to the success of the procedure is selecting which embryo to transfer to the patient, a process typically conducted manually and without any universally accepted and standardized criteria. In this paper, we describe a novel data-driven system trained to directly predict embryo implantation probability from embryogenesis time-lapse imaging videos. Using retrospectively collected videos from 272 embryos, we demonstrate that, when compared to an external panel of embryologists, our algorithm results in a 12% increase of positive predictive value and a 29% increase of negative predictive value.

S. Sommer, A. M. Bronstein, Horizontal flows and manifold stochastics in geometric deep learning, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2020 details

## Horizontal flows and manifold stochastics in geometric deep learning

S. Sommer, A. M. Bronstein
IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2020
--->>

We introduce two constructions in geometric deep learning for 1) transporting orientation-dependent convolutional filters over a manifold in a continuous way and thereby defining a convolution operator that naturally incorporates the rotational effect of holonomy; and 2) allowing efficient evaluation of manifold convolution layers by sampling manifold valued random variables that center around a weighted Brownian motion maximum likelihood mean. Both methods are inspired by stochastics on manifolds and geometric statistics, and provide examples of how stochastic methods — here horizontal frame bundle flows and non-linear bridge sampling schemes, can be used in geometric deep learning. We outline the theoretical foundation of the two methods, discuss their relation to Euclidean deep networks and existing methodology in geometric deep learning, and establish important properties of the proposed constructions.

K. Rotker, D. Ben-Bashat, A. M. Bronstein, Over-parameterized models for vector fields, SIAM Journal on Imaging Sciences (SIIMS), 2020 details

## Over-parameterized models for vector fields

K. Rotker, D. Ben-Bashat, A. M. Bronstein
SIAM Journal on Imaging Sciences (SIIMS), 2020
--->>

Vector fields arise in a variety of quantity measure and visualization techniques such as fluid flow imaging, motion estimation, deformation measures, and color imaging, leading to a better understanding of physical phenomena. Recent progress in vector field imaging technologies has emphasized the need for efficient noise removal and reconstruction algorithms. A key ingredient in the success of extracting signals from noisy measurements is prior information, which can often be represented as a parameterized model. In this work, we extend the over-parameterization variational framework in order to perform model-based reconstruction of vector fields. The over-parameterization methodology combines local modeling of the data with global model parameter regularization. By considering the vector field as a linear combination of basis vector fields and appropriate scale and rotation coefficients, the denoising problem reduces to a simpler form of coefficient recovery. We introduce two versions of the over-parameterization framework: total variation-based method and sparsity-based method, relying on the co-sparse analysis model. We demonstrate the efficiency of the proposed frameworks for two- and three-dimensional vector fields with linear and quadratic over-parameterization models.

A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras. A. M. Bronstein, I. Oseledets, E. Müller, Intrinsic multi-scale evaluation of generative models, Proc. ICLR, 2020 details

## Intrinsic multi-scale evaluation of generative models

A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras. A. M. Bronstein, I. Oseledets, E. Müller
Proc. ICLR, 2020
--->>

Generative models are often used to sample high-dimensional data points from a manifold with small intrinsic dimension. Existing techniques for comparing generative models focus on global data properties such as mean and covariance; in that sense, they are extrinsic and uni-scale. We develop the first, to our knowledge, intrinsic and multi-scale method for characterizing and comparing underlying data manifolds, based on comparing all data moments by lower-bounding the spectral notion of the Gromov-Wasserstein distance between manifolds. In a thorough experimental study, we demonstrate that our method effectively evaluates the quality of generative models; further, we showcase its efficacy in discerning the disentanglement process in neural networks.

A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson, HCM: Hardware-aware complexity metric for neural network architectures, arXiv:2004.08906, 2020 details

## HCM: Hardware-aware complexity metric for neural network architectures

A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson
arXiv:2004.08906, 2020
--->>

Convolutional Neural Networks (CNNs) have become common in many fields including computer vision, speech recognition, and natural language processing. Although CNN hardware accelerators are already included as part of many SoC architectures, the task of achieving high accuracy on resource-restricted devices is still considered challenging, mainly due to the vast number of design parameters that need to be balanced to achieve an efficient solution. Quantization techniques, when applied to the network parameters, lead to a reduction of power and area and may also change the ratio between communication and computation. As a result, some algorithmic solutions may suffer from lack of memory bandwidth or computational resources and fail to achieve the expected performance due to hardware constraints. Thus, the system designer and the micro-architect need to understand at early development stages the impact of their high-level decisions (e.g., the architecture of the CNN and the amount of bits used to represent its parameters) on the final product (e.g., the expected power saving, area, and accuracy). Unfortunately, existing tools fall short of supporting such decisions. This paper introduces a hardware-aware complexity metric that aims to assist the system designer of the neural network architectures, through the entire project lifetime (especially at its early stages) by predicting the impact of architectural and micro-architectural decisions on the final product. We demonstrate how the proposed metric can help evaluate different design alternatives of neural network models on resource-restricted devices such as real-time embedded systems, and to avoid making design mistakes at early stages.

L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. Giryes, StarNet: towards weakly supervised few-shot detection and explainable few-shot classification, AAAI, 2021 details

## StarNet: towards weakly supervised few-shot detection and explainable few-shot classification

L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. Giryes
AAAI, 2021
--->>

In this paper, we propose a new few-shot learning method called StarNet, which is an end-to-end trainable non-parametric star-model few-shot classifier. While being meta-trained using only image-level class labels, StarNet learns not only to predict the class labels for each query image of a few-shot task, but also to localize (via a heatmap) what it believes to be the key image regions supporting its prediction, thus effectively detecting the instances of the novel categories. The localization is enabled by the StarNet’s ability to find large, arbitrarily shaped, semantically matching regions between all pairs of support and query images of a few-shot task. We evaluate StarNet on multiple few-shot classification benchmarks attaining significant state-of-the-art improvement on the CUB and ImageNetLOC-FS, and smaller improvements on other benchmarks. At the same time, in many cases, StarNet provides plausible explanations for its class label predictions, by highlighting the correctly paired novel category instances on the query and on its best matching support (for the predicted class). In addition, we test the proposed approach on the previously unexplored and challenging task of Weakly Supervised Few-Shot Object Detection (WS-FSOD), obtaining significant improvements over the baselines.

E. Zheltonozhskii, C. Baskin, Y. Nemcovsky, B. Chmiel, A. Mendelson, A. M. Bronstein, Colored noise injection for training adversarially robust neural networks, arXiv:2003.02188, 2020 details

## Colored noise injection for training adversarially robust neural networks

E. Zheltonozhskii, C. Baskin, Y. Nemcovsky, B. Chmiel, A. Mendelson, A. M. Bronstein
arXiv:2003.02188, 2020
--->>

Even though deep learning have shown unmatched performance on various tasks, neural networks has been shown to be vulnerable to small adversarial perturbation of the input which lead to significant performance degradation. In this work we extend the idea of adding independent Gaussian noise to weights and activation during adversarial training (PNI) to injection of colored noise for defense against common white-box and black-box attacks. We show that our approach outperforms PNI and various previous approaches in terms of adversarial accuracy on CIFAR-10 dataset. In addition, we provide an extensive ablation study of the proposed method justifying the chosen configurations.

A. Livne, A. M. Bronstein, R. Kimmel, Z. Aviv, S. Grofit, Do we need depth in state-of-the-art face authentication?, arXiv:2003.10895 2020 details

## Do we need depth in state-of-the-art face authentication?

A. Livne, A. M. Bronstein, R. Kimmel, Z. Aviv, S. Grofit
arXiv:2003.10895 2020
--->>

Some face recognition methods are designed to utilize geometric features extracted from depth sensors to handle the challenges of single-image based recognition technologies. However, calculating the geometrical data is an expensive and challenging process. Here, we introduce a novel method that learns distinctive geometric features from stereo camera systems without the need to explicitly compute the facial surface or depth map. The raw face stereo images along with coordinate maps allow a CNN to learn geometric features. This way, we keep the simplicity and cost-efficiency of recognition from a single image, while enjoying the benefits of geometric data without explicitly reconstructing it. We demonstrate that the suggested method outperforms both existing single-image and explicit depth-based methods on large-scale benchmarks. We also provide an ablation study to show that the suggested method uses the coordinate maps to encode more informative features.

M. Shkolnik, B. Chmiel, R. Banner, G. Shomron, Y. Nahshan, A. M. Bronstein, U. Weiser, Robust Quantization: One Model to Rule Them All, NeurIPS 2020 details

## Robust Quantization: One Model to Rule Them All

M. Shkolnik, B. Chmiel, R. Banner, G. Shomron, Y. Nahshan, A. M. Bronstein, U. Weiser
NeurIPS 2020
--->>

Neural network quantization methods often involve simulating the quantization process during training. This makes the trained model highly dependent on the precise way quantization is performed. Since low-precision accelerators differ in their quantization policies and their supported mix of data-types, a model trained for one accelerator may not be suitable for another. To address this issue, we propose KURE, a method that provides intrinsic robustness to the model against a broad range of quantization implementations. We show that KURE yields a generic model that may be deployed on numerous inference accelerators without a significant loss in accuracy

A. Boyarski, S. Vedula, A. M. Bronstein, Deep matrix factorization with spectral geometric regularization, arXiv: 1911.07255, 2019 details

## Deep matrix factorization with spectral geometric regularization

A. Boyarski, S. Vedula, A. M. Bronstein
arXiv: 1911.07255, 2019
--->>

We address the problem of reconstructing a matrix from a subset of its entries. Current methods, branded as geometric matrix completion, augment classical rank regularization techniques by incorporating geometric information into the solution. This information is usually provided as graphs encoding relations between rows/columns. In this work, we propose a simple spectral approach for solving the matrix completion problem, via the framework of functional maps. We introduce the zoomout loss, a multiresolution spectral geometric loss inspired by recent advances in shape correspondence, whose minimization leads to state-of-the-art results on various recommender systems datasets. Surprisingly, for some datasets, we were able to achieve comparable results even without incorporating geometric information. This puts into question both the quality of such information and current methods’ ability to use it in a meaningful and efficient way.

Code is available either as Google Colab notebook, or via https://github.com/amitboy/SGMC

Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, Loss aware post-training quantization, arXiv: 1911.07190, 2019 details

## Loss aware post-training quantization

Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson
arXiv: 1911.07190, 2019
--->>

Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. We show that the structure is flat and separable for mild quantization, enabling straightforward post-training quantization methods to achieve good results. On the other hand, we show that with more aggressive quantization, the loss landscape becomes highly non-separable with sharp minima points, making the selection of quantization parameters more challenging. Armed with this understanding, we design a method that quantizes the layer parameters jointly, enabling significant accuracy improvement over current post-training quantization methods. Reference implementation accompanies the paper.

Y. Nemcovsky, E. Zheltonozhskii, C. Baskin, B. Chmiel, A. M. Bronstein, A. Mendelson, Smoothed inference for adversarially-trained models, arXiv: 1911.07198, 2019 details

## Smoothed inference for adversarially-trained models

Y. Nemcovsky, E. Zheltonozhskii, C. Baskin, B. Chmiel, A. M. Bronstein, A. Mendelson
arXiv: 1911.07198, 2019
--->>

Deep neural networks are known to be vulnerable to inputs with maliciously constructed adversarial perturbations aimed at forcing misclassification. We study randomized smoothing as a way to both improve performance on unperturbed data as well as increase robustness to adversarial attacks. Moreover, we extend the method proposed by arXiv:1811.09310 by adding low-rank multivariate noise, which we then use as a base model for smoothing. The proposed method achieves 58.5% top-1 accuracy on CIFAR-10 under PGD attack and outperforms previous works by 4%. In addition, we consider a family of attacks, which were previously used for training purposes in the certified robustness scheme. We demonstrate that the proposed attacks are more effective than PGD against both smoothed and non-smoothed models. Since our method is based on sampling, it lends itself well for trading-off between the model inference complexity and its performance. A reference implementation of the proposed techniques is provided.

S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky, MetAdapt: Meta-learned task-adaptive architecture for few-shot classification, arXiv: 1912.00412, 2019 details

S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky
arXiv: 1912.00412, 2019
--->>

E. Rozenberg, D. Freedman, A. M. Bronstein, Localization with limited annotation for chest X-rays, ML4H, NeuralIPS 2019 details

## Localization with limited annotation for chest X-rays

E. Rozenberg, D. Freedman, A. M. Bronstein
ML4H, NeuralIPS 2019
--->>

Localization of an object within an image is a common task in medical imaging. Learning to localize or detect objects typically requires the collection of data which has been labelled with bounding boxes or similar annotations, which can be very time consuming and expensive. A technique which could perform such learning with much less annotation would, therefore, be quite valuable. We present such a technique for localization with limited annotation, in which the number of images with bounding boxes can be a small fraction of the total dataset (e.g. less than 1%); all other images only possess a whole image label and no bounding box. We propose a novel loss function for tackling this problem; the loss is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning and is numerically well-posed. Furthermore, we propose a new architecture which accounts for both patch dependence and shift-invariance, through the inclusion of CRF layers and anti-aliasing filters, respectively. We apply our technique to the localization of thoracic diseases in chest X-ray images and demonstrate state-of-the-art localization performance on the ChestX-ray14 dataset.

S. Vedula, O. Senouf, G. Zurakov, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Learning beamforming in ultrasound imaging, Proc. Medical Imaging with Deep Learning (MIDL), 2019 details

## Learning beamforming in ultrasound imaging

S. Vedula, O. Senouf, G. Zurakov, A. M. Bronstein, O. Michailovich, M. Zibulevsky
Proc. Medical Imaging with Deep Learning (MIDL), 2019
--->>
Medical ultrasound (US) is a widespread imaging modality owing its popularity to cost-efficiency, portability, speed, and lack of harmful ionizing radiation. In this paper, we demonstrate that replacing the traditional ultrasound processing pipeline with a data-driven, learnable counterpart leads to signi cant improvement in image quality. Moreover, we demonstrate that greater improvement can be achieved through a learning-based design of the transmitted beam patterns simultaneously with learning an image reconstruction pipeline. We evaluate our method on an in-vivo fi rst-harmonic cardiac ultrasound dataset acquired from volunteers and demonstrate the signi cance of the learned pipeline and transmit beam patterns on the image quality when compared to standard transmit and receive beamformers used in high frame-rate US imaging. We believe that the presented methodology provides a fundamentally di erent perspective on the classical problem of ultrasound beam pattern design.
E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein, RepMet: Representative-based metric learning for classification and one-shot object detection, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 details

## RepMet: Representative-based metric learning for classification and one-shot object detection

E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein
Proc. Computer Vision and Pattern Recognition (CVPR), 2019
--->>

Distance metric learning (DML) has been successfully applied to object classification, both in the standard regime of rich training data and in the few-shot scenario, where each category is represented by only few examples. In this work, we propose a new method for DML, featuring a joint learning of the embedding space and of the data distribution of the training categories, in a single training process. Our method improves upon leading algorithms for DML-based object classification. Furthermore, it opens the door for a new task in computer vision — a few-shot object detection, since the proposed DML architecture can be naturally embedded as the classification head of any standard object detector. In numerous experiments, we achieve state-of-the-art classification results on a variety of fine-grained datasets, and offer the community a benchmark on the few-shot detection task, performed on the Imagenet-LOC dataset.

O. Halimi, O. Litany, E. Rodolà, A. M. Bronstein, R. Kimmel, Self-supervised learning of dense shape correspondence, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 details

## Self-supervised learning of dense shape correspondence

O. Halimi, O. Litany, E. Rodolà, A. M. Bronstein, R. Kimmel
Proc. Computer Vision and Pattern Recognition (CVPR), 2019
--->>

We introduce the first completely unsupervised correspondence learning approach for deformable 3D shapes. Key to our model is the understanding that natural deformations (such as changes in the pose) approximately preserve the metric structure of the surface, yielding a natural criterion to drive the learning process toward distortion-minimizing predictions. On this basis, we overcome the need for annotated data and replace it with a purely geometric criterion. The resulting learning model is class-agnostic and is able to leverage any type of deformable geometric data for the training phase. In contrast to existing supervised approaches which specialize in the class seen at training time, we demonstrate stronger generalization as well as applicability to a variety of challenging settings. We showcase our method on a wide selection of correspondence benchmarks, where we outperform other methods in terms of accuracy, generalization, and efficiency.

A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, A. M. Bronstein, LaSO: Label-Set Operations networks for multi-label few-shot learning, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 details

## LaSO: Label-Set Operations networks for multi-label few-shot learning

A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, A. M. Bronstein
Proc. Computer Vision and Pattern Recognition (CVPR), 2019
--->>

Example synthesis is one of the leading methods to tackle the problem of few-shot learning, where only a small number of samples per class are available. However, current synthesis approaches only address the scenario of a single category label per image. In this work, we propose a novel technique for synthesizing samples with multiple labels for the (yet unhandled) multi-label few-shot classification scenario. We propose to combine pairs of given examples in feature space, so that the resulting synthesized feature vectors will correspond to examples whose label sets are obtained through certain set operations on the label sets of the corresponding input pairs. Thus, our method is capable of producing a sample containing the intersection, union or set-difference of labels present in two input samples. As we show, these set operations generalize to labels unseen during training. This enables performing augmentation on examples of novel categories, thus, facilitating multi-label few-shot classifier learning. We conduct numerous experiments showing promising results for the label-set manipulation capabilities of the proposed approach, both directly (using the classification and retrieval metrics), and in the context of performing data augmentation for multi-label few-shot learning. We propose a benchmark for this new and challenging task and show that our method compares favorably to all the common baselines.

A. Zabatani, V. Surazhsky, E. Sperling, S. Ben Moshe, O. Menashe, D. H. Silver, Z. Karni, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Intel RealSense SR300 Coded light depth Camera, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2019 details

## Intel RealSense SR300 Coded light depth Camera

A. Zabatani, V. Surazhsky, E. Sperling, S. Ben Moshe, O. Menashe, D. H. Silver, Z. Karni, A. M. Bronstein, M. M. Bronstein, R. Kimmel
IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2019
--->>

Intel RealSense SR300 is a depth camera capable of providing a VGA-size depth map at 60 fps and 0.125mm depth resolution. In addition, it outputs an infrared VGA-resolution image and a 1080p color texture image at 30 fps.
SR300 form-factor enables it to be integrated into small consumer products and as a front-facing camera in laptops and Ultrabooks. The SR300 depth camera is based on a coded-light technology where triangulation between projected patterns and images captured by a dedicated sensor is used to produce the depth map. Each projected line is coded by a special temporal optical code, that enables a dense depth map reconstruction from its reflection. The solid mechanical assembly of the camera allows it to stay calibrated throughout temperature and pressure changes, drops, and hits. In addition, active dynamic control maintains a calibrated depth output. An extended API LibRS released with the camera allows developers to integrate the camera in various applications. Algorithms for 3D scanning, facial analysis, hand gesture recognition, and tracking are within reach for applications using the SR300. In this paper, we describe the underlying technology, hardware, and algorithms of the SR300, as well as its calibration procedure, and outline some use cases. We believe that this paper will provide a full case study of a mass-produced depth sensing product and technology.

Y. Zur, C. Baskin, E. Zheltonozhskii, B. Chmiel, I. Evron, A. M. Bronstein, A. Mendelson, Towards learning of filter-level heterogeneous compression of convolutional neural networks, Proc. AutoML Workshop, Int'l Conf. on Machine Learning (ICML), 2019 details

## Towards learning of filter-level heterogeneous compression of convolutional neural networks

Y. Zur, C. Baskin, E. Zheltonozhskii, B. Chmiel, I. Evron, A. M. Bronstein, A. Mendelson
Proc. AutoML Workshop, Int'l Conf. on Machine Learning (ICML), 2019
--->>

Recently, deep learning has become a de facto standard in machine learning with convolutional neural networks (CNNs) demonstrating spectacular success on a wide variety of tasks. However, CNNs are typically very demanding computationally at inference time. One of the ways to alleviate  this burden on certain hardware platforms is quantization relying on the use of low-precision arithmetic representation for the weights and the activations. Another popular method is the pruning of the number of filters in each layer. While mainstream deep learning methods train the neural networks weights while keeping the network architecture fixed, the emerging neural architecture search (NAS) techniques make the latter also amenable to training. In this paper, we formulate optimal arithmetic bit length allocation and neural network pruning as a NAS problem, searching for the configurations satisfying a computational complexity budget while maximizing the accuracy. We use a differentiable search method based on the continuous relaxation of the search space proposed by Liu et al. (2019a). We show, by grid search, that heterogeneous quantized networks suffer from a high variance which renders the benefit of the search questionable. For pruning, improvement over homogeneous cases is possible, but it is still challenging to find those configurations with the proposed method.  The code is publicly available at https://github.com/yochaiz/Slimmable and https://github.com/yochaiz/darts-UNIQ.

T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Joint learning of Cartesian undersampling and reconstruction for accelerated MRI, Proc. Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2020 details

## Joint learning of Cartesian undersampling and reconstruction for accelerated MRI

T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, O. Michailovich, M. Zibulevsky
Proc. Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2020
--->>

Magnetic Resonance Imaging (MRI) is considered today the golden-standard modality for soft tissues. The long acquisition times, however, make it more prone to motion artifacts as well as contribute to the relatively high costs of this examination. Over the years, multiple studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MRI, however, these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the simultaneous learning-based design of the acquisition and reconstruction schemes manifesting significant improvement in the reconstruction quality with a constrained time budget. Inspired by these successes, in this work, we propose to learn accelerated MR acquisition schemes (in the form of Cartesian trajectories) jointly with the image reconstruction operator. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using the learned Cartesian trajectories at different speed up rates.

B. Chmiel, C. Baskin, R. Banner, E. Zheltonozshkii, Y. Yermolin, A. Karbachevsky, A. M. Bronstein, A. Mendelson, Feature map transform coding for energy-efficient CNN inference, Proc. Intl. Joint Conf. on Neural Networks (IJCNN), 2020 details

## Feature map transform coding for energy-efficient CNN inference

B. Chmiel, C. Baskin, R. Banner, E. Zheltonozshkii, Y. Yermolin, A. Karbachevsky, A. M. Bronstein, A. Mendelson
Proc. Intl. Joint Conf. on Neural Networks (IJCNN), 2020
--->>

Convolutional neural networks (CNNs) achieve state-of-the-art accuracy in a variety of tasks in computer vision and beyond. One of the major obstacles hindering the ubiquitous use of CNNs for inference on low-power edge devices is their relatively high computational complexity and memory bandwidth requirements. The latter often dominates the energy footprint on modern hardware. In this paper, we introduce a lossy transform coding approach, inspired by image and video compression, designed to reduce the memory bandwidth due to the storage of intermediate activation calculation results. Our method exploits the high correlations between feature maps and adjacent pixels and allows to halve the data transfer volumes to the main memory without re-training. We analyze the performance of our approach on a variety of CNN architectures and demonstrated FPGA implementation of ResNet18 with our approach results in a reduction of around 40% in the memory energy footprint compared to quantized network with negligible impact on accuracy. A reference implementation accompanies the paper.

E. Schwartz, L. Karlinsky, R. Feris, R. Giryes, A. M. Bronstein, Baby steps towards few-shot learning with multiple semantics, arXiv:1906.01905, 2019 details

## Baby steps towards few-shot learning with multiple semantics

E. Schwartz, L. Karlinsky, R. Feris, R. Giryes, A. M. Bronstein
arXiv:1906.01905, 2019
--->>

Learning from one or few visual examples is one of the key capabilities of humans since early infancy, but is still a significant challenge for modern AI systems. While considerable progress has been achieved in few-shot learning from a few image examples, much less attention has been given to the verbal descriptions that are usually provided to infants when they are presented with a new object. In this paper, we focus on the role of additional semantics that can significantly facilitate few-shot visual learning. Building upon recent advances in few-shot learning with additional semantic information, we demonstrate that further improvements are possible using richer semantics and multiple semantic sources. Using these ideas, we offer the community a new result on the one-shot test of the popular miniImageNet benchmark, comparing favorably to the previous state-of-the-art results for both visual only and visual plus semantics-based approaches. We also performed an ablation study investigating the components and design choices of our approach.

A. Rampini, I. Tallini, M. Ovsjanikov, A. M. Bronstein, E. Rodola, Correspondence-free region localization for partial shape similarity via Hamiltonian spectrum alignment, Proc. 3D Vision (3DV), 2019 (Best paper award) details

## Correspondence-free region localization for partial shape similarity via Hamiltonian spectrum alignment

A. Rampini, I. Tallini, M. Ovsjanikov, A. M. Bronstein, E. Rodola
Proc. 3D Vision (3DV), 2019 (Best paper award)

We consider the problem of localizing relevant subsets of non-rigid geometric shapes given only a partial 3D query as the input. Such problems arise in several challenging tasks in 3D vision and graphics, including partial shape similarity, retrieval, and non-rigid correspondence. We phrase the problem as one of alignment between short sequences of eigenvalues of basic differential operators, which are constructed upon a scalar function defined on the 3D surfaces. Our method therefore seeks for a scalar function that entails this alignment. Differently from existing approaches, we do not require solving for a correspondence between the query and the target, therefore greatly simplifying the optimization process; our core technique is also descriptor-free, as it is driven by the geometry of the two objects as encoded in their operator spectra. We further show that our spectral alignment algorithm provides a remarkably simple alternative to the recent shape-from-spectrum reconstruction approaches. For both applications, we demonstrate improvement over the state-of-the-art either in terms of accuracy or computational cost.

O. Senouf, S. Vedula, T. Weiss, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Self-supervised learning of inverse problem solvers in medical imaging, Proc. Medical Image Learning with Less Labels and Imperfect Data, MICCAI 2019 details

## Self-supervised learning of inverse problem solvers in medical imaging

O. Senouf, S. Vedula, T. Weiss, A. M. Bronstein, O. Michailovich, M. Zibulevsky
Proc. Medical Image Learning with Less Labels and Imperfect Data, MICCAI 2019
--->>

In the past few years, deep learning-based methods have demonstrated enormous success for solving inverse problems in medical imaging. In this work, we address the following question: Given a set of measurements obtained from real imaging experiments, what is the best way to use a learnable model and the physics of the modality to solve the inverse problem and reconstruct the latent image? Standard supervised learning based methods approach this problem by collecting data sets of known latent images and their corresponding measurements. However, these methods are often impractical due to the lack of availability of appropriately sized training sets, and, more generally, due to the inherent difficulty in measuring the “groundtruth” latent image. In light of this, we propose a self-supervised approach to training inverse models in medical imaging in the absence of aligned data. Our method only requiring access to the measurements and the forward model at training. We showcase its effectiveness on inverse problems arising in accelerated magnetic resonance imaging (MRI).

N. Diamant, D. Zadok, C. Baskin, E. Schwartz, A. M. Bronstein, Beholder-GAN: Generation and beautification of facial images with conditioning on their beauty level, Proc. Int'l Conf. on Image Processing (ICIP), 2019 details

## Beholder-GAN: Generation and beautification of facial images with conditioning on their beauty level

Proc. Int'l Conf. on Image Processing (ICIP), 2019
--->>

Beauty is in the eye of the beholder. This maxim, emphasizing the subjectivity of the perception of beauty, has enjoyed a wide consensus since ancient times. In the digital era, data-driven methods have been shown to be able to predict human-assigned beauty scores for facial images. In this work, we augment this ability and train a generative model that generates faces conditioned on a requested beauty score. In addition, we show how this trained generator can be used to beautify an input face image. By doing so, we achieve an unsupervised beautification model, in the sense that it relies on no ground truth target images.

G. Pai, R. Talmon, A. M. Bronstein, R. Kimmel, DIMAL: Deep isometric manifold learning using sparse geodesic sampling, Proc. IEEE Winter Conf. on Applications of Computer Vision (WACV), 2019 details

## DIMAL: Deep isometric manifold learning using sparse geodesic sampling

G. Pai, R. Talmon, A. M. Bronstein, R. Kimmel
Proc. IEEE Winter Conf. on Applications of Computer Vision (WACV), 2019
--->>

This paper explores a fully unsupervised deep learning approach for computing distance-preserving maps that generate low-dimensional embeddings for a certain class of manifolds. We use the Siamese configuration to train a neural network to solve the problem of least squares multidimensional scaling for generating maps that approximately preserve geodesic distances. By training with only a few landmarks, we show a significantly improved local and nonlocal generalization of the isometric mapping as compared to analogous non-parametric counterparts. Importantly, the combination of a deep-learning framework with a multidimensional scaling objective enables a numerical analysis of network architectures to aid in understanding their representation power. This provides a geometric perspective to the generalizability of deep learning.

O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. Cremers, Partial single- and multi-shape dense correspondence using functional maps, Chapter in The Handbook of Numerical Analysis - Processing, Analyzing and Learning of Images, Shapes, and Forms, Elsevier, 2019 details

## Partial single- and multi-shape dense correspondence using functional maps

O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. Cremers
Chapter in The Handbook of Numerical Analysis - Processing, Analyzing and Learning of Images, Shapes, and Forms, Elsevier, 2019
--->>

Shape correspondence is a fundamental problem in computer graphics and vision, with applications in various problems including animation, texture mapping, robotic vision, medical imaging, archaeology and many more. In settings where the shapes are allowed to undergo non-rigid deformations and only partial views are available, the problem becomes very challenging. In this chapter we describe recent techniques designed to tackle such problems. Specifically, we explain how the renown functional maps framework can be extended to tackle the partial setting. We then present a further extension to the mutli-part case in which one tries to establish correspondence between a collection of shapes. Finally, we focus on improving the technique efficiency, by disposing of its spatial ingredient and thus keeping the computation in the spectral domain. Extensive experimental results are provided along with the theoretical explanations, to demonstrate the effectiveness of the described methods in these challenging scenarios.

A. Boyarski, A. M. Bronstein, Multidimensional scaling, Computer Vision: A Reference Guide, (Katsushi Ikeuchi, Ed.) details

## Multidimensional scaling

A. Boyarski, A. M. Bronstein
Computer Vision: A Reference Guide, (Katsushi Ikeuchi, Ed.)
--->>

The various multidimensional scaling models can be broadly classified into metric vs. non-metric, and strain (classical scaling) vs. stress (distance scaling) based MDS models. In metric MDS the goal is to maintain the distances in the embedding space as close as possible to the given dissimilarities, while in nonmetric MDS only the order relations between the dissimilarities are important. Strain-based MDS is an algebraic version of the problem that can be solved by eigenvalue decomposition. Stress-based MDS uses a geometric distortion criterion which results in a non-linear and non-convex optimization problem. Each of these models has its own merits and drawbacks, both numerically and application-wise. On top of these basic models, there exist numerous generalizations, including embedding into non-Euclidean domains, working with different stress models, working in different subspaces, and incorporating machine learning approaches to obtain faster, more accurate and more robust embeddings. This chapter reviews these models, with emphasis on their role in computer vision applications.

E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein, ∆-encoder: an effective sample synthesis method for few-shot object recognition, Proc. Neural Information Processing Systems (NIPS), 2018 details

## ∆-encoder: an effective sample synthesis method for few-shot object recognition

E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein
Proc. Neural Information Processing Systems (NIPS), 2018
--->>

Learning to classify new categories based on just one or a few examples is a long-standing challenge in modern computer vision. In this work, we propose a simple yet effective method for few-shot (and one-shot) object recognition. Our approach is based on a modified auto-encoder, denoted ∆-encoder, that learns to synthesize new samples for an unseen category just by seeing few examples from it. The synthesized samples are then used to train a classifier. The proposed approach learns to both extract transferable intra-class deformations, or “deltas”, between same-class pairs of training examples, and to apply those deltas to the few provided examples of a novel class (unseen during training) in order to efficiently synthesize samples from that new class. The proposed method improves over the state-of-the-art in one-shot object-recognition and compares favorably in the few-shot case.

E. Rodolà, Z. Lähner, A. M. Bronstein, M. M. Bronstein, J. Solomon, Functional maps representation on product manifolds, arXiv:1809.10940, 2018 details

## Functional maps representation on product manifolds

E. Rodolà, Z. Lähner, A. M. Bronstein, M. M. Bronstein, J. Solomon
arXiv:1809.10940, 2018
--->>

We consider the tasks of representing, analyzing and manipulating maps between shapes. We model maps as densities over the product manifold of the input shapes; these densities can be treated as scalar functions and therefore are manipulable using the language of signal processing on manifolds. Being a manifold itself, the product space endows the set of maps with a geometry of its own, which we exploit to define map operations in the spectral domain; we also derive relationships with other existing representations (soft maps and functional maps). To apply these ideas in practice, we discretize product manifolds and their Laplace-Beltrami operators, and we introduce localized spectral analysis of the product manifold as a novel tool for map processing. Our framework applies to maps defined between and across 2D and 3D shapes without requiring special adjustment, and it can be implemented efficiently with simple operations on sparse matrices.

C. Baskin, N. Liss, Y. Chai, E. Zheltonozhskii, E. Schwartz, R. Giryes, A. Mendelson, A. M. Bronstein, NICE: noise injection and clamping estimation for neural network quantization, arXiv:1810.00162, 2018 details

## NICE: noise injection and clamping estimation for neural network quantization

C. Baskin, N. Liss, Y. Chai, E. Zheltonozhskii, E. Schwartz, R. Giryes, A. Mendelson, A. M. Bronstein
arXiv:1810.00162, 2018
--->>

Convolutional Neural Networks (CNN) are very popular in many fields including computer vision, speech recognition, natural language processing, to name a few. Though deep learning leads to groundbreaking performance in these domains, the networks used are very demanding computationally and are far from real-time even on a GPU, which is not power efficient and therefore does not suit low power systems such as mobile devices. To overcome this challenge, some solutions have been proposed for quantizing the weights and activations of these networks, which accelerate the runtime significantly. Yet, this acceleration comes at the cost of a larger error. The uniqname method proposed in this work trains quantized neural networks by noise injection and a learned clamping, which improve the accuracy. This leads to state-of-the-art results on various regression and classification tasks, e.g., ImageNet classification with architectures such as ResNet-18/34/50 with low as 3-bit weights and activations. We implement the proposed solution on an FPGA to demonstrate its applicability for low power real-time applications.

Q. Qiu, J. Lezama, A. M. Bronstein, G. Sapiro, ForestHash: Semantic hashing with shallow random forests and tiny convolutional networks, Proc. European Conf. on Computer Vision (ECCV), 2018 details

## ForestHash: Semantic hashing with shallow random forests and tiny convolutional networks

Q. Qiu, J. Lezama, A. M. Bronstein, G. Sapiro
Proc. European Conf. on Computer Vision (ECCV), 2018
--->>

Hash codes are efficient data representations for coping with the ever growing amounts of data. In this paper, we introduce a random forest semantic hashing scheme that embeds tiny convolutional neural networks (CNN) into shallow random forests, with near-optimal information-theoretic code aggregation among trees. We start with a simple hashing scheme, where random trees in a forest act as hashing functions by setting 1′ for the visited tree leaf, and 0′ for the rest. We show that traditional random forests fail to generate hashes that preserve the underlying similarity between the trees, rendering the random forests approach to hashing challenging. To address this, we propose to first randomly group arriving classes at each tee split node into two groups, obtaining a significantly simplified two-class classification problem, which can be handled using a light-weight CNN weak learner. Such random class grouping scheme enables code uniqueness by enforcing each class to share its code with different classes in different trees. A non-conventional low-rank loss is further adopted for the CNN weak learners to encourage code consistency by minimizing intra-class variations and maximizing inter-class distance for the two random class groups. Finally, we introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. The proposed approach significantly outperforms state-of-the-art hashing methods for image retrieval tasks on large-scale public datasets, while performing at the level of other state-of-the-art image classification techniques while utilizing a more compact and efficient scalable representation. This work proposes a principled and robust procedure to train and deploy in parallel an ensemble of light-weight CNNs, instead of simply going deeper.

T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Class-aware fully-convolutional Gaussian and Poisson denoising, IEEE Trans. Image Processing, Vol. 27(11), 2018 details

## Class-aware fully-convolutional Gaussian and Poisson denoising

T. Remez, O. Litany, R. Giryes, A. M. Bronstein
IEEE Trans. Image Processing, Vol. 27(11), 2018
--->>

We propose a fully-convolutional neural-network architecture for image denoising which is simple yet powerful. Its structure allows to exploit the gradual nature of the denoising process, in which shallow layers handle local noise statistics, while deeper layers recover edges and enhance textures. Our method advances the state-of-the-art when trained for different noise levels and distributions (both Gaussian and Poisson). In addition, we show that making the denoiser class-aware by exploiting semantic class information boosts performance, enhances textures and reduces artifacts.

A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller, NetLSD: Hearing the shape of a graph, Proc. ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), 2018 details

## NetLSD: Hearing the shape of a graph

A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller
Proc. ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), 2018

Comparison among graphs is ubiquitous in graph analytics. However, it is a hard task in terms of the expressiveness of the employed similarity measure and the efficiency of its computation. Ideally, graph comparison should be invariant to the order of nodes and the sizes of compared graphs, adaptive to the scale of graph patterns, and scalable. Unfortunately, these properties have not been addressed together. Graph comparisons still rely on direct approaches, graph kernels, or representation-based methods, which are all inefficient and impractical for large graph collections. In this paper, we propose the Network Laplacian Spectral Descriptor (NetLSD): the first, to our knowledge, permutation- and size-invariant, scale-adaptive, and efficiently computable graph representation method that allows for straightforward comparisons of large graphs. NetLSD extracts a compact signature that inherits the formal properties of the Laplacian spectrum, specifically its heat or wave kernel; thus, it hears the shape of a graph. Our evaluation on a variety of real-world graphs demonstrates that it outperforms previous works in both expressiveness and efficiency.

O. Senouf, S. Vedula, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Blondheim, High frame-rate cardiac ultrasound imaging with deep learning, Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2018 details

## High frame-rate cardiac ultrasound imaging with deep learning

O. Senouf, S. Vedula, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Blondheim
Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2018
--->>

Cardiac ultrasound imaging requires a high frame rate in order to capture rapid motion. This can be achieved by multi-line acquisition (MLA), where several narrow-focused received lines are obtained from each wide-focused transmitted line. This shortens the acquisition time at the expense of introducing block artifacts. In this paper, we propose a data-driven learning-based approach to improve the MLA image quality. We train an end-to-end convolutional neural network on pairs of real ultrasound cardiac data, acquired through MLA and the corresponding single-line acquisition (SLA). The network achieves a significant improvement in image quality for both 5- and 7-line MLA resulting in a decorrelation measure similar to that of SLA while having the frame rate of MLA.

S. Vedula, O. Senouf, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Gaitini, High quality ultrasonic multi-line transmission through deep learning, Proc. Machine Learning for Medical Image Reconstruction (MLMIR), 2018 details

## High quality ultrasonic multi-line transmission through deep learning

S. Vedula, O. Senouf, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Gaitini
Proc. Machine Learning for Medical Image Reconstruction (MLMIR), 2018
--->>

Frame rate is a crucial consideration in cardiac ultrasound imaging and 3D sonography. Several methods have been proposed in the medical ultrasound literature aiming at accelerating the image acquisition. In this paper, we consider one such method called multi-line transmission (MLT), in which several evenly separated focused beams are transmitted simultaneously. While MLT reduces the acquisition time, it comes at the expense of a heavy loss of contrast due to the interactions between the beams (cross-talk artifact). In this paper, we introduce a data-driven method to reduce the artifacts arising in MLT. To this end, we propose to train an end-to-end convolutional neural network consisting of correction layers followed by a constant apodization layer. The network is trained on pairs of raw data obtained through MLT and the corresponding single-line transmission (SLT) data. Experimental evaluation demonstrates signi cant improvement both in the visual image quality and in objective measures such as contrast ratio and contrast-to-noise ratio, while preserving resolution unlike traditional apodization-based methods. We show that the proposed method is able to generalize
well across di erent patients and anatomies on real and phantom data.

A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller, SGR: Self-supervised spectral graph representation learning, Proc. KDD Deep Learning Day, 2018 details

## SGR: Self-supervised spectral graph representation learning

A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller
Proc. KDD Deep Learning Day, 2018
--->>

Representing a graph as a vector is a challenging task; ideally, the representation should be easily computable and conducive to efficient comparisons among graphs, tailored to the particular data and an analytical task at hand. Unfortunately, a “one-size-fits-all” solution is unattainable, as different analytical tasks may require different attention to global or local graph features. We develop SGR, the first, to our knowledge, method for learning graph representations in a self-supervised manner. Grounded on spectral graph analysis, SGR seamlessly combines all aforementioned desirable properties. In extensive experiments, we show how our approach works on large graph collections, facilitates self-supervised representation learning across a variety of application domains, and performs competitively to state-of-the-art methods without re-training.

E. Schwartz, R. Giryes, A. M. Bronstein, DeepISP: Towards learning an end-to-end image processing pipeline, IEEE Trans. on Image Processing, 2018 details

## DeepISP: Towards learning an end-to-end image processing pipeline

E. Schwartz, R. Giryes, A. M. Bronstein
IEEE Trans. on Image Processing, 2018
--->>

We present DeepISP, a full end-to-end deep neural model of the camera image signal processing (ISP) pipeline. Our model learns a mapping from the raw low-light mosaiced image to the final visually compelling image and encompasses low-level tasks such as demosaicing and denoising as well as higher-level tasks such as color correction and image adjustment. The training and evaluation of the pipeline were performed on a dedicated dataset containing pairs of low-light and well-lit images captured by a Samsung S7 smartphone camera in both raw and processed JPEG formats. The proposed solution achieves state-of-the-art performance in the objective evaluation of PSNR on the subtask of joint denoising and demosaicing. For the full end-to-end pipeline, it achieves better visual quality compared to the manufacturer ISP, in both a subjective human assessment and when rated by a deep model trained for assessing image quality.

H. Haim, S. Elmalem, R. Giryes, A. M. Bronstein, E. Marom, Depth estimation from a single image using deep learned phase coded mask, IEEE Trans. Computational Imaging, Vol. 2(3), 2018 (Winner of the OSA Student Grand Challenge The Optical System of the Future) details

## Depth estimation from a single image using deep learned phase coded mask

H. Haim, S. Elmalem, R. Giryes, A. M. Bronstein, E. Marom
IEEE Trans. Computational Imaging, Vol. 2(3), 2018 (Winner of the OSA Student Grand Challenge The Optical System of the Future)
--->>

Depth estimation from a single image is a well-known challenge in computer vision. With the advent of deep learning, several approaches for monocular depth estimation have been proposed, all of which have inherent limitations due to the scarce depth cues that exist in a single image. Moreover, these methods are very demanding computationally, which makes them inadequate for systems with limited processing power. In this paper, a phase-coded aperture camera for depth estimation is proposed. The camera is equipped with an optical phase mask that provides unambiguous depth-related color characteristics for the captured image. These are used for estimating the scene depth map using a fully convolutional neural network. The phase-coded aperture structure is learned jointly with the network weights using backpropagation. The strong depth cues (encoded in the image by the phase mask, designed together with the network weights) allow a much simpler neural network architecture for faster and more accurate depth estimation. Performance achieved on simulated images as well as on a real optical setup is superior to the state-of-the-art monocular depth estimation methods (both with respect to the depth accuracy and required processing power), and is competitive with more complex and expensive depth estimation methods such as light-field cameras.

E. Tsitsin, A. M. Bronstein, T. Hendler, M. Medvedovsky, Passive electric impedance tomography, Proc. Electric Impedance Tomography (EIT), 2018 details

## Passive electric impedance tomography

E. Tsitsin, A. M. Bronstein, T. Hendler, M. Medvedovsky
Proc. Electric Impedance Tomography (EIT), 2018
--->>

We introduce an electric impedance tomography modality without any active current injection. By loading the probe electrodes with a time-varying network of impedances, the proposed technique exploits electrical fields existing in the medium due to biological activity or EM interference from the environment or an implantable device. A phantom validation of the technique is presented.

E. Tsitsin, T. Mund, A. M. Bronstein, Printable anisotropic phantom for EEG with distributed current sources, Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018 details

## Printable anisotropic phantom for EEG with distributed current sources

E. Tsitsin, T. Mund, A. M. Bronstein
Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018
--->>

We introduce an electric impedance tomography modality without any active current injection. By loading the probe electrodes with a time-varying network of impedances, the proposed technique exploits electrical fields existing in the medium due to biological activity or EM interference from the environment or an implaPresented is the phantom mimicking the electromagnetic properties of the human head. The fabrication is based on the additive manufacturing (3d-printing) technology combined with the electrically conductive gel. The novel key features of the phantom are the controllable anisotropic electrical conductivity of the skull and the densely packed actively multiplexed monopolar current sources permitting interpolation of the measured gain function to any dipolar current source position and orientation within the head. The phantom was tested in realistic environment successfully simulating the possible signals from neural activations situated at any depth within the brain as well as EMI and motion artifacts. The proposed design can be readily repeated in any lab having an access to a standard 100 micron precision 3d-printer. The meshes of the phantom are available from the corresponding author.ntable device. A phantom validation of the technique is presented.

E. Tsitsin, M. Medvedovsky, A. M. Bronstein, VibroEEG: Improved EEG source reconstruction by combined acoustic-electric imaging, Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018 details

## VibroEEG: Improved EEG source reconstruction by combined acoustic-electric imaging

E. Tsitsin, M. Medvedovsky, A. M. Bronstein
Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018
--->>

Electroencephalography (EEG) is the electrical neural activity recording modality with high temporal and low spatial resolution. Here we propose a novel technique that we call vibroEEG improving significantly the source localization accuracy of EEG. Our method combines electric potential acquisition in concert with acoustic excitation of the vibrational modes of the electrically active cerebral cortex which displace periodically the sources of the low frequency neural electrical activity. The sources residing on the maxima of the induced modes will be maximally weighted in the corresponding spectral components of the broadband signals measured on the noninvasive electrodes. In vibroEEG, for the first time the rich internal geometry of the cerebral cortex can be utilized to separate sources of neural activity lying close in the sense of the Euclidean metric. When the modes are excited locally using phased arrays the neural activity can essentially be probed at any cortical location. When a single transducer is used to induce the excitations, the EEG gain matrix is still being enriched with numerous independent gain vectors increasing its rank. We show theoretically and on numerical simulation that in both cases the source localization accuracy improves substantially.

C. Baskin, N. Liss, E. Zheltonozhskii, A. M. Bronstein, A. Mendelson, Streaming architectures for large-scale quantized neural networks on an FPGA-based dataflow platform, IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2018 details

## Streaming architectures for large-scale quantized neural networks on an FPGA-based dataflow platform

C. Baskin, N. Liss, E. Zheltonozhskii, A. M. Bronstein, A. Mendelson
IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2018
--->>

Deep neural networks (DNNs) are used by different applications that are executed on a range of computer architectures, from IoT devices to supercomputers. The footprint of these networks is huge as well as their computational and communication needs. In order to ease the pressure on resources, research indicates that in many cases a low precision representation (1-2 bit per parameter) of weights and other parameters can achieve similar accuracy while requiring less resources. Using quantized values enables the use of FPGAs to run NNs, since FPGAs are well fitted to these primitives; e.g., FPGAs provide efficient support for bitwise operations and can work with arbitrary-precision representation of numbers. This paper presents a new streaming architecture for running QNNs on FPGAs. The proposed architecture scales out better than alternatives, allowing us to take advantage of systems with multiple FPGAs. We also included support for skip connections, that are used in state-of-the art NNs, and shown that our architecture allows to add those connections almost for free. All this allowed us to implement an 18-layer ResNet for 224×224 images classification, achieving 57.5% top-1 accuracy. In addition, we implemented a full-sized quantized AlexNet. In contrast to previous works, we use 2-bit activations instead of 1-bit ones, which improves AlexNet’s top-1 accuracy from 41.8% to 51.03% for the ImageNet classification. Both AlexNet and ResNet can handle 1000-class real-time classification on an FPGA. Our implementation of ResNet-18 consumes 5× less power and is 4× slower for ImageNet, when compared to the same NN on the latest Nvidia GPUs. Smaller NNs, that fit a single FPGA, are running faster then on GPUs on small (32×32) inputs, while consuming up to 20× less energy and power.

R. Giryes, Y. C. Eldar, A. M. Bronstein, G. Sapiro, Tradeoffs between convergence speed and reconstruction accuracy in inverse problems, IEEE Trans. on Signal Processing, Vol. 66(7), 2018 details

## Tradeoffs between convergence speed and reconstruction accuracy in inverse problems

R. Giryes, Y. C. Eldar, A. M. Bronstein, G. Sapiro
IEEE Trans. on Signal Processing, Vol. 66(7), 2018
--->>

Solving inverse problems with iterative algorithms is popular, especially for large data. Due to time constraints, the number of possible iterations is usually limited, potentially affecting the achievable accuracy. Given an error one is willing to tolerate, an important question is whether it is possible to modify the original iterations to obtain faster convergence to a minimizer achieving the allowed error without increasing the computational cost of each iteration considerably. Relying on recent recovery techniques developed for settings in which the desired signal belongs to some low-dimensional set, we show that using a coarse estimate of this set may lead to faster convergence at the cost of an additional reconstruction error related to the accuracy of the set approximation. Our theory ties to recent advances in sparse recovery, compressed sensing, and deep learning. Particularly, it may provide a possible explanation to the successful approximation of the L1-minimization solution by neural networks with layers representing iterations, as practiced in the learned iterative shrinkage-thresholding algorithm.

S. Vedula, O. Senouf, A. M. Bronstein, O. V. Michailovich, M. Zibulevsky, Towards CT-quality ultrasound imaging using deep learning, arXiv:1710.06304, 2017 details

## Towards CT-quality ultrasound imaging using deep learning

S. Vedula, O. Senouf, A. M. Bronstein, O. V. Michailovich, M. Zibulevsky
arXiv:1710.06304, 2017
--->>

The cost-effectiveness and practical harmlessness of ultra- sound imaging have made it one of the most widespread tools for medical diagnosis. Unfortunately, the beam-forming based image formation produces granular speckle noise, blur- ring, shading and other artifacts. To overcome these effects, the ultimate goal would be to reconstruct the tissue acoustic properties by solving a full wave propagation inverse prob- lem. In this work, we make a step towards this goal, using Multi-Resolution Convolutional Neural Networks (CNN). As a result, we are able to reconstruct CT-quality images from the reflected ultrasound radio-frequency(RF) data obtained by simulation from real CT scans of a human body. We also show that CNN is able to imitate existing computationally heavy despeckling methods, thereby saving orders of magni- tude in computations and making them amenable to real-time applications.

O. Litany, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, Deep Functional Maps: Structured prediction for dense shape correspondence, Proc. Int'l Conf. on Computer Vision (ICCV), 2017 details

## Deep Functional Maps: Structured prediction for dense shape correspondence

O. Litany, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein
Proc. Int'l Conf. on Computer Vision (ICCV), 2017
--->>

We introduce a new framework for learning dense correspondence between deformable 3D shapes. Existing learning based approaches model shape correspondence as a labelling problem, where each point of a query shape receives a label identifying a point on some reference domain; the correspondence is then constructed a posteriori by composing the label predictions of two input shapes. We propose a paradigm shift and design a structured prediction model in the space of functional maps, linear operators that provide a compact representation of the correspondence. We model the learning process via a deep residual network which takes dense descriptor fields defined on two shapes as input, and outputs a soft map between the two given objects. The resulting correspondence is shown to be accurate on several challenging benchmarks comprising multiple categories, synthetic models, real scans with acquisition artifacts, topological noise, and partiality.

Z. Laehner, M. Vestner, A. Boyarski, O. Litany, R. Slossberg, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, R. Kimmel, D. Cremers, Efficient deformable shape correspondence via kernel matching, Proc. 3D Vision (3DV), 2017 details

## Efficient deformable shape correspondence via kernel matching

Z. Laehner, M. Vestner, A. Boyarski, O. Litany, R. Slossberg, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, R. Kimmel, D. Cremers
Proc. 3D Vision (3DV), 2017

We present a method to match three dimensional shapes under non-isometric deformations, topology changes and partiality. We formulate the problem as matching between a set of pair-wise and point-wise descriptors, imposing a continuity prior on the mapping, and propose a projected descent optimization procedure inspired by difference of convex functions (DC) programming. Surprisingly, in spite of the highly non-convex nature of the resulting quadratic assignment problem, our method converges to a semantically meaningful and continuous mapping in most of our experiments, and scales well. We provide preliminary theoretical analysis and several interpretations of the method.

G. Alexandroni, Y. Podolsky, H. Greenspan, T. Remez, O. Litany, A. M. Bronstein, R. Giryes, White matter fiber representation using continuous dictionary learning, Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2017 details

## White matter fiber representation using continuous dictionary learning

G. Alexandroni, Y. Podolsky, H. Greenspan, T. Remez, O. Litany, A. M. Bronstein, R. Giryes
Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2017
--->>

With increasingly sophisticated Diffusion Weighted MRI acquisition methods and modelling techniques, very large sets of streamlines (fibers) are presently generated per imaged brain. These reconstructions of white matter architecture, which are important for human brain research and pre-surgical planning, require a large amount of storage and are often unwieldy and difficult to manipulate and analyze. This work proposes a novel continuous parsimonious framework in which signals are sparsely represented in a dictionary with continuous atoms. The significant innovation in our new methodology is the ability to train such continuous dictionaries, unlike previous approaches that either used pre-fixed continuous transforms or training with finite atoms. This leads to an innovative fiber representation method, which uses Continuous Dictionary Learning to sparsely code each fiber with high accuracy. This method is tested on numerous tractograms produced from the Human Connectome Project data and achieves state-of-the-art performances in compression ratio and reconstruction error.

M. Vestner, R. Litman, E. Rodolà, A. M. Bronstein, D. Cremers, Product Manifold Filter: Non-rigid shape correspondence via kernel density estimation in the product space, Proc. Computer Vision and Pattern Recognition (CVPR), 2017 details

## Product Manifold Filter: Non-rigid shape correspondence via kernel density estimation in the product space

M. Vestner, R. Litman, E. Rodolà, A. M. Bronstein, D. Cremers
Proc. Computer Vision and Pattern Recognition (CVPR), 2017
--->>

Many algorithms for the computation of correspondences between deformable shapes rely on some variant of nearest neighbor matching in a descriptor space. Such are, for example, various point-wise correspondence recovery algorithms used as a post-processing stage in the functional correspondence framework. Such frequently used techniques implicitly make restrictive assumptions (e.g., near-isometry) on the considered shapes and in practice suffer from a lack of accuracy and result in poor surjectivity. We propose an alternative recovery technique capable of guaranteeing a bijective correspondence and producing significantly higher accuracy and smoothness. Unlike other methods, our approach does not depend on the assumption that the analyzed shapes are isometric. We derive the proposed method from the statistical framework of kernel density estimation and demonstrate its performance on several challenging deformable 3D shape matching datasets.

O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, Fully spectral partial shape matching, Computer Graphics Forum, Vol. 36(2), 2017 details

## Fully spectral partial shape matching

O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein
Computer Graphics Forum, Vol. 36(2), 2017
--->>

We propose an efficient procedure for calculating partial dense intrinsic correspondence between deformable shapes performed entirely in the spectral domain. Our technique relies on the recently introduced partial functional maps formalism and on the joint approximate diagonalization (JAD) of the Laplace-Beltrami operators previously introduced for matching non-isometric shapes. We show that a variant of the JAD problem with an appropriately modified coupling term (surprisingly) allows to construct quasi-harmonic bases localized on the latent corresponding parts. This circumvents the need to explicitly compute the unknown parts by means of the cumbersome alternating minimization used in the previous approaches, and allows performing all the calculations in the spectral domain with constant complexity independent of the number of shape vertices. We provide an extensive evaluation of the proposed technique on standard non-rigid correspondence benchmarks and show state-of-the-art performance in various settings, including partiality and the presence of topological noise.

A. Boyarski, A. M. Bronstein, M. M. Bronstein, Subspace least squares multidimensional scaling, Proc. Scale Space and Variational Methods (SSVM), 2017 details

## Subspace least squares multidimensional scaling

A. Boyarski, A. M. Bronstein, M. M. Bronstein
Proc. Scale Space and Variational Methods (SSVM), 2017

Multidimensional Scaling (MDS) is one of the most popular methods for dimensionality reduction and visualization of high dimensional data. Apart from these tasks, it also found applications in the field of geometry processing for the analysis and reconstruction of non-rigid shapes. In this regard, MDS can be thought of as a shape from metric algorithm, consisting of finding a configuration of points in the Euclidean space that realize, as isometrically as possible, some given distance structure. In the present work we cast the least squares variant of MDS (LS-MDS) in the spectral domain. This uncovers a multiresolution property of distance scaling which speeds up the optimization by a significant amount, while producing comparable, and sometimes even better, embeddings.

T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep class-aware image denoising, Proc. Int'l Conf. on Image Processing (ICIP), 2017 details

## Deep class-aware image denoising

T. Remez, O. Litany, R. Giryes, A. M. Bronstein
Proc. Int'l Conf. on Image Processing (ICIP), 2017
--->>

The increasing demand for high image quality in mobile devices brings forth the need for better computational enhancement techniques, and image denoising in particular. To this end, we propose a new fully convolutional deep neural network architecture which is simple yet powerful and achieves state-of-the-art performance for additive Gaussian noise removal. Furthermore, we claim that the personal photo-collections can usually be categorized into a small set of semantic classes. However simple, this observation has not been exploited in image denoising until now. We show that a significant boost in performance of up to 0.4dB PSNR can be achieved by making our network class-aware, namely, by fine-tuning it for images belonging to a specific semantic class. Relying on the hugely successful existing image classifiers, this research advocates for using a class-aware approach in all image enhancement tasks.

O. Litany, T. Remez, A. M. Bronstein, Cloud Dictionary: Sparse coding and modeling for point clouds, arXiv:1612.04956 details

## Cloud Dictionary: Sparse coding and modeling for point clouds

O. Litany, T. Remez, A. M. Bronstein
arXiv:1612.04956
--->>

With the development of range sensors such as LIDAR and time-of-flight cameras, 3D point cloud scans have become ubiquitous in computer vision applications, the most prominent ones being gesture recognition and autonomous driving. Parsimony-based algorithms have shown great success on images and videos where data points are sampled on a regular Cartesian grid. We propose an adaptation of these techniques to irregularly sampled signals by using continuous dictionaries. We present an example application in the form of point cloud denoising.

T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep class-aware denoising, arXiv:1701.01698 details

## Deep class-aware denoising

T. Remez, O. Litany, R. Giryes, A. M. Bronstein
arXiv:1701.01698
--->>

The increasing demand for high image quality in mobile devices brings forth the need for better computational enhancement techniques, and image denoising in particular. At the same time, the images captured by these devices can be categorized into a small set of semantic classes. However simple, this observation has not been exploited in image denoising until now. In this paper, we demonstrate how the reconstruction quality improves when a denoiser is aware of the type of content in the image. To this end, we first propose a new fully convolutional deep neural network architecture which is simple yet powerful as it achieves state-of-the-art performance even without be- ing class-aware. We further show that a significant boost in performance of up to 0.4 dB PSNR can be achieved by making our network class-aware, namely, by fine-tuning it for images belonging to a specific semantic class. Relying on the hugely successful existing image classifiers, this research advocates for using a class-aware approach in all image enhancement tasks.

T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep convolutional denoising of low-light images, arXiv:1701.01687 details

## Deep convolutional denoising of low-light images

T. Remez, O. Litany, R. Giryes, A. M. Bronstein
arXiv:1701.01687
--->>

Poisson distribution is used for modeling noise in photon-limited imaging. While canonical examples include relatively exotic types of sensing like spectral imaging or astronomy, the problem is relevant to regular photography now more than ever due to the booming market for mobile cameras. Restricted form factor limits the amount of absorbed light, thus computational post-processing is called for. In this paper, we make use of the powerful framework of deep convolutional neural networks for Poisson denoising. We demonstrate how by training the same network with images having a specific peak value, our denoiser outperforms previous state-of-the-art by a large margin both visually and quantitatively. Being flexible and data-driven, our solution resolves the heavy ad hoc engineering used in previous methods and is an order of magnitude faster. We further show that by adding a reasonable prior on the class of the image being processed, another significant boost in performance is achieved.

O. Litany, T. Remez, D. Freedman, L. Shapira, A. M. Bronstein, R. Gal, ASIST: Automatic Semantically Invariant Scene Transformation, Computer Vision and Image Understanding, Vol. 157, 2017 details

## ASIST: Automatic Semantically Invariant Scene Transformation

O. Litany, T. Remez, D. Freedman, L. Shapira, A. M. Bronstein, R. Gal
Computer Vision and Image Understanding, Vol. 157, 2017
--->>

We present ASIST, a technique for transforming point clouds by replacing objects with their semantically equivalent counterparts. Transformations of this kind have applications in virtual reality, repair of fused scans, and robotics. ASIST is based on a unified formulation of semantic labeling and object replacement; both result from minimizing a single objective. We present numerical tools for the efficient solution of this optimization problem. The method is experimentally assessed on new datasets of both synthetic and real point clouds, and is additionally compared to two recent works on object replacement on data from the corresponding papers.

M. Ovsjanikov, E. Corman, M. M. Bronstein, E. Rodolà, M. Ben-Chen, L. Guibas, F. Chazal, A. M. Bronstein, Computing and processing correspondences with functional maps, SIGGRAPH Courses, 2017 details

## Computing and processing correspondences with functional maps

M. Ovsjanikov, E. Corman, M. M. Bronstein, E. Rodolà, M. Ben-Chen, L. Guibas, F. Chazal, A. M. Bronstein
SIGGRAPH Courses, 2017
--->>

Notions of similarity and correspondence between geometric shapes and images are central to many tasks in geometry processing, computer vision, and computer graphics. The goal of this course is to familiarize the audience with a set of recent techniques that greatly facilitate the computation of mappings or correspondences between geometric datasets, such as 3D shapes or 2D images by formulating them as mappings between functions rather than points or triangles. Methods based on the functional map framework have recently led to state-of-the-art results in problems as diverse as non-rigid shape matching, image co-segmentation and even some aspects of tangent vector field design. One challenge in adopting these methods in practice, however, is that their exposition often assumes a significant amount of background in geometry processing, spectral methods and functional analysis, which can make it difficult to gain an intuition about their performance or about their applicability to real-life problems. In this course, we try to provide all the tools necessary to appreciate and use these techniques, while assuming very little background knowledge. We also give a unifying treatment of these techniques, which may be difficult to extract from the individual publications and, at the same time, hint at the generality of this point of view, which can help tackle many problems in the analysis and creation of visual content. This course is structured as a half day course. We will assume that the participants have knowledge of basic linear algebra and some knowledge of differential geometry, to the extent of being familiar with the concepts of a manifold and a tangent vector space. We will discuss in detail the functional approach to finding correspondences between non-rigid shapes, the design and analysis of tangent vector fields on surfaces, consistent map estimation in networks of shapes and applications to shape and image segmentation, shape variability analysis, and other areas.

Y. Choukroun, A. Shtern, A. M. Bronstein, R. Kimmel, Hamiltonian operator for spectral shape analysis, arXiv:1611.01990, 2016 details

## Hamiltonian operator for spectral shape analysis

Y. Choukroun, A. Shtern, A. M. Bronstein, R. Kimmel
arXiv:1611.01990, 2016
--->>

Many shape analysis methods treat the geometry of an object as a metric space that can be captured by the Laplace-Beltrami operator. In this paper, we propose to adapt the classical Hamiltonian operator from quantum me- chanics to the field of shape analysis. To this end we study the addition of a potential function to the Laplacian as a generator for dual spaces in which shape processing is performed. We present a general optimization approach for solving variational problems involving the basis defined by the Hamilto- nian using perturbation theory for its eigenvectors. The suggested operator is shown to produce better functional spaces to operate with, as demon- strated on different shape analysis tasks.

A. M. Bronstein, Y. Choukroun, R. Kimmel, M. Sela, Consistent discretization and minimization of the L1 norm on manifolds, Proc. 3D Vision (3DV), 2016 details

## Consistent discretization and minimization of the L1 norm on manifolds

A. M. Bronstein, Y. Choukroun, R. Kimmel, M. Sela
Proc. 3D Vision (3DV), 2016
--->>

The L1 norm has been tremendously popular in signal and image processing in the past two decades due to its sparsity-promoting properties. More recently, its generalization to non-Euclidean domains has been found useful in shape analysis applications. For example, in conjunction with the minimization of the Dirichlet energy, it was shown to produce a compactly supported quasi-harmonic orthonormal basis, dubbed as compressed manifold modes. The continuous L1 norm on the manifold is often replaced by the vector l1 norm applied to sampled functions. We show that such an approach is incorrect in the sense that it does not consistently discretize the continuous norm and warn against its sensitivity to the specific sampling. We propose two alternative discretizations resulting in an iteratively-reweighed l2 norm. We demonstrate the proposed strategy on the compressed modes problem, which reduces to a sequence of simple eigendecomposition problems not requiring non-convex optimization on Stiefel manifolds and producing more stable and accurate results.

R. Litman, A. M. Bronstein, SpectroMeter: Amortized sublinear spectral approximation of distance on graphs, Proc. 3D Vision (3DV), 2016 details

## SpectroMeter: Amortized sublinear spectral approximation of distance on graphs

R. Litman, A. M. Bronstein
Proc. 3D Vision (3DV), 2016
--->>

We present a method to approximate pairwise distance on a graph, having an amortized sub-linear complexity in its size. The proposed method follows the so-called heat method due to Crane et al. The only additional input is the values of the eigenfunctions of the graph Laplacian at a subset of the vertices. Using these values we estimate a random walk from the source points, and normalize the result into a unit gradient function. The eigenfunctions are then used to synthesize distance values abiding by these constraints at desired locations. We show that this method works in practice on different types of inputs ranging from triangular meshes to general graphs. We also demonstrate that the resulting approximate distance is accurate enough to be used as the input to a recent method for intrinsic shape correspondence computation.

T. Remez, O. Litany, S. Yoseff, H. Haim, A. M. Bronstein, FPGA system for real-time computational extended depth of field imaging using phase aperture coding, arXiv:1608.01074 details

## FPGA system for real-time computational extended depth of field imaging using phase aperture coding

T. Remez, O. Litany, S. Yoseff, H. Haim, A. M. Bronstein
arXiv:1608.01074
--->>

We present a proof-of-concept end-to-end system for computational extended depth of field (EDOF) imaging. The acquisition is performed through a phase-coded aperture implemented by placing a thin wavelength-dependent op- tical mask inside the pupil of a conventional camera lens, as a result of which, each color channel is focused at a different depth. The reconstruction process re- ceives the raw Bayer image as the input, and performs blind estimation of the output color image in focus at an extended range of depths using a patch-wise sparse prior. We present a fast non-iterative reconstruction algorithm operating with constant latency in fixed-point arithmetics and achieving real-time perfor- mance in a prototype FPGA implementation. The output of the system, on simu- lated and real-life scenes, is qualitatively and quantitatively better than the result of clear-aperture imaging followed by state-of-the-art blind deblurring.

R. Giryes, G. Sapiro, A. M. Bronstein, Deep neural networks with random Gaussian weights: A universal classification strategy?, IEEE Trans. Signal Processing, Vol. 64(13), 2016 details

## Deep neural networks with random Gaussian weights: A universal classification strategy?

R. Giryes, G. Sapiro, A. M. Bronstein
IEEE Trans. Signal Processing, Vol. 64(13), 2016

Three important properties of a classification machinery are: (i) the system preserves the important information of the input data; (ii) the training examples convey information for unseen data; and (iii) the system is able to treat differently points from different classes. In this work, we show that these fundamental properties are inherited by the architecture of deep neural networks. We formally prove that these networks with random Gaussian weights perform a distance-preserving embedding of the data, with a special treatment for in-class and out-of-class data. Similar points at the input of the network are likely to have the same The theoretical analysis of deep networks here presented exploits tools used in the compressed sensing and dictionary learning literature, thereby making a formal connection between these important topics. The derived results allow drawing conclusions on the metric learning properties of the network and their relation to its structure; and provide bounds on the required size of the training set such that the training examples would represent faithfully the unseen data. The results are validated with state-of-the-art trained networks.

O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. Cremers, Non-rigid puzzles, Computer Graphics Forum, Vol. 35(5), 2016 (SGP Best Paper Award) details

## Non-rigid puzzles

O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. Cremers
Computer Graphics Forum, Vol. 35(5), 2016 (SGP Best Paper Award)
--->>

Shape correspondence is a fundamental problem in computer graphics and vision, with applications in various problems including animation, texture mapping, robotic vision, medical imaging, archaeology and many more. In settings where the shapes are allowed to undergo non-rigid deformations and only partial views are available, the problem becomes very challenging. To this end, we present a non-rigid multi-part shape matching algorithm. We assume to be given a reference shape and its multiple parts undergoing a non-rigid deformation. Each of these query parts can be additionally contaminated by clutter, may overlap with other parts, and there might be missing parts or redundant ones. Our method simultaneously solves for the segmentation of the reference model, and for a dense correspondence to (subsets of) the parts. Experimental results on synthetic as well as real scans demonstrate the effectiveness of our method in dealing with this challenging matching scenario.

X. Bian, H. Krim, A. M. Bronstein, L. Dai, Sparsity and nullity: paradigms for analysis dictionary learning, SIAM J. Imaging Sci., Vol. 9(3), 2016 details

## Sparsity and nullity: paradigms for analysis dictionary learning

X. Bian, H. Krim, A. M. Bronstein, L. Dai
SIAM J. Imaging Sci., Vol. 9(3), 2016
--->>

Sparse models in dictionary learning have been successfully applied in a wide variety of machine learning and computer vision problems, and as a result, have recently attracted increased research interest. Another interesting related problem based on linear equality constraints, namely the sparse null space (SNS) problem, first appeared in 1986 and has since inspired results on sparse basis pursuit. In this paper, we investigate the relation between the SNS problem and the analysis dictionary learning (ADL) problem, and show that the SNS problem plays a central role, and may be utilized to solve dictionary learning problems. Moreover, we propose an efficient algorithm of sparse null space basis pursuit (SNS-BP) and extend it to a solution of ADL. Experimental results on numerical synthetic data and real-world data are further presented to validate the performance of our method.

D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. M. Bronstein, M. M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, J. Ye, Shape retrieval of non-rigid 3D human models, Intl. Journal of Computer Vision (IJCV), 2016 details

## Shape retrieval of non-rigid 3D human models

D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. M. Bronstein, M. M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, J. Ye
Intl. Journal of Computer Vision (IJCV), 2016
--->>

3D models of humans are commonly used within computer graphics and vision, and so the ability to distinguish between body shapes is an important shape retrieval problem. We extend our recent paper which provided a benchmark for testing non-rigid 3D shape retrieval algorithms on 3D human models. This benchmark provided a far stricter challenge than previous shape benchmarks.We have added 145 new models for use as a separate training set, in order to standardise the training data used and provide a fairer comparison. We have also included experiments with the FAUST dataset of human scans. All participants of the previous benchmark study have taken part in the new tests reported here, many providing updated results using the new data. In addition, further participants have also taken part, and we provide extra analysis of the retrieval results. A total of 25 different shape retrieval methods are compared.

O. Litany, T. Remez, A. M. Bronstein, Image reconstruction from dense binary pixels, arXiv:1512.01774
D. Eynard, A. Kovnatsky, M. M. Bronstein, K. Glashoff, A. M. Bronstein, Multimodal manifold analysis using simultaneous diagonalization of Laplacians, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 37(12), 2015 details

## Multimodal manifold analysis using simultaneous diagonalization of Laplacians

D. Eynard, A. Kovnatsky, M. M. Bronstein, K. Glashoff, A. M. Bronstein
IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 37(12), 2015
--->>

We construct an extension of spectral and diffusion geometry to multiple modalities through simultaneous diagonalization of Laplacian matrices. This naturally extends classical data analysis tools based on spectral geometry, such as diffusion maps and spectral clustering. We provide several synthetic and real examples of manifold learning, retrieval, and clustering demonstrating that the joint spectral geometry frequently better captures the inherent structure of multi-modal data. We also show the relation of many previous approaches to multimodal manifold analysis to our framework, of which the can be seen as particular cases.

T. Remez, O. Litany, A. M. Bronstein, A Picture is Worth a Billion Bits: Real-time image reconstruction from dense binary pixels, arXiv:1510.04601 details

## A Picture is Worth a Billion Bits: Real-time image reconstruction from dense binary pixels

T. Remez, O. Litany, A. M. Bronstein
arXiv:1510.04601
--->>

The pursuit of smaller pixel sizes at ever-increasing resolution in digital image sensors is mainly driven by the stringent price and form-factor requirements of sensors and optics in the cellular phone market. Recently, Eric Fossum proposed a novel concept of an image sensor with dense sub-diffraction limit one-bit pixels (jots), which can be considered a digital emulation of silver halide photographic film. This idea has been recently embodied as the EPFL Gigavision camera. A major bottleneck in the design of such sensors is the image reconstruction process, producing a continuous high dynamic range image from oversampled bi- nary measurements. The extreme quantization of the Pois- son statistics is incompatible with the assumptions of most standard image processing and enhancement frameworks. The recently proposed maximum-likelihood (ML) approach addresses this difficulty, but suffers from image artifacts and has impractically high computational complexity. In this work, we study a variant of a sensor with binary thresh- old pixels and propose a reconstruction algorithm combin- ing an ML data fitting term with a sparse synthesis prior. We also show an efficient hardware-friendly real-time approximation of this inverse operator. Promising results are shown on synthetic data as well as on HDR data emulated using multiple exposures of a regular CMOS sensor.

A. M. Bronstein, New dimensions of media, Universidad La Salle, Revista de ciencias de la computación, Vol. 3(1), 2015
H. Haim, A. M. Bronstein, E. Marom, Computational all-in-focus imaging using an optical phase mask, OSA Optics Express, Vol. 23(19), 2015 details

## Computational all-in-focus imaging using an optical phase mask

H. Haim, A. M. Bronstein, E. Marom
OSA Optics Express, Vol. 23(19), 2015
--->>

A method for extended depth of field imaging based on image acquisition through a thin binary phase plate followed by fast automatic computational post-processing is presented. By placing a wavelength dependent optical mask inside the pupil of a conventional camera lens, one acquires a unique response for each of the three main color channels, which adds valuable information that allows blind reconstruction of blurred images without the need of an iterative search process for estimating the blurring kernel. The presented simulation as well as capture of a real life scene show how acquiring a one-shot image focused at a single plane, enable generating a de-blurred scene over an extended range in space.

R. Litman, S. Korman, A. M. Bronstein, S. Avidan, GMD: Global model detection via inlier rate estimation, Proc. Computer Vision and Pattern Recognition (CVPR), 2015 details

## GMD: Global model detection via inlier rate estimation

R. Litman, S. Korman, A. M. Bronstein, S. Avidan
Proc. Computer Vision and Pattern Recognition (CVPR), 2015
--->>

This work presents a novel approach for detecting inliers in a given set of correspondences (matches). It does so without explicitly identifying any consensus set, based on a method for inlier rate estimation (IRE). Given such an estimator for the inlier rate, we also present an algorithm that detects a globally optimal transformation. We provide a theoretical analysis of the IRE method using a stochastic generative model on the continuous spaces of matches and transformations. This model allows rigorous investigation of the limits of our IRE method for the case of 2D translation, further giving bounds and insights for the more general case. Our theoretical analysis is validated empirically and is shown to hold in practice for the more general case of 2D affinities. In addition, we show that the combined framework works on challenging cases of 2D homography estimation, with very few and possibly noisy inliers, where RANSAC generally fails.

I. Sipiran, B. Bustos, T. Schreck, A. M. Bronstein, M. M. Bronstein, U. Castellani, S. Choi, L. Lai, H. Li, R. Litman, L. Sun, SHREC'15 Track: Scalability of non-rigid 3D shape retrieval, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2015 details

## SHREC'15 Track: Scalability of non-rigid 3D shape retrieval

I. Sipiran, B. Bustos, T. Schreck, A. M. Bronstein, M. M. Bronstein, U. Castellani, S. Choi, L. Lai, H. Li, R. Litman, L. Sun
Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2015
--->>

Due to recent advances in 3D acquisition and modeling, increasingly large amounts of 3D shape data become available in many application domains. This rises not only the need for effective methods for 3D shape retrieval, but also efficient retrieval and robust implementations. Previous 3D retrieval challenges have mainly considered data sets in the range of a few thousands of queries. In the 2015 SHREC track on Scalability of 3D Shape Retrieval we provide a benchmark with more than 96 thousand shapes. The data set is based on a non-rigid retrieval benchmark enhanced by other existing shape benchmarks. From the baseline models, a large set of partial objects were automatically created by simulating a range-image acquisition process. Four teams have participated in the track, with most methods providing very good to near-perfect retrieval results, and one less complex baseline method providing fair performance. Timing results indicate that three of the methods including the latter baseline one provide near- interactive time query execution. Generally, the cost of data pre-processing varies depending on the method.

X. Bian, H. Krim, A. M. Bronstein, L. Dai, Sparse null space basis pursuit and analysis dictionary learning for high-dimensional data analysis, Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2015 details

## Sparse null space basis pursuit and analysis dictionary learning for high-dimensional data analysis

X. Bian, H. Krim, A. M. Bronstein, L. Dai
Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2015
--->>

Sparse models in dictionary learning have been successfully applied in a wide variety of machine learning and computer vision problems, and have also recently been of increasing research interest. Another interesting related problem based on a linear equality constraint, namely the sparse null space problem (SNS), first appeared in 1986, and has since inspired results on sparse basis pursuit. In this paper, we investigate the relation between the SNS problem and the analysis dictionary learning problem, and show that the SNS problem plays a central role, and may be utilized to solve dictionary learning problems. Moreover, we propose an efficient algorithm of sparse null space basis pursuit, and extend it to a solution of analysis dictionary learning. Experimental results on numerical synthetic data and realworld data are further presented to validate the performance of our method.

Y. Aflalo, A. M. Bronstein, R. Kimmel, On convex relaxation of graph isomorphism, Proc. US National Academy of Sciences (PNAS), 2015 details

## On convex relaxation of graph isomorphism

Y. Aflalo, A. M. Bronstein, R. Kimmel
Proc. US National Academy of Sciences (PNAS), 2015

We consider the problem of exact and inexact matching of weighted undirected graphs, in which a bijective correspondence is sought to minimize a quadratic weight disagreement. This computationally challenging problem is often relaxed as a convex quadratic program, in which the space of permutations is replaced by the space of doubly stochastic matrices. However, the applicability of such a relaxation is poorly understood. We define a broad class of friendly graphs characterized by an easily verifiable spectral property. We prove that for friendly graphs, the convex relaxation is guaranteed to find the exact isomorphism or certify its inexistence. This result is further extended to approximately isomorphic graphs, for which we develop an explicit bound on the amount of weight disagreement under which the relaxation is guaranteed to find the globally optimal approximate isomorphism. We also show that in many cases, the graph matching problem can be further harmlessly relaxed to a convex quadratic program with only n separable linear equality constraints, which is substantially more efficient than the standard relaxation involving 2n equality and n2 inequality constraints. Finally, we show that our results are still valid for unfriendly graphs if additional information in the form of seeds or attributes is allowed, with the latter satisfying an easy to verify spectral characteristic.

P. Sprechmann, A. M. Bronstein, G. Sapiro, Learning efficient sparse and low-rank models, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 37(9), 2015 details

## Learning efficient sparse and low-rank models

P. Sprechmann, A. M. Bronstein, G. Sapiro
IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 37(9), 2015
--->>

Parsimony, including sparsity and low rank, has been shown to successfully model data in numerous machine learning and signal processing tasks. Traditionally, parsimonious modeling approaches rely on an iterative algorithm that minimizes an objective function with parsimony-promoting terms. The inherently sequential structure and data-dependent complexity and latency of iterative optimization constitute a major limitation in many applications requiring real-time performance or involving large-scale data. Another limitation encountered by these models is the difficulty of their inclusion in supervised learning scenarios, where the higher-level training objective would depend on the solution of the lower-level pursuit problem. The resulting bilevel optimization problems are in general notoriously difficult to solve. In this paper, we propose to move the emphasis from the model to the pursuit algorithm, and develop a process-centric view of parsimonious modeling, in which a deterministic fixed-complexity pursuit process is used in lieu of iterative optimization. We show a principled way to construct learnable pursuit process architectures for structured sparse and robust low rank models from the iteration of proximal descent algorithms. These architectures approximate the exact parsimonious representation with a fraction of the complexity of the standard optimization methods. We also show that carefully chosen training regimes allow to naturally extend parsimonious models to discriminative settings. State-of-the-art results are demonstrated on several challenging problems in image and audio processing with several orders of magnitude speedup compared to the exact optimization algorithms.

P. Sprechmann, A. M. Bronstein, G. Sapiro, Supervised non-negative matrix factorization for audio source separation, Chapter in Excursions in Harmonic Analysis (R. Balan, M. Begue, J. J. Benedetto, W. Czaja, K. Okoudjou Eds.), Birkhaeuser, 2015 details

## Supervised non-negative matrix factorization for audio source separation

P. Sprechmann, A. M. Bronstein, G. Sapiro
Chapter in Excursions in Harmonic Analysis (R. Balan, M. Begue, J. J. Benedetto, W. Czaja, K. Okoudjou Eds.), Birkhaeuser, 2015
--->>

Source separation is a widely studied problems in signal processing. Despite the permanent progress reported in the literature it is still considered a significant challenge. This chapter first reviews the use of non-negative matrix factorization (NMF) algorithms for solving source separation problems, and proposes a new way for the supervised training in NMF. Matrix factorization methods have received a lot of attention in recent year in the audio processing community, producing particularly good results in source separation. Traditionally, NMF algorithms consist of two separate stages: a training stage, in which a generative model is learned; and a testing stage in which the pre-learned model is used in a high level task such as enhancement, separation, or classification. As an alternative, we propose a tasksupervised NMF method for the adaptation of the basis spectra learned in the first stage to enhance the performance on the specific task used in the second stage. We cast this problem as a bilevel optimization program efficiently solved via stochastic gradient descent. The proposed approach is general enough to handle sparsity priors of the activations, and allow non-Euclidean data terms such as beta-divergences. The framework is evaluated on speech enhancement.

Q. Qiu, G. Sapiro, A. M. Bronstein, Random forests can hash, arXiv:1412.5083 details

## Random forests can hash

Q. Qiu, G. Sapiro, A. M. Bronstein
arXiv:1412.5083
--->>

Hash codes are a very efficient data representation needed to be able to cope with the ever growing amounts of data. We introduce a random forest semantic hashing scheme with information-theoretic code aggregation, showing for the first time how random forest, a technique that together with deep learning have shown spectacular results in classification, can also be extended to large-scale retrieval. Traditional random forest fails to enforce the consistency of hashes generated from each tree for the same class data, i.e., to preserve the underlying similarity, and it also lacks a principled way for code aggregation across trees. We start with a simple hashing scheme, where independently trained random trees in a forest are acting as hashing functions. We the propose a subspace model as the splitting function, and show that it enforces the hash consistency in a tree for data from the same class. We also introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. Experiments on large-scale public datasets are presented, showing that the proposed approach significantly outperforms state-of-the-art hashing methods for retrieval tasks.

P. Sprechmann, A. M. Bronstein, G. Sapiro, Supervised non-Euclidean sparse NMF via bilevel optimization with applications to speech enhancement, Proc. Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014 details

## Supervised non-Euclidean sparse NMF via bilevel optimization with applications to speech enhancement

P. Sprechmann, A. M. Bronstein, G. Sapiro
Proc. Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014
--->>

Traditionally, NMF algorithms consist of two separate stages: a training stage, in which a generative model is learned; and a testing stage in which the pre-learned model is used in a high level task such as enhancement, separation, or classification. As an alternative, we propose a task-supervised NMF method for the adaptation of the basis spectra learned in the first stage to enhance the performance on the specific task used in the second stage. We cast this problem as a bilevel optimization program that can be efficiently solved via stochastic gradient descent. The proposed approach is general enough to handle sparsity priors of the activations, and allow non-Euclidean data terms such as beta-divergences. The framework is evaluated on single-channel speech enhancement tasks.

S. Korman, R. Litman, S. Avidan, A. M. Bronstein, Probably approximately symmetric: Fast rigid symmetry detection with global guarantees, Computer Graphics Forum (CGF), Vol. 34(1), 2014 details

## Probably approximately symmetric: Fast rigid symmetry detection with global guarantees

S. Korman, R. Litman, S. Avidan, A. M. Bronstein
Computer Graphics Forum (CGF), Vol. 34(1), 2014
--->>

We present a fast algorithm for global 3D symmetry detection with approximation guarantees. The algorithm is guaranteed to find the best approximate symmetry of a given shape, to within a user-specified threshold, with very high probability. Our method uses a carefully designed sampling of the transformation space, where each transformation is efficiently evaluated using a sub-linear algorithm. We prove that the density of the sampling depends on the total variation of the shape, allowing us to derive formal bounds on the algorithm’s complexity and approximation quality. We further investigate different volumetric shape representations (in the form of truncated distance transforms), and in such a way control the total variation of the shape and hence the sampling density and the runtime of the algorithm. A comprehensive set of experiments assesses the proposed method, including an evaluation on the eight categories of the COSEG data-set. This is the first large-scale evaluation of any symmetry detection technique that we are aware of.

R. Litman, A. M. Bronstein, M. M. Bronstein, U. Castellani, Supervised learning of bag-of-features shape descriptors using sparse coding, Computer Graphics Forum (CGF), Vol. 33(5), 2014 details

## Supervised learning of bag-of-features shape descriptors using sparse coding

R. Litman, A. M. Bronstein, M. M. Bronstein, U. Castellani
Computer Graphics Forum (CGF), Vol. 33(5), 2014
--->>

We present a method for supervised learning of shape descriptors for shape retrieval applications. Many content-based shape retrieval approaches follow the bag-of-features (BoF) paradigm commonly used in text and image retrieval by first computing local shape descriptors, and then representing them in a `geometric dictionary’ using vector quantization. A major drawback of such approaches is that the dictionary is constructed in an unsupervised manner using clustering, unaware of the last stage of the process (pooling of the local descriptors into a BoF, and comparison of the latter using some metric). In this paper, we replace the clustering with dictionary learning, where every atom acts as a feature, followed by sparse coding and pooling to get the final BoF descriptor. Both the dictionary and the sparse codes can be learned in the supervised regime via bi-level optimization using a task-specific objective that promotes invariance desired in the specific application. We show significant performance improvement on several standard shape retrieval benchmarks.

O. Menashe, A. M. Bronstein, Real-time compressed imaging of scattering volumes, Proc. Int'l Conf. on Image Processing (ICIP), 2014 details

## Real-time compressed imaging of scattering volumes

O. Menashe, A. M. Bronstein
Proc. Int'l Conf. on Image Processing (ICIP), 2014
--->>

We propose a method and a prototype imaging system for real-time reconstruction of volumetric piecewise-smooth scattering media. The volume is illuminated by a sequence of structured binary patterns emitted from a fan beam projector, and the scattered light is collected by a two-dimensional sensor, thus creating an under-complete set of compressed measurements. We show a fixed-complexity and latency reconstruction algorithm capable of estimating the scattering coefficients in real-time. We also show a simple greedy algorithm for learning the optimal illumination patterns. Our results demonstrate faithful reconstruction from highly compressed measurements. Furthermore, a method for compressed registration of the measured volume to a known template is presented, showing excellent alignment with just a single projection. Though our prototype system operates in visible light, the presented methodology is suitable for fast x-ray scattering imaging, in particular in real-time vascular medical imaging.

S. Biasotti, A. Cerri, A. M. Bronstein, M. M. Bronstein, Quantifying 3D shape similarity using maps: Recent trends, applications and perspectives, Proc. EUROGRAPHICS STARS, 2014 details

## Quantifying 3D shape similarity using maps: Recent trends, applications and perspectives

S. Biasotti, A. Cerri, A. M. Bronstein, M. M. Bronstein
Proc. EUROGRAPHICS STARS, 2014
--->>

Shape similarity is an acute issue in Computer Vision and Computer Graphics that involves many aspects of human perception of the real world, including judged and perceived similarity concepts, deterministic and probabilistic decisions and their formalization. 3D models carry multiple information with them (e.g., geometry, topology, texture, time evolution, appearance), which can be thought as the filter that drives the recognition process. Assessing and quantifying the similarity between 3D shapes is necessary to explore large dataset of shapes, and tune the analysis framework to the userÕs needs. Many efforts have been done in this sense, including several attempts to formalize suitable notions of similarity and distance among 3D objects and their shapes. In the last years, 3D shape analysis knew a rapidly growing interest in a number of challenging issues, ranging from deformable shape similarity to partial matching and view-point selection. In this panorama, we focus on methods which quantify shape similarity (between two objects and sets of models) and compare these shapes in terms of their properties (i.e., global and local, geometric, differential and topological) conveyed by (sets of) maps. After presenting in detail the theoretical foundations underlying these methods, we review their usage in a number of 3D shape application domains, ranging from matching and retrieval to annotation and segmentation. Particular emphasis will be given to analyze the suitability of the different methods for specific classes of shapes (e.g. rigid or isometric shapes), as well as the flexibility of the various methods at the different stages of the shape comparison process. Finally, the most promising directions for future research developments are discussed.

J. Masci, M. M. Bronstein, A. M. Bronstein, J. Schmidhuber, Multimodal similarity-preserving hashing, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36(4), 2014 details

## Multimodal similarity-preserving hashing

J. Masci, M. M. Bronstein, A. M. Bronstein, J. Schmidhuber
IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36(4), 2014
--->>

We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches, our hashing functions are not limited to binarized linear projections and can assume arbitrarily complex forms. We show experimentally that our method significantly outperforms state-of-the-art hashing approaches on multimedia retrieval tasks.

J. Masci, A. M. Bronstein, M. M. Bronstein, P. Sprechmann, G. Sapiro, Sparse similarity-preserving hashing, Proc. Int'l Conf. on Learning Representations (ICLR), 2014 details

## Sparse similarity-preserving hashing

J. Masci, A. M. Bronstein, M. M. Bronstein, P. Sprechmann, G. Sapiro
Proc. Int'l Conf. on Learning Representations (ICLR), 2014
--->>

In recent years, a lot of attention has been devoted to efficient nearest neighbor search by means of similarity-preserving hashing. One of the plights of existing hashing techniques is the intrinsic trade-off between performance and computational complexity: while longer hash codes allow for lower false positive rates, it is very difficult to increase the embedding dimensionality without incurring in very high false negatives rates or prohibiting computational costs. In this paper, we propose a way to overcome this limitation by enforcing the hash codes to be sparse. Sparse high-dimensional codes enjoy from the low false positive rates typical of long hashes, while keeping the false negative rates similar to those of a shorter dense hashing scheme with equal number of degrees of freedom. We use a tailored feed-forward neural network for the hashing function. Extensive experimental evaluation involving visual and multi-modal data shows the benefits of the proposed method.

D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. Sochen, Equi-affine invariant intrinsic geometries for bendable shapes analysis, Journal of Mathematical Imaging and Vision (JMIV), Vol. 50(1), 2014 details

## Equi-affine invariant intrinsic geometries for bendable shapes analysis

D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. Sochen
Journal of Mathematical Imaging and Vision (JMIV), Vol. 50(1), 2014
--->>

Traditional models of bendable surfaces are based on the exact or approximate invariance to deformations that do not tear or stretch the shape, leaving intact an intrinsic geometry associated with it. Intrinsic geometries are typically defined using either the shortest path length (geodesic distance), or properties of heat diffusion (diffusion distance) on the surface. Both ways are implicitly derived from the metric induced by the ambient Euclidean space. In this paper, we depart from this restrictive assumption by observing that a different choice of the metric results in a richer set of geometric invariants. We extend the classic equi-affine arclength, defined on convex surfaces, to arbitrary shapes with non-vanishing gaussian curvature. As a result, a family of affine- invariant intrinsic geometries is obtained. The potential of this novel framework is explored in a wide range of applications such as shape matching and retrieval, symmetry detection, and computation of Voroni tessellation. We show that in some shape analysis tasks, our affine-invariant intrinsic geometries often outperform their Euclidean-based counterparts.

D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. M. Bronstein, M. M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, J. Ye, Shape retrieval of non-rigid 3D human models, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2014 details

## Shape retrieval of non-rigid 3D human models

D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. M. Bronstein, M. M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, J. Ye
Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2014
--->>

We have created a new dataset for non-rigid 3D shape retrieval, one that is much more challenging than existing datasets. Our dataset features exclusively human models, in a variety of body shapes and poses. 3D models of humans are commonly used within computer graphics and vision, therefore the ability to distinguish between body shapes is an important feature for shape retrieval methods. In this track nine groups have submitted the results of a total of 22 different methods which have been tested on our new dataset.

P. Sprechmann, R. Litman, T. Ben Yakar, A. M. Bronstein, G. Sapiro, Efficient supervised sparse analysis and synthesis operators, Proc. Neural Information Proc. Systems (NIPS), 2013 details

## Efficient supervised sparse analysis and synthesis operators

P. Sprechmann, R. Litman, T. Ben Yakar, A. M. Bronstein, G. Sapiro
Proc. Neural Information Proc. Systems (NIPS), 2013
--->>

In this paper, we propose a new and computationally efficient framework for learning sparse models. We formulate a unified approach that contains as particular cases models promoting sparse synthesis and analysis type of priors, and mixtures thereof. The supervised training of the proposed model is formulated as a bilevel optimization problem, in which the operators are optimized to achieve the best possible performance on a specific task, e.g., reconstruction or classification. By restricting the operators to be shift invariant, our approach can be thought as a way of learning analysis+synthesis sparsity-promoting convolutional operators. Leveraging recent ideas on fast trainable regressors designed to approximate exact sparse codes, we propose a way of constructing feed-forward neural networks capable of approximating the learned models at a fraction of the computational cost of exact solvers. In the shift-invariant case, this leads to a principled way of constructing task-specific convolutional networks. We illustrate the proposed models on several experiments in music analysis and image processing applications.

T. Ben Yakar, R. Litman, P. Sprechmann, A. M. Bronstein, G. Sapiro, Bilevel sparse models for polyphonic music transcription, Proc. Annual Conf. of the Int'l Society for Music Info. Retrieval (ISMIR), 2013 details

## Bilevel sparse models for polyphonic music transcription

T. Ben Yakar, R. Litman, P. Sprechmann, A. M. Bronstein, G. Sapiro
Proc. Annual Conf. of the Int'l Society for Music Info. Retrieval (ISMIR), 2013
--->>

In this work, we propose a trainable sparse model for automatic polyphonic music transcription, which incorporates several successful approaches into a unified optimization framework. Our model combines unsupervised synthesis models similar to latent component analysis and nonnegative factorization with metric learning techniques that allow supervised discriminative learning. We develop efficient stochastic gradient training schemes allowing unsupervised, semi-, and fully supervised training of the model as well its adaptation to test data. We show efficient fixed complexity and latency approximation that can replace iterative minimization algorithms in time-critical applications. Experimental evaluation on synthetic and real data shows promising initial results.

J. Pokrass, A. M. Bronstein, M. M. Bronstein, P. Sprechmann, G. Sapiro, Sparse modeling of intrinsic correspondences, Computer Graphics Forum (CGF), Vol. 32(2), 2013 details

## Sparse modeling of intrinsic correspondences

J. Pokrass, A. M. Bronstein, M. M. Bronstein, P. Sprechmann, G. Sapiro
Computer Graphics Forum (CGF), Vol. 32(2), 2013
--->>

We present a novel sparse modeling approach to non-rigid shape matching using only the ability to detect repeatable regions. As the input to our algorithm, we are given only two sets of regions in two shapes; no descriptors are provided so the correspondence between the regions is not know, nor we know how many regions correspond in the two shapes. We show that even with such scarce information, it is possible to establish very accurate correspondence between the shapes by using methods from the field of sparse modeling, being this, the first non-trivial use of sparse models in shape correspondence. We formulate the problem of permuted sparse coding, in which we solve simultaneously for an unknown permutation ordering the regions on two shapes and for an unknown correspondence in functional representation. We also propose a robust variant capable of handling incomplete matches. Numerically, the problem is solved efficiently by alternating the solution of a linear assignment and a sparse coding problem. The proposed methods are evaluated qualitatively and quantitatively on standard benchmarks containing both synthetic and scanned objects.

A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, K. Glashoff, R. Kimmel, Coupled quasi-harmonic bases, Computer Graphics Forum (CGF), Vol. 32(2), 2013 details

## Coupled quasi-harmonic bases

A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, K. Glashoff, R. Kimmel
Computer Graphics Forum (CGF), Vol. 32(2), 2013
--->>

State-of-the-art approaches to shape analysis, synthesis, and correspondence rely on these natural harmonic bases that allow using classical tools from harmonic analysis on manifolds. However, many applications involving multiple shapes are obstacled by the fact that Laplacian eigenbases computed independently on different shapes are often incompatible with each other. In this paper, we propose the construction of common approximate eigenbases for multiple shapes using approximate joint diagonalization algorithms, taking as input a set of corresponding functions (e.g. indicator functions of stable regions) on the two shapes. We illustrate the benefits of the proposed approach on tasks from shape editing, pose transfer, correspondence, and similarity.

P. Sprechmann, A. M. Bronstein, J.-M. Morel, G. Sapiro, Audio restoration from multiple copies, Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013 details

## Audio restoration from multiple copies

P. Sprechmann, A. M. Bronstein, J.-M. Morel, G. Sapiro
Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013
--->>

A method for removing impulse noise from audio signals by fusing multiple copies of the same recording is introduced in this paper. The proposed algorithm exploits the fact that while in general multiple copies of a given recording are available, all sharing the same master, most degradations in audio signals are record-dependent. Our method first seeks for the optimal non-rigid alignment of the signals that is robust to the presence of sparse outliers with arbitrary magnitude. Unlike previous approaches, we simultaneously find the optimal alignment of the signals and impulsive degradation. This is obtained via continuous dynamic time warping computed solving an Eikonal equation. We propose to use our approach in the derivative domain, reconstructing the signal by solving an inverse problem that resembles the Poisson image editing technique. The proposed framework is here illustrated and tested in the restoration of old gramophone recordings showing promising results; however, it can be used in other application where different copies of the signal of interest are available and the degradations are copy-dependent.

P. Sprechmann, A. M. Bronstein, M. M. Bronstein, G. Sapiro, Learnable low rank sparse models for speech denoising, Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013 details

## Learnable low rank sparse models for speech denoising

P. Sprechmann, A. M. Bronstein, M. M. Bronstein, G. Sapiro
Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013
--->>

In this paper we present a framework for real time enhancement of speech signals. Our method leverages a new process-centric approach for sparse and parsimonious models, where the representation pursuit is obtained applying a deterministic function or process rather than solving an optimization problem. We first propose a rank-regularized robust version of non-negative matrix factorization (NMF) for modeling time-frequency representations of speech signals in which the spectral frames are decomposed as sparse linear combinations of atoms of a low-rank dictionary. Then, a parametric family of pursuit processes is derived from the iteration of the proximal descent method for solving this model. We present several experiments showing successful results and the potential of the proposed framework. Incorporating discriminative learning makes the proposed method significantly outperform exact NMF algorithms, with fixed latency and at a fraction of it’s computational complexity.

A. Kovnatski, D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Geometric and photometric data fusion in non-rigid shape analysis, Numerical Mathematics: Theory, Methods and Applications (NM-TMA), Vol. 6(1), 2013 details

## Geometric and photometric data fusion in non-rigid shape analysis

A. Kovnatski, D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel
Numerical Mathematics: Theory, Methods and Applications (NM-TMA), Vol. 6(1), 2013
--->>

In this paper, we explore the use of the diffusion geometry framework for the fusion of geometric and photometric information in local and global shape descriptors. Our construction is based on the definition of a diffusion process on the shape manifold embedded into a high-dimensional space where the embedding coordinates represent the photometric information. Experimental results show that such data fusion is useful in coping with different challenges of shape analysis where pure geometric and pure photometric methods fail.

J. Pokrass, A. M. Bronstein, M. M. Bronstein, Partial shape matching without point-wise correspondence, Numerical Mathematics: Theory, Methods and Applications (NM-TMA), Vol. 6(1), 2013 details

## Partial shape matching without point-wise correspondence

J. Pokrass, A. M. Bronstein, M. M. Bronstein
Numerical Mathematics: Theory, Methods and Applications (NM-TMA), Vol. 6(1), 2013
--->>

Partial similarity of shapes in a challenging problem arising in many important applications in computer vision, shape analysis, and graphics, e.g. when one has to deal with partial information and acquisition artifacts. The problem is especially hard when the underlying shapes are non-rigid and are given up to a deformation. Partial matching is usually approached by computing local descriptors on a pair of shapes and then establishing a point-wise non-bijective correspondence between the two, taking into account possibly different parts. In this paper, we introduce an alternative correspondence-less approach to matching fragments to an entire shape undergoing a non-rigid deformation. We use diffusion geometric descriptors and optimize over the integration domains on which the integral descriptors of the two parts match. The problem is regularized using the Mumford-Shah functional. We show an efficient discretization based on the Ambrosio-Tortorelli approximation generalized to triangular meshes and point clouds, and present experiments demonstrating the success of the proposed method.

R. Litman, and A. M. Bronstein, Learning spectral descriptors for deformable shape correspondence, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36(1), 2013 details

## Learning spectral descriptors for deformable shape correspondence

R. Litman, and A. M. Bronstein
IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36(1), 2013
--->>

Informative and discriminative feature descriptors play a fundamental role in deformable shape analysis. For example, they have been successfully employed in correspondence, registration, and retrieval tasks. In the recent years, significant attention has been devoted to descriptors obtained from the spectral decomposition of the Laplace-Beltrami operator associated with the shape. Notable examples in this family are the heat kernel signature (HKS) and the recently introduced wave kernel signature (WKS). Laplacian-based descriptors achieve state-of-the-art performance in numerous shape analysis tasks; they are computationally efficient, isometry-invariant by construction, and can gracefully cope with a variety of transformations. In this paper, we formulate a generic family of parametric spectral descriptors. We argue that in order to be optimized for a specific task, the descriptor should take into account the statistics of the corpus of shapes to which it is applied (the “signal”) and those of the class of transformations to which it is made insensitive (the “noise”). While such statistics are hard to model axiomatically, they can be learned from examples. Following the spirit of the Wiener filter in signal processing, we show a learning scheme for the construction of optimized spectral descriptors and relate it to Mahalanobis metric learning. The superiority of the proposed approach in generating correspondences is demonstrated on synthetic and scanned human figures. We also show that the learned descriptors are robust enough to be learned on synthetic data and transferred successfully to scanned shapes.

P. Sprechmann, A. M. Bronstein, G. Sapiro, Real-time online singing voice separation from monaural recordings using robust low-rank modeling, Proc. Annual Conference of the Int'l Society for Music Information Retrieval (ISMIR), 2012 (Best poster presentation award) details

## Real-time online singing voice separation from monaural recordings using robust low-rank modeling

P. Sprechmann, A. M. Bronstein, G. Sapiro
Proc. Annual Conference of the Int'l Society for Music Information Retrieval (ISMIR), 2012 (Best poster presentation award)
--->>

Separating the leading vocals from the musical accompaniment is a challenging task that appears naturally in several music processing applications. Robust principal component analysis (RPCA) has been recently employed to this problem producing very successful results. The method decomposes the signal into a low-rank component corresponding to the accompaniment with its repetitive structure, and a sparse component corresponding to the voice with its quasi-harmonic structure. In this paper, we first introduce a non-negative variant of RPCA, termed as robust low-rank non-negative matrix factorization (RNMF). This new framework better suits audio applications. We then propose two efficient feed-forward architectures that approximate the RPCA and RNMF with low latency and a fraction of the complexity of the original optimization method. These approximants allow incorporating elements of unsupervised, semi- and fully-supervised learning into the RPCA and RNMF frameworks. Our basic implementation shows several orders of magnitude speedup compared to the exact solvers with no performance degradation, and allows online and faster-than-real-time processing. Evaluation on the MIR-1K dataset demonstrates state-of-the-art performance.

O. Litany, A. M. Bronstein, M. M. Bronstein, Putting the pieces together: regularized multi-shape partial matching, Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012 details

## Putting the pieces together: regularized multi-shape partial matching

O. Litany, A. M. Bronstein, M. M. Bronstein
Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012
--->>

Multi-part shape matching in an important class of problems, arising in many fields such as computational archaeology, biology, geometry processing, computer graphics and vision. In this paper, we address the problem of simultaneous matching and segmentation of multiple shapes. We assume to be given a reference shape and multiple parts partially matching the reference. Each of these parts can have additional clutter, have overlap with other parts, or there might be missing parts. We show experimental results of efficient and accurate assembly of fractured synthetic and real objects.

A. Kovnatsky, A. M. Bronstein, M. M. Bronstein, Stable spectral mesh filtering, Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012 details

## Stable spectral mesh filtering

A. Kovnatsky, A. M. Bronstein, M. M. Bronstein
Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012
--->>

The rapid development of 3D acquisition technology has brought with itself the need to perform standard signal processing operations such as filters on 3D data. It has been shown that the eigenfunctions of the Laplace-Beltrami operator (manifold harmonics) of a surface play the role of the Fourier basis in the Euclidean space; it is thus possible to formulate signal analysis and synthesis in the manifold harmonics basis. In particular, geometry filtering can be carried out in the manifold harmonics domain by decomposing the embedding coordinates of the shape in this basis. However, since the basis functions depend on the shape itself, such filtering is valid only for weak (near all-pass) filters, and produces severe artifacts otherwise. In this paper, we analyze this problem and propose the fractional filtering approach, wherein we apply iteratively weak fractional powers of the filter, followed by the update of the basis functions. Experimental results show that such a process produces more plausible and meaningful results.

I. Kokkinos, M. M. Bronstein, R. Litman, A. M. Bronstein, Intrinsic shape context descriptors for deformable shapes, Proc. Computer Vision and Pattern Recognition (CVPR), 2012 details

## Intrinsic shape context descriptors for deformable shapes

I. Kokkinos, M. M. Bronstein, R. Litman, A. M. Bronstein
Proc. Computer Vision and Pattern Recognition (CVPR), 2012
--->>

In this work, we present intrinsic shape context (ISC) descriptors for 3D shapes. We generalize to surfaces the polar sampling of the image domai