Publications

Topics:
  1. Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. Baskin, Physical passive patch adversarial attacks on visual odometry systems, Proc. ACCV, 2022 details

    Physical passive patch adversarial attacks on visual odometry systems

    Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. Baskin
    Proc. ACCV, 2022
    Picture for Physical passive patch adversarial attacks on visual odometry systems
    --->>

    Deep neural networks are known to be susceptible to adversarial perturbations — small perturbations that alter the output of the network and exist under strict norm limitations. While such perturbations are usually discussed as tailored to a specific input, a universal perturbation can be constructed to alter the model’s output on a set of inputs. Universal perturbations present a more realistic case of adversarial attacks, as awareness of the model’s exact input is not required. In addition, the universal attack setting raises the subject of generalization to unseen data, where given a set of inputs, the universal perturbations aim to alter the model’s output on out-of-sample data. In this work, we study physical passive patch adversarial attacks on visual odometry-based autonomous navigation systems. A visual odometry system aims to infer the relative camera motion between two corresponding viewpoints, and is frequently used by vision-based autonomous navigation systems to estimate their state. For such navigation systems, a patch adversarial perturbation poses a severe security issue, as it can be used to mislead a system onto some collision course. To the best of our knowledge, we show for the first time that the error margin of a visual odometry model can be significantly increased by deploying patch adversarial attacks in the scene. We provide evaluation on synthetic closed-loop drone navigation data and demonstrate that a comparable vulnerability exists in real data.

    L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. Bronstein, Machine learning approaches demonstrate that protein structures carry information about their genetic coding, Preprint, 2022 details

    Machine learning approaches demonstrate that protein structures carry information about their genetic coding

    L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. Bronstein
    Preprint, 2022
    Picture for Machine learning approaches demonstrate that protein structures carry information about their genetic coding
    --->>

    Synonymous codons translate into the same amino acid. Although the identity of synonymous codons is often considered
    inconsequential to the final protein structure there is mounting evidence for an association between the two. Our study
    examined this association using regression and classification models, finding that codon sequences predict protein backbone dihedral angles with a lower error than amino acid sequences, and that models trained with true dihedral angles have better classification of synonymous codons given structural information than models trained with random dihedral angles. Using this classification approach, we investigated local codon-codon dependencies and tested whether synonymous codon identity can be predicted more accurately from codon context than amino acid context alone, and most specifically which codon context position carries the most predictive power.

    A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues, biorXiv/2022/513383, 2022 details

    Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues

    A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein
    biorXiv/2022/513383, 2022
    Picture for Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues
    --->>

    Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. A large-scale analysis of high-resolution protein structures suggests that amino acid pairs constitute another layer of ordered structure, more local than these conventionally defined secondary structures. We develop a cross-peptide-bond Ramachandran plot that captures the 15 conformational preferences of the amino acid pairs and show that the effect of a particular mutation on the stability of a protein depends in a predictable manner on the adjacent amino acid context.

    A. Rosenberg, A. Marx, A. M. Bronstein, Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon, Nature Communications, 2022 details

    Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon

    A. Rosenberg, A. Marx, A. M. Bronstein
    Nature Communications, 2022
    Picture for Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon
    --->>

    Synonymous codons translate into chemically identical amino acids. Once considered inconsequential to the formation of the protein product, there is now significant evidence to suggest that codon usage affects co-translational protein folding and the final structure of the expressed protein. Here we develop a method for computing and comparing codon-specific Ramachandran plots and demonstrate that the backbone dihedral angle distributions of some synonymous codons are distinguishable with statistical significance for some secondary structures. This shows that there exists a dependence between codon identity and backbone torsion of the translated amino acid. Although these findings cannot pinpoint the causal direction of this dependence, we discuss the vast biological implications should coding be shown to directly shape protein conformation and demonstrate the usefulness of this method as a tool for probing associations between codon usage and protein structure. Finally, we urge for the inclusion of exact genetic information into structural databases.

    E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse design of spontaneous parametric downconversion for generation of high-dimensional qudits, Optica 9, 602-615, 2022 details

    Inverse design of spontaneous parametric downconversion for generation of high-dimensional qudits

    E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie
    Optica 9, 602-615, 2022

    Spontaneous parametric down-conversion in quantum optics is an invaluable resource for the realization of high-dimensional qudits with spatial modes of light. One of the main open challenges is how to directly generate a desirable qudit state in the SPDC process. This problem can be addressed through advanced computational learning methods; however, due to difficulties in modeling the SPDC process by a fully differentiable algorithm that takes into account all interaction effects, progress has been limited. Here, we overcome these limitations and introduce a physically-constrained and differentiable model, validated against experimental results for shaped pump beams and structured crystals, capable of learning every interaction parameter in the process. We avoid any restrictions induced by the stochastic nature of our physical model and integrate the dynamic equations governing the evolution under the SPDC Hamiltonian. We solve the inverse problem of designing a nonlinear quantum optical system that achieves the desired quantum state of down-converted photon pairs. The desired states are defined using either the second-order correlations between different spatial modes or by specifying the required density matrix. By learning nonlinear volume holograms as well as different pump shapes, we successfully show how to generate maximally entangled states. Furthermore, we simulate all-optical coherent control over the generated quantum state by actively changing the profile of the pump beam. Our work can be useful for applications such as novel designs of high-dimensional quantum key distribution and quantum information processing protocols. In addition, our method can be readily applied for controlling other degrees of freedom of light in the SPDC process, such as the spectral and temporal properties, and may even be used in condensed-matter systems having a similar interaction Hamiltonian.

    N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. Dreslinski, Mint: An Accelerator For Mining Temporal Motifs, Proc. MICRO, 2022 details

    Mint: An Accelerator For Mining Temporal Motifs

    N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. Dreslinski
    Proc. MICRO, 2022
    Picture for Mint: An Accelerator For Mining Temporal Motifs
    --->>

    A variety of complex systems, including social and communication networks, financial markets, biology, and neuroscience are modeled using temporal graphs that contain a set of nodes and directed timestamped edges. Temporal motifs in temporal graphs are generalized from subgraph patterns in static graphs in that they also account for edge ordering and time duration, in addition to the graph structure. Mining temporal motifs is a fundamental problem used in several application domains. However, existing software frameworks offer suboptimal performance due to high algorithmic complexity and irregular memory accesses of temporal motif mining. This paper presents Mint—a novel accelerator architecture and a programming model for mining temporal motifs efficiently. We first divide this workload into three fundamental tasks: search, book-keeping, and backtracking. Based on this, we propose a task–centric programming model that enables decoupled, asynchronous execution. This model unlocks massive opportunities for parallelism, and allows storing task context information on-chip. To best utilize the proposed programming model, we design a domain-specific hardware accelerator using its data path and memory subsystem design to cater to the unique workload characteristics of temporal motif mining. To further improve performance, we propose a novel optimization called search index memoization that significantly reduces memory traffic. We comprehensively compare the performance of Mint with state-of-the-art temporal motif mining software frameworks (both approximate and exact) running on both CPU and GPU, and show 9×–2576× benefit in performance.

    T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-Poranne, Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons, ChemRxiv 10.26434/chemrxiv-2022-krng1, 2022 details

    Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons

    T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-Poranne
    ChemRxiv 10.26434/chemrxiv-2022-krng1, 2022
    Picture for Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons
    --->>

    In this work, interpretable deep learning was used to identify structure-property relationships governing the HOMO-LUMO gap and relative stability of polybenzenoid hydrocarbons (PBHs). To this end, a ring-based graph representation was used. In addition to affording reduced training times and excellent predictive ability, this representation could be combined with a subunit-based perception of PBHs, allowing chemical insights to be presented in terms of intuitive and simple structural motifs. The resulting insights agree with conventional organic chemistry knowledge and electronic structure-based analyses, and also reveal new behaviors and identify influential structural motifs. In particular, we evaluated and compared the effects of linear, angular, and branching motifs on these two molecular properties, as well as explored the role of dispersion in mitigating torsional strain inherent in non-planar PBHs. Hence, the observed regularities and the proposed analysis contribute to a deeper understanding of the behavior of PBHs and form the foundation for design strategies for new functional PBHs.

    A. A. Rosenberg, S. Vedula, Y. Romano, A. M. Bronstein, Fast nonlinear vector quantile regression, arXiv preprint arXiv:2205.14977, 2022 details

    Fast nonlinear vector quantile regression

    A. A. Rosenberg, S. Vedula, Y. Romano, A. M. Bronstein
    arXiv preprint arXiv:2205.14977, 2022
    Picture for Fast nonlinear vector quantile regression
    --->>

    Quantile regression (QR) is a powerful tool for estimating one or more conditional quantiles of a target variable Y given explanatory features X. A limitation of QR is that it is only defined for scalar target variables, due to the formulation of its objective function, and since the notion of quantiles has no standard definition for multivariate distributions. Recently, vector quantile regression (VQR) was proposed as an extension of QR for high-dimensional target variables, thanks to a meaningful generalization of the notion of quantiles to multivariate distributions. Despite its elegance, VQR is arguably not applicable in practice due to several limitations: (i) it assumes a linear model for the quantiles of the target Y given the features X; (ii) its exact formulation is intractable even for modestly-sized problems in terms of target dimensions, the number of regressed quantile levels, or the number of features, and its relaxed dual formulation may violate the monotonicity of the estimated quantiles; (iii) no fast or scalable solvers for VQR currently exist. In this work we fully address these limitations, namely: (i) We extend VQR to the non-linear case, showing substantial improvement over linear VQR; (ii) We propose vector monotone rearrangement, a method which ensures the estimates obtained by VQR relaxations are monotone functions; (iii) We provide fast, GPU-accelerated solvers for linear and nonlinear VQR which maintain a fixed memory footprint with the number of samples and quantile levels, and demonstrate that they scale to millions of samples and thousands of quantile levels; (iv) We release an optimized python package of our solvers as to widespread the use of VQR in real-world applications.

    E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. Litany, Contrast to divide: Self-supervised pre-training for learning with noisy labels, Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022 details

    Contrast to divide: Self-supervised pre-training for learning with noisy labels

    E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. Litany
    Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022
    Picture for Contrast to divide: Self-supervised pre-training for learning with noisy labels
    --->>

    The success of learning with noisy labels (LNL) methods relies heavily on the success of a warm-up stage where standard supervised training is performed using the full (noisy) training set. In this paper, we identify a” warm-up obstacle”: the inability of standard warm-up stages to train high quality feature extractors and avert memorization of noisy labels. We propose” Contrast to Divide”(C2D), a simple framework that solves this problem by pre-training the feature extractor in a self-supervised fashion. Using self-supervised pre-training boosts the performance of existing LNL approaches by drastically reducing the warm-up stage’s susceptibility to noise level, shortening its duration, and improving extracted feature quality. C2D works out of the box with existing methods and demonstrates markedly improved performance, especially in the high noise regime, where we get a boost of more than 27% for CIFAR-100 with 90% noise over the previous state of the art. In real-life noise settings, C2D trained on mini-WebVision outperforms previous works both in WebVision and ImageNet validation sets by 3% top-1 accuracy. We perform an in-depth analysis of the framework, including investigating the performance of different pre-training approaches and estimating the effective upper bound of the LNL performance with semi-supervised learning.

    D. Zadok, O. Salzman, A. Wolf, A. M. Bronstein, Towards predicting fine finger motions from ultrasound images via kinematic representation, arXiv preprint arXiv:2202.05204, 2022 details

    Towards predicting fine finger motions from ultrasound images via kinematic representation

    D. Zadok, O. Salzman, A. Wolf, A. M. Bronstein
    arXiv preprint arXiv:2202.05204, 2022
    Picture for Towards predicting fine finger motions from ultrasound images via kinematic representation
    --->>

    A central challenge in building robotic prostheses is the creation of a sensor-based system able to read physiological signals from the lower limb and instruct a robotic hand to perform various tasks. Existing systems typically perform discrete gestures such as pointing or grasping, by employing electromyography (EMG) or ultrasound (US) technologies to analyze the state of the muscles. In this work, we study the inference problem of identifying the activation of specific fingers from a sequence of US images when performing dexterous tasks such as keyboard typing or playing the piano. While estimating finger gestures has been done in the past by detecting prominent gestures, we are interested in classification done in the context of fine motions that evolve over time. We consider this task as an important step towards higher adoption rates of robotic prostheses among arm amputees, as it has the potential to dramatically increase functionality in performing daily tasks. Our key observation, motivating this work, is that modeling the hand as a robotic manipulator allows to encode an intermediate representation wherein US images are mapped to said configurations. Given a sequence of such learned configurations, coupled with a neural-network architecture that exploits temporal coherence, we are able to infer fine finger motions. We evaluated our method by collecting data from a group of subjects and demonstrating how our framework can be used to replay music played or text typed. To the best of our knowledge, this is the first study demonstrating these downstream tasks within an end-to-end system.

    G. Pai, A. Bronstein, R. Talmon, R. Kimmel, Deep isometric maps, Image and Vision Computing, 2022 details

    Deep isometric maps

    G. Pai, A. Bronstein, R. Talmon, R. Kimmel
    Image and Vision Computing, 2022
    Picture for Deep isometric maps
    --->>

    Isometric feature mapping is an established time-honored algorithm in manifold learning and non-linear dimensionality reduction. Its prominence can be attributed to the output of a coherent global low-dimensional representation of data by preserving intrinsic distances. In order to enable an efficient and more applicable isometric feature mapping, a diverse set of sophisticated advancements have been proposed to the original algorithm to incorporate important factors like sparsity of computation, conformality, topological constraints and spectral geometry. However, a significant shortcoming of most approaches is the dependence on large scale dense-spectral decompositions or the inability to generalize to points far away from the sampling of the manifold.

    In this paper, we explore an unsupervised deep learning approach for computing distance-preserving maps for non-linear dimensionality reduction. We demonstrate that our framework is general enough to incorporate all previous advancements and show a significantly improved local and non-local generalization of the isometric mapping. Our approach involves training with only a few landmark datapoints and therefore avoids the need for population of dense matrices as well as computing their spectral decomposition.

    S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky, MetAdapt: Meta-learned task-adaptive architecture for few-shot classification, Pattern Recognition Letters details

    MetAdapt: Meta-learned task-adaptive architecture for few-shot classification

    S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky
    Pattern Recognition Letters
    Picture for MetAdapt: Meta-learned task-adaptive architecture for few-shot classification
    --->>

    Few-Shot Learning (FSL) is a topic of rapidly growing interest. Typically, in FSL a model is trained on a dataset consisting of many small tasks (meta-tasks) and learns to adapt to novel tasks that it will encounter during test time. This is also referred to as meta-learning. Another topic closely related to meta-learning with a lot of interest in the community is Neural Architecture Search (NAS), automatically finding optimal architecture instead of engineering it manually. In this work we combine these two aspects of meta-learning. So far, meta-learning FSL methods have focused on optimizing parameters of pre-defined network architectures, in order to make them easily adaptable to novel tasks. Moreover, it was observed that, in general, larger architectures perform better than smaller ones up to a certain saturation point (where they start to degrade due to over-fitting). However, little attention has been given to explicitly optimizing the architectures for FSL, nor to an adaptation of the architecture at test time to particular novel tasks. In this work, we propose to employ tools inspired by the Differentiable Neural Architecture Search (D-NAS) literature in order to optimize the architecture for FSL without over-fitting. Additionally, to make the architecture task adaptive, we propose the concept of ‘MetAdapt Controller’ modules. These modules are added to the model and are meta-trained to predict the optimal network connections for a given novel task. Using the proposed approach we observe state-of-the-art resu

    T. Weiss, N. Peretz, S. Vedula, A. Feuer, A. M. Bronstein, Joint optimization of system design and reconstruction in MIMO radar imaging, Proc. IEEE Int'l Workshop on Machine Learning for Signal Processing details

    Joint optimization of system design and reconstruction in MIMO radar imaging

    T. Weiss, N. Peretz, S. Vedula, A. Feuer, A. M. Bronstein
    Proc. IEEE Int'l Workshop on Machine Learning for Signal Processing
    Picture for Joint optimization of system design and reconstruction in MIMO radar imaging
    --->>

    Multiple-input multiple-output (MIMO) radar is one of the leading depth sensing modalities. However, the usage of multiple receive channels lead to relative high costs and prevent the penetration of MIMOs in many areas such as the automotive industry. Over the last years, few studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MIMO radars, however these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of simultaneous learningbased design of the acquisition and reconstruction schemes, manifesting significant improvement in the reconstruction quality. Inspired by these successes, in this work, we propose to learn MIMO acquisition parameters in the form of receive (Rx) antenna elements locations jointly with an image neuralnetwork based reconstruction. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using our learned acquisition parameters with and without the neural-network reconstruction.

    Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, Loss aware post-training quantization, Machine Learning details

    Loss aware post-training quantization

    Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson
    Machine Learning
    Picture for Loss aware post-training quantization
    --->>

    Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. We show that the structure is flat and separable for mild quantization, enabling straightforward post-training quantization methods to achieve good results. We show that with more aggressive quantization, the loss landscape becomes highly non-separable with steep curvature, making the selection of quantization parameters more challenging. Armed with this understanding, we design a method that quantizes the layer parameters jointly, enabling significant accuracy improvement over current post-training quantization methods.

    J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. Karras, GRASP: Graph Alignment through Spectral Signatures, Proc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data details

    GRASP: Graph Alignment through Spectral Signatures

    J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. Karras
    Proc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data
    Picture for GRASP: Graph Alignment through Spectral Signatures
    --->>

    What is the best way to match the nodes of two graphs? This graph alignment problem generalizes graph isomorphism and arises in applications from social network analysis to bioinformatics. Some solutions assume that auxiliary information on known matches or node or edge attributes is available, or utilize arbitrary graph features. Such methods fare poorly in the pure form of the problem, in which only graph structures are given. Other proposals translate the problem to one of aligning node embeddings, yet, by doing so, provide only a single-scale view of the graph. In this paper, we transfer the shape-analysis concept of functional maps from the continuous to the discrete case, and treat the graph alignment problem as a special case of the problem of finding a mapping between functions on graphs. We present GRASP, a method that first establishes a correspondence between functions derived from Laplacian matrix eigenvectors, which capture multiscale structural characteristics, and then exploits this correspondence to align nodes. Our experimental study, featuring noise levels higher than anything used in previous studies, shows that GRASP outperforms state-of-the-art methods for graph alignment across noise levels and graph types.

    P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. Liu, Alex Bronstein Deep fused two-step cross-modal hashing with multiple semantic supervision, Multimedia Tools and Applications, 2022 details

    Alex Bronstein Deep fused two-step cross-modal hashing with multiple semantic supervision

    P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. Liu
    Multimedia Tools and Applications, 2022
    Picture for Alex Bronstein Deep fused two-step cross-modal hashing with multiple semantic supervision
    --->>

    Existing cross-modal hashing methods ignore the informative multimodal joint information and cannot fully exploit the semantic labels. In this paper, we propose a deep fused two-step cross-modal hashing (DFTH) framework with multiple semantic supervision. In the first step, DFTH learns unified hash codes for instances by a fusion network. Semantic label and similarity reconstruction have been introduced to acquire binary codes that are informative, discriminative and semantic similarity preserving. In the second step, two modality-specific hash networks are learned under the supervision of common hash codes reconstruction, label reconstruction, and intra-modal and inter-modal semantic similarity reconstruction. The modality-specific hash networks can generate semantic preserving binary codes for out-of-sample queries. To deal with the vanishing gradients of binarization, continuous differentiable tanh is introduced to approximate the discrete sign function, making the networks able to back-propagate by automatic gradient computation. Extensive experiments on MIRFlickr25K and NUS-WIDE show the superiority of DFTH over state-of-the-art methods.

    P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. Liu, Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval, Applied Intelligence, 52(1), pp. 33-54, 2022 details

    Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval

    P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. Liu
    Applied Intelligence, 52(1), pp. 33-54, 2022
    Picture for Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
    --->>

    Cross-modal retrieval aims to retrieve related items across different modalities, for example, using an image query to retrieve related text. The existing deep methods ignore both the intra-modal and inter-modal intra-class low-rank structures when fusing various modalities, which decreases the retrieval performance. In this paper, two deep models (denoted as ILCMR and Semi-ILCMR) based on intra-class low-rank regularization are proposed for supervised and semi-supervised cross-modal retrieval, respectively. Specifically, ILCMR integrates the image network and text network into a unified framework to learn a common feature space by imposing three regularization terms to fuse the cross-modal data. First, to align them in the label space, we utilize semantic consistency regularization to convert the data representations to probability distributions over the classes. Second, we introduce an intra-modal low-rank regularization, which encourages the intra-class samples that originate from the same space to be more relevant in the common feature space. Third, an inter-modal low-rank regularization is applied to reduce the cross-modal discrepancy. To enable the low-rank regularization to be optimized using automatic gradients during network back-propagation, we propose the rank-r approximation and specify the explicit gradients for theoretical completeness. In addition to the three regularization terms that rely on label information incorporated by ILCMR, we propose Semi-ILCMR in the semi-supervised regime, which introduces a low-rank constraint before projecting the general representations into the common feature space. Extensive experiments on four public cross-modal datasets demonstrate the superiority of ILCMR and Semi-ILCMR over other state-of-the-art methods.

    N. Diamant, N. Shandor, A. M. Bronstein, Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples, arXiv:2111.08419 details

    Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples

    N. Diamant, N. Shandor, A. M. Bronstein
    arXiv:2111.08419
    Picture for Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples
    --->>

    Understating and controlling generative models’ latent space is a complex task. In this paper, we propose a novel method for learning to control any desired attribute in a pre-trained GAN’s latent space, for the purpose of editing synthesized and real-world data samples accordingly. We perform Sim2Real learning, relying on minimal samples to achieve an unlimited amount of continuous precise edits. We present an Autoencoder-based model that learns to encode the semantics of changes between images as a basis for editing new samples later on, achieving precise desired results – example shown in Fig. 1. While previous editing methods rely on a known structure of latent spaces (e.g., linearity of some semantics in StyleGAN), our method inherently does not require any structural constraints. We demonstrate our method in the domain of facial imagery: editing different expressions, poses, and lighting attributes, achieving state-of-the-art results.

    C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, CAT: Compression-aware training for bandwidth reduction, JMLR, 2021 details

    CAT: Compression-aware training for bandwidth reduction

    C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson
    JMLR, 2021
    Picture for CAT: Compression-aware training for bandwidth reduction
    --->>

    Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving visual processing tasks. One of the major obstacles hindering the ubiquitous use of CNNs for inference is their relatively high memory bandwidth requirements, which can be a main energy consumer and throughput bottleneck in hardware accelerators. Accordingly, an efficient feature map compression method can result in substantial performance gains. Inspired by quantization-aware training approaches, we propose a compression-aware training (CAT) method that involves training the model in a way that allows better compression of feature maps during inference. Our method trains the model to achieve low-entropy feature maps, which enables efficient compression at inference time using classical transform coding methods. CAT significantly improves the state-of-the-art results reported for quantization. For example, on ResNet-34 we achieve 73.1% accuracy (0.2% degradation from the baseline) with an average representation of only 1.79 bits per value.

    E. Amrani, A. M. Bronstein, Self-supervised classification network, arXiv:2103.10994, 2021 details

    Self-supervised classification network

    E. Amrani, A. M. Bronstein
    arXiv:2103.10994, 2021
    Picture for Self-supervised classification network
    --->>

    We present Self-Classifier — a novel self-supervised end-to-end classification neural network. Self-Classifier learns labels and representations simultaneously in a single-stage end-to-end manner by optimizing for same-class prediction of two augmented views of the same sample. To guarantee non-degenerate solutions (i.e., solutions where all labels are assigned to the same class), a uniform prior is asserted on the labels. We show mathematically that unlike the regular cross-entropy loss, our approach avoids such solutions. Self-Classifier is simple to implement and is scalable to practically unlimited amounts of data. Unlike other unsupervised classification approaches, it does not require any form of pre-training or the use of expectation maximization algorithms, pseudo-labelling or external clustering. Unlike other contrastive learning representation learning approaches, it does not require a memory bank or a second network. Despite its relative simplicity, our approach achieves comparable results to state-of-the-art performance with ImageNet, CIFAR10 and CIFAR100 for its two objectives: unsupervised classification and unsupervised representation learning. Furthermore, it is the first unsupervised end-to-end classification network to perform well on the large-scale ImageNet dataset. Code will be made available.

    E. Rozenberg, D. Freedman, A. M. Bronstein, Learning to localize objects using limited annotation with applications to thoracic diseases, IEEE Access Vol. 9 details

    Learning to localize objects using limited annotation with applications to thoracic diseases

    E. Rozenberg, D. Freedman, A. M. Bronstein
    IEEE Access Vol. 9
    Picture for Learning to localize objects using limited annotation with applications to thoracic diseases
    --->>

    Motivation: The localization of objects in images is a longstanding objective within the field of image processing. Most current techniques are based on machine learning approaches, which typically require careful annotation of training samples in the form of expensive bounding box labels. The need for such large-scale annotation has only been exacerbated by the widespread adoption of deep learning techniques within the image processing community: deep learning is notoriously data-hungry. Method: In this work, we attack this problem directly by providing a new method for learning to localize objects with limited annotation: most training images can simply be annotated with their whole image labels (and no bounding box), with only a small fraction marked with bounding boxes. The training is driven by a novel loss function, which is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning. Care is taken to ensure that the loss is numerically well-posed. Additionally, we propose a neural network architecture which accounts for both patch dependence, through the use of Conditional Random Field layers, and shift-invariance, through the inclusion of anti-aliasing filters. Results: We demonstrate our method on the task of localizing thoracic diseases in chest X-ray images, achieving state-of-the-art performance on the ChestX-ray14 dataset. We further show that with a modicum of additional effort our technique can be extended from object localization to object detection, attaining high quality results on the Kaggle RSNA Pneumonia Detection Challenge. Conclusion: The technique presented in this paper has the potential to enable high accuracy localization in regimes in which annotated data is either scarce or expensive to acquire. Future work will focus on applying the ideas presented in this paper to the realm of semantic segmentation.

    T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. Bronstein, PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI, Journal of Machine Learning for Biomedical Imaging (MELBA), 2021 details

    PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI

    T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. Bronstein
    Journal of Machine Learning for Biomedical Imaging (MELBA), 2021
    Picture for PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI
    --->>

    Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relatively high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the concurrent learning-based design of data acquisition and image reconstruction schemes. Such schemes have already demonstrated substantial effectiveness, leading to considerably shorter acquisition times and improved quality of image reconstruction. Inspired by this initial success, in this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization, carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and backpropagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks.

    A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky, Detector-free weakly supervised grounding by separation, Proc. CVPR, 2022 details

    Detector-free weakly supervised grounding by separation

    A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky
    Proc. CVPR, 2022
    Picture for Detector-free weakly supervised grounding by separation
    --->>

    Nowadays, there is an abundance of data involving images and surrounding free-form text weakly corresponding to those images. Weakly Supervised phrase-Grounding (WSG) deals with the task of using this data to learn to localize (or to ground) arbitrary text phrases in images without any additional annotations. However, most recent SotA methods for WSG assume the existence of a pre-trained object detector, relying on it to produce the ROIs for localization. In this work, we focus on the task of Detector-Free WSG (DF-WSG) to solve WSG without relying on a pre-trained detector. We directly learn everything from the images and associated free-form text pairs, thus potentially gaining an advantage on the categories unsupported by the detector. The key idea behind our proposed Grounding by Separation (GbS) method is synthesizing `text to image-regions’ associations by random alpha-blending of arbitrary image pairs and using the corresponding texts of the pair as conditions to recover the alpha map from the blended image via a segmentation network. At test time, this allows using the query phrase as a condition for a non-blended query image, thus interpreting the test image as a composition of a region corresponding to the phrase and the complement region. Using this approach we demonstrate a significant accuracy improvement, of up to 8.5% over previous DF-WSG SotA, for a range of benchmarks including Flickr30K, Visual Genome, and ReferIt, as well as a significant complementary improvement (above 7%) over the detector-based approaches for WSG.

    Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. Yaniv, Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis, PNAS 2021 details

    Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis

    Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. Yaniv
    PNAS 2021
    Picture for Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis
    --->>

    Despite their great promise, artificial intelligence (AI) systems have yet to become ubiquitous in the daily practice of medicine largely due to several crucial unmet needs of healthcare practitioners. These include lack of explanations in clinically meaningful terms, handling the presence of unknown medical conditions, and transparency regarding the system’s limitations, both in terms of statistical performance as well as recognizing situations for which the system’s predictions are irrelevant. We articulate these unmet clinical needs as machine-learning (ML) problems and systematically address them with cutting-edge ML techniques. We focus on electrocardiogram (ECG) analysis as an example domain in which AI has great potential and tackle two challenging tasks: the detection of a heterogeneous mix of known and unknown arrhythmias from ECG and the identification of underlying cardio-pathology from segments annotated as normal sinus rhythm recorded in patients with an intermittent arrhythmia. We validate our methods by simulating a screening for arrhythmias in a large-scale population while adhering to statistical significance requirements. Specifically, our system 1) visualizes the relative importance of each part of an ECG segment for the final model decision; 2) upholds specified statistical constraints on its out-of-sample performance and provides uncertainty estimation for its predictions; 3) handles inputs containing unknown rhythm types; and 4) handles data from unseen patients while also flagging cases in which the model’s outputs are not usable for a specific patient. This work represents a significant step toward overcoming the limitations currently impeding the integration of AI into clinical practice in cardiology and medicine in general.

    E. Amrani, R. Ben-Ari, D. Rotman, A. M. Bronstein, Noise estimation using density estimation for self-supervised multimodal learning, AAAI, 2021 details

    Noise estimation using density estimation for self-supervised multimodal learning

    E. Amrani, R. Ben-Ari, D. Rotman, A. M. Bronstein
    AAAI, 2021
    Picture for Noise estimation using density estimation for self-supervised multimodal learning
    --->>

    One of the key factors of enabling machine learning models to comprehend and solve real-world tasks is to leverage multimodal data. Unfortunately, the annotation of multimodal data is challenging and expensive. Recently, self-supervised multimodal methods that combine vision and language were proposed to learn multimodal representations without annotation. However, these methods choose to ignore the presence of high levels of noise and thus yield sub-optimal results. In this work, we show that the problem of noise estimation for multimodal data can be reduced to a multimodal density estimation task. Using multimodal density estimation, we propose a noise estimation building block for multimodal representation learning that is based strictly on the inherent correlation between different modalities. We demonstrate how our noise estimation can be broadly integrated and achieves comparable results to state-of-the-art performance on five different benchmark datasets for two challenging multimodal tasks: Video Question Answering and Text-To-Video Retrieval.

    O. Dahary, M. Jacoby, A. M. Bronstein, Digital Gimbal: End-to-end deep image stabilization with learnable exposure times, Proc. CVPR, 2021 details

    Digital Gimbal: End-to-end deep image stabilization with learnable exposure times

    O. Dahary, M. Jacoby, A. M. Bronstein
    Proc. CVPR, 2021
    Picture for Digital Gimbal: End-to-end deep image stabilization with learnable exposure times
    --->>

    Mechanical image stabilization using actuated gimbals enables capturing long-exposure shots without suffering from blur due to camera motion. These devices, however, are often physically cumbersome and expensive, limiting their widespread use. In this work, we propose to digitally emulate a mechanically stabilized system from the input of a fast unstabilized camera. To exploit the trade-off between motion blur at long exposures and low SNR at short exposures, we train a CNN that estimates a sharp high-SNR image by aggregating a burst of noisy short-exposure frames, related by unknown motion. We further suggest learning the burst’s exposure times in an end-to-end manner, thus balancing the noise and blur across the frames. We demonstrate this method’s advantage over the traditional approach of deblurring a single image or denoising a fixed-exposure burst.

    A. Boyarski, S. Vedula, A. M. Bronstein, Spectral geometric matrix completion, Proc. Mathematical and Scientific Machine Learning, 2021 details

    Spectral geometric matrix completion

    A. Boyarski, S. Vedula, A. M. Bronstein
    Proc. Mathematical and Scientific Machine Learning, 2021
    Picture for Spectral geometric matrix completion
    --->>

    Deep Matrix Factorization (DMF) is an emerging approach to the problem of reconstructing a matrix from a subset of its entries. Recent works have established that gradient descent applied to a DMF model induces an implicit regularization on the rank of the recovered matrix. Despite these promising theoretical results, empirical evaluation of vanilla DMF on real benchmarks exhibits poor reconstructions which we attribute to the extremely low number of samples available. We propose an explicit spectral regularization scheme that is able to make DMF models competitive on real benchmarks, while still maintaining the implicit regularization induced by gradient descent, thus enjoying the best of both worlds.

    E. Rozenberg, A. Karnieli, O. Yesharim, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse design of quantum holograms in three-dimensional nonlinear photonic crystals, CLEO, 2021 details

    Inverse design of quantum holograms in three-dimensional nonlinear photonic crystals

    E. Rozenberg, A. Karnieli, O. Yesharim, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie
    CLEO, 2021
    Picture for Inverse design of quantum holograms in three-dimensional nonlinear photonic crystals
    --->>

    We introduce a systematic approach for designing 3D nonlinear photonic crystals and pump beams for generating desired quantum correlations between structured photon-pairs. Our model is fully differentiable, allowing accurate and efficient learning and discovery of novel designs.

    A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson, Early-stage neural network hardware performance analysis, Sustainability 13(2):717, 2021 details

    Early-stage neural network hardware performance analysis

    A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson
    Sustainability 13(2):717, 2021
    Picture for Early-stage neural network hardware performance analysis
    --->>
    The demand for running NNs in embedded environments has increased significantly in recent years due to the significant success of convolutional neural network (CNN) approaches in various tasks, including image recognition and generation. The task of achieving high accuracy on resource-restricted devices, however, is still considered to be challenging, which is mainly due to the vast number of design parameters that need to be balanced. While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. This change is hard to evaluate, and the lack of balance may lead to lower utilization of either memory bandwidth or computational resources, thereby reducing performance. This paper introduces a hardware performance analysis framework for identifying bottlenecks in the early stages of CNN hardware design. We demonstrate how the proposed method can help in evaluating different architecture alternatives of resource-restricted CNN accelerators (e.g., part of real-time embedded systems) early in design stages and, thus, prevent making design mistakes.
    Keywords: neural networks; accelerators; quantization; CNN architecture
    C. Baskin, E. Schwartz, E. Zheltonozhskii, N. Liss, R. Giryes, A. M. Bronstein, A. Mendelson, UNIQ: Uniform noise injection for non-uniform quantization of neural networks, ACM Transactions on Computer Systems (TOCS), 2020 details

    UNIQ: Uniform noise injection for non-uniform quantization of neural networks

    C. Baskin, E. Schwartz, E. Zheltonozhskii, N. Liss, R. Giryes, A. M. Bronstein, A. Mendelson
    ACM Transactions on Computer Systems (TOCS), 2020
    Picture for UNIQ: Uniform noise injection for non-uniform quantization of neural networks
    --->>

    We present a novel method for training a neural network amenable to inference in low-precision arithmetic with quantized weights and activations. The training is performed in full precision with random noise injection emulating quantization noise. In order to circumvent the need to simulate realistic quantization noise distributions, the weight distributions are uniformized by a non-linear transfor- mation, and uniform noise is injected. This procedure emulates a non-uniform k-quantile quantizer at inference time, which adapts to the specific distribution of the quantized parameters. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. The method achieves state-of-the-art results for training low-precision networks on ImageNet. In particular, we observe no degradation in accuracy for MobileNet and ResNet-18/34/50 on ImageNet with as low as 4-bit quantization of weights. Our solution achieves the state-of-the-art results in accuracy, in the low computational budget regime, compared to similar models.

    B. Finkelshtein, C. Baskin, E. Zheltonozhskii, U. Alon, Single-node attack for fooling graph neural networks, arXiv:2011.03574, 2020 details

    Single-node attack for fooling graph neural networks

    B. Finkelshtein, C. Baskin, E. Zheltonozhskii, U. Alon
    arXiv:2011.03574, 2020
    Picture for Single-node attack for fooling graph neural networks
    --->>

    Graph neural networks (GNNs) have shown broad applicability in a variety of domains. Some of these domains, such as social networks and product recommendations, are fertile ground for malicious users and behavior. In this paper, we show that GNNs are vulnerable to the extremely limited scenario of a single-node adversarial example, where the node cannot be picked by the attacker. That is, an attacker can force the GNN to classify any target node to a chosen label by only slightly perturbing another single arbitrary node in the graph, even when not being able to pick that specific attacker node. When the adversary is allowed to pick a specific attacker node, the attack is even more effective. We show that this attack is effective across various GNN types, such as GraphSAGE, GCN, GAT, and GIN, across a variety of real-world datasets, and as a targeted and a non-targeted attack.

    J. Alush-Aben, L. Ackerman-Schraier, T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, 3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI, Proc. Machine Learning for Medical Image Reconstruction, MICCAI 2020 details

    3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI

    J. Alush-Aben, L. Ackerman-Schraier, T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein
    Proc. Machine Learning for Medical Image Reconstruction, MICCAI 2020
    Picture for 3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI
    --->>

    Magnetic Resonance Imaging (MRI) has long been considered to be among the gold standards of today’s diagnostic imaging. The most significant drawback of MRI is long acquisition times, prohibiting its use in standard practice for some applications. Compressed sensing (CS) proposes to subsample the k-space (the Fourier domain dual to the physical space of spatial coordinates) leading to significantly accelerated acquisition. However, the benefit of compressed sensing has not been fully  exploited; most of the sampling densities obtained through CS do not produce a trajectory that obeys the stringent constraints of the MRI machine imposed in practice. Inspired by recent success of deep learning-based approaches for image reconstruction and ideas from computational imaging on learning-based design of imaging systems, we introduce 3D FLAT, a novel protocol for data-driven design of 3D non-Cartesian accelerated trajectories in MRI. Our proposal leverages the entire 3D k-space to simultaneously learn a physically feasible acquisition trajectory with a reconstruction method. Experimental results, performed as a proof-of-concept, suggest that 3D FLAT achieves higher image quality for a given readout time compared to standard trajectories such as radial, stack-of-stars, or 2D learned trajectories (trajectories that evolve only in the 2D plane while fully sampling along the third dimension). Furthermore, we demonstrate evidence supporting the significant benefit of performing MRI acquisitions using non-Cartesian 3D trajectories over 2D non-Cartesian trajectories acquired slice-wise.

    T. Weiss, S. Vedula, O. Senouf, O. Michailovich, A. M. Bronstein, Towards learned optimal q-space sampling in diffusion MRI, Proc. Computational Diffusion MRI, MICCAI 2020 details

    Towards learned optimal q-space sampling in diffusion MRI

    T. Weiss, S. Vedula, O. Senouf, O. Michailovich, A. M. Bronstein
    Proc. Computational Diffusion MRI, MICCAI 2020
    Picture for Towards learned optimal q-space sampling in diffusion MRI
    --->>

    Fiber tractography is an important tool of computational neuroscience that enables reconstructing the spatial connectivity and organization of white matter of the brain. Fiber tractography takes advantage of diffusion Magnetic Resonance Imaging (dMRI) which allows measuring the apparent diffusivity of cerebral water along different spatial directions. Unfortunately, collecting such data comes at the price of reduced spatial resolution and substantially elevated acquisition times, which limits the clinical applicability of dMRI. This problem has been thus far addressed using two principal strategies. Most of the efforts have been extended towards improving the quality of signal estimation for any, yet fixed sampling scheme (defined through the choice of diffusion encoding gradients). On the other hand, optimization over the sampling scheme has also proven to be effective. Inspired by the previous results, the present work consolidates the above strategies into a unified estimation framework, in which the optimization is carried out with respect to both estimation model and sampling design concurrently. The proposed solution offers substantial improvements in the quality of signal estimation as well as the accuracy of ensuing analysis by means of fiber tractography. While proving the optimality of the learned estimation models would probably need more extensive evaluation, we nevertheless claim that the learned sampling schemes can be of immediate use, offering a way to improve the dMRI analysis without the necessity of deploying the neural network used for their estimation. We present a comprehensive comparative analysis based on the Human Connectome Project data.

    E. Zheltonozhskii, C. Baskin, A. M. Bronstein, A. Mendelson, Self-supervised learning for large-scale unsupervised image clustering, NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020 details

    Self-supervised learning for large-scale unsupervised image clustering

    E. Zheltonozhskii, C. Baskin, A. M. Bronstein, A. Mendelson
    NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020
    Picture for Self-supervised learning for large-scale unsupervised image clustering
    --->>

    Unsupervised learning has always been appealing to machine learning researchers and practitioners, allowing them to avoid an expensive and complicated process of labeling the data. However, unsupervised learning of complex data is challenging, and even the best approaches show much weaker performance than their supervised counterparts. Self-supervised deep learning has become a strong instrument for representation learning in computer vision. However, those methods have not been evaluated in a fully unsupervised setting.
    In this paper, we propose a simple scheme for unsupervised classification based on self-supervised representations. We evaluate the proposed approach with several recent self-supervised methods showing that it achieves competitive results for ImageNet classification (39% accuracy on ImageNet with 1000 clusters and 46% with overclustering). We suggest adding the unsupervised evaluation to a set of standard benchmarks for self-supervised learning.

     

    G. Mariani, L. Cosmo, A. M. Bronstein, E. Rodolà, Generating adversarial surfaces via band-limited perturbations, Computer Graphics Forum, 2020 details

    Generating adversarial surfaces via band-limited perturbations

    G. Mariani, L. Cosmo, A. M. Bronstein, E. Rodolà
    Computer Graphics Forum, 2020
    Picture for Generating adversarial surfaces via band-limited perturbations
    --->>

    Adversarial attacks have demonstrated remarkable efficacy in altering the output of a learning model by applying a minimal perturbation to the input data. While increasing attention has been placed on the image domain, however, the study of adversarial perturbations for geometric data has been notably lagging behind. In this paper, we show that effective adversarial attacks can be concocted for surfaces embedded in 3D, under weak smoothness assumptions on the perceptibility of the attack. We address the case of deformable 3D shapes in particular, and introduce a general model that is not tailored to any specific surface representation, nor does it assume access to a parametric description of the 3D object.In this context, we consider targeted and untargeted variants of the attack, demonstrating compelling results in either case. We further show how discovering adversarial examples, and then using them for adversarial training, leads to an increase in both robustness and accuracy. Our findings are confirmed empirically over multiple datasets spanning different semantic classes and deformations.

    E. Amrani, R. Ben-Ari, T. Hakim, A. M. Bronstein, Self-Supervised Object Detection and Retrieval Using Unlabeled Videos, CVPR workshop, 2020 details

    Self-Supervised Object Detection and Retrieval Using Unlabeled Videos

    E. Amrani, R. Ben-Ari, T. Hakim, A. M. Bronstein
    CVPR workshop, 2020
    Picture for Self-Supervised Object Detection and Retrieval Using Unlabeled Videos
    --->>

    Unlabeled video in the wild presents a valuable, yet so far unharnessed, source of information for learning vision tasks. We present the first attempt of fully self-supervised learning of object detection from subtitled videos without any manual object annotation. To this end, we use the How2 multi-modal collection of instructional videos with English subtitles. We pose the problem as learning with a weakly- and noisily-labeled data, and propose a novel training model that can confront high noise levels, and yet train a classifier to localize the object of interest in the video frames, without any manual labeling involved. We evaluate our approach on a set of 11 manually annotated objects in over 5000 frames and compare it to an existing weakly-supervised approach as baseline. Benchmark data and code will be released upon acceptance of the paper.

    D. H. Silver, M. Feder, Y. Gold-Zamir, A. L. Polsky, S. Rosentraub, E. Shachor, A. Weinberger, P. Mazur, V. D. Zukin, A. M. Bronstein, Data-driven prediction of embryo implantation probability using IVF time-lapse imaging, Proc. MIDL, 2020 details

    Data-driven prediction of embryo implantation probability using IVF time-lapse imaging

    D. H. Silver, M. Feder, Y. Gold-Zamir, A. L. Polsky, S. Rosentraub, E. Shachor, A. Weinberger, P. Mazur, V. D. Zukin, A. M. Bronstein
    Proc. MIDL, 2020

    The process of fertilizing a human egg outside the body in order to help those suffering from infertility to conceive is known as in vitro fertilization (IVF). Despite being the most effective method of assisted reproductive technology (ART), the average success rate of IVF is a mere 20-40%. One step that is critical to the success of the procedure is selecting which embryo to transfer to the patient, a process typically conducted manually and without any universally accepted and standardized criteria. In this paper, we describe a novel data-driven system trained to directly predict embryo implantation probability from embryogenesis time-lapse imaging videos. Using retrospectively collected videos from 272 embryos, we demonstrate that, when compared to an external panel of embryologists, our algorithm results in a 12% increase of positive predictive value and a 29% increase of negative predictive value.

    S. Sommer, A. M. Bronstein, Horizontal flows and manifold stochastics in geometric deep learning, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2020 details

    Horizontal flows and manifold stochastics in geometric deep learning

    S. Sommer, A. M. Bronstein
    IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2020
    Picture for Horizontal flows and manifold stochastics in geometric deep learning
    --->>

    We introduce two constructions in geometric deep learning for 1) transporting orientation-dependent convolutional filters over a manifold in a continuous way and thereby defining a convolution operator that naturally incorporates the rotational effect of holonomy; and 2) allowing efficient evaluation of manifold convolution layers by sampling manifold valued random variables that center around a weighted Brownian motion maximum likelihood mean. Both methods are inspired by stochastics on manifolds and geometric statistics, and provide examples of how stochastic methods — here horizontal frame bundle flows and non-linear bridge sampling schemes, can be used in geometric deep learning. We outline the theoretical foundation of the two methods, discuss their relation to Euclidean deep networks and existing methodology in geometric deep learning, and establish important properties of the proposed constructions.

    K. Rotker, D. Ben-Bashat, A. M. Bronstein, Over-parameterized models for vector fields, SIAM Journal on Imaging Sciences (SIIMS), 2020 details

    Over-parameterized models for vector fields

    K. Rotker, D. Ben-Bashat, A. M. Bronstein
    SIAM Journal on Imaging Sciences (SIIMS), 2020
    Picture for Over-parameterized models for vector fields
    --->>

    Vector fields arise in a variety of quantity measure and visualization techniques such as fluid flow imaging, motion estimation, deformation measures, and color imaging, leading to a better understanding of physical phenomena. Recent progress in vector field imaging technologies has emphasized the need for efficient noise removal and reconstruction algorithms. A key ingredient in the success of extracting signals from noisy measurements is prior information, which can often be represented as a parameterized model. In this work, we extend the over-parameterization variational framework in order to perform model-based reconstruction of vector fields. The over-parameterization methodology combines local modeling of the data with global model parameter regularization. By considering the vector field as a linear combination of basis vector fields and appropriate scale and rotation coefficients, the denoising problem reduces to a simpler form of coefficient recovery. We introduce two versions of the over-parameterization framework: total variation-based method and sparsity-based method, relying on the co-sparse analysis model. We demonstrate the efficiency of the proposed frameworks for two- and three-dimensional vector fields with linear and quadratic over-parameterization models.

    A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras. A. M. Bronstein, I. Oseledets, E. Müller, Intrinsic multi-scale evaluation of generative models, Proc. ICLR, 2020 details

    Intrinsic multi-scale evaluation of generative models

    A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras. A. M. Bronstein, I. Oseledets, E. Müller
    Proc. ICLR, 2020
    Picture for Intrinsic multi-scale evaluation of generative models
    --->>

    Generative models are often used to sample high-dimensional data points from a manifold with small intrinsic dimension. Existing techniques for comparing generative models focus on global data properties such as mean and covariance; in that sense, they are extrinsic and uni-scale. We develop the first, to our knowledge, intrinsic and multi-scale method for characterizing and comparing underlying data manifolds, based on comparing all data moments by lower-bounding the spectral notion of the Gromov-Wasserstein distance between manifolds. In a thorough experimental study, we demonstrate that our method effectively evaluates the quality of generative models; further, we showcase its efficacy in discerning the disentanglement process in neural networks.

    A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson, HCM: Hardware-aware complexity metric for neural network architectures, arXiv:2004.08906, 2020 details

    HCM: Hardware-aware complexity metric for neural network architectures

    A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson
    arXiv:2004.08906, 2020
    Picture for HCM: Hardware-aware complexity metric for neural network architectures
    --->>

    Convolutional Neural Networks (CNNs) have become common in many fields including computer vision, speech recognition, and natural language processing. Although CNN hardware accelerators are already included as part of many SoC architectures, the task of achieving high accuracy on resource-restricted devices is still considered challenging, mainly due to the vast number of design parameters that need to be balanced to achieve an efficient solution. Quantization techniques, when applied to the network parameters, lead to a reduction of power and area and may also change the ratio between communication and computation. As a result, some algorithmic solutions may suffer from lack of memory bandwidth or computational resources and fail to achieve the expected performance due to hardware constraints. Thus, the system designer and the micro-architect need to understand at early development stages the impact of their high-level decisions (e.g., the architecture of the CNN and the amount of bits used to represent its parameters) on the final product (e.g., the expected power saving, area, and accuracy). Unfortunately, existing tools fall short of supporting such decisions. This paper introduces a hardware-aware complexity metric that aims to assist the system designer of the neural network architectures, through the entire project lifetime (especially at its early stages) by predicting the impact of architectural and micro-architectural decisions on the final product. We demonstrate how the proposed metric can help evaluate different design alternatives of neural network models on resource-restricted devices such as real-time embedded systems, and to avoid making design mistakes at early stages.

    L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. Giryes, StarNet: towards weakly supervised few-shot detection and explainable few-shot classification, AAAI, 2021 details

    StarNet: towards weakly supervised few-shot detection and explainable few-shot classification

    L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. Giryes
    AAAI, 2021
    Picture for StarNet: towards weakly supervised few-shot detection and explainable few-shot classification
    --->>

    In this paper, we propose a new few-shot learning method called StarNet, which is an end-to-end trainable non-parametric star-model few-shot classifier. While being meta-trained using only image-level class labels, StarNet learns not only to predict the class labels for each query image of a few-shot task, but also to localize (via a heatmap) what it believes to be the key image regions supporting its prediction, thus effectively detecting the instances of the novel categories. The localization is enabled by the StarNet’s ability to find large, arbitrarily shaped, semantically matching regions between all pairs of support and query images of a few-shot task. We evaluate StarNet on multiple few-shot classification benchmarks attaining significant state-of-the-art improvement on the CUB and ImageNetLOC-FS, and smaller improvements on other benchmarks. At the same time, in many cases, StarNet provides plausible explanations for its class label predictions, by highlighting the correctly paired novel category instances on the query and on its best matching support (for the predicted class). In addition, we test the proposed approach on the previously unexplored and challenging task of Weakly Supervised Few-Shot Object Detection (WS-FSOD), obtaining significant improvements over the baselines.

    E. Zheltonozhskii, C. Baskin, Y. Nemcovsky, B. Chmiel, A. Mendelson, A. M. Bronstein, Colored noise injection for training adversarially robust neural networks, arXiv:2003.02188, 2020 details

    Colored noise injection for training adversarially robust neural networks

    E. Zheltonozhskii, C. Baskin, Y. Nemcovsky, B. Chmiel, A. Mendelson, A. M. Bronstein
    arXiv:2003.02188, 2020
    Picture for Colored noise injection for training adversarially robust neural networks
    --->>

    Even though deep learning have shown unmatched performance on various tasks, neural networks has been shown to be vulnerable to small adversarial perturbation of the input which lead to significant performance degradation. In this work we extend the idea of adding independent Gaussian noise to weights and activation during adversarial training (PNI) to injection of colored noise for defense against common white-box and black-box attacks. We show that our approach outperforms PNI and various previous approaches in terms of adversarial accuracy on CIFAR-10 dataset. In addition, we provide an extensive ablation study of the proposed method justifying the chosen configurations.

    A. Livne, A. M. Bronstein, R. Kimmel, Z. Aviv, S. Grofit, Do we need depth in state-of-the-art face authentication?, arXiv:2003.10895 2020 details

    Do we need depth in state-of-the-art face authentication?

    A. Livne, A. M. Bronstein, R. Kimmel, Z. Aviv, S. Grofit
    arXiv:2003.10895 2020
    Picture for Do we need depth in state-of-the-art face authentication?
    --->>

    Some face recognition methods are designed to utilize geometric features extracted from depth sensors to handle the challenges of single-image based recognition technologies. However, calculating the geometrical data is an expensive and challenging process. Here, we introduce a novel method that learns distinctive geometric features from stereo camera systems without the need to explicitly compute the facial surface or depth map. The raw face stereo images along with coordinate maps allow a CNN to learn geometric features. This way, we keep the simplicity and cost-efficiency of recognition from a single image, while enjoying the benefits of geometric data without explicitly reconstructing it. We demonstrate that the suggested method outperforms both existing single-image and explicit depth-based methods on large-scale benchmarks. We also provide an ablation study to show that the suggested method uses the coordinate maps to encode more informative features.

    M. Shkolnik, B. Chmiel, R. Banner, G. Shomron, Y. Nahshan, A. M. Bronstein, U. Weiser, Robust Quantization: One Model to Rule Them All, NeurIPS 2020 details

    Robust Quantization: One Model to Rule Them All

    M. Shkolnik, B. Chmiel, R. Banner, G. Shomron, Y. Nahshan, A. M. Bronstein, U. Weiser
    NeurIPS 2020
    Picture for Robust Quantization: One Model to Rule Them All
    --->>

    Neural network quantization methods often involve simulating the quantization process during training. This makes the trained model highly dependent on the precise way quantization is performed. Since low-precision accelerators differ in their quantization policies and their supported mix of data-types, a model trained for one accelerator may not be suitable for another. To address this issue, we propose KURE, a method that provides intrinsic robustness to the model against a broad range of quantization implementations. We show that KURE yields a generic model that may be deployed on numerous inference accelerators without a significant loss in accuracy

    A. Boyarski, S. Vedula, A. M. Bronstein, Deep matrix factorization with spectral geometric regularization, arXiv: 1911.07255, 2019 details

    Deep matrix factorization with spectral geometric regularization

    A. Boyarski, S. Vedula, A. M. Bronstein
    arXiv: 1911.07255, 2019
    Picture for Deep matrix factorization with spectral geometric regularization
    --->>

    We address the problem of reconstructing a matrix from a subset of its entries. Current methods, branded as geometric matrix completion, augment classical rank regularization techniques by incorporating geometric information into the solution. This information is usually provided as graphs encoding relations between rows/columns. In this work, we propose a simple spectral approach for solving the matrix completion problem, via the framework of functional maps. We introduce the zoomout loss, a multiresolution spectral geometric loss inspired by recent advances in shape correspondence, whose minimization leads to state-of-the-art results on various recommender systems datasets. Surprisingly, for some datasets, we were able to achieve comparable results even without incorporating geometric information. This puts into question both the quality of such information and current methods’ ability to use it in a meaningful and efficient way.

     

    Code is available either as Google Colab notebook, or via https://github.com/amitboy/SGMC

    Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, Loss aware post-training quantization, arXiv: 1911.07190, 2019 details

    Loss aware post-training quantization

    Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson
    arXiv: 1911.07190, 2019
    Picture for Loss aware post-training quantization
    --->>

    Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. We show that the structure is flat and separable for mild quantization, enabling straightforward post-training quantization methods to achieve good results. On the other hand, we show that with more aggressive quantization, the loss landscape becomes highly non-separable with sharp minima points, making the selection of quantization parameters more challenging. Armed with this understanding, we design a method that quantizes the layer parameters jointly, enabling significant accuracy improvement over current post-training quantization methods. Reference implementation accompanies the paper.

    Y. Nemcovsky, E. Zheltonozhskii, C. Baskin, B. Chmiel, A. M. Bronstein, A. Mendelson, Smoothed inference for adversarially-trained models, arXiv: 1911.07198, 2019 details

    Smoothed inference for adversarially-trained models

    Y. Nemcovsky, E. Zheltonozhskii, C. Baskin, B. Chmiel, A. M. Bronstein, A. Mendelson
    arXiv: 1911.07198, 2019
    Picture for Smoothed inference for adversarially-trained models
    --->>

    Deep neural networks are known to be vulnerable to inputs with maliciously constructed adversarial perturbations aimed at forcing misclassification. We study randomized smoothing as a way to both improve performance on unperturbed data as well as increase robustness to adversarial attacks. Moreover, we extend the method proposed by arXiv:1811.09310 by adding low-rank multivariate noise, which we then use as a base model for smoothing. The proposed method achieves 58.5% top-1 accuracy on CIFAR-10 under PGD attack and outperforms previous works by 4%. In addition, we consider a family of attacks, which were previously used for training purposes in the certified robustness scheme. We demonstrate that the proposed attacks are more effective than PGD against both smoothed and non-smoothed models. Since our method is based on sampling, it lends itself well for trading-off between the model inference complexity and its performance. A reference implementation of the proposed techniques is provided.

    S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky, MetAdapt: Meta-learned task-adaptive architecture for few-shot classification, arXiv: 1912.00412, 2019 details

    MetAdapt: Meta-learned task-adaptive architecture for few-shot classification

    S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky
    arXiv: 1912.00412, 2019
    Picture for MetAdapt: Meta-learned task-adaptive architecture for few-shot classification
    --->>

    Few-Shot Learning (FSL) is a topic of rapidly growing interest. Typically, in FSL a model is trained on a dataset consisting of many small tasks (meta-tasks) and learns to adapt to novel tasks that it will encounter during test time. This is also referred to as meta-learning. So far, meta-learning FSL methods have focused on optimizing parameters of pre-defined network architectures, in order to make them easily adaptable to novel tasks. Moreover, it was observed that, in general, larger architectures perform better than smaller ones up to a certain saturation point (and even degrade due to over-fitting). However, little attention has been given to explicitly optimizing the architectures for FSL, nor to an adaptation of the architecture at test time to particular novel tasks. In this work, we propose to employ tools borrowed from the Differentiable Neural Architecture Search (D-NAS) literature in order to optimize the architecture for FSL without over-fitting. Additionally, to make the architecture task adaptive, we propose the concept of `MetAdapt Controller’ modules. These modules are added to the model and are meta-trained to predict the optimal network connections for a given novel task. Using the proposed approach we observe state-of-the-art results on two popular few-shot benchmarks: miniImageNet and FC100.

    E. Rozenberg, D. Freedman, A. M. Bronstein, Localization with limited annotation for chest X-rays, ML4H, NeuralIPS 2019 details

    Localization with limited annotation for chest X-rays

    E. Rozenberg, D. Freedman, A. M. Bronstein
    ML4H, NeuralIPS 2019
    Picture for Localization with limited annotation for chest X-rays
    --->>

    Localization of an object within an image is a common task in medical imaging. Learning to localize or detect objects typically requires the collection of data which has been labelled with bounding boxes or similar annotations, which can be very time consuming and expensive. A technique which could perform such learning with much less annotation would, therefore, be quite valuable. We present such a technique for localization with limited annotation, in which the number of images with bounding boxes can be a small fraction of the total dataset (e.g. less than 1%); all other images only possess a whole image label and no bounding box. We propose a novel loss function for tackling this problem; the loss is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning and is numerically well-posed. Furthermore, we propose a new architecture which accounts for both patch dependence and shift-invariance, through the inclusion of CRF layers and anti-aliasing filters, respectively. We apply our technique to the localization of thoracic diseases in chest X-ray images and demonstrate state-of-the-art localization performance on the ChestX-ray14 dataset.

    S. Vedula, O. Senouf, G. Zurakov, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Learning beamforming in ultrasound imaging, Proc. Medical Imaging with Deep Learning (MIDL), 2019 details

    Learning beamforming in ultrasound imaging

    S. Vedula, O. Senouf, G. Zurakov, A. M. Bronstein, O. Michailovich, M. Zibulevsky
    Proc. Medical Imaging with Deep Learning (MIDL), 2019
    Picture for Learning beamforming in ultrasound imaging
    --->>
    Medical ultrasound (US) is a widespread imaging modality owing its popularity to cost-efficiency, portability, speed, and lack of harmful ionizing radiation. In this paper, we demonstrate that replacing the traditional ultrasound processing pipeline with a data-driven, learnable counterpart leads to signi cant improvement in image quality. Moreover, we demonstrate that greater improvement can be achieved through a learning-based design of the transmitted beam patterns simultaneously with learning an image reconstruction pipeline. We evaluate our method on an in-vivo fi rst-harmonic cardiac ultrasound dataset acquired from volunteers and demonstrate the signi cance of the learned pipeline and transmit beam patterns on the image quality when compared to standard transmit and receive beamformers used in high frame-rate US imaging. We believe that the presented methodology provides a fundamentally di erent perspective on the classical problem of ultrasound beam pattern design.
    E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein, RepMet: Representative-based metric learning for classification and one-shot object detection, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 details

    RepMet: Representative-based metric learning for classification and one-shot object detection

    E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein
    Proc. Computer Vision and Pattern Recognition (CVPR), 2019
    Picture for RepMet: Representative-based metric learning for classification and one-shot object detection
    --->>

    Distance metric learning (DML) has been successfully applied to object classification, both in the standard regime of rich training data and in the few-shot scenario, where each category is represented by only few examples. In this work, we propose a new method for DML, featuring a joint learning of the embedding space and of the data distribution of the training categories, in a single training process. Our method improves upon leading algorithms for DML-based object classification. Furthermore, it opens the door for a new task in computer vision — a few-shot object detection, since the proposed DML architecture can be naturally embedded as the classification head of any standard object detector. In numerous experiments, we achieve state-of-the-art classification results on a variety of fine-grained datasets, and offer the community a benchmark on the few-shot detection task, performed on the Imagenet-LOC dataset.

    O. Halimi, O. Litany, E. Rodolà, A. M. Bronstein, R. Kimmel, Self-supervised learning of dense shape correspondence, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 details

    Self-supervised learning of dense shape correspondence

    O. Halimi, O. Litany, E. Rodolà, A. M. Bronstein, R. Kimmel
    Proc. Computer Vision and Pattern Recognition (CVPR), 2019

    We introduce the first completely unsupervised correspondence learning approach for deformable 3D shapes. Key to our model is the understanding that natural deformations (such as changes in the pose) approximately preserve the metric structure of the surface, yielding a natural criterion to drive the learning process toward distortion-minimizing predictions. On this basis, we overcome the need for annotated data and replace it with a purely geometric criterion. The resulting learning model is class-agnostic and is able to leverage any type of deformable geometric data for the training phase. In contrast to existing supervised approaches which specialize in the class seen at training time, we demonstrate stronger generalization as well as applicability to a variety of challenging settings. We showcase our method on a wide selection of correspondence benchmarks, where we outperform other methods in terms of accuracy, generalization, and efficiency.

    A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, A. M. Bronstein, LaSO: Label-Set Operations networks for multi-label few-shot learning, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 details

    LaSO: Label-Set Operations networks for multi-label few-shot learning

    A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, A. M. Bronstein
    Proc. Computer Vision and Pattern Recognition (CVPR), 2019
    Picture for LaSO: Label-Set Operations networks for multi-label few-shot learning
    --->>

    Example synthesis is one of the leading methods to tackle the problem of few-shot learning, where only a small number of samples per class are available. However, current synthesis approaches only address the scenario of a single category label per image. In this work, we propose a novel technique for synthesizing samples with multiple labels for the (yet unhandled) multi-label few-shot classification scenario. We propose to combine pairs of given examples in feature space, so that the resulting synthesized feature vectors will correspond to examples whose label sets are obtained through certain set operations on the label sets of the corresponding input pairs. Thus, our method is capable of producing a sample containing the intersection, union or set-difference of labels present in two input samples. As we show, these set operations generalize to labels unseen during training. This enables performing augmentation on examples of novel categories, thus, facilitating multi-label few-shot classifier learning. We conduct numerous experiments showing promising results for the label-set manipulation capabilities of the proposed approach, both directly (using the classification and retrieval metrics), and in the context of performing data augmentation for multi-label few-shot learning. We propose a benchmark for this new and challenging task and show that our method compares favorably to all the common baselines.

    A. Zabatani, V. Surazhsky, E. Sperling, S. Ben Moshe, O. Menashe, D. H. Silver, Z. Karni, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Intel RealSense SR300 Coded light depth Camera, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2019 details

    Intel RealSense SR300 Coded light depth Camera

    A. Zabatani, V. Surazhsky, E. Sperling, S. Ben Moshe, O. Menashe, D. H. Silver, Z. Karni, A. M. Bronstein, M. M. Bronstein, R. Kimmel
    IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2019
    Picture for Intel RealSense SR300 Coded light depth Camera
    --->>

    Intel RealSense SR300 is a depth camera capable of providing a VGA-size depth map at 60 fps and 0.125mm depth resolution. In addition, it outputs an infrared VGA-resolution image and a 1080p color texture image at 30 fps.
    SR300 form-factor enables it to be integrated into small consumer products and as a front-facing camera in laptops and Ultrabooks. The SR300 depth camera is based on a coded-light technology where triangulation between projected patterns and images captured by a dedicated sensor is used to produce the depth map. Each projected line is coded by a special temporal optical code, that enables a dense depth map reconstruction from its reflection. The solid mechanical assembly of the camera allows it to stay calibrated throughout temperature and pressure changes, drops, and hits. In addition, active dynamic control maintains a calibrated depth output. An extended API LibRS released with the camera allows developers to integrate the camera in various applications. Algorithms for 3D scanning, facial analysis, hand gesture recognition, and tracking are within reach for applications using the SR300. In this paper, we describe the underlying technology, hardware, and algorithms of the SR300, as well as its calibration procedure, and outline some use cases. We believe that this paper will provide a full case study of a mass-produced depth sensing product and technology.

    Y. Zur, C. Baskin, E. Zheltonozhskii, B. Chmiel, I. Evron, A. M. Bronstein, A. Mendelson, Towards learning of filter-level heterogeneous compression of convolutional neural networks, Proc. AutoML Workshop, Int'l Conf. on Machine Learning (ICML), 2019 details

    Towards learning of filter-level heterogeneous compression of convolutional neural networks

    Y. Zur, C. Baskin, E. Zheltonozhskii, B. Chmiel, I. Evron, A. M. Bronstein, A. Mendelson
    Proc. AutoML Workshop, Int'l Conf. on Machine Learning (ICML), 2019
    Picture for Towards learning of filter-level heterogeneous compression of convolutional neural networks
    --->>

    Recently, deep learning has become a de facto standard in machine learning with convolutional neural networks (CNNs) demonstrating spectacular success on a wide variety of tasks. However, CNNs are typically very demanding computationally at inference time. One of the ways to alleviate  this burden on certain hardware platforms is quantization relying on the use of low-precision arithmetic representation for the weights and the activations. Another popular method is the pruning of the number of filters in each layer. While mainstream deep learning methods train the neural networks weights while keeping the network architecture fixed, the emerging neural architecture search (NAS) techniques make the latter also amenable to training. In this paper, we formulate optimal arithmetic bit length allocation and neural network pruning as a NAS problem, searching for the configurations satisfying a computational complexity budget while maximizing the accuracy. We use a differentiable search method based on the continuous relaxation of the search space proposed by Liu et al. (2019a). We show, by grid search, that heterogeneous quantized networks suffer from a high variance which renders the benefit of the search questionable. For pruning, improvement over homogeneous cases is possible, but it is still challenging to find those configurations with the proposed method.  The code is publicly available at https://github.com/yochaiz/Slimmable and https://github.com/yochaiz/darts-UNIQ.

    T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Joint learning of Cartesian undersampling and reconstruction for accelerated MRI, Proc. Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2020 details

    Joint learning of Cartesian undersampling and reconstruction for accelerated MRI

    T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, O. Michailovich, M. Zibulevsky
    Proc. Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2020
    Picture for Joint learning of Cartesian undersampling and reconstruction for accelerated MRI
    --->>

    Magnetic Resonance Imaging (MRI) is considered today the golden-standard modality for soft tissues. The long acquisition times, however, make it more prone to motion artifacts as well as contribute to the relatively high costs of this examination. Over the years, multiple studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MRI, however, these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the simultaneous learning-based design of the acquisition and reconstruction schemes manifesting significant improvement in the reconstruction quality with a constrained time budget. Inspired by these successes, in this work, we propose to learn accelerated MR acquisition schemes (in the form of Cartesian trajectories) jointly with the image reconstruction operator. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using the learned Cartesian trajectories at different speed up rates.

    B. Chmiel, C. Baskin, R. Banner, E. Zheltonozshkii, Y. Yermolin, A. Karbachevsky, A. M. Bronstein, A. Mendelson, Feature map transform coding for energy-efficient CNN inference, Proc. Intl. Joint Conf. on Neural Networks (IJCNN), 2020 details

    Feature map transform coding for energy-efficient CNN inference

    B. Chmiel, C. Baskin, R. Banner, E. Zheltonozshkii, Y. Yermolin, A. Karbachevsky, A. M. Bronstein, A. Mendelson
    Proc. Intl. Joint Conf. on Neural Networks (IJCNN), 2020
    Picture for Feature map transform coding for energy-efficient CNN inference
    --->>

    Convolutional neural networks (CNNs) achieve state-of-the-art accuracy in a variety of tasks in computer vision and beyond. One of the major obstacles hindering the ubiquitous use of CNNs for inference on low-power edge devices is their relatively high computational complexity and memory bandwidth requirements. The latter often dominates the energy footprint on modern hardware. In this paper, we introduce a lossy transform coding approach, inspired by image and video compression, designed to reduce the memory bandwidth due to the storage of intermediate activation calculation results. Our method exploits the high correlations between feature maps and adjacent pixels and allows to halve the data transfer volumes to the main memory without re-training. We analyze the performance of our approach on a variety of CNN architectures and demonstrated FPGA implementation of ResNet18 with our approach results in a reduction of around 40% in the memory energy footprint compared to quantized network with negligible impact on accuracy. A reference implementation accompanies the paper.

    E. Schwartz, L. Karlinsky, R. Feris, R. Giryes, A. M. Bronstein, Baby steps towards few-shot learning with multiple semantics, arXiv:1906.01905, 2019 details

    Baby steps towards few-shot learning with multiple semantics

    E. Schwartz, L. Karlinsky, R. Feris, R. Giryes, A. M. Bronstein
    arXiv:1906.01905, 2019
    Picture for Baby steps towards few-shot learning with multiple semantics
    --->>

    Learning from one or few visual examples is one of the key capabilities of humans since early infancy, but is still a significant challenge for modern AI systems. While considerable progress has been achieved in few-shot learning from a few image examples, much less attention has been given to the verbal descriptions that are usually provided to infants when they are presented with a new object. In this paper, we focus on the role of additional semantics that can significantly facilitate few-shot visual learning. Building upon recent advances in few-shot learning with additional semantic information, we demonstrate that further improvements are possible using richer semantics and multiple semantic sources. Using these ideas, we offer the community a new result on the one-shot test of the popular miniImageNet benchmark, comparing favorably to the previous state-of-the-art results for both visual only and visual plus semantics-based approaches. We also performed an ablation study investigating the components and design choices of our approach.

    A. Rampini, I. Tallini, M. Ovsjanikov, A. M. Bronstein, E. Rodola, Correspondence-free region localization for partial shape similarity via Hamiltonian spectrum alignment, Proc. 3D Vision (3DV), 2019 (Best paper award) details

    Correspondence-free region localization for partial shape similarity via Hamiltonian spectrum alignment

    A. Rampini, I. Tallini, M. Ovsjanikov, A. M. Bronstein, E. Rodola
    Proc. 3D Vision (3DV), 2019 (Best paper award)

    We consider the problem of localizing relevant subsets of non-rigid geometric shapes given only a partial 3D query as the input. Such problems arise in several challenging tasks in 3D vision and graphics, including partial shape similarity, retrieval, and non-rigid correspondence. We phrase the problem as one of alignment between short sequences of eigenvalues of basic differential operators, which are constructed upon a scalar function defined on the 3D surfaces. Our method therefore seeks for a scalar function that entails this alignment. Differently from existing approaches, we do not require solving for a correspondence between the query and the target, therefore greatly simplifying the optimization process; our core technique is also descriptor-free, as it is driven by the geometry of the two objects as encoded in their operator spectra. We further show that our spectral alignment algorithm provides a remarkably simple alternative to the recent shape-from-spectrum reconstruction approaches. For both applications, we demonstrate improvement over the state-of-the-art either in terms of accuracy or computational cost.

    O. Senouf, S. Vedula, T. Weiss, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Self-supervised learning of inverse problem solvers in medical imaging, Proc. Medical Image Learning with Less Labels and Imperfect Data, MICCAI 2019 details

    Self-supervised learning of inverse problem solvers in medical imaging

    O. Senouf, S. Vedula, T. Weiss, A. M. Bronstein, O. Michailovich, M. Zibulevsky
    Proc. Medical Image Learning with Less Labels and Imperfect Data, MICCAI 2019
    Picture for Self-supervised learning of inverse problem solvers in medical imaging
    --->>

    In the past few years, deep learning-based methods have demonstrated enormous success for solving inverse problems in medical imaging. In this work, we address the following question: Given a set of measurements obtained from real imaging experiments, what is the best way to use a learnable model and the physics of the modality to solve the inverse problem and reconstruct the latent image? Standard supervised learning based methods approach this problem by collecting data sets of known latent images and their corresponding measurements. However, these methods are often impractical due to the lack of availability of appropriately sized training sets, and, more generally, due to the inherent difficulty in measuring the “groundtruth” latent image. In light of this, we propose a self-supervised approach to training inverse models in medical imaging in the absence of aligned data. Our method only requiring access to the measurements and the forward model at training. We showcase its effectiveness on inverse problems arising in accelerated magnetic resonance imaging (MRI).

    N. Diamant, D. Zadok, C. Baskin, E. Schwartz, A. M. Bronstein, Beholder-GAN: Generation and beautification of facial images with conditioning on their beauty level, Proc. Int'l Conf. on Image Processing (ICIP), 2019 details

    Beholder-GAN: Generation and beautification of facial images with conditioning on their beauty level

    N. Diamant, D. Zadok, C. Baskin, E. Schwartz, A. M. Bronstein
    Proc. Int'l Conf. on Image Processing (ICIP), 2019
    Picture for Beholder-GAN: Generation and beautification of facial images with conditioning on their beauty level
    --->>

    Beauty is in the eye of the beholder. This maxim, emphasizing the subjectivity of the perception of beauty, has enjoyed a wide consensus since ancient times. In the digital era, data-driven methods have been shown to be able to predict human-assigned beauty scores for facial images. In this work, we augment this ability and train a generative model that generates faces conditioned on a requested beauty score. In addition, we show how this trained generator can be used to beautify an input face image. By doing so, we achieve an unsupervised beautification model, in the sense that it relies on no ground truth target images.

    G. Pai, R. Talmon, A. M. Bronstein, R. Kimmel, DIMAL: Deep isometric manifold learning using sparse geodesic sampling, Proc. IEEE Winter Conf. on Applications of Computer Vision (WACV), 2019 details

    DIMAL: Deep isometric manifold learning using sparse geodesic sampling

    G. Pai, R. Talmon, A. M. Bronstein, R. Kimmel
    Proc. IEEE Winter Conf. on Applications of Computer Vision (WACV), 2019
    Picture for DIMAL: Deep isometric manifold learning using sparse geodesic sampling
    --->>

    This paper explores a fully unsupervised deep learning approach for computing distance-preserving maps that generate low-dimensional embeddings for a certain class of manifolds. We use the Siamese configuration to train a neural network to solve the problem of least squares multidimensional scaling for generating maps that approximately preserve geodesic distances. By training with only a few landmarks, we show a significantly improved local and nonlocal generalization of the isometric mapping as compared to analogous non-parametric counterparts. Importantly, the combination of a deep-learning framework with a multidimensional scaling objective enables a numerical analysis of network architectures to aid in understanding their representation power. This provides a geometric perspective to the generalizability of deep learning.

    O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. Cremers, Partial single- and multi-shape dense correspondence using functional maps, Chapter in The Handbook of Numerical Analysis - Processing, Analyzing and Learning of Images, Shapes, and Forms, Elsevier, 2019 details

    Partial single- and multi-shape dense correspondence using functional maps

    O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. Cremers
    Chapter in The Handbook of Numerical Analysis - Processing, Analyzing and Learning of Images, Shapes, and Forms, Elsevier, 2019
    Picture for Partial single- and multi-shape dense correspondence using functional maps
    --->>

    Shape correspondence is a fundamental problem in computer graphics and vision, with applications in various problems including animation, texture mapping, robotic vision, medical imaging, archaeology and many more. In settings where the shapes are allowed to undergo non-rigid deformations and only partial views are available, the problem becomes very challenging. In this chapter we describe recent techniques designed to tackle such problems. Specifically, we explain how the renown functional maps framework can be extended to tackle the partial setting. We then present a further extension to the mutli-part case in which one tries to establish correspondence between a collection of shapes. Finally, we focus on improving the technique efficiency, by disposing of its spatial ingredient and thus keeping the computation in the spectral domain. Extensive experimental results are provided along with the theoretical explanations, to demonstrate the effectiveness of the described methods in these challenging scenarios.

    A. Boyarski, A. M. Bronstein, Multidimensional scaling, Computer Vision: A Reference Guide, (Katsushi Ikeuchi, Ed.) details

    Multidimensional scaling

    A. Boyarski, A. M. Bronstein
    Computer Vision: A Reference Guide, (Katsushi Ikeuchi, Ed.)
    Picture for Multidimensional scaling
    --->>

    The various multidimensional scaling models can be broadly classified into metric vs. non-metric, and strain (classical scaling) vs. stress (distance scaling) based MDS models. In metric MDS the goal is to maintain the distances in the embedding space as close as possible to the given dissimilarities, while in nonmetric MDS only the order relations between the dissimilarities are important. Strain-based MDS is an algebraic version of the problem that can be solved by eigenvalue decomposition. Stress-based MDS uses a geometric distortion criterion which results in a non-linear and non-convex optimization problem. Each of these models has its own merits and drawbacks, both numerically and application-wise. On top of these basic models, there exist numerous generalizations, including embedding into non-Euclidean domains, working with different stress models, working in different subspaces, and incorporating machine learning approaches to obtain faster, more accurate and more robust embeddings. This chapter reviews these models, with emphasis on their role in computer vision applications.

    E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein, ∆-encoder: an effective sample synthesis method for few-shot object recognition, Proc. Neural Information Processing Systems (NIPS), 2018 details

    ∆-encoder: an effective sample synthesis method for few-shot object recognition

    E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein
    Proc. Neural Information Processing Systems (NIPS), 2018

    Learning to classify new categories based on just one or a few examples is a long-standing challenge in modern computer vision. In this work, we propose a simple yet effective method for few-shot (and one-shot) object recognition. Our approach is based on a modified auto-encoder, denoted ∆-encoder, that learns to synthesize new samples for an unseen category just by seeing few examples from it. The synthesized samples are then used to train a classifier. The proposed approach learns to both extract transferable intra-class deformations, or “deltas”, between same-class pairs of training examples, and to apply those deltas to the few provided examples of a novel class (unseen during training) in order to efficiently synthesize samples from that new class. The proposed method improves over the state-of-the-art in one-shot object-recognition and compares favorably in the few-shot case.

    E. Rodolà, Z. Lähner, A. M. Bronstein, M. M. Bronstein, J. Solomon, Functional maps representation on product manifolds, arXiv:1809.10940, 2018 details

    Functional maps representation on product manifolds

    E. Rodolà, Z. Lähner, A. M. Bronstein, M. M. Bronstein, J. Solomon
    arXiv:1809.10940, 2018
    Picture for Functional maps representation on product manifolds
    --->>

    We consider the tasks of representing, analyzing and manipulating maps between shapes. We model maps as densities over the product manifold of the input shapes; these densities can be treated as scalar functions and therefore are manipulable using the language of signal processing on manifolds. Being a manifold itself, the product space endows the set of maps with a geometry of its own, which we exploit to define map operations in the spectral domain; we also derive relationships with other existing representations (soft maps and functional maps). To apply these ideas in practice, we discretize product manifolds and their Laplace-Beltrami operators, and we introduce localized spectral analysis of the product manifold as a novel tool for map processing. Our framework applies to maps defined between and across 2D and 3D shapes without requiring special adjustment, and it can be implemented efficiently with simple operations on sparse matrices.

    C. Baskin, N. Liss, Y. Chai, E. Zheltonozhskii, E. Schwartz, R. Giryes, A. Mendelson, A. M. Bronstein, NICE: noise injection and clamping estimation for neural network quantization, arXiv:1810.00162, 2018 details

    NICE: noise injection and clamping estimation for neural network quantization

    C. Baskin, N. Liss, Y. Chai, E. Zheltonozhskii, E. Schwartz, R. Giryes, A. Mendelson, A. M. Bronstein
    arXiv:1810.00162, 2018
    Picture for NICE: noise injection and clamping estimation for neural network quantization
    --->>

    Convolutional Neural Networks (CNN) are very popular in many fields including computer vision, speech recognition, natural language processing, to name a few. Though deep learning leads to groundbreaking performance in these domains, the networks used are very demanding computationally and are far from real-time even on a GPU, which is not power efficient and therefore does not suit low power systems such as mobile devices. To overcome this challenge, some solutions have been proposed for quantizing the weights and activations of these networks, which accelerate the runtime significantly. Yet, this acceleration comes at the cost of a larger error. The uniqname method proposed in this work trains quantized neural networks by noise injection and a learned clamping, which improve the accuracy. This leads to state-of-the-art results on various regression and classification tasks, e.g., ImageNet classification with architectures such as ResNet-18/34/50 with low as 3-bit weights and activations. We implement the proposed solution on an FPGA to demonstrate its applicability for low power real-time applications.

    Q. Qiu, J. Lezama, A. M. Bronstein, G. Sapiro, ForestHash: Semantic hashing with shallow random forests and tiny convolutional networks, Proc. European Conf. on Computer Vision (ECCV), 2018 details

    ForestHash: Semantic hashing with shallow random forests and tiny convolutional networks

    Q. Qiu, J. Lezama, A. M. Bronstein, G. Sapiro
    Proc. European Conf. on Computer Vision (ECCV), 2018
    Picture for ForestHash: Semantic hashing with shallow random forests and tiny convolutional networks
    --->>

    Hash codes are efficient data representations for coping with the ever growing amounts of data. In this paper, we introduce a random forest semantic hashing scheme that embeds tiny convolutional neural networks (CNN) into shallow random forests, with near-optimal information-theoretic code aggregation among trees. We start with a simple hashing scheme, where random trees in a forest act as hashing functions by setting `1′ for the visited tree leaf, and `0′ for the rest. We show that traditional random forests fail to generate hashes that preserve the underlying similarity between the trees, rendering the random forests approach to hashing challenging. To address this, we propose to first randomly group arriving classes at each tee split node into two groups, obtaining a significantly simplified two-class classification problem, which can be handled using a light-weight CNN weak learner. Such random class grouping scheme enables code uniqueness by enforcing each class to share its code with different classes in different trees. A non-conventional low-rank loss is further adopted for the CNN weak learners to encourage code consistency by minimizing intra-class variations and maximizing inter-class distance for the two random class groups. Finally, we introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. The proposed approach significantly outperforms state-of-the-art hashing methods for image retrieval tasks on large-scale public datasets, while performing at the level of other state-of-the-art image classification techniques while utilizing a more compact and efficient scalable representation. This work proposes a principled and robust procedure to train and deploy in parallel an ensemble of light-weight CNNs, instead of simply going deeper.

    T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Class-aware fully-convolutional Gaussian and Poisson denoising, IEEE Trans. Image Processing, Vol. 27(11), 2018 details

    Class-aware fully-convolutional Gaussian and Poisson denoising

    T. Remez, O. Litany, R. Giryes, A. M. Bronstein
    IEEE Trans. Image Processing, Vol. 27(11), 2018
    Picture for Class-aware fully-convolutional Gaussian and Poisson denoising
    --->>

    We propose a fully-convolutional neural-network architecture for image denoising which is simple yet powerful. Its structure allows to exploit the gradual nature of the denoising process, in which shallow layers handle local noise statistics, while deeper layers recover edges and enhance textures. Our method advances the state-of-the-art when trained for different noise levels and distributions (both Gaussian and Poisson). In addition, we show that making the denoiser class-aware by exploiting semantic class information boosts performance, enhances textures and reduces artifacts.

    A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller, NetLSD: Hearing the shape of a graph, Proc. ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), 2018 details

    NetLSD: Hearing the shape of a graph

    A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller
    Proc. ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), 2018

    Comparison among graphs is ubiquitous in graph analytics. However, it is a hard task in terms of the expressiveness of the employed similarity measure and the efficiency of its computation. Ideally, graph comparison should be invariant to the order of nodes and the sizes of compared graphs, adaptive to the scale of graph patterns, and scalable. Unfortunately, these properties have not been addressed together. Graph comparisons still rely on direct approaches, graph kernels, or representation-based methods, which are all inefficient and impractical for large graph collections. In this paper, we propose the Network Laplacian Spectral Descriptor (NetLSD): the first, to our knowledge, permutation- and size-invariant, scale-adaptive, and efficiently computable graph representation method that allows for straightforward comparisons of large graphs. NetLSD extracts a compact signature that inherits the formal properties of the Laplacian spectrum, specifically its heat or wave kernel; thus, it hears the shape of a graph. Our evaluation on a variety of real-world graphs demonstrates that it outperforms previous works in both expressiveness and efficiency.

    O. Senouf, S. Vedula, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Blondheim, High frame-rate cardiac ultrasound imaging with deep learning, Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2018 details

    High frame-rate cardiac ultrasound imaging with deep learning

    O. Senouf, S. Vedula, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Blondheim
    Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2018
    Picture for High frame-rate cardiac ultrasound imaging with deep learning
    --->>

    Cardiac ultrasound imaging requires a high frame rate in order to capture rapid motion. This can be achieved by multi-line acquisition (MLA), where several narrow-focused received lines are obtained from each wide-focused transmitted line. This shortens the acquisition time at the expense of introducing block artifacts. In this paper, we propose a data-driven learning-based approach to improve the MLA image quality. We train an end-to-end convolutional neural network on pairs of real ultrasound cardiac data, acquired through MLA and the corresponding single-line acquisition (SLA). The network achieves a significant improvement in image quality for both 5- and 7-line MLA resulting in a decorrelation measure similar to that of SLA while having the frame rate of MLA.

    S. Vedula, O. Senouf, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Gaitini, High quality ultrasonic multi-line transmission through deep learning, Proc. Machine Learning for Medical Image Reconstruction (MLMIR), 2018 details

    High quality ultrasonic multi-line transmission through deep learning

    S. Vedula, O. Senouf, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Gaitini
    Proc. Machine Learning for Medical Image Reconstruction (MLMIR), 2018

    Frame rate is a crucial consideration in cardiac ultrasound imaging and 3D sonography. Several methods have been proposed in the medical ultrasound literature aiming at accelerating the image acquisition. In this paper, we consider one such method called multi-line transmission (MLT), in which several evenly separated focused beams are transmitted simultaneously. While MLT reduces the acquisition time, it comes at the expense of a heavy loss of contrast due to the interactions between the beams (cross-talk artifact). In this paper, we introduce a data-driven method to reduce the artifacts arising in MLT. To this end, we propose to train an end-to-end convolutional neural network consisting of correction layers followed by a constant apodization layer. The network is trained on pairs of raw data obtained through MLT and the corresponding single-line transmission (SLT) data. Experimental evaluation demonstrates signi cant improvement both in the visual image quality and in objective measures such as contrast ratio and contrast-to-noise ratio, while preserving resolution unlike traditional apodization-based methods. We show that the proposed method is able to generalize
    well across di erent patients and anatomies on real and phantom data.

    A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller, SGR: Self-supervised spectral graph representation learning, Proc. KDD Deep Learning Day, 2018 details

    SGR: Self-supervised spectral graph representation learning

    A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller
    Proc. KDD Deep Learning Day, 2018
    Picture for SGR: Self-supervised spectral graph representation learning

    Representing a graph as a vector is a challenging task; ideally, the representation should be easily computable and conducive to efficient comparisons among graphs, tailored to the particular data and an analytical task at hand. Unfortunately, a “one-size-fits-all” solution is unattainable, as different analytical tasks may require different attention to global or local graph features. We develop SGR, the first, to our knowledge, method for learning graph representations in a self-supervised manner. Grounded on spectral graph analysis, SGR seamlessly combines all aforementioned desirable properties. In extensive experiments, we show how our approach works on large graph collections, facilitates self-supervised representation learning across a variety of application domains, and performs competitively to state-of-the-art methods without re-training.

    E. Schwartz, R. Giryes, A. M. Bronstein, DeepISP: Towards learning an end-to-end image processing pipeline, IEEE Trans. on Image Processing, 2018 details

    DeepISP: Towards learning an end-to-end image processing pipeline

    E. Schwartz, R. Giryes, A. M. Bronstein
    IEEE Trans. on Image Processing, 2018
    Picture for DeepISP: Towards learning an end-to-end image processing pipeline
    --->>

    We present DeepISP, a full end-to-end deep neural model of the camera image signal processing (ISP) pipeline. Our model learns a mapping from the raw low-light mosaiced image to the final visually compelling image and encompasses low-level tasks such as demosaicing and denoising as well as higher-level tasks such as color correction and image adjustment. The training and evaluation of the pipeline were performed on a dedicated dataset containing pairs of low-light and well-lit images captured by a Samsung S7 smartphone camera in both raw and processed JPEG formats. The proposed solution achieves state-of-the-art performance in the objective evaluation of PSNR on the subtask of joint denoising and demosaicing. For the full end-to-end pipeline, it achieves better visual quality compared to the manufacturer ISP, in both a subjective human assessment and when rated by a deep model trained for assessing image quality.

    H. Haim, S. Elmalem, R. Giryes, A. M. Bronstein, E. Marom, Depth estimation from a single image using deep learned phase coded mask, IEEE Trans. Computational Imaging, Vol. 2(3), 2018 (Winner of the OSA Student Grand Challenge The Optical System of the Future) details

    Depth estimation from a single image using deep learned phase coded mask

    H. Haim, S. Elmalem, R. Giryes, A. M. Bronstein, E. Marom
    IEEE Trans. Computational Imaging, Vol. 2(3), 2018 (Winner of the OSA Student Grand Challenge The Optical System of the Future)
    Picture for Depth estimation from a single image using deep learned phase coded mask
    --->>

    Depth estimation from a single image is a well-known challenge in computer vision. With the advent of deep learning, several approaches for monocular depth estimation have been proposed, all of which have inherent limitations due to the scarce depth cues that exist in a single image. Moreover, these methods are very demanding computationally, which makes them inadequate for systems with limited processing power. In this paper, a phase-coded aperture camera for depth estimation is proposed. The camera is equipped with an optical phase mask that provides unambiguous depth-related color characteristics for the captured image. These are used for estimating the scene depth map using a fully convolutional neural network. The phase-coded aperture structure is learned jointly with the network weights using backpropagation. The strong depth cues (encoded in the image by the phase mask, designed together with the network weights) allow a much simpler neural network architecture for faster and more accurate depth estimation. Performance achieved on simulated images as well as on a real optical setup is superior to the state-of-the-art monocular depth estimation methods (both with respect to the depth accuracy and required processing power), and is competitive with more complex and expensive depth estimation methods such as light-field cameras.

    E. Tsitsin, A. M. Bronstein, T. Hendler, M. Medvedovsky, Passive electric impedance tomography, Proc. Electric Impedance Tomography (EIT), 2018 details

    Passive electric impedance tomography

    E. Tsitsin, A. M. Bronstein, T. Hendler, M. Medvedovsky
    Proc. Electric Impedance Tomography (EIT), 2018
    Picture for Passive electric impedance tomography
    --->>

    We introduce an electric impedance tomography modality without any active current injection. By loading the probe electrodes with a time-varying network of impedances, the proposed technique exploits electrical fields existing in the medium due to biological activity or EM interference from the environment or an implantable device. A phantom validation of the technique is presented.

    E. Tsitsin, T. Mund, A. M. Bronstein, Printable anisotropic phantom for EEG with distributed current sources, Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018 details

    Printable anisotropic phantom for EEG with distributed current sources

    E. Tsitsin, T. Mund, A. M. Bronstein
    Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018
    Picture for Printable anisotropic phantom for EEG with distributed current sources

    We introduce an electric impedance tomography modality without any active current injection. By loading the probe electrodes with a time-varying network of impedances, the proposed technique exploits electrical fields existing in the medium due to biological activity or EM interference from the environment or an implaPresented is the phantom mimicking the electromagnetic properties of the human head. The fabrication is based on the additive manufacturing (3d-printing) technology combined with the electrically conductive gel. The novel key features of the phantom are the controllable anisotropic electrical conductivity of the skull and the densely packed actively multiplexed monopolar current sources permitting interpolation of the measured gain function to any dipolar current source position and orientation within the head. The phantom was tested in realistic environment successfully simulating the possible signals from neural activations situated at any depth within the brain as well as EMI and motion artifacts. The proposed design can be readily repeated in any lab having an access to a standard 100 micron precision 3d-printer. The meshes of the phantom are available from the corresponding author.ntable device. A phantom validation of the technique is presented.

    E. Tsitsin, M. Medvedovsky, A. M. Bronstein, VibroEEG: Improved EEG source reconstruction by combined acoustic-electric imaging, Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018 details

    VibroEEG: Improved EEG source reconstruction by combined acoustic-electric imaging

    E. Tsitsin, M. Medvedovsky, A. M. Bronstein
    Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018
    Picture for VibroEEG: Improved EEG source reconstruction by combined acoustic-electric imaging

    Electroencephalography (EEG) is the electrical neural activity recording modality with high temporal and low spatial resolution. Here we propose a novel technique that we call vibroEEG improving significantly the source localization accuracy of EEG. Our method combines electric potential acquisition in concert with acoustic excitation of the vibrational modes of the electrically active cerebral cortex which displace periodically the sources of the low frequency neural electrical activity. The sources residing on the maxima of the induced modes will be maximally weighted in the corresponding spectral components of the broadband signals measured on the noninvasive electrodes. In vibroEEG, for the first time the rich internal geometry of the cerebral cortex can be utilized to separate sources of neural activity lying close in the sense of the Euclidean metric. When the modes are excited locally using phased arrays the neural activity can essentially be probed at any cortical location. When a single transducer is used to induce the excitations, the EEG gain matrix is still being enriched with numerous independent gain vectors increasing its rank. We show theoretically and on numerical simulation that in both cases the source localization accuracy improves substantially.

    C. Baskin, N. Liss, E. Zheltonozhskii, A. M. Bronstein, A. Mendelson, Streaming architectures for large-scale quantized neural networks on an FPGA-based dataflow platform, IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2018 details

    Streaming architectures for large-scale quantized neural networks on an FPGA-based dataflow platform

    C. Baskin, N. Liss, E. Zheltonozhskii, A. M. Bronstein, A. Mendelson
    IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2018

    Deep neural networks (DNNs) are used by different applications that are executed on a range of computer architectures, from IoT devices to supercomputers. The footprint of these networks is huge as well as their computational and communication needs. In order to ease the pressure on resources, research indicates that in many cases a low precision representation (1-2 bit per parameter) of weights and other parameters can achieve similar accuracy while requiring less resources. Using quantized values enables the use of FPGAs to run NNs, since FPGAs are well fitted to these primitives; e.g., FPGAs provide efficient support for bitwise operations and can work with arbitrary-precision representation of numbers. This paper presents a new streaming architecture for running QNNs on FPGAs. The proposed architecture scales out better than alternatives, allowing us to take advantage of systems with multiple FPGAs. We also included support for skip connections, that are used in state-of-the art NNs, and shown that our architecture allows to add those connections almost for free. All this allowed us to implement an 18-layer ResNet for 224×224 images classification, achieving 57.5% top-1 accuracy. In addition, we implemented a full-sized quantized AlexNet. In contrast to previous works, we use 2-bit activations instead of 1-bit ones, which improves AlexNet’s top-1 accuracy from 41.8% to 51.03% for the ImageNet classification. Both AlexNet and ResNet can handle 1000-class real-time classification on an FPGA. Our implementation of ResNet-18 consumes 5× less power and is 4× slower for ImageNet, when compared to the same NN on the latest Nvidia GPUs. Smaller NNs, that fit a single FPGA, are running faster then on GPUs on small (32×32) inputs, while consuming up to 20× less energy and power.

    R. Giryes, Y. C. Eldar, A. M. Bronstein, G. Sapiro, Tradeoffs between convergence speed and reconstruction accuracy in inverse problems, IEEE Trans. on Signal Processing, Vol. 66(7), 2018 details

    Tradeoffs between convergence speed and reconstruction accuracy in inverse problems

    R. Giryes, Y. C. Eldar, A. M. Bronstein, G. Sapiro
    IEEE Trans. on Signal Processing, Vol. 66(7), 2018

    Solving inverse problems with iterative algorithms is popular, especially for large data. Due to time constraints, the number of possible iterations is usually limited, potentially affecting the achievable accuracy. Given an error one is willing to tolerate, an important question is whether it is possible to modify the original iterations to obtain faster convergence to a minimizer achieving the allowed error without increasing the computational cost of each iteration considerably. Relying on recent recovery techniques developed for settings in which the desired signal belongs to some low-dimensional set, we show that using a coarse estimate of this set may lead to faster convergence at the cost of an additional reconstruction error related to the accuracy of the set approximation. Our theory ties to recent advances in sparse recovery, compressed sensing, and deep learning. Particularly, it may provide a possible explanation to the successful approximation of the L1-minimization solution by neural networks with layers representing iterations, as practiced in the learned iterative shrinkage-thresholding algorithm.

    S. Vedula, O. Senouf, A. M. Bronstein, O. V. Michailovich, M. Zibulevsky, Towards CT-quality ultrasound imaging using deep learning, arXiv:1710.06304, 2017 details

    Towards CT-quality ultrasound imaging using deep learning

    S. Vedula, O. Senouf, A. M. Bronstein, O. V. Michailovich, M. Zibulevsky
    arXiv:1710.06304, 2017
    Picture for Towards CT-quality ultrasound imaging using deep learning
    --->>

    The cost-effectiveness and practical harmlessness of ultra- sound imaging have made it one of the most widespread tools for medical diagnosis. Unfortunately, the beam-forming based image formation produces granular speckle noise, blur- ring, shading and other artifacts. To overcome these effects, the ultimate goal would be to reconstruct the tissue acoustic properties by solving a full wave propagation inverse prob- lem. In this work, we make a step towards this goal, using Multi-Resolution Convolutional Neural Networks (CNN). As a result, we are able to reconstruct CT-quality images from the reflected ultrasound radio-frequency(RF) data obtained by simulation from real CT scans of a human body. We also show that CNN is able to imitate existing computationally heavy despeckling methods, thereby saving orders of magni- tude in computations and making them amenable to real-time applications.

    O. Litany, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, Deep Functional Maps: Structured prediction for dense shape correspondence, Proc. Int'l Conf. on Computer Vision (ICCV), 2017 details

    Deep Functional Maps: Structured prediction for dense shape correspondence

    O. Litany, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein
    Proc. Int'l Conf. on Computer Vision (ICCV), 2017

    We introduce a new framework for learning dense correspondence between deformable 3D shapes. Existing learning based approaches model shape correspondence as a labelling problem, where each point of a query shape receives a label identifying a point on some reference domain; the correspondence is then constructed a posteriori by composing the label predictions of two input shapes. We propose a paradigm shift and design a structured prediction model in the space of functional maps, linear operators that provide a compact representation of the correspondence. We model the learning process via a deep residual network which takes dense descriptor fields defined on two shapes as input, and outputs a soft map between the two given objects. The resulting correspondence is shown to be accurate on several challenging benchmarks comprising multiple categories, synthetic models, real scans with acquisition artifacts, topological noise, and partiality.

    Z. Laehner, M. Vestner, A. Boyarski, O. Litany, R. Slossberg, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, R. Kimmel, D. Cremers, Efficient deformable shape correspondence via kernel matching, Proc. 3D Vision (3DV), 2017 details

    Efficient deformable shape correspondence via kernel matching

    Z. Laehner, M. Vestner, A. Boyarski, O. Litany, R. Slossberg, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, R. Kimmel, D. Cremers
    Proc. 3D Vision (3DV), 2017

    We present a method to match three dimensional shapes under non-isometric deformations, topology changes and partiality. We formulate the problem as matching between a set of pair-wise and point-wise descriptors, imposing a continuity prior on the mapping, and propose a projected descent optimization procedure inspired by difference of convex functions (DC) programming. Surprisingly, in spite of the highly non-convex nature of the resulting quadratic assignment problem, our method converges to a semantically meaningful and continuous mapping in most of our experiments, and scales well. We provide preliminary theoretical analysis and several interpretations of the method.

    G. Alexandroni, Y. Podolsky, H. Greenspan, T. Remez, O. Litany, A. M. Bronstein, R. Giryes, White matter fiber representation using continuous dictionary learning, Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2017 details

    White matter fiber representation using continuous dictionary learning

    G. Alexandroni, Y. Podolsky, H. Greenspan, T. Remez, O. Litany, A. M. Bronstein, R. Giryes
    Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2017
    Picture for White matter fiber representation using continuous dictionary learning
    --->>

    With increasingly sophisticated Diffusion Weighted MRI acquisition methods and modelling techniques, very large sets of streamlines (fibers) are presently generated per imaged brain. These reconstructions of white matter architecture, which are important for human brain research and pre-surgical planning, require a large amount of storage and are often unwieldy and difficult to manipulate and analyze. This work proposes a novel continuous parsimonious framework in which signals are sparsely represented in a dictionary with continuous atoms. The significant innovation in our new methodology is the ability to train such continuous dictionaries, unlike previous approaches that either used pre-fixed continuous transforms or training with finite atoms. This leads to an innovative fiber representation method, which uses Continuous Dictionary Learning to sparsely code each fiber with high accuracy. This method is tested on numerous tractograms produced from the Human Connectome Project data and achieves state-of-the-art performances in compression ratio and reconstruction error.

    M. Vestner, R. Litman, E. Rodolà, A. M. Bronstein, D. Cremers, Product Manifold Filter: Non-rigid shape correspondence via kernel density estimation in the product space, Proc. Computer Vision and Pattern Recognition (CVPR), 2017 details

    Product Manifold Filter: Non-rigid shape correspondence via kernel density estimation in the product space

    M. Vestner, R. Litman, E. Rodolà, A. M. Bronstein, D. Cremers
    Proc. Computer Vision and Pattern Recognition (CVPR), 2017
    Picture for Product Manifold Filter: Non-rigid shape correspondence via kernel density estimation in the product space
    --->>

    Many algorithms for the computation of correspondences between deformable shapes rely on some variant of nearest neighbor matching in a descriptor space. Such are, for example, various point-wise correspondence recovery algorithms used as a post-processing stage in the functional correspondence framework. Such frequently used techniques implicitly make restrictive assumptions (e.g., near-isometry) on the considered shapes and in practice suffer from a lack of accuracy and result in poor surjectivity. We propose an alternative recovery technique capable of guaranteeing a bijective correspondence and producing significantly higher accuracy and smoothness. Unlike other methods, our approach does not depend on the assumption that the analyzed shapes are isometric. We derive the proposed method from the statistical framework of kernel density estimation and demonstrate its performance on several challenging deformable 3D shape matching datasets.

    O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, Fully spectral partial shape matching, Computer Graphics Forum, Vol. 36(2), 2017 details

    Fully spectral partial shape matching

    O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein
    Computer Graphics Forum, Vol. 36(2), 2017
    Picture for Fully spectral partial shape matching
    --->>

    We propose an efficient procedure for calculating partial dense intrinsic correspondence between deformable shapes performed entirely in the spectral domain. Our technique relies on the recently introduced partial functional maps formalism and on the joint approximate diagonalization (JAD) of the Laplace-Beltrami operators previously introduced for matching non-isometric shapes. We show that a variant of the JAD problem with an appropriately modified coupling term (surprisingly) allows to construct quasi-harmonic bases localized on the latent corresponding parts. This circumvents the need to explicitly compute the unknown parts by means of the cumbersome alternating minimization used in the previous approaches, and allows performing all the calculations in the spectral domain with constant complexity independent of the number of shape vertices. We provide an extensive evaluation of the proposed technique on standard non-rigid correspondence benchmarks and show state-of-the-art performance in various settings, including partiality and the presence of topological noise.

    A. Boyarski, A. M. Bronstein, M. M. Bronstein, Subspace least squares multidimensional scaling, Proc. Scale Space and Variational Methods (SSVM), 2017 details

    Subspace least squares multidimensional scaling

    A. Boyarski, A. M. Bronstein, M. M. Bronstein
    Proc. Scale Space and Variational Methods (SSVM), 2017

    Multidimensional Scaling (MDS) is one of the most popular methods for dimensionality reduction and visualization of high dimensional data. Apart from these tasks, it also found applications in the field of geometry processing for the analysis and reconstruction of non-rigid shapes. In this regard, MDS can be thought of as a shape from metric algorithm, consisting of finding a configuration of points in the Euclidean space that realize, as isometrically as possible, some given distance structure. In the present work we cast the least squares variant of MDS (LS-MDS) in the spectral domain. This uncovers a multiresolution property of distance scaling which speeds up the optimization by a significant amount, while producing comparable, and sometimes even better, embeddings.

    T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep class-aware image denoising, Proc. Int'l Conf. on Image Processing (ICIP), 2017 details

    Deep class-aware image denoising

    T. Remez, O. Litany, R. Giryes, A. M. Bronstein
    Proc. Int'l Conf. on Image Processing (ICIP), 2017
    Picture for Deep class-aware image denoising
    --->>

    The increasing demand for high image quality in mobile devices brings forth the need for better computational enhancement techniques, and image denoising in particular. To this end, we propose a new fully convolutional deep neural network architecture which is simple yet powerful and achieves state-of-the-art performance for additive Gaussian noise removal. Furthermore, we claim that the personal photo-collections can usually be categorized into a small set of semantic classes. However simple, this observation has not been exploited in image denoising until now. We show that a significant boost in performance of up to 0.4dB PSNR can be achieved by making our network class-aware, namely, by fine-tuning it for images belonging to a specific semantic class. Relying on the hugely successful existing image classifiers, this research advocates for using a class-aware approach in all image enhancement tasks.

    O. Litany, T. Remez, A. M. Bronstein, Cloud Dictionary: Sparse coding and modeling for point clouds, arXiv:1612.04956 details

    Cloud Dictionary: Sparse coding and modeling for point clouds

    O. Litany, T. Remez, A. M. Bronstein
    arXiv:1612.04956
    Picture for Cloud Dictionary: Sparse coding and modeling for point clouds
    --->>

    With the development of range sensors such as LIDAR and time-of-flight cameras, 3D point cloud scans have become ubiquitous in computer vision applications, the most prominent ones being gesture recognition and autonomous driving. Parsimony-based algorithms have shown great success on images and videos where data points are sampled on a regular Cartesian grid. We propose an adaptation of these techniques to irregularly sampled signals by using continuous dictionaries. We present an example application in the form of point cloud denoising.

    T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep class-aware denoising, arXiv:1701.01698 details

    Deep class-aware denoising

    T. Remez, O. Litany, R. Giryes, A. M. Bronstein
    arXiv:1701.01698
    Picture for Deep class-aware denoising
    --->>

    The increasing demand for high image quality in mobile devices brings forth the need for better computational enhancement techniques, and image denoising in particular. At the same time, the images captured by these devices can be categorized into a small set of semantic classes. However simple, this observation has not been exploited in image denoising until now. In this paper, we demonstrate how the reconstruction quality improves when a denoiser is aware of the type of content in the image. To this end, we first propose a new fully convolutional deep neural network architecture which is simple yet powerful as it achieves state-of-the-art performance even without be- ing class-aware. We further show that a significant boost in performance of up to 0.4 dB PSNR can be achieved by making our network class-aware, namely, by fine-tuning it for images belonging to a specific semantic class. Relying on the hugely successful existing image classifiers, this research advocates for using a class-aware approach in all image enhancement tasks.

    T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep convolutional denoising of low-light images, arXiv:1701.01687 details

    Deep convolutional denoising of low-light images

    T. Remez, O. Litany, R. Giryes, A. M. Bronstein
    arXiv:1701.01687
    Picture for Deep convolutional denoising of low-light images
    --->>

    Poisson distribution is used for modeling noise in photon-limited imaging. While canonical examples include relatively exotic types of sensing like spectral imaging or astronomy, the problem is relevant to regular photography now more than ever due to the booming market for mobile cameras. Restricted form factor limits the amount of absorbed light, thus computational post-processing is called for. In this paper, we make use of the powerful framework of deep convolutional neural networks for Poisson denoising. We demonstrate how by training the same network with images having a specific peak value, our denoiser outperforms previous state-of-the-art by a large margin both visually and quantitatively. Being flexible and data-driven, our solution resolves the heavy ad hoc engineering used in previous methods and is an order of magnitude faster. We further show that by adding a reasonable prior on the class of the image being processed, another significant boost in performance is achieved.

    O. Litany, T. Remez, D. Freedman, L. Shapira, A. M. Bronstein, R. Gal, ASIST: Automatic Semantically Invariant Scene Transformation, Computer Vision and Image Understanding, Vol. 157, 2017 details

    ASIST: Automatic Semantically Invariant Scene Transformation

    O. Litany, T. Remez, D. Freedman, L. Shapira, A. M. Bronstein, R. Gal
    Computer Vision and Image Understanding, Vol. 157, 2017
    Picture for ASIST: Automatic Semantically Invariant Scene Transformation
    --->>

    We present ASIST, a technique for transforming point clouds by replacing objects with their semantically equivalent counterparts. Transformations of this kind have applications in virtual reality, repair of fused scans, and robotics. ASIST is based on a unified formulation of semantic labeling and object replacement; both result from minimizing a single objective. We present numerical tools for the efficient solution of this optimization problem. The method is experimentally assessed on new datasets of both synthetic and real point clouds, and is additionally compared to two recent works on object replacement on data from the corresponding papers.

    M. Ovsjanikov, E. Corman, M. M. Bronstein, E. Rodolà, M. Ben-Chen, L. Guibas, F. Chazal, A. M. Bronstein, Computing and processing correspondences with functional maps, SIGGRAPH Courses, 2017 details

    Computing and processing correspondences with functional maps

    M. Ovsjanikov, E. Corman, M. M. Bronstein, E. Rodolà, M. Ben-Chen, L. Guibas, F. Chazal, A. M. Bronstein
    SIGGRAPH Courses, 2017
    Picture for Computing and processing correspondences with functional maps
    --->>

    Notions of similarity and correspondence between geometric shapes and images are central to many tasks in geometry processing, computer vision, and com