Copyright Notice

This material (namely, the preprints available on this page) is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Patents

Vincent Garcia
Systems and methods for stabilizing videos
Patent number: 10778895, Date of Patent: 2020-09-15
Vincent Garcia, and Jean Caillé
Systems and methods for creating compilations based on hierarchical clustering
Patent number: 10614114, Date of Patent: 2020-04-07
Tom Medioni, Vincent Garcia
Systems and methods for detecting moments within videos
Patent number: 10403326, Date of Patent: 2019-09-03
Tom Medioni, Vincent Garcia
Systems and methods for identifying speech based on cepstral coefficients and support vector machines
Patent number: 10403303, Date of Patent: 2019-09-03
Vincent Garcia, Maxime Schwab, Francois Lagunas
Systems and methods to distinguish between features depicted in images
Patent number: 10192143, Date of Patent: 2019-01-29
Vincent Garcia, Tom Medioni, Matthieu Rouif, Gabriel Lema, Francescu Santoni
Systems and methods to detect and correlate user responses to media content
Patent number: 10187690, Date of Patent: 2019-01-22
Vincent Garcia
Systems and methods for horizon identification in an image
Patent number: 10186036, Date of Patent: 2019-01-22
Tom Medioni, Vincent Garcia
Systems and methods for editing videos based on shakiness measures
Patent number: 9916863, Date of Patent: 2018-03-13

Journals

Keelin Murphy, Bram van Ginneken, Joseph M. Reinhardt, Sven Kabus, Kai Ding, Xiang Deng, Kunlin Cao, Kaifang Du, Gary E. Christensen, Vincent Garcia, Tom Vercauteren, Nicholas Ayache, Olivier Commowick, Gregoire Malandain, Ben Glocker, Nikos Paragios, Nassir Navab, Vladlena Gorbunova, Jon Sporring, Marleen de Bruijne, Xiao Han, Mattias P. Heinrich, Julia A. Schnabel, Mark Jenkinson, Cristian Lorenz, Marc Modat, Jamie R. McClelland, Sebastien Ourselin, Sascha E.A. Muenzing, Max A. Viergever, Dante De Nigris, D. Louis Collins, Tal Arbel, Marta Peroni, Rui Li, Gregory C. Sharp, Alexander Schmidt-Richberg, Jan Ehrhardt, Rene Werner, Dirk Smeets, Dirk Loeckx, Gang Song, Nicholas Tustison, Brian Avants, James C. Gee, Marius Staring, Stefan Klein, Berend C. Stoel, Martin Urschler, Manuel Werlberger, Jef Vandemeulebroucke, Simon Rit, David Sarrut, and Josien P.W. Pluim
Evaluation of Registration Methods on Thoracic CT: The EMPIRE10 Challenge
In IEEE Transactions on Medical Imaging, 2011. BibTex
EMPIRE10 (Evaluation of Methods for Pulmonary Image REgistration 2010) is a public platform for fair and meaningful comparison of registration algorithms which are applied to a database of intra-patient thoracic CT image pairs. Evaluation of non-rigid registration techniques is a non trivial task. This is compounded by the fact that researchers typically test only on their own data, which varies widely. For this reason, reliable assessment and comparison of different registration algorithms has been virtually impossible in the past. In this work we present the results of the launch phase of EMPIRE10, which comprised the comprehensive evaluation and comparison of 20 individual algorithms from leading academic and industrial research groups. All algorithms are applied to the same set of 30 thoracic CT pairs. Algorithm settings and parameters are chosen by researchers expert in the configuration of their own method and the evaluation is independent, using the same criteria for all participants. All results are published on the EMPIRE10 website (http://empire10.isi.uu.nl). The challenge remains ongoing and open to new participants. Full results from 24 algorithms have been published at the time of writing. This article details the organisation of the challenge, the data and evaluation methods and the outcome of the initial launch with 20 algorithms. The gain in knowledge and future work are discussed.
Vincent Garcia, and Frank Nielsen
Simplification and hierarchical representations of mixtures of exponential families
In Signal Processing, Vol. 90, Number 12, p. 3197-3212, 2010. BibTex
A mixture model in statistics is a powerful framework commonly used to estimate the probability measure function of a random variable. Most algorithms handling mixture models were originally specifically designed for processing mixtures of Gaussians. However, other distributions such as Poisson, multinomial, Gamma/Beta have gained interest in signal processing in the past decades. These common distributions are unified in the framework of exponential families in statistics. In this paper, we present three generic clustering algorithms working on arbitrary mixtures of exponential families: the Bregman soft clustering, the Bregman hard clustering, and the Bregman hierarchical clustering. These algorithms allow one to estimate a mixture model from observations, to simplify such a mixture model, and to automatically learn the ``optimal'' number of components in a simplified mixture model according to a resolution parameter. In addition, we present jMEF, an open source Java library allowing users to create, process and manage mixtures of exponential families. In particular, jMEF includes the three aforementioned Bregman clustering algorithms.

ArXiv

Frank Nielsen, and Vincent Garcia
Statistical exponential families: a digest with flash cards
arXiV, http://arxiv.org/abs/0911.4863, November 2009. BibTex
Vincent Garcia, Eric Debreuve, and Michel Barlaud
Fast k nearest neighbor search using GPU
arXiV, http://arxiv.org/abs/0804.1448, April 2008. BibTex

International Conferences and Workshops

Vincent Garcia, Olivier Commowick, and Gregoire Malandain
A Robust and Efficient Block Matching Framework for Non Linear Registration of Thoracic CT Images
In Proceedings of the Grand Challenges in Medical Image Analysis (MICCAI workshop), Beijing, China, September 2010. Preprint BibTex
The registration of thoracic images is a common but still challenging problem with critical clinical applications (e.g. radiotherapy and diagnosis). In the context of the EMPIRE10 challenge, we briefly introduce in this paper our registration method based on the diffeomorphic demons algorithm. Although fully automatic and generic (applies to a large variety of images such as brain or thoracic CT scans), the proposed method appears to be a very efficient registration method.
Vincent Garcia, Tom Vercauteren, Gregoire Malandain, and Nicholas Ayache
Diffeomorphic demons and the EMPIRE10 challenge
In Proceedings of the Grand Challenges in Medical Image Analysis (MICCAI workshop), Beijing, China, September 2010. Preprint BibTex
The registration of thoracic images is a challenging problem with essential clinical applications such as radiotherapy and diagnosis. In the context of the EMPIRE10 challenge, we briefly introduce a general robust and efficient algorithm to register automatically any type of scalar images (CT, MRI, ...) on virtually any location (brain, thorax, ...). Although fully automatic and generic, the proposed algorithm reached the $17^{th}$ place in the EMPIRE challenge over 34 algorithms evaluated. Moreover, we have since optimized further the parameter set used for the challenge and we demonstrate the ability of the algorithm to recover much better large displacements of the lungs boundaries.
Vincent Garcia, Eric Debreuve, Frank Nielsen, and Michel Barlaud
k-nearest neighbor search: fast GPU-based implementations and application to high-dimensional feature matching
In Proceedings of the IEEE International Conference on Image Processing (ICIP), Hong Kong, China, September 2010. Preprint BibTex
The k-nearest neighbor (kNN) search problem is widely used in domains and applications such as classification, statistics, and biology. In this paper, we propose two fast GPU-based implementations of the brute-force kNN search algorithm using the CUDA and CUBLAS APIs. We show that our CUDA and CUBLAS implementations are up to, respectively, 64X and 189X faster on synthetic data than the highly optimized ANN C++ library, and up to, respectively, 25X and 62X faster on high-dimensional SIFT matching.
Vincent Garcia, Frank Nielsen, and Richard Nock
Hierarchical Gaussian mixture model
In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas, USA, March 2010. Preprint BibTex
Gaussian mixture models (GMMs) are a convenient and essential tool for the estimation of probability density functions. Although GMMs are used in many research domains from image processing to machine learning, this statistical mixture modeling is usually complex and further needs to be simplified. In this paper, we present a GMM simplification method based on a hierarchical clustering algorithm. Our method allows one to first to quickly compute a compact version of the initial GMM, and second to automatically learn the optimal number of components of the simplified GMM. Using the framework of Bregman divergences, this simplification algorithm, although presented here for GMMs, is suitable for any mixture of exponential families.
Vincent Garcia, Frank Nielsen, and Richard Nock
Levels of details for Gaussian mixture models
In Proceedings of the Asian Conference on Computer Vision (ACCV), Xi’ an, China, September 2009. Preprint BibTex
Mixtures of Gaussians are a crucial statistical modeling tool at the heart of many challenging applications in computer vision and machine learning. In this paper, we first describe a novel and efficient algorithm for simplifying Gaussian mixture models using a generalization of the celebrated $k$-means quantization algorithm tailored to relative entropy. Our method is shown to compare experimentally favourably well with the state-of-the-art both in terms of time and quality performances. Second, we propose a practical enhanced approach providing a hierarchical representation of the simplified GMM while automatically computing the optimal number of Gaussians in the simplified mixture. Application to clustering-based image segmentation is reported.
Frank Nielsen, Vincent Garcia, and Richard Nock
Simplifying Gaussian mixture models via entropic quantization
In Proceedings of the European Signal Processing Conference (EUSIPCO), Glasgow, Scotland, August 2009. Preprint BibTex
Mixture models are a crucial statistical modeling tool at the heart of many challenging applications in computer vision, machine learning, and text classification among others. In this paper, we describe a novel and efficient algorithm for simplifying Gaussian mixture models using a generalization of the celebrated $k$-means quantization algorithm tailored to relative entropy in statistical distribution spaces. Our algorithm extends easily to arbitrary mixture of exponential families. The proposed method is shown to compare favourably well with the state-of-the-art unscented transform clustering algorithm both in terms of time and quality performances.
Vincent Garcia, and Frank Nielsen
Searching high-dimensional neighbours: CPU-based tailored data-structures versus GPU-based brute-force method
In Proceedings of the International MIRAGE Conference, Paris, France, May 2009. BibTex
Many image processing algorithms rely on nearest neighbor (NN) or on the k nearest neighbor (kNN) search problem. Several methods have been proposed to reduce the computation time, for instance using space partitionning. However, these methods are very slow in high dimensional space. In this paper, we propose a fast implementation of the brute-force algorithm using GPU (Graphics Processing Units) programming. We show that our implementation is up to 150 times faster than the classical approaches on synthetic data, and up to 75 times faster on real image processing algorithms (finding similar patches in images and texture synthesis).
Vincent Garcia, Eric Debreuve, and Michel Barlaud
Fast k nearest neighbor search using GPU
In Proceedings of the CVPR Workshop on Computer Vision on GPU (CVGPU), Anchorage, Alaska, USA, June 2008. Preprint BibTex
Statistical measures coming from information theory represent interesting bases for image and video processing tasks such as image retrieval and video object tracking. For example, let us mention the entropy and the Kullback-Leibler divergence. Accurate estimation of these measures requires to adapt to the local sample density, especially if the data are high-dimensional. The k nearest neighbor (kNN) framework has been used to define efficient variable-bandwidth kernel-based estimators with such a locally adaptive property. Unfortunately, these estimators are computationally intensive since they rely on searching neighbors among large sets of d-dimensional vectors. This computational burden can be reduced by pre-structuring the data, e.g. using binary trees as proposed by the Approximated Nearest Neighbor (ANN) library. Yet, the recent opening of Graphics Processing Units (GPU) to general-purpose computation by means of the NVIDIA CUDA API offers the image and video processing community a powerful platform with parallel calculation capabilities. In this paper, we propose a CUDA implementation of the ``brute force'' kNN search and we compare its performances to several CPU-based implementations including an equivalent brute force algorithm and ANN. We show a speed increase on synthetic and real data by up to one or two orders of magnitude depending on the data, with a quasi-linear behavior with respect to the data size in a given, practical range.
Vincent Garcia, Sylvain Boltz, Eric Debreuve, and Michel Barlaud
Outer-layer based tracking using entropy as a similarity measure
In Proceedings of the IEEE International Conference on Image Processing (ICIP), San Antonio, Texas, USA, September 2007. Preprint BibTex
Tracking can be achieved using region active contours based on homogeneity models (intensity, motion...). However the model complexity necessary to achieve a given accuracy might be prohibitive. Methods based on salient points may not extract enough of these for reliable motion estimation if the object is too homogeneous. Here we propose to compute the contour deformation based on its neighborhood. Motion estimation is performed at contour samples using a block matching approach. First, partial background masking is applied. Since outliers may then bias the motion estimation, a robust, nonparametric estimation using entropy as a similarity measure between blocks is proposed. Tracking results on synthetic and natural sequences are presented.
Vincent Garcia, Eric Debreuve, and Michel Barlaud
Tracking based on local motion estimation of spatio-temporally weighted salient points
In Proceedings of the International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Santorini, Greece, June 2007. Preprint BibTex
The extraction of a video object contour, called ``rotoscoping'' in cinematographic post-production, is usually performed manually and frame by frame. Semi-automatic algorithms have been proposed to reduce the load of this task. However, they classically use region information and are usually based on a notion of homogeneity of the object. This homogeneity description might be difficult to establish and, consequently, the tracking may be not precise enough. The proposed method relies on the analysis of some temporal trajectories of salient points, or keypoints, called tracks. The main contribution of this paper is the local estimation, both spatially and temporally, of the contour motion from these tracks. The proposed method seems accurate, robust to outliers, and allows local deformation. Moreover, it can deal with partial occlusions.
Vincent Garcia, Eric Debreuve, and Michel Barlaud
Region-of-interest tracking based on keypoint trajectories on a group of pictures
In Proceedings of the IEEE International Workshop on Content-Based Multimedia Indexing (CBMI), Bordeaux, France, June 2007. Preprint BibTex
This paper deals with region-of-interest (ROI) tracking for applications such as video surveillance or cinematographic post-production. An ROI is typically delineated by a bounding box or a basic shape such as an ellipse in the first frame of the video. The tracking problem consists in detecting the ROI throughout the video as it moves and deforms. This detection can be done based on the full content of the ROI. However, since the ROI is, by definition, an approximate segmentation of the actual object of interest, it includes some background. This can make the ROI detection less accurate and induce a drift. Instead, we propose to use keypoint extractors and local descriptors combined with robust motion estimation. The motion estimation relies on the analysis of the temporal trajectories, or tracks, of the keypoints in groups of pictures (GOP). Some results are presented on natural sequences. The proposed method seems accurate.
Vincent Garcia, Eric Debreuve, and Michel Barlaud
Méthode de suivi d'objets basée sur des trajectoires temporelles de points d'intérêt
In Proceedings of the Colloque GRETSI, Groupe d'Etudes du Traitement du Signal et des Images, Troyes, France, September 2007. Preprint BibTex
Ce papier traite du problème de suivi de régions d'intérêt (RI) dans une vidéo, à partir d'une initialisation manuelle, pour des applications telles que la vidéo surveillance ou la post-production cinématographique. Plus exactement, nous étudions le suivi de RI réalisé à partir de trajectoires temporelles de points d'intérêt. Les méthodes classiques utilisant de telles trajectoires sont faussées dès lors qu'elles travaillent à partir d'un nombre trop faible d'observations. Nous proposons dans ce papier d'augmenter ce nombre en estimant le mouvement de la RI sur un groupe d'images. Nous montrons sur des exemples réels que notre approche est précise, fiable, et qu'elle permet d'améliorer la qualité de suivi par rapport à une estimation de mouvement sur seulement deux images.
Vincent Garcia, Eric Debreuve, and Michel Barlaud
A contour tracking algorithm for rotoscopy
In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Toulouse, France, May 2006. Preprint BibTex
Rotoscopy is a crucial processing for cinema post-production. It corresponds to the segmentation of an object of interest in a video in order to apply a local processing. In the industry, rotoscopy is usually performed manually, frame by frame. Semi-automatic algorithms have been proposed to reduce the load of this task. However, they classically use contour-based information and consequently lack robustness in case of occlusion. Here, we propose a region-based contour tracking algorithm relying on feature points which are temporally matched to build trajectories used to estimate a global or local deformation between a distant reference contour and the current frame. Then, we propose a rotoscopy algorithm based on a forward and a backward contour tracking. The use of region information and distant reference contours allows to avoid drift and greatly reduce the influence of occlusions. The rotoscopy algorithm was applied to CIF and SD sequences.
Vincent Garcia, Sylvain Boltz, Eric Debreuve, and Michel Barlaud
Contour tracking for rotoscoping based on trajectories of feature points
In Proceedings of the ECCV Workshop on Statistical Methods in Multi-Image and Video Processing (SMVP), Graz, Austria, May 2006. Preprint BibTex
Rotoscoping is the generic term for methods that consist in defining the contour of a moving object in the frames of a video in order to apply a local processing. It is usually performed manually and frame by frame in the cinema industry. Semi-automatic algorithms have been proposed to reduce the load of this task. However they classically use contour-based information and consequently lack robustness in the presence of occlusions. We propose a rotoscoping method based on tracking in both the forward and backward directions. Each tracking relies on a motion estimation performed in group of pictures. Motion is estimated from temporal trajectories of region-based feature points by minimizing the entropy of the residual. The use of trajectories brings robustness to time-variant statistics due to occlusions and the use of entropy brings robustness to outliers. The proposed method seems accurate and allows to cope with large object occlusions.

Theses

Vincent Garcia
Ph.D. Thesis: Suivi d'objets d'intérêt dans une séquence d'images : des points saillants aux mesures statistiques
Université de Nice - Sophia Antipolis, Sophia Antipolis, France, December 2008. Thesis BibTex
Vincent Garcia
Master Thesis: Estimation de mouvement subpixélique par blocs adaptée à la couleur avec modèle de mouvement
Université de Nice - Sophia Antipolis, Sophia Antipolis, France, September 2004. Thesis BibTex

Seminars

Vincent Garcia
Fast k Nearest Neighbor Search using CPU and GPU for Computer Vision algorithms
Invited by Pr. Denis Caromel at INRIA, Sophia Antipolis, France, July 2009. BibTex
Vincent Garcia
A contour tracking algorithm for rotoscopy
Invited by Pr. Janusz Konrad at Boston University, Boston, Massachusset, USA, November 2005. BibTex