VISAPP 2008 Abstracts
Conference
Area 1 - Image Formation and Processing
Area 2 - Image Analysis
Area 3 - Image Understanding
Area 4 - Motion, Tracking and Stereo Vision

Special Sessions
Bayesian Approach for Inverse Problems in Computer Vision
Online Pattern Recognition and Machine Learning Techniques for Computer-Vision Applications

Workshops
VISAPP International Workshop on Robotic Perception (VISAPP-RoboPerc08)
The First International Workshop on Metadata Mining for Image Understanding (MMIU 2008)
The First International Workshop on Image Mining. Theory and Applications (IMTA 2008)


Area 1 - Image Formation and Processing

Paper Nr.:
18
Title:
BACKGROUND SEGMENTATION IN MICROSCOPY IMAGES
Author(s):
J.J. Charles, L.I. Kuncheva, B. Wells and I.S. Lim
Abstract:
In many applications it is necessary to segment the foreground of an image from the background. However images from microscope slides illuminated using transmitted light have uneven background light levels. The non-uniform illumination makes segmentation difficult. We propose to fit a set of parabolas in order to segment the image into background and foreground. Parabolas are fitted separately on horizontal and vertical stripes of the grey level intensity image. A pixel is labelled as background or foreground based on the two corresponding parabolas. The proposed method outperforms the following four standard segmentation techniques, (1) thresholding determined manually or by fitting a mixture of Gaussians, (2) clustering in the RGB space, (3) fitting a two-argument quadratic function on the whole image and (4) using the morphological closure method.

Paper Nr.:
53
Title:
MULTIPLE VIEW GEOMETRY FOR MIXED DIMENSIONAL CAMERAS
Author(s):
Kazuki Kozuka and Jun Sato
Abstract:
In this paper, we analyze the multiple view geometry under the case where various dimensional imaging sensors are used together. Although the multiple view geometry has been studied extensively and extended for more general situations, all the existing multiple view geometries assume that the scene is observed by the same dimensional imaging sensors, such as 2D cameras. In this paper, we show that there exist multilinear constraints on image coordinates, even if the dimensions of camera images are different each other. The new multilinear constraints can be used for describing the geometric relationships between 1D line sensors, 2D cameras, 3D range sensors etc., and for calibrating mixed sensor systems.

Paper Nr.:
85
Title:
ACCURACY IMPROVEMENTS AND ARTIFACTS REMOVAL IN EDGE BASED IMAGE INTERPOLATION
Author(s):
Nicola Asuni and Andrea Giachetti
Abstract:
In this paper we analyse the problem of general purpose image upscaling that preserves edge features and natural appearance and we present the results of subjective and objective evaluation of images interpolated using different algorithms. In particular, we consider the well-known NEDI (New Edge Directed Interpolation, Li and Orchard, 2001) method, showing that by modifying it in order to reduce numerical instability and making the region used to estimate the low resolution covariance adaptive, it is possible to obtain relevant improvements in the interpolation quality. The implementation of the new algorithm (iNEDI, improved New Edge Directed Interpolation), even if computationally heavy (as the Li and Orchard’s method), obtained, in both subjective and objective tests, quality scores that are notably higher than those obtained with NEDI and other methods presented in the literature

Paper Nr.:
117
Title:
IMAGE INPAINTING CONSIDERING BRIGHTNESS CHANGE AND SPATIAL LOCALITY OF TEXTURES
Author(s):
Norihiko Kawai, Tomokazu Sato and Naokazu Yokoya
Abstract:
Image inpainting techniques have been widely investigated to remove undesired visual objects in images such as damaged portions of photographs and people who have accidentally entered into pictures. Conventionally, the missing parts of an image are completed by optimizing the objective function which is defined based on pattern similarity between the missing region and the rest of the image. However, unnatural textures are easily generated due to two factors: (1) available samples in the data region is quite limited, (2) pattern similarity is one of the required conditions but is not sufficient for reproducing natural textures. In this paper, in order to improve the image quality of completed texture, the objective function is extended by allowing brightness changes of sample textures (for(1)) and introducing spatial locality as an additional constraint (for(2)). The effectiveness of these extensions is successfully demonstrated by applying the proposed method to one hundred images and comparing the results with those obtained by the conventional methods.

Paper Nr.:
131
Title:
CONSTRAIN PROPAGATION FOR GHOST REMOVAL IN HIGH DYNAMIC RANGE IMAGES
Author(s):
Matteo Pedone and Janne Heikkilä
Abstract:
Creating high dynamic range images of non-static scenes is currently a challenging task. Carefully preventing strong camera shakes during shooting and performing image-registration before combining the exposures cannot ensure that the resulting hdr image is consistent. This is eventually due to the presence of moving objects in the scene that causes the so called ghosting artifacts. Different approaches have been developed so far in order to reduce the visible effects of ghosts in hdr images. Our iterative method propagates the influences of pixels that have low chances to belong to the static part of the scene through an image-guided energy minimization approach. Results produced with our technique show a significant reduction or total removal of ghosting artifacts.

Paper Nr.:
140
Title:
DATA EVALUATION FOR DEPTH CALIBRATION OF A CUSTOMARY PMD RANGE IMAGING SENSOR CONSIDERING OBJECTS WITH DIFFERENT ALBEDO
Author(s):
Jochen Radmer, Alexander Sabov and Jörg Krüger
Abstract:
For various applications, such as object recognition or tracking and especially when the object is partly occluded or articulated, 3D information is crucial for the robustness of the application. A recently developed sensor to aquire distance information is based on the Photo Mixer Device \textit{(PMD)} technique. Lateral and depth calibration has been carried out on a modified research sensor without considering the reflectivity of the objects. For a customary sensor of this type data evaluation and depth calibration has not been carried out yet. For that reason this paper focuses on data evaluation and depth calibration of a customary sensor. In addition, the dependence of the reflectivity of the considered objects on the distance measurement is incorporated which had not been considered yet at all.

Paper Nr.:
142
Title:
ANISOTROPIC DIFFUSION BY QUADRATIC REGULARIZATION
Author(s):
Marcus Hund and Bärbel Mertsching
Abstract:
Based on a regularization formulation of the problem,we present a novel approach to anisotropic diffusion that brings up a clear and easy-to-implement theory containing a problem formulation with existence and uniqueness of the solution. Unlike many iterative applications, we present a clear condition for the step size ensuring the convergence of the algorithm. The capability of our approach is demonstrated on a variety of well known test images.

Paper Nr.:
144
Title:
ACCELERATED SKELETONIZATION ALGORITHM FOR TUBULAR STRUCTURES IN LARGE DATASETS BY RANDOMIZED EROSION
Author(s):
Gerald Zwettler, Franz Pfeifer, Roland Swoboda and Werner Backfrieder
Abstract:
Skeletonization is an important procedure in morphological analysis of three-dimensional objects. A simplified object geometry allows easy semantic interpretation at the cost of high computational effort. This paper introduces a fast morphological thinning approach for skeletonization of tubular structures and objects of arbitrary shape. With minimized constraints for erosions at the surface, hit-ratio is increased allowing high performance thinning with large datasets. Time consuming neighbourhood checking is solved by use of fast indexing lookup tables. The novel algorithm homogenously erodes the object’s surface, resulting in an accurate extraction of the centerline, even when the medial axis is placed between actual voxel-grid. The thinning algorithm is applied for vessel tree analysis in the field of computer-based medical diagnostics, thus meeting high robustness and performance requirements.

Paper Nr.:
169
Title:
HISTORICAL DOCUMENT IMAGE BINARIZATION
Author(s):
Carlos A.B.Mello, Adriano L.I.Oliveira and Ángel Sánchez
Abstract:
Preservation and publishing historical documents is an important issue which has gained more and more interest over the years. Digital media has been used to storage digital versions of the documents as image files. However, this digital image needs huge storage space as usually the documents are digitized in high resolutions and in true colour for preservation purposes. In order to make easier the access to the images they can be converted into bi-level images. We present in this work a new method composed by two algorithms for binarization of historical document images based on Tsallis entropy. The new method was compared to several other well-known threshold algorithms and it achieved the best quantitative results when compared to the gold standard images of the documents measuring the values of precision, recall, accuracy, specificity, peak signal-to-noise ratio and mean square error.

Paper Nr.:
170
Title:
A NEW RELIABILITYMEASURE FOR ESSENTIAL MATRICES SUITABLE IN MULTIPLE VIEWCALIBRATION
Author(s):
Jaume Vergés-Llahí, Daniel Moldovan and Toshikazu Wada
Abstract:
This paper presents a new technique to recover structure and motion from a large number of images acquired by an intrinsically calibrated perspective camera. We describe a method for computing reliable camera motion parameters that combines (1) a camera dependency graph and (2) an algorithm for computing the weights on the edges. A new criterion for evaluating the reliability of epipolar constraint is introduced. It is composed of unreliability of Kanatani's renormalization process and the decomposition error between the estimated matrix encoding the epipolar constraint and the decomposed motion parameters. Experimental results show that there exist a clear correlation between the proposed criterion and the error in the estimation of motion parameters. The performance of the proposed method is demonstrated on a long sequence of short base-line images.

Paper Nr.:
207
Title:
SELF-CALIBRATION OF CENTRAL CAMERAS BY MINIMIZING ANGULAR ERROR
Author(s):
Juho Kannala, Sami S. Brandt and Janne Heikkilä
Abstract:
This paper proposes a generic self-calibration method for central cameras. The method requires two-view point correspondences and estimates both the internal and external camera parameters by minimizing angular error. In the minimization, we use a generic camera model which is suitable for central cameras with different kinds of radial distortion models. The proposed method can be hence applied to a large range of cameras from narrow-angle to fish-eye lenses and catadioptric cameras. Here the camera parameters are estimated by minimizing the angular error which does not depend on the 3D coordinates of the point correspondences. However, the error still has several local minima and in order to avoid these we propose a multi-step optimization approach. This strategy also has the advantage that it can be used together with RANSAC to provide robustness for false matches. We demonstrate our method in experiments with synthetic and real data.

Paper Nr.:
224
Title:
HIGH-SPEED IMAGE FEATURE DETECTION USING FPGA IMPLEMENTATION OF FAST ALGORITHM
Author(s):
Marek Kraft, Adam Schmidt and Andrzej Kasínski
Abstract:
Many of contemporary computer and machine vision applications require finding of corresponding points across multiple images. To that goal, among many features, the most commonly used are corner points. Corners are formed by two or more edges, and mark the boundaries of objects or boundaries between distinctive object parts. This makes corners the feature points that used in a wide range of tasks. Therefore, numerous corner detectors with different properties have been developed. In this paper, we present a complete FPGA architecture implementing corer detection. This architecture is based on the FAST algorithm. The proposed solution is capable of processing the incoming image data with the speed of hundreds of frames per second for a 512 x 512, 8-bit gray-scale image. The speed is comparable to the results achieved by top-of-the-shelf general purpose processors. However, the use of inexpensive FPGA allows to cut costs, power consumption and to reduce the footprint of a complete system solution. The paper includes also a brief description of the implemented algorithm, resource usage summary, resulting images, as well as block diagrams of the described architecture.

Paper Nr.:
235
Title:
FILLING-IN GAPS IN TEXTURED IMAGES USING BIT-PLANE STATISTICS
Author(s):
E. Ardizzone, H. Dindo and G. Mazzola
Abstract:
In this paper we propose a novel approach for the texture analysis-synthesis problem, with the purpose to restore missing zones in greyscale images. Bit-plane decomposition is used, and a dictionary is build with bit-blocks statistics for each plane. Gaps are reconstructed with a conditional stochastic process, to propagate texture global features into the damaged area, using information stored in the dictionary. Our restoration method is simple, easy and fast, with very good results for a large set of textured images. Results are compared with a state-of-the-art restoration algorithm.

Paper Nr.:
241
Title:
BINARY MORPHOLOGY AND RELATED OPERATIONS ON RUN-LENGTH REPRESENTATIONS
Author(s):
Thomas M. Breuel
Abstract:
Binary morphology on large images is compute intensive, in particular for large structuring el- ements. Run-length encoding is a compact and space-saving technique for representing images. This paper describes how to implement binary morphology directly on run-length encoded binary images for rectangular structuring elements. In addition, it describes efficient algorithm for trans- posing and rotating run-length encoded images. The paper evaluates and compares run length morphologial processing on page images from the UW3 database with an efficient and mature bit blit-based implementation and shows that the run length approach is several times faster than bit blit-based implementations for large images and masks. The experiments also show that com- plexity decreases for larger mask sizes. The paper also demonstrates running times on a simple morphology-based layout analysis algorithm on the UW3 database and shows that replacing bit blit morphology with run length based morphology speeds up performance approximately two-fold.

Paper Nr.:
274
Title:
A STUDY ON ILLUMINATION NORMALIZATION FOR 2D FACE VERIFICATION
Author(s):
Qian Tao and Raymond Veldhuis
Abstract:
Illumination normalization is very important for 2D face verification. This study examines the state-of-art illumination normalization methods, and proposes two solutions, namely horizontal Gaussian derivative filters and local binary patterns. Experiments show that our methods significantly improve the generalization capability, while maintaining good discrimination capability of a face verification system. The proposed illumination normalization methods have low requirements on image acquisition, and low computation complexities, and are very suitable for low-end 2D face verification systems.

Paper Nr.:
275
Title:
FACE HALLUCINATION USING PCA IN WAVELET DOMAIN
Author(s):
Abdu Rahiman V. and Jiji C. V.
Abstract:
The term face hallucination stands for recognition based super resolution of face images to improve the spatial resolution. In this paper, we propose two face hallucination algorithms based on principal component analysis (PCA) in the wavelet transform domain. In the spatial domain, PCA based super resolution algorithms, a low resolution (LR) observation is represented as the linear combination of LR images in an image database. Super resolved image is obtained as the linear combination of the corresponding HR images in the database. In the first approach proposed in this paper, PCA based hallucination algorithm is applied to the wavelet coefficients of face image. The hallucinated face image is reconstructed from the super resolved wavelet coefficients. In second method, face image is split in to four sub images and the first method is separately applied to three textured regions. Fourth region, which is relatively smooth, is interpolated using standard interpolation techniques. We compare the performance of the two proposed algorithms with their spatial domain counter parts. The proposed method shows significant improvement over the spatial domain approaches.

Paper Nr.:
295
Title:
FREE-VIEW POINT TV WATERMARKING EVALUATED ON GENERATED ARBITRARY VIEWS
Author(s):
Evlambios E. Apostolidis and Georgios A. Triantafyllidis
Abstract:
The recent advances in Image Based Rendering (IBR) has pioneered a new technology, free view point television, in which TV-viewers select freely the viewing position and angle by the application of IBR on the transmitted multi-view video. In this paper, exhaustive tests were carried out to conclude to the best possible free view-point TV watermarking evaluated on arbitrary views. The watermark should not only be extracted from a generated arbitrary view, it should also be resistant to common video processing and multi-view video processing operations.

Paper Nr.:
298
Title:
MULTI-ERROR CORRECTION OF IMAGE FORMING SYSTEMS BY TRAINING SAMPLES MAINTAINING COLORS
Author(s):
Gerald Krell and Bernd Michaelis
Abstract:
Optical and electronic components of image forming devices degrade objective and subjective quality of the acquired or reproduced images. Classical restoration techniques usually require an explicit estimation or measurement of parameters for each error source. We propose to derive restoration parameters in a training phase with suitable test patterns for a particular system to be corrected. Space varying properties of different classes of image degradations are considered simultaneously. It is shown how training is performed in such a way that colors are reproduced correctly independently of the used test patterns.

Paper Nr.:
307
Title:
PROGRESSIVE DCT BASED IMAGE CODEC USING STATISTICAL PARAMETERS
Author(s):
Pooneh Bagheri Zadeh, Tom Buggy and Akbar Sheikh Akbari
Abstract:
This paper presents a novel progressive statistical and discrete cosine transform based image-coding scheme. The proposed coding scheme divides the input image into a number of non-overlapping pixel blocks. The coefficients in each block are then decorrelated into their spatial frequencies using a discrete cosine transform. Coefficients with the same spatial frequency at different blocks are put together to generate a number of matrices, where each matrix contains coefficients of a particular spatial frequency. The matrix containing DC coefficients is losslessly coded to preserve visually important information. Matrices, which consist of high frequency coefficients, are coded using a novel statistical encoder developed in this paper. Perceptual weights are used to regulate the threshold value required in the coding process of the high frequency matrices. The coded matrices generate a number of bitstreams, which are used for progressive image transmission. The proposed coding scheme, JPEG and JPEG2000 were applied to a number of test images. Results show that the proposed coding scheme outperforms JPEG and JPEG2000 subjectively and objectively at low compression ratios. Results also indicate that the decoded images using the proposed codec have superior subjective quality at high compression ratios compared to that of JPEG, while offering comparable results to that of JPEG2000.

Paper Nr.:
316
Title:
EVOLVING ROI CODING IN H.264 SVC
Author(s):
Syeda Shamikha F. Shah and Eran A. Edirisinge
Abstract:
Region-of-Interest (ROI) based coding is an integral feature of most image/video coding techniques/standards and has im-portant applications in content based video coding, storage and transmission. However, in the latest scalable extension of H.264 AVC video coding standard, i.e. H.264 SVC, motion estimation across the slice group boundaries does not preserve the coding quality and compression rate of the ROI. In this paper novel enhancements to the ROI based coding for H.264 SVC have been proposed to constrain the inter frame prediction across slice group boundaries. We show that the proposed algorithms do not negatively affect the rate-distortion performance of the coded video, but provide useful additional functionality that enables the extended use of the standard in many new application domains. Further, we pro-pose a method for supporting the coding of moving ROI in the scalable video coding domain, by adaptively changing the shape, size and position of the slice groups. We show that this additional functionality is particularly useful in video surveil-lance applications to effectively compress and transmit the ROI and reduce the storage and transmission requirements without any quality degradation of the ROI.

Paper Nr.:
325
Title:
APPROXIMATE POINT-TO-SURFACE REGISTRATION WITH A SINGLE CHARACTERISTIC POINT
Author(s):
Darko Dimitrov, Christian Knauer, Klaus Kriegel and Fabian Stehn
Abstract:
We present approximation algorithms for point-to-surface registration problems which have applications in medical navigation systems. One of the central tasks of such a system is to determine a "good" mapping (the registration transformation or em registration for short) of the coordinate system of the operation theatre onto the coordinate system of a 3D model M of a patient, generated from CR- or MRT scans. The registration q is computed by matching a 3D point set P measured on the skin of the patient to the 3D model M. It is chosen from a class R of admissible transformations (e.g., rigid motions) so that it approximately minimizes a suitable error function e (such as the directed Hausdorff or mean squared error distance) between q(P) and M, i.e., q = arg min_(q' elementOf R) e(q'(P), M). A common technique to support the registration process is to determine either automatically or manually so-called characteristic points or landmarks, which are corresponding points on the model and in the point set. Since corresponding characteristic points are supposed to be mapped onto (or close to) each other, this reduces the number of degrees of freedom of the matching problem. We provide approximation algorithms which compute a rigid motion registration in the most difficult setting of only a single characteristic point.

Paper Nr.:
336
Title:
USE OF SPATIAL ADAPTATION FOR IMAGE RENDERING BASED ON AN EXTENSION OF THE CIECAM02
Author(s):
Olivier Tulet, Mohamed-Chaker Larabi and Christine Maloigne Fernandez
Abstract:
With the development and the multiplicity of imaging devices, the color quality and portability have become a very challenging problem. Moreover, a color is perceived with regards to its environment. In order to take into account the variation of perceptual vision in function of environment, the CIE (Commission Internationale de l'éclairage) has standardized a tool named color appearance model (CIECAM97*, CIECAM02). These models are able to take into account many phenomena related to human vision of color and can predict the color of a stimulus, function of its observations conditions. However, these models do not deal with the influence of spatial frequencies which can have a big impact on our perception. In this paper, an extended version of the CIECAM02 was presented. This new version integrates a spatial model correcting the color in relation to its spatial frequency and its environment. Moreover, a study on the influence of the background’s chromaticity has been also performed. The obtained results are sound and demonstrate the efficiency of the proposed extension.

Paper Nr.:
341
Title:
LIMITED ANGLE IMAGE RECONSTRUCTION USING FOUR HIGH RESOLUTION PROJECTION AXES AT CO-PRIME RATIO VIEW ANGLES
Author(s):
Anastasios L. Kesidis
Abstract:
This paper proposes a sequential image reconstruction algorithm for the exact reconstruction of an image from a limited number of projection angles. Specifically, four projection axes oriented at coprime ratio view angles are used. The set of proper values for the view angles as well as the overall number of samples on the projection axis are explicitly defined and are related only to the dimensions of the image. The slopes of the four projection axes are calculated according to the chosen view angle and are symmetrically oriented with respect to the horizontal and the vertical axis. The reconstruction is a non-iterative, one pass process based on a decomposition sequence which defines the order in which the image pixels are restored. Several simulation results are provided that demonstrate the feasibility of the proposed method.

Paper Nr.:
361
Title:
AN AUTOMATED VISUAL EVENT DETECTION SYSTEM FOR CABLED OBSERVATORY VIDEO
Author(s):
Danelle E. Cline, Duane R. Edgington and Jérôme Mariette
Abstract:
This paper presents an overview of a system for processing video streams from underwater cabled observatory systems based on the Automated Visual Event Detection (AVED) software. This system identifies potentially interesting visual events using a neuromorphic vision algorithm and tracks events frame-by-frame. The events can later be previewed or edited in a graphical user interface for false detections, and subsequently imported into a database, or used in an object classification system. We present a scaleable processing system that can be used on a single computer, a Beowulf cluster, or a pool of computers, using the Condor workflow management system.

Paper Nr.:
370
Title:
EDGE-PRESERVING SMOOTHING OF NATURAL IMAGES BASED ON GEODESIC TIME FUNCTIONS
Author(s):
Jacopo Grazzini and Pierre Soille
Abstract:
In this paper, we address the problem of edge-preserving smoothing of natural images. We introduce a novel adaptive approach derived from mathematical morphology as a preprocessing stage in feature extraction and/or image segmentation. Likewise other filtering methods, it assumes that the local neighbourhood of a pixel contains the essential process required for the estimation of local properties. It performs a weighted averaging by combining both spatial and tonal information in a single similarity measure based on the local calculation of geodesic time functions from the estimated pixel. By designing relevant geodesic masks, it can deal with specific situation and type of images. We describe in the following two possible strategies and we show their capabilities at smoothing heterogeneous areas while preserving relevant structures in natural images greyscale and color displaying different features.

Paper Nr.:
382
Title:
COLOR QUANTIZATION BY MORPHOLOGICAL HISTOGRAM PROCESSING
Author(s):
Franklin César Flores, Leonardo Bespalhuk Facci and Roberto de Alencar Lotufo
Abstract:
In a previous paper it was proposed a graylevel quantization method by morphological histogram processing. This paper introduces the extension of that quantization method to color images. Considering an image under the RGB color space model, this extension reduces the number of colors in the image by partitioning an 3-D histogram, similar to the RGB color space, in rectangular parallelepiped regions, through a iterative process. Such partitioning is done, in each interation, by application of the graylevel quantization method to the longest dimension of the current region which have the greatest volume. The final classified color space is used to quantize the image. This paper also shows the comparison of the proposed method to the classical median cut one.

Paper Nr.:
427
Title:
FINGERPRINT IMAGE SEGMENTATION BASED ON BOUNDARY VALUES
Author(s):
M. Usman Akram, Anam Tariq, Shahida Jabeen and Shoab A. Khan
Abstract:
Fingerprint image segmentation highly influences the performances of Automatic Fingerprint Identification System( AFIS). We propose a new enhanced segmentation technique based on unwanted boundary area gray-level values. The objective of fingerprint segmentation is to extract the region of interest(ROI) which contains the desired fingerprint impression. We present in this paper, a Modified Gradient Based Method to extract ROI. The distinct feature of our technique is that it gives high accurate segmentation percentage for fingerprint images even in case of low quality fingerprint images. The proposed algorithm is applied on FVC2004 database. Experimental results demonstrate the improved performance of the proposed scheme.

Paper Nr.:
547
Title:
NEWBORN’S BIOMETRIC IDENTIFICATION: CAN IT BE DONE?
Author(s):
Daniel Weingaertner, Olga Regina Pereira Bellon, Luciano Silva and Mônica Nunes Lima Cat
Abstract:
In this article we propose a novel biometric identification method for newborn babies using their palmprints. A new high resolution optical sensor was developed, which obtains images with enough ridge minutiae to uniquely identify the baby. The palm and footprint images of 106 newborns were analysed, leading to the conclusion that palmprints yield more detailed images then footprints. Fingerprint experts from the Identifcation Institute of Paraná State performed two matching tests, resulting in a correct identification rate of $63.3\%$ and $67.7\%$, more than three times higher than that obtained on similar experiments described on literature. The proposed image acquisition method also opens the perspective for the creation of an automatic identification system for newborns.

Area 2 - Image Analysis

Paper Nr.:
3
Title:
SEMI-SUPERVISED DIMENSIONALITY REDUCTION USING PAIRWISE EQUIVALENCE CONSTRAINTS
Author(s):
Hakan Cevikalp, Jakob Verbeek, Frédéric Jurie and Alexander Kläser
Abstract:
To deal with the problem of insufficient labeled data, usually side information - given in the form of pairwise equivalence constraints between points - is used to discover groups within data. However, existing methods using side information typically fail in cases with high-dimensional spaces. In this paper, we address the problem of learning from side information for high-dimensional data. To this end, we propose a semi-supervised dimensionality reduction scheme that incorporates pairwise equivalence constraints for finding a better embedding space, which improves the performance of subsequent clustering and classification phases. Our method builds on the assumption that points in a sufficiently small neighborhood tend to have the same label. Equivalence constraints are employed to modify the neighborhoods and to increase the separability of different classes. Experimental results on high-dimensional image data sets show that integrating side information into the dimensionality reduction improves the clustering and classification performance.

Paper Nr.:
13
Title:
CLUSTERED CELL SEGMENTATION - Based on Iterative Voting and the Level Set Method
Author(s):
Arjan Kuijper, Yayun Zhou and Bettina Heise
Abstract:
In this paper we deal with images in which the cells cluster together and the boundaries of the cells are ambiguous. Combining the outcome of an automatic point detector with the multiphase level set method, the centre of each cell is detected and used as the ”seed”, in other words, the initial condition for level set method. Then by choosing appropriate level set equation, the fronts of the seeds propagate and finally stop near the boundary of the cells. This method solves the cluster problem and can distinguish individual cells properly, therefore it is useful in cell segmentation. By using this method, we can count the number of the cells and calculate the area of each cell. Furthermore, this information can be used to get the histogram of the cell image.

Paper Nr.:
15
Title:
CORNER DETECTION WITH MINIMAL EFFORT ON MULTIPLE SCALES
Author(s):
Ernst D. Dickmanns
Abstract:
Based on results of fitting linearly shaded blobs to rectangular image regions a new corner detector has been developed. Theoretical results for a plane fit with least sum of errors squared to the intensity distribution within a mask having four mask elements of same rectangular shape and size with averaged intensity values in these mask elements, allow very efficient simultaneous computation of pyramid levels and a new corner criterion at the center of the masks on these levels. The method is intended for real-time application and has thus been designed for minimal computing effort. It nicely fits into the ‘Unified Blob-edge-corner Method’ (UBM) developed recently. Results are given for road scenes.

Paper Nr.:
17
Title:
A MODEL-BASED APPROACH TO SHAPE FROM FOCUS
Author(s):
R. R. Sahay and A. N. Rajagopalan
Abstract:
Shape from focus (SFF) estimates the structure of a 3D object using the degree of focus as a cue in a sequence of observations. The estimate of the depth profile is however, vulnerable to lack of sufficient scene texture. In this paper, we propose a method to improve the estimate of the structure of the object by exploiting neighbourhood dependencies. A degradation model is used to describe the formation of space-variantly blurred observations in SFF. The shape of the object is modeled as a Markov random field and a suitably derived objective function is minimized to arrive at the final estimate of the shape.

Paper Nr.:
44
Title:
A NOVEL CHAOTIC CODING SYSTEM FOR LOSSY IMAGE COMPRESSION
Author(s):
Sebastiano Battiato and Francesco Rundo
Abstract:
In this paper a novel image compression pipeline, by making use of a controlled chaotic system, is proposed. Chaos is a particular dynamic generated by nonlinear systems. Under certain conditions it is possible to properly manage the chaotic dynamics obtaining very feasible and powerful workinginstruments. In the proposed compression pipeline a linear feedback control strategy has been used to stabilize chaotic dynamic used to track the 1D signal generated by the input image. The pipeline is closed by an entropy encoder. Preliminary experiments and comparison with respect to standard JPEG engine confirm the effectiveness of the proposed chaotic coding system both for natural and graphic images. Also the overall performances in terms of rate-distortion capabilities are promising.

Paper Nr.:
46
Title:
ROBUST ESTIMATION OF THE PAN-ZOOM PARAMETERS FROM A BACKGROUND AREA IN CASE OF A CRISS-CROSSING FOREGROUND OBJECT
Author(s):
J. Bruijns
Abstract:
In the field of video processing, a model of the background motion has application in deriving depth from motion. The pan-zoom parameters of our background model are estimated from the motion vectors of parts which are a priori likely to belong to the background, such as the top and side borders ("the background area"). This fails when a foreground object obscures the greater part of this background area. We have developed a method to extract a set of pan-zoom parameters for each different part of the background area. Using the pan-zoom parameters of the previous frame, we compute from these sets the pan-zoom parameters most likely to correspond to the proper background parts. This background area partition method gives more accurate pan parameters for shots with the greater part of the background area obscured by one or more foreground objects than application of the entire background area.

Paper Nr.:
52
Title:
BINARY IMAGE SKELETON - Continuous Approach
Author(s):
Leonid Mestetskiy and Andrey Semenov
Abstract:
In this paper we propose a correct model building technique of continuous skeleton for discrete binary image. Our approach is based on approximation of each connected figure in image by a polygonal figure. Figure boundary consists of minimal perimeter closed paths which separate points of foreground and background. Figure skeleton is constructed as a locus of centers of maximal inscribed circles. A so-called skeletal base is build from figure skeleton by cutting of essential noise. It is shown, that the constructed continuous skeleton exists and is unique for each binary image. There are the following advantages of derived continuous skeleton: strict mathematical description, stability to noise, broad capabilities of form transformations and shape comparison of objects. There is a substantial advantage in speed of skeleton construction of proposed approach in comparison with discrete methods, including those in which parallel calculations are used. This advantage is demonstrated on real images of big size.

Paper Nr.:
54
Title:
EFFECT OF FACIAL EXPRESSIONS ON FEATURE-BASED LANDMARK LOCALIZATION IN STATIC GREY SCALE IMAGES
Author(s):
Yulia Gizatdinova and Veikko Surakka
Abstract:
The present aim was to examine effect of facial expressions on the feature-based landmark localization in static grey scale images. In the method, local oriented edges were extracted and edge maps of the image were constructed at two levels of resolution. The landmark candidates resulted from this step were further verified by matching against the edge orientation model. The method was tested on a large database of expressive faces coded in terms of action units. Action units represented single or conjoint facial muscle activations in upper and lower face. As results demonstrated, eye regions were localized with high rates in both neutral and expressive datasets. Nose and mouth localization was more attenuated by variations in facial expressions. The present results specified some of the critical facial behaviours which should be taken into consideration while improving automatic landmark detectors which rely on the low-level edge and intensity information.

Paper Nr.:
79
Title:
4D WARPING FOR ANALYSING MORPHOLOGICAL CHANGES IN SEED DEVELOPMENT OF BARLEY GRAINS
Author(s):
Rainer Pielot, Udo Seiffert, Bertram Manz, Diana Weier, Frank Volke and Winfriede Weschke
Abstract:
NMR imaging allows to obtain 3D-images by non-invasive treatment of biological structures. In this study intensity-based warping is evaluated by comparing it to landmark-based warping for a four-dimensional analysis of morphological changes in seed development of barley. The datasets of barley grains are obtained at certain development stages by NMR. Warping algorithms reconstruct intermediate physically non-measured stages. The landmark-based procedure consists of automatic definition of landmarks and subsequent distance-weighted warping. The intensity-based approach uses iterative intensity-based warping for definition of the displacement vector field and distance-weighted volume warping for generation of the virtual intermediate dataset. The approaches were tested with four datasets of barley at different development stages. As a result, the intensity-based approach is highly applicable for analysis of morphological changes in NMR datasets and serves as a tool for an extensive 4D analysis of seed development in barley grains.

Paper Nr.:
84
Title:
A NORMALIZED PARAMETRIC DOMAIN FOR THE ANALYSIS OF THE LEFT VENTRICULAR FUNCTION
Author(s):
Jaume Garcia-Barnes, Debora Gil, Sandra Pujadas, Francesc Carreras and Manel Ballester
Abstract:
Impairment of Left Ventricular (LV) contractility due to cardiovascular diseases is reflected in LV motion patterns. The mechanics of any muscle strongly depends on the spatial orientation of its muscular fibers since the motion that the muscle undergoes mainly takes place along the fiber. The helical ventricular myocardial band concept describes the myocardial muscle as a unique muscular band that twists in space in a non homogeneous way. The 3D anisotropy of the ventricular band fibers suggests a regional analysis of the heart motion. Computation of normality models of such motion can help in the diagnosis of any cardiac disorder. In this paper we introduce, for the first time, a normalized parametric domain that allows comparison of the left ventricle motion across patients. We address, both, extraction of the LV motion from Tagged Magnetic Resonance images, as well as, defining a mapping of the LV to a common normalized domain. Extraction of normality motion patterns from $17$ healthy volunteers shows the clinical potential of our LV parametrization.

Paper Nr.:
92
Title:
FAST AND ROBUST LOCALIZATION OF THE HEART IN CARDIAC MRI SERIES - A Cascade of Operations for Automatically Detecting the Heart in Cine MRI Series
Author(s):
Sebastian Zambal, Andreas Schöllhuber, Katja Bühler and Jiří Hladůvka
Abstract:
This work presents a robust approach for fast initialization of an Active Appearance Model for subsequent segmentation of cardiac MRI data. The method automatically determines AAM initialization parameters: position, orientation, and scaling of the model. Four steps are carried out: (1) variance images over time are calculated to find a bounding box that roughly defines the heart region; (2) circle Hough-transformation adapted to gray values is performed to detect the left ventricle; (3) thresholding is carried out to determine the orientation of the heart; (4) the optimal initialization is selected using a mean texture model. The method was evaluated on 42 MRI short axis studies coming from two MRI scanners of two different vendors. Automatic initializations are compared to manual ones. It is shown that the proposed automatic method is much faster than and achieves results qualitatively equal to manual initialization.

Paper Nr.:
106
Title:
CONTENT-BASED SHAPE RETIEVAL USING DIFFERENT AFFINE SHAPE DESCRIPTORS
Author(s):
Fatma Chaker, Faouzi Ghorbel and Mohamed Tarak Bannour
Abstract:
Shape representation is a fundamental issue in the newly emerging multimedia applications. In the Content Based Image Retrieval (CBIR), shape is an important low level image feature. Many shape representations have been proposed. However, for CBIR, a shape representation should satisfy several properties such as affine invariance, robustness, compactness, low computation complexity and perceptual similarity measurement. Against these properties, in this paper we attempt to study and compare three shape descriptors: two issued from Fourier method and the Affine Curvature Scale Space Descriptor (ACSSD). We build a retrieval framework to compare shape retrieval performance in terms of robustness and retrieval performance. The retrieval performance of the different descriptors is compared using two standard shape databases. Retrieval results are given to show the comparison.

Paper Nr.:
107
Title:
MODEL BASED GLOBAL IMAGE REGISTRATION
Author(s):
Niloofar Gheissari, Mostafa Kamali, Parisa Mirshams and Zohreh Sharafi
Abstract:
In this paper, we propose a model-based image registration method capable of detecting the true transformation model between two images. We incorporate a statistical model selection criterion to choose the true underlying transformation model. Therefore, the proposed algorithm is robust to degeneracy as any degeneracy is detected by the model selection component. In addition, the algorithm is robust to noise and outliers since any corresponding pair that does not undergo the chosen model is rejected by a robust fitting method adapted from the literature. Another important contribution of this paper is evaluating a number of different model selection criteria for image registration task. We evaluated all different criteria based on different levels of noise. We conclude that CAIC, GBIC slightly outperform other criteria for this application. The next choices are GIC, SSD and MDL. Finally we create panorama images using our registration algorithm. The panorama images show the success of this algorithm.

Paper Nr.:
115
Title:
DISPLAY REGISTRATION FOR DEVICE INTERACTION - A Proof of Principle Prototype
Author(s):
Nick Pears, Patrick Olivier and Daniel Jackson
Abstract:
A method is proposed to facilitate visually-driven interactions between two devices, which we call the {\em client}, such as a mobile phone or personal digital assistant (PDA), which must be equipped with a camera, and the {\em server}, such as a personal computer (PC) or intelligent display. The technique that we describe here requires a camera on the client to view the display on the server, such that either the client or the server (or both) can compute exactly which part of the server display is being viewed. The server display and the clients image of the server display, which can be written onto (part of) the client's display are then registered. This basic principle, which we call ``display registration" supports a very broad range of interactions (depending on the context in which the system is operating) and it will make these interactions significantly quicker, easier and more intuitive for the user to initiate and control. In addition, either the client or the server (or both) can compute the six degree-of-freedom (6 DOF) position of the client camera with respect to the server display. We have built a prototype which proves the principle and usefulness of display registration. This system employs markers on the server display for fast registration and it has been used to demonstrate a variety of operations, such as selecting and zooming into images.

Paper Nr.:
129
Title:
NONRIGID OBJECT SEGMENTATION AND OCCLUSION DETECTION IN IMAGE SEQUENCES
Author(s):
Ketut Fundana, Niels Chr. Overgaard, Anders Heyden, David Gustavsson and Mads Nielsen
Abstract:
We address the problem of nonrigid object segmentation in image sequences in the presence of occlusion. The proposed variational segmentation method is based on a region-based active contour of the Chan-Vese model augmented with a frame-to-frame interaction term as a shape prior. The interaction term is constructed to be pose-invariant by minimizing over a group of transformations and to allow moderate deformation in the shape of the contour. The segmentation method is then coupled with a novel variational contour matching formulation between two consecutive contours which gives a mapping of the intensities from the interior of the previous contour to the next. With this information occlusions can be detected from deviations from predicted intensities and the missing intensities in the occluded areas can be reconstructed. Experimental results on synthetic and real image sequences are shown.

Paper Nr.:
130
Title:
ESTIMATION OF FACIAL EXPRESSION INTENSITY BASED ON THE BELIEF THEORY
Author(s):
Khadoudja Ghanem1, Alice Caplier and Sébastien Stillittano
Abstract:
This article presents a new method to estimate the intensity of a human facial expression. Supposing an expression occurring on a face has been recognized among the six universal emotions (joy, disgust, surprise,sadness, anger, fear), the estimation of the expression’s intensity is based on the determination of the degree of geometrical deformations of some facial features and on the analysis of several distances computed on skeletons of expressions. These skeletons are the result of a contour segmentation of facial permanent features (eyes, brows, mouth). The proposed method uses the belief theory for data fusion. The intensity of the recognized expression is scored on a three-point ordinal scale: "low intensity", "medium intensity" or " high intensity". Experiments on a great number of images validate our method and give good estimation for facial expression intensity. We have implemented and tested the method on the following three expressions: joy, surprise and disgust.

Paper Nr.:
132
Title:
ENTROPY-BASED SALIENCY COMPUTATION IN LOG-POLAR IMAGES
Author(s):
Nadia Tamayo and V. Javier Traver
Abstract:
Visual saliency provides a filtering mechanism to focus on a set of interesting areas in the scene, but these mechanisms often overload the computational resources of many computer vision tasks. In order to reduce such an overload and improve the computational performance, we propose to exploit the advantages of log-polar vision to detect salient regions with economy of computational resources and quite stable results. Particularly, in this paper we study the application of the entropy-based saliency to log-polar images. Some interesting considerations are presented in reference to the concept of ``scale" and the effects of space-variant sampling on scale selection. We also propose a necessary border extension to detect objects present in peripheral areas. The original entropy-based saliency algorithm can be used in log-polar images, but the results show that our adaptations allow to detect with more precision log-polar salient forms because they consider the information redundancy of space-variant sampling. Compared with cartesian, log-polar salient results allow a significant saving of computational resources.

Paper Nr.:
133
Title:
LEARNING A WARPED SUBSPACE MODEL OF FACES WITH IMAGES OF UNKNOWN POSE AND ILLUMINATION
Author(s):
Jihun Hamm and Daniel D. Lee
Abstract:
In this paper we tackle the problem of learning the appearances of a person's face from images with both unknown pose and illumination. The unknown, simultaneous change in pose and illumination makes it difficult to learn 3D face models from data without manual labeling and tracking of features. In comparison, image-based models do not require geometric knowledge of faces but only the statistics of data itself, and therefore are easier to train with images with such variations. We take an image-based approach to the problem and propose a generative model of a warped illumination subspace. Image variations due to illumination change are accounted for by a low-dimensional linear subspace, whereas variations due to pose change are approximated by a geometric warping of images in the subspace. We demonstrate that this model can be efficiently learned via MAP estimation and multiscale registration techniques. With this learned warped subspace we can jointly estimate the pose and the lighting conditions of test images and improve recognition of faces under novel poses and illuminations. We test our algorithm with synthetic faces and real images from the CMU PIE and Yale face databases. The results show improvements in prediction and recognition performance compared to other standard methods.

Paper Nr.:
136
Title:
ADDING COLOR TO GEODESIC INVARIANT FEATURES
Author(s):
Pier Paolo Campari, Matteo Matteucci and Davide Migliore
Abstract:
Geodesic invariant feature have been originally proposed to build a new local feature descriptor invariant not only to affine transformations, but also to general deformations. The aim of this paper is to investigate the possible improvements given by the use of color information in this kind of descriptor. We introduced color information both in geodesic feature construction and description. At feature construction level, we extended the fast marching algorithm to use color information; at description level, we tested several color spaces on real data and we devised the opponent color space as an useful integration to intensity information. The experiments used to validate our theory are based on publicly available data and show the improvement, in precision and recall, with respect to the original intensity based geodesic features. We also compared this kind of features, on affine and non affine transformation, with SIFT, steerable filters, moments invariants, spin images and GIH.

Paper Nr.:
145
Title:
POISSON LOCAL COLOR CORRECTION FOR IMAGE STITCHING
Author(s):
Mohammad Amin Sadeghi, Seyyed Mohammad Mohsen Hejrati and Niloofar Gheissari
Abstract:
A new method for seamless image stitching is presented. The proposed algorithm is a hybrid method which uses optimal seam methods and smoothes the intensity transition between two images by color correction. A dynamic programming algorithm that finds an optimal seam along which gradient disparities are minimized is used. A modification of Poisson image editing is utilized to correct color differences between two images. Different boundary conditions for the Poisson equation were investigated and tested, and mixed boundary conditions generated the most accurate results. To evaluate and compare the proposed method with competing ones, a large image database consisting of more than two hundred image pairs was created. The test image pairs are taken at different lighting conditions, scene geometries and camera positions. On this database the proposed approach tested favorably as compared to standard methods and has shown to be very effective in producing visually acceptable images.

Paper Nr.:
147
Title:
A FRAMEWORK FOR ANALYZING TEXTURE DESCRIPTORS
Author(s):
Timo Ahonen and Matti Pietikäinen
Abstract:
This paper presents a new unified framework for texture descriptors such as Local Binary Patterns (LBP) and Maximum Response 8 (MR8) that are based on histograms of local pixel neighborhood properties. This framework is enabled by a novel filter based approach to the LBP operator which shows that it can be seen as a special filter based texture operator. Using the proposed framework, the filters to implement LBP are shown to be both simpler and more descriptive than MR8 or Gabor filters in the texture categorization task. It is also shown that when the filter responses are quantized for histogram computation, codebook based vector quantization yields slightly better results than threshold based binning at the cost of higher computational complexity.

Paper Nr.:
148
Title:
INNER LIP SEGMENTATION BY COMBINING ACTIVE CONTOURS AND PARAMETRIC MODELS
Author(s):
Sebastien Stillittano and Alice Caplier
Abstract:
Lip reading applications require accurate information about lip movement and shape, and both outer and inner contours are useful. In this paper, we introduce a new method for inner lip segmentation. From the outer lip contour given by a preexisting algorithm, we use some key points to initialize an active contour called “jumping snake”. According to some optimal information of luminance and chrominance gradient, this active contour fits the position of two parametric models; a first one composed of two cubic curves and a broken line in case of a closed mouth, and a second one composed of four cubic curves in case of an open mouth. These parametric models give a flexible and accurate final inner lip contour. Finally, we present several experimental results demonstrating the effectiveness of the proposed algorithm.

Paper Nr.:
168
Title:
RECOGNITION OF DYNAMIC VIDEO CONTENTS BASED ON MOTION TEXTURE STATISTICAL MODELS
Author(s):
Tomas Crivelli, Bruno Cernushi-Frias, Patrick Bouthemy and Jian-feng Yao
Abstract:
The aim of this work is to model, learn and recognize, dynamic contents in video sequences, displayed mostly by natural scene elements, such as rivers, smoke, moving foliage, fire, etc. We adopt the mixed-state Markov random fields modeling recently introduced to represent the so-called motion textures. The approach consists in describing the spatial distribution of some motion measurements which exhibit values of two types: a discrete component related to the absence of motion and a continuous part for measurements different from zero. Based on this, we present a method for recognition and classification of real motion textures using the generative statistical models that can be learned for each motion texture class. Experiments on sequences from the DynTex dynamic texture database demonstrate the performance of this novel approach.

Paper Nr.:
177
Title:
A FAST AND ROBUST METHOD FOR VOLUMETRIC MRI BRAIN EXTRACTION
Author(s):
Sami Bourouis and Kamel Hamrouni
Abstract:
This paper presents a method for magnetic resonance imaging (MRI) segmentation and the extraction of main brain tissues. The method uses an image processing technique based on level-set approach and EM-algorithm. The paper describes the main features of the method, and presents experimental results with real volumetric images in order to evaluate the performance of the method.

Paper Nr.:
178
Title:
MULTIRESOLUTION MESH SEGMENTATION OF MRI BRAIN USING CLASSIFICATION AND DISCRETE CURVATURE
Author(s):
Sami Bourouis, Kamel Hamrouni and Mounir Dhibi
Abstract:
This paper presents a method for brain tissue segmentation and characterization of magnetic resonance imaging (MRI) scans. It is based on statistical classification, differential geometry, and multiresolution representation. The Expectation Maximization algorithm and k-means clustering are applied to generate an initial mask of tissue classes of data volume. Then, we generate a hierarchical multiresolution representation of each object. The idea is that the low-resolution description is used to determine constraints for the segmentation at the higher resolutions. Thus, our contribution is the design of a pipeline procedure for brain characterization/labeling by using discrete curvature and multiresolution representation. We have tested our method on several MRI data.

Paper Nr.:
182
Title:
WAVELET TRANSFORM FOR PARTIAL SHAPE RECOGNITION USING SUB-MATRIX MATCHING
Author(s):
El-hadi Zahzah
Abstract:
In this paper, we propose a method for 2D partial shape recognition under affine transform using the discrete dyadic wavelet transform invariant to translation well known as \textit{Stationary Wavelet Transform or SWT}. The main problem of this type of transforms is its dependence to the signal starting point since the same signal may have several representations depending on the starting point. The choice of the starting point is then necessary to match two descriptors. Moreover, the contours must be closed which is not realistic, this is due generally to the image quality, and the methods of contour extraction. Recently, we proposed a 2D shape recognition method based on the Discrete Wavelet Transform. This method was applied on contours represented by close curves. The method we propose in this is about partial shape matching based on contour representation using the wavelet transform. A technique of sub matrix matching is then used to match partial shapes

Paper Nr.:
197
Title:
INDEX, MIDDLE, AND RING FINGER EXTRACTION AND IDENTIFICATION BY INDEX, MIDDLE, AND RING FINGER OUTLINES
Author(s):
Ching-Liang Su
Abstract:
In this study, the new technique is used to extract the index, middle and ring finger outlines. The orientations and geometrical features of these outlines are calculated and compared to identify different individuals. The techniques of database SQL searching and manipulation, image dilation, object position locating, image shifting, rotation, and interpolation are used to recognize different individuals. The hand was fixed each time when a photograph was taken, and one can assume that each time when a hand was acquired, the image was the same as the previous one. Since the photographs are the same, after the index, middle or ring fingers have been extracted from the hand image, the acquired images can be used to identify different persons.

Paper Nr.:
202
Title:
IMAGE PROCESSING IN MATERIAL ANALYSES OF ARTWORKS
Author(s):
Miroslav Beněs, Barbara Zitová, Janka Hradilová and David Hradil
Abstract:
In this paper we present system for processing, description and archiving material analyses used during art restoration - Nephele. The aim of the material analyses of painting layers is to identify inorganic and organic compounds using microanalytical methods, and to describe stratigraphy and morphology of layers. The results are used to interpret the applied painting technique. The Nephele system is the database system for material analysis reports, extended with the image preprocessing modules and the image retrieval facility. The imple- mented digital image processing methods are image registration, layers segmentation, and grains segmenta- tion. In the archiving part of the Nephele, in addition to the traditional database functions we have incorporated image-based retrieval methods into the developed system. They are based on the feature descriptions such as Haralick descriptors of co-occurence matrices or features computed using the wavelet decomposition of the images. Presented examples of achieved results show the applicability of the system.

Paper Nr.:
211
Title:
CONTENT-BASED IMAGE RETRIEVAL USING GENERIC FOURIER DESCRIPTOR AND GABOR FILTERS
Author(s):
Quan He, ZhengQiao Ji and Q. M. Jonathan Wu
Abstract:
Content-based image retrieval (CBIR) is an important research area with application to large amount image databases and multimedia information. CBIR has three general visual contents, including color, texture and shape. The focus of this paper is on the problem of shape and texture feature extraction and representation for CBIR. We apply Generic Fourier Descriptor (GFD) for shape feature extraction and Gabor Filters (GF) for texture feature extraction, and we successfully combine GFD and GF together for shape and texture feature extraction. Experimental results show that the proposed GFD+GF is robust to all the test databases with best retrieval rate.

Paper Nr.:
232
Title:
ON THE IMPROVEMENT OF THE TOPOLOGICAL ACTIVE VOLUMES MODEL : A Tetrahedral Approach
Author(s):
N. Barreira, M. G. Penedo, M. Ortega and J. Rouco
Abstract:
The Topological Active Volumes model is a 3D active model focused on segmentation and reconstruction tasks. The segmentation process is based on the adjustment of a 3D mesh composed of polyhedra. This adjustment is guided by the minimisation of several energy functions related to the mesh. Even though the original cubic mesh achieves good segmentation results, it has difficulties in some cases due to its shape. This paper proposes a new topology for the TAV mesh based on tetrahedra that overcomes the cubic mesh difficulties. Also, the paper explains an improvement in the tetrahedral topology to increase the accuracy of the results as well as the efficiency of the overall process.

Paper Nr.:
242
Title:
MULTIREGION GRAPH CUT IMAGE SEGMENTATION
Author(s):
Mohamed Ben Salah, Ismail Ben Ayed and Amar Mitiche
Abstract:
Graph cut image segmentation, which solves a labeling problem by combinatorial optimization, has been applied successfully to a variety of images. For common objective functions, graph cut methods run significantly faster than level set methods. However, because they assign a grey level label to each pixel from the set of all possible grey levels, they lead to an implicit partition of the image domain which is generally an oversegmentation. The purpose of our study is two-fold: (1) investigate an image segmentation method which combines parametric modeling of the image data and graph cut combinatorial optimization and, (2) use a prior which allows the number of labels/regions to decrease when the number of regions is not known and the algorithm initialized with a larger number. Experimental verification shows that the method results in good segmentations and runs faster than conventional graph cut methods.

Paper Nr.:
243
Title:
ACTIVE APPEARANCE MODEL(AAM) - From Theory to Implementation
Author(s):
Aleksandra Pizurica, Nikzad Babaii Rizvandi and Wilfried Philips
Abstract:
Active Appearance Model (AAM) is a powerful object modeling technique and one of the best available ones in computer vision and computer graphics. This approach is however quite complex and various parts of its implementation were addressed separately by different researchers in several recent works. In this paper, we present systematically a full implementation of the AAM model with pseudo codes for the crucial steps in the construction of this model.

Paper Nr.:
245
Title:
ENHANCED PHASE–BASED DISPLACEMENT ESTIMATION - An Application to Facial Feature Extraction and Tracking
Author(s):
Mohamed Dahmane and Jean Meunier
Abstract:
In this work, we develop a multi-scale approach for automatic facial feature detection and tracking. The method is based on a coarse to fine paradigm to characterize a set of facial fiducial points using a bank of Gabor filters that have interesting properties such as directionality, scalability and hierarchy. When the first face image is captured, a trained grid is used on the coarsest level to estimate a rough position for each facial feature. Afterward, a refinement stage is performed from the coarsest to the finest (original) image level to get accurate positions. These are then tracked over the subsequent frames using a modification of a fast phase–based technique. Experimental results show that facial features can be localized with high accuracy and that their tracking can be kept during long periods of free head motion.

Paper Nr.:
257
Title:
NOVEL TECHNIQUES FOR AUTOMATICALLY ENHANCED VISUALIZATION OF CORONARY ARTERIES IN MSCT DATA AND FOR DRAWING DIRECT COMPARISONS TO CONVENTIONAL ANGIOGRAPHY
Author(s):
Marion Jähne, Christina Lacalli and Stefan Wesarg
Abstract:
The new generation of multi-slice computed tomography (MSCT) scanners enables the radiologist to assess the coronary arteries in a non-invasive way. The question of particular interest is whether the quality of the findings based on MSCT data can compete with the gold standard - the coronary angiography. In this work we present novel automated methods for a reliable visualization of coronary arteries and for drawing direct visual side-by-side comparisons to conventional angiograms. Our approach comprises a new method for automatically extracting the heart from cardiac CT data and an advanced masking method for eliminating large cardiac cavities to obtain a better visibility of the coronary arteries in the rendered CT data. For drawing direct side-by-side comparisons we present a novel approach for simulating the conventional coronary angiography in an easy-to-handle manner. The new methods have been developed for and tested with contrast-enhanced cardiac CT datasets.

Paper Nr.:
296
Title:
DEPTH-BASED DETECTION OF SALIENT MOVING OBJECTS IN SONIFIED VIDEOS FOR BLIND USERS
Author(s):
Benoît Deville, Guido Bologna, Michel Vinckenbosch and Thierry Pun
Abstract:
The context of this work is the development of a mobility aid for visually impaired persons. We present here an original approach for a real time alerting system, based on the use of feature maps for detecting visual salient parts in images. In order to improve the quality of this method, we propose here to benefit from a new feature map constructed from the depth gradient. A specific distance function is described, which takes into account both stereoscopic camera limitations and user's choices. We demonstrate here that this additional depth-based feature map allows the system to detect the salient regions with good accuracy in most situations, even with noisy disparity maps.

Paper Nr.:
294
Title:
REDUCING THE EFFECT OF PARTIAL OCCLUSIONS
Author(s):
Meryem Erbilek and Önsen Toygar
Abstract:
The difficulty in the process of human identification by iris recognition is that the iris images captured may have occlusions by the eyelids and eyelashes. In that case, recognition of occluded iris patterns becomes hard and the corresponding person may not be correctly recognized. In order to reduce the effect of eyelid or eyelash occlusion on the recognition of human beings by their iris patterns, we propose a trivial and efficient method for iris recognition using specific regions on the iris images without using the traditional preprocessing approach before applying the feature extraction method to recognize the irises. First of all, these regions are individually experimented and then the outputs of each region are combined using a multiple classifier combination method with the feature extraction method Principal Component Analysis (PCA). The experiments on the iris images, with and without occlusions, demonstrate that the proposed approach achieves better recognition rates compared to the recognition rates of the holistic approaches.

Paper Nr.:
305
Title:
EVALUATION OF LOCAL ORIENTATION FOR TEXTURE CLASSIFICATION
Author(s):
Dana Elena Ilea, Ovidiu Ghita and Paul F. Whelan
Abstract:
The aim of this paper is to present a study where we evaluated the optimal inclusion of the texture orientation in the classification process. In this paper the orientation for each pixel in the image is extracted using the partial derivatives of the Gaussian function and the main focus of our work is centred on the evaluation of the local dominant orientation (which is calculated by combining the magnitude and local orientations) on the classification results. While the dominant orientation of the texture depends strongly on the observation scale, in this paper we propose to evaluate the macro-texture by calculating the distribution of the dominant orientations for all pixels in the image that sample the texture at micro-level. The experimental results were conducted on standard texture databases and the results indicate that the dominant orientation calculated at micro-level is an appropriate measure for texture description.

Paper Nr.:
306
Title:
AUTOMATIC SHOT BOUNDARY DETECTION USING GAUSSIAN MIXTURE MODEL
Author(s):
A. Adhipathi Reddy and Sridhar Varadharajan
Abstract:
The basic step for video analysis is the detection of shots in a given video. A shot is sequence of frames captured in a single continuous action in time and space using a single camera. The boundary between two adjacent shots may be an abrupt change (hard cut) or gradual change. In literature, many shot boundary detection algorithms have been proposed for detecting the hard cut or gradual changes like fadein/out and dissolve. The performance of these algorithms degrades with zooming, lighting change conditions, and fast moving type of videos. In this paper, a novel algorithm based on Gaussian Mixture Model (GMM) is developed for shot boundary detection. The behavior of GMM with abrupt and gradual change is used for detection of hard cut, fadein/out and dissolve. Experimental results shows credibility of the proposed algorithm with zooming, lighting change conditions, and fast moving type of videos.

Paper Nr.:
313
Title:
MEAN SHIFT SEGMENTATION - Evaluation of Optimization Techniques
Author(s):
Jens N. Kaftan, André A. Bell and Til Aach
Abstract:
The mean shift algorithm is a powerful clustering technique, which is based on an iterative scheme to detect modes in a probability density function. It has been utilized for image segmentation by seeking the modes in a feature space composed of spatial and color information. Although the modes of the feature space can be efficiently calculated in that scheme, different optimization techniques have been investigated to further improve the calculation speed. Beside those techniques that improve the efficiency using specialized data structures, there are other ones, which take advantage of some heuristics, and therefore affect the accuracy of the algorithm output. In this paper we discuss and evaluate different optimization strategies for mean shift based image segmentation. These optimization techniques are quantitatively evaluated based on different real world images. We compare segmentation results of heuristic-based, performance-optimized implementations with the segmentation result of the original mean shift algorithm as a gold standard. Towards this end, we utilize different partition distance measures, by identifying corresponding regions and analyzing the thus revealed differences.

Paper Nr.:
315
Title:
A ROBUST AND EFFICIENT METHOD FOR TOPOLOGY ADAPTATIONS IN DEFORMABLE MODELS
Author(s):
Jochen Abhau
Abstract:
In this paper, we present a novel algorithm for calculating topological adaptations in explicit evolutions of surface meshes in 3D. Our topological adaptation system consists of two main ingredients: A spatial hashing technique is used to detect mesh self-collisions during the evolution. Its expected running time is linear with respect to the number of vertices. A database consisting of possible topology changes is developed in the mathematical framework of homology theory. This database allows for fast and robust topology adaptation during a mesh evolution. The algorithm works without mesh reparametrizations, global mesh smoothness assumptions or vertex sampling density conditions, making it suitable for robust, near real-time application. Furthermore, it can be integrated into existing mesh evolutions easily. Numerical examples from medical imaging are given.

Paper Nr.:
319
Title:
ESTIMATING CAMERA ROTATION PARAMETERS FROM A BLURRED IMAGE
Author(s):
Giacomo Boracchia, Vincenzo Cagliotia and Alberto Danesea
Abstract:
A fast rotation of the camera during the image acquisition results in a blurred image, which typically shows curved smears. We propose an algorithm for estimating both the camera rotation axis and the camera angular speed from a single blurred image. The algorithm is based on local analysis of the blur smears. Contrary to the existing methods in literature, we treat the more general case where the rotation axis can be not orthogonal to the image plane, taking into account the perspective effects that in such case affect the smears. The algorithm is validated in experiments with synthetic and real blurred images, providing accurate estimates in both cases.

Paper Nr.:
332
Title:
LATTICE EXTRACTION BASED ON SYMMETRY ANALYSIS
Author(s):
Manuel Agustí-Melchor, Jose-Miguel Valiente-González and Ángel Rodas-Jordá
Abstract:
In many computer tasks it is necessary to structurally describe the contents of images for further processing, for example, in regular images produced in industrial processes such as textiles or ceramics. After reviewing the different approaches found in the literature, this work redefines the problem of periodicity in terms of the existence of local symmetries. Phase symmetry analysis is chosen to obtain these symmetries because of its robustness when dealing with image contrast and noise. Also, the multiresolution nature of the technique offers independence from using fixed thresholds to segment the image. Our adaptation of the original technique, based on lattice constraints, has result in a parameter free algorithm for determining the lattice. It offers a significant increase in computational speed with respect to the original proposal. Given that there is no set of images for assessing this type of techniques, various sets of images have been used, and a proposal to create more images for evaluating algorithms related to this task, is presented. A measure to enable the evaluation of results is also introduced, so that each calculated lattice can be tagged with an index regarding its correctness. The experiments show that using this statistic, good results are reported from image collections. Possible applications of the lattice extraction are suggested.

Paper Nr.:
334
Title:
BUILDING A NORMALITY SPACE OF EVENTS - A PCA Approach to Event Detection
Author(s):
Angelo Cenedese, Ruggero Frezza, Enrico Campana, Giambattista Gennari and Giorgio Raccanelli
Abstract:
The detection of events in video streams is a central task in the automatic vision paradigm, and spans heterogeneous fields of application from the surveillance of the environment, to the analysis of scientific data. Actually, although well captured by intuition, the definition itself of event is somewhat hazy and depending on the specific application of interest. In this work, the approach to the problem of event detection is different in nature. Instead of defining the event and searching for it within the data, a normality space of the scene is built from a chosen learning sequence (which represents the only input from the human operator). The event detection algorithm works by projecting any newly acquired image onto the normality space so as to calculate a distance from it that represents the innovation of the new frame, and define the metric for triggering an event alert. The algorithm has been validated in real life situations, in indoor and outdoor environments, and present appealing features in terms of robustness to natural motions and weather conditions.

Paper Nr.:
337
Title:
A SUBJECTIVE SURFACES BASED SEGMENTATION FOR THE RECONSTRUCTION OF BIOLOGICAL CELL SHAPE
Author(s):
Matteo Campana, Cecilia Zanella, Barbara Rizzi, Paul Bourgine, Nadine Peyriéras and Alessandro Sarti
Abstract:
Confocal laser scanning microscopy provides nondestructive in vivo imaging to capture specific structures that have been fluorescently labeled, such as cell nuclei and membranes, throughout early Zebrafish embryogenesis. With this strategy we aim at reconstruct in time and space the biological structures of the embryo during the organogenesis. In this paper we propose a method to extract bounding surfaces at the cellular-organization level from microscopy images. The shape reconstruction of membranes and nuclei is obtained first with an automatic identification of the cell center and then a subjective surfaces based segmentation is used to extract the bounding surfaces.

Paper Nr.:
340
Title:
CHARACTERISATION AND AUTOMATIC DETECTION OF LYMPH NODES ON MR COLORECTAL IMAGES
Author(s):
Jeong-Gyoo Kim and J. Michael Brady
Abstract:
Colorectal cancer is the second most common cause of death in Western countries. It is often curable by chemoradiotherapy and/or surgery; however, accurate staging has a significant impact on patient management and outcome. Numerous clinical reports attest to the fact that staging is not currently satisfactory, and so more precise methods are required for effective treatment. The three major components of disease staging are tumour size; whether or not there is distal metastatic spread; and the extent of lymph node involvement. Of these, the latter is currently by far the hardest to quantify, and it is the subject of this paper. Lymph nodes are distributed throughout the mesorectal fascia that envelops the colorectum. In practice, they are detected and assessed by clinicians using properties such as their size and shape. We are not aware of any previous image analysis approach for colorectal images that makes this subjective approach more scientific. To aid precise staging and surgery, we have developed methods that characterises lymph nodes by extracting implicit properties as computed from magnetic resonance colorectal images. We first learn the probability density function (PDF) of the intensities of the mesorectal fascia and find that it closely approximates a Gaussian distribution. The parameters of a Gaussian, fitted to the PDF, were estimated and the mean intensity of a lymph node candidate was compared with it. The fitting provides an explicit criterion for a region to be classed as a lymph node: namely, it is an outlier of the Gaussian distribution. As a key part of this process, we need to segment the boundaries of the mesorectal fascia, which is enclosed by two closed contours. Clinicians recognise the outer contour as thin edges. Since the thin edges are often ambiguous and disconnected, differentiating them from neighbouring tissues is a non-trivial problem; the surrounding tissues have no significant difference from the mesorectal fascia in both intensity and texture. We employed a level set method to segment three sets of objects: the mesorectal fascia, the colorectum, and lymph node candidates. Our segmentation results led us to build a PDF and to use it for the criterion that we propose. The whole process of implementation of our methods is automatic including the lookup of lymph candidates. The results of clinical cases are summarised in the paper.

Paper Nr.:
355
Title:
SPECKLE MODELIZATION IN OCT IMAGES FOR SKIN LAYERS SEGMENTATION
Author(s):
Ali Mcheik, Clovis Tauber, Hadj Batatia, Jerome George and Jean-Michel Lagarde
Abstract:
In dermatology, the optical coherence tomography (OCT) is used to visualize the skin over a few millimetre depth. These images are affected by speckle, which can alter the interpretation, but which also carry information that characterizes locally the visualized tissue. In this paper, we present a statistical study of the speckle distribution in OCT images. The capability of three probability density functions (pdf) (Rayleigh, Lognormal, and Nakagami) to differentiate the speckle distribution according to the skin layer is analysed. For each pdf, the vector of parameters, estimated over several images which are annotated by experts, are mapped onto a parameter space. Quantitative results over 30 images are compared to the manual delineations of 5 experts. Results confirm the potential of the method for the segmentation of the layers of the skin.

Paper Nr.:
362
Title:
INCORPORATING A NEW RELATIONAL FEATURE IN ARABIC ONLINE HANDWRITTEN CHARACTER RECOGNITION
Author(s):
Sara Izadi and Ching Y. Suen
Abstract:
Artificial neural networks have shown good performance in classification tasks. However, models used for learning in pattern classification are challenged when the differences between the patterns of the training set are small. Therefore, the choice of effective features is mandatory for obtaining good performance. Statistical and geometrical features alone are not suitable for recognition of hand printed characters due to variations in writing styles that may result in deformations of character shapes. We address this problem by using a relational context feature combined with a local descriptor for training a neural network-based recognition system in a user-independent online character recognition application. Our feature extraction approach provides a rich representation of the global shape characteristics, in a considerably compact form. This new relational feature generally provides a higher distinctiveness and that increases robustness with respect to character deformations, and potentially increasing the recognition rate in a user-independent system. While enhancing the recognition accuracy, the feature extraction is computationally simple. We show that the ability to discriminate in Arabic handwriting characters is increased by adopting this mechanism which provides input to the feed forward neural network architecture. Our experiments on Arabic character recognition show comparable results with the state-of-the-art methods for online recognition of these characters.

Paper Nr.:
368
Title:
PROJECTIVE IMAGE ALIGNMENT BY USING PROJECTIVE IMAGE ALIGNMENT
Author(s):
Georgios D. Evangelidis and Emmanouil Z. Psarakis
Abstract:
Nonlinear projective transformation provides the exact number of desired parameters to account for all possible camera motions thus making its use in problems where the objective is the alignment of two or more image profiles to be considered as a natural choice. Moreover, the ability of an alignment algorithm to quickly and accurately estimate the parameter values of the geometric transformation even in cases of over-modelling of the warping process constitutes a basic requirement to many computer vision applications. In this paper the appropriateness of the Enhanced Correlation Coefficient (ECC) function as a performance criterion in the projective image registration problem is investigated. Since this measure is a highly nonlinear function of the warp parameters, its maximization is achieved by using an iterative technique. The main theoretical results concerning the nonlinear optimization problem and an efficient approximation that leads to an optimal closed form solution (per iteration) are presented. The performance of the iterative algorithm is compared against the well known Lucas-Kanade algorithm with the help of a series of experiments involving strong or weak geometric deformations, ideal and noisy conditions and even over-modelling of the warping process. In all cases ECC based algorithm exhibits a better behavior in speed, as well as in the probability of convergence as compared to the Lucas-Kanade scheme.

Paper Nr.:
369
Title:
PERFORMANCE EVALUATION OF ROBUST MATCHING MEASURES
Author(s):
Federico Tombari, Luigi Di Stefano, Stefano Mattoccia and Angelo Galanti
Abstract:
This paper is aimed at evaluating the performances of different measures which have been proposed in literature for robust matching. In particular, classical matching metrics typically employed for this task are considered together with specific approaches aiming at achieving robustness. The main aspects assessed by the proposed evaluation are robustness with respect to photometric distortions, noise and occluded patterns. Specific datasets have been used for testing, which provide a very challenging framework for what concerns the considered disturbance factors and can also serve as testbed for evaluation of future robust visual correspondence measures.

Paper Nr.:
373
Title:
HEAD POSE ESTIMATION IN FACE RECOGNITION ACROSS POSE SCENARIOS
Author(s):
M. Saquib Sarfraz and Olaf Hellwich
Abstract:
We present a robust front-end pose classification/estimation procedure to be used in face recognition scenarios. A novel discriminative feature description that encodes underlying shape well and is insensitive to illumination and other common variations in facial appearance, such as skin colour etc., is proposed. Using such features we generate a pose similarity feature space (PSFS) that turns the multi-class problem into two-class by using inter-pose and intra-pose similarities. A new classification procedure is laid down which models this feature space and copes well with discriminating between nearest poses. For a test image it outputs a measure of confidence or so called posterior probability for all poses without explicitly estimating underlying densities. The pose estimation system is evaluated using CMU Pose, Illumination and Expression (PIE) database.

Paper Nr.:
379
Title:
IMAGE RE-SEGMENTATION - A New Approach Applied to Urban Imagery
Author(s):
Thales Sehn Korting, Leila Maria Garcia Fonseca, Luciano Vieira Dutra and Felipe Castro da Silva
Abstract:
This article presents a new approach for Image Segmentation, applied to Urban Scenes of Remote Sensing Data. Our method is called re-segmentation, since it uses the results of a previous over-segmented image, using well-known algorithms like Region Growing or Watershed. Resultant objects of the first segmentation are connected through a weighted Region Adjacency Graph, and by analyzing the connections, we look for regular shapes, i.e. rectangles and circles, within the connected nodes. The objects, or graph vertices, whose union forms more regular objects, are merged resulting in new regions with shape characteristics adequate to the urban case.

Paper Nr.:
384
Title:
SURFACE DEFECTS DETECTION ON ROLLED STEEL STRIPS BY GABOR FILTERS
Author(s):
Roberto Medina, Fernando Gayubo, Luis M. González, David Olmedo, Jaime Gómez, Eduardo Zalama and José R. Perán
Abstract:
Product material integrity and surface appearance, in steel flat products manufacturing and processing, are important attributes that will affect product operation, reliability and customer confidence. Automated visual inspection has to be envisaged, but five major problems have to be overcome: (i) The variable nature of the defects, (ii) The high reflective nature of the metallic surfaces, (iii) The oil presence, (iv) The huge amount of visual data to be acquired and processed, and (v) The high speed in the section where inspections are performed. We have developed an automated cellular visual inspection system of flat products in a flat steel cutting factory. Among the approaches that the system uses to detect defects, we have included the two-dimensional Gabor filters. In this paper a detection procedure of defects in flat steel products based on Gabor filters is presented. The traditional methods based on the study of the grey-level histogram and shape analysis, have shown quite good results, but there are not good enough to achieve the level of success required. Experimental results show that a greater number of defects can be readily detected using the proposed approach.

Paper Nr.:
428
Title:
CORE POINT DETECTION USING FINE ORIENTATION FIELD ESTIMATION
Author(s):
M. Usman Akram, Rabia Arshad, Rabia Anwar, Shoab A. Khan and Sarwat Nasir
Abstract:
Performance of Automatic Fingerprint Identification System( AFIS) is greatly influenced by the detection of core point. Extraction of best Region Of Interest(ROI) from image can play a vital role for core point detection. In this paper, we present an improved technique for fine orientation field estimation and core point detection. The distinct feature of our technique is that it gives high detection percentage of core point even in case of low quality fingerprint images. The proposed algorithm is applied on FVC2004 database. Results of experiments demonstrate improved performance for detecting core point.

Paper Nr.:
429
Title:
FACIAL EXPRESSION RECOGNITION BASED ON FUZZY LOGIC
Author(s):
M. Usman Akram, Irfan Zafar, Wasim Siddique Khan and Zohaib Mushtaq
Abstract:
We present a novel scheme for facial expression recognition from facial features using Mamdani-type fuzzy system. Facial expression recognition is of prime importance in human-computer interaction systems (HCI). HCI has gained importance in web information systems and e-commerce and certainly has the potential to reshape the IT landscape towards value driven perspectives. We present a novel algorithm for facial region extraction from static image. These extracted facial regions are used for facial feature extraction. Facial features are fed to a Mamdani-type fuzzy rule based system for facial expression recognition. Linguistic models employed for facial features provide an additional insight into how the rules combine to form the ultimate expression output. Another distinct feature of our system is the membership function model of expression output which is based on different psychological studies and surveys. The validation of the model is further supported by the high expression recognition percentage.

Area 3 - Image Understanding

Paper Nr.:
11
Title:
A CORRECTIVE FRAMEWORK FOR FACIAL FEATURE DETECTION AND TRACKING
Author(s):
Hussein Hamshari, Steven Beauchemin, Denis Laurendeau and Normand Teasdale
Abstract:
Epidemiological studies indicate that automobile drivers from varying demographics are confronted by difficult driving contexts such as negotiating intersections, yielding, merging and overtaking. This research is based on the hypothesis that visual search patterns of at-risk drivers provide vital information required for assessing driving abilities and improving the skills of such drivers under varying conditions. We aim to detect and track the face and eyes of the driver during several driving scenarios, allowing for further processing of a driver's visual search pattern behavior. Traditionally, detection and tracking of objects in visual media has been performed using specific techniques. These techniques vary in terms of their robustness and computational cost. This research proposes a framework that is built upon a foundation synonymous to boosting. The idea of an integrated framework employing multiple trackers is advantageous in forming a globally strong tracking methodology. In order to model the effectiveness of trackers, a confidence parameter is introduced to help minimize the errors produced by incorrect matches and allow more effective trackers with a higher confidence value to correct the perceived position of the target.

Paper Nr.:
31
Title:
DIFFUSION FILTERING FOR ILLUMINATION INVARIANT FACE RECOGNITION - Illumination Approximation with Diffusion Filters within Retinex Context
Author(s):
Peter Dunker and Melanie Keller
Abstract:
Face recognition becomes a very important technology in recent years for a lot of various applications. One major problem of the most state-of-the-art algorithms are different lightning conditions which can decrease recognition rates dramatically. To reduce the influence of illumination in the recognition process normalization methods can be used. In this paper we introduce illumination normalization algorithms based on diffusion filters. Further we compare our approach with human perceptional inspired retinex algorithms. Finally we present the evaluation results of our experiments with well known face recognitions techniques such as principal component analysis (PCA). The results show that the diffusion filter approaches outperforms known retinex algorithms which demonstrates the capabilities of the diffusion filter technology for illumination normalization.

Paper Nr.:
39
Title:
LOSS-WEIGHTED DECODING FOR ERROR-CORRECTING OUTPUT CODING
Author(s):
Sergio Escalera, Oriol Pujol and Petia Radeva
Abstract:
The multi-class classification is a challenging problem for several applications in Computer Vision. Error Correcting Output Codes technique (ECOC) represents a general framework capable to extend any binary classification process to the multi-class case. In this work, we present a novel decoding strategy that takes advantage of the ECOC coding to outperform the up to now existing decoding strategies. The novel decoding strategy is applied to the state-of-the-art coding designs, extensively tested on the UCI Machine Learning repository database and in two real vision applications: tissue characterization in medical images and traffic sign categorization. The results show that the presented methodology considerably increases the performance of the traditional ECOC strategies and the state-of-the-art multi-classifiers.

Paper Nr.:
41
Title:
HARMONIC DEFORMATION MODEL FOR EDGE BASED TEMPLATE MATCHING
Author(s):
Andreas Hofhauser, Carsten Steger and Nassir Navab
Abstract:
The paper presents an approach to the detection of deformable objects in single images. To this end we propose a robust match metric that preserves the relative edge point neighborhood, but allows significant shape changes. Similar metrics have been used for the detection of rigid objects \cite{olson:97,steger:02}. To the best of our knowledge this adaptation to deformable objects is new. In addition, we present a fast algorithm for model deformation. In contrast to the widely used thin-plate spline \cite{bookstein:89,Gianluca:02}, it is efficient even for several thousand points. For arbitrary deformations, a forward-backward interpolation scheme is utilized. It is based on harmonic inpainting, i.e. it regularizes the displacement in order to obtain smooth deformations. Similar to optical flow, we obtain a dense deformation field, though the template contains only a sparse set of model points. Using a coarse-to-fine representation for the distortion of the template further increases efficiency. We show in a number of experiments that the presented approach in not only fast, but also very robust in detecting deformable objects.

Paper Nr.:
45
Title:
A NEW FACE RECOGNITION SYSTEM - Using HMMs Along with SVD Coefficients
Author(s):
Pooya Davari and Hossein Miar Naimi
Abstract:
In this paper, a new Hidden Markov Model (HMM)-based face recognition system is proposed. As a novel point despite of five-state HMM used in pervious researches, we used 7-state HMM to cover more details. As another novel point, we used a small number of quantized Singular Value Decomposition (SVD) coefficients as features describing blocks of face images. This makes the system very fast. In order to additional reduction in computational complexity and memory consumption (in hardware implementation) the images are resized to jpeg format. Before anything, an order-statistic filter is used as a preprocessing operation. Then a top-down sequence of overlapping sub-image blocks is considered. Using quantized SVD coefficients of these blocks, each face is considered as a numerical sequence that can be easily modeled by HMM. The system has been examined on the Olivetti Research Laboratory (ORL) face database. The experiments showed a recognition rate of 99%, using half of the images for training. Our system has been evaluated on YALE database too. Using five and six training images, we obtained 97.78% and 100% recognition rates respectively, a record in the literature. The proposed method is compared with the best researches in the literature. The results show that the proposed method is the fastest one, having approximately 100% recognition rate.

Paper Nr.:
49
Title:
NEW INVARIANT DESCRIPTORS BASED ON THE MELLIN TRANSFORM
Author(s):
S. Metari and François Deschęnes
Abstract:
In this paper we introduce two new classes of radiometric and combined radiometric-geometric invariant descriptors. The first class includes two types of radiometric descriptors. The first type is based on Mellin transform and the second one is based on central moments. Both descriptors are invariant to contrast changes and to convolution with any kernel having a symmetric form with respect to the diagonals. The second class contains two subclasses of combined descriptors. The first subclass includes central-moment based descriptors invariant simultaneously to translations, to uniform and anisotropic scaling, to stretching, to contrast changes and to convolution. The second subclass includes central-complex-moment based descriptors invariant simultaneously to similarity transformation and to contrast changes. We apply those invariants to the matching of geometric transformed and/or blurred images.

Paper Nr.:
59
Title:
EYE DETECTION USING LINE EDGE MAP TEMPLATE
Author(s):
Mihir Jain, Suman K. Mitra and Naresh Jotwani
Abstract:
Location of eyes is an important visual clue for processes such as scaling and orientation correction, which are precursors to face recognition. This paper presents a robust algorithm for eye detection which makes use of edge information and distinctive features of eyes, starting from a roughly localized face image. Potential region pairs are generated, and then template matching is applied to match these region pairs with a generated eye line edge map template using primary line segment Hausdorff distance to get an estimation of the centers of two eyes. This result is then refined to get iris centers and also eye centers. Experimental results demonstrate the excellent performance of the proposed algorithm.

Paper Nr.:
62
Title:
VIDEO EVENT CLASSIFICATION AND DETECTION USING 2D TRAJECTORIES
Author(s):
Alexandre Hervieu, Patrick Bouthemy and Jean-Pierre Le Cadre
Abstract:
This paper describes an original statistical trajectory-based approach which can address several issues related to dynamic video content understanding: unsupervised clustering of events, recognition of events corresponding to learnt classes of dynamic video contents, and detection of unexpected events. Appropriate local differential features combining curvature and motion magnitude are robustly computed on the trajectories. They are invariant to image translation, in-the-plane rotation and scale transformation. The temporal causality of these features is then captured by hidden Markov models whose states are properly quantized values, and similarity between trajectories is expressed by exploiting the HMM framework. We report experiments on two sets of data, a first one composed of typical classes of synthetic (noised) trajectories (such as parabola or clothoid), and a second one formed with trajectories computed in sports videos. We have also favorably compared our method to other ones, including feature histogram comparison, use of the longest common subsequence (LCSS) distance and SVM-based classification.

Paper Nr.:
63
Title:
FACE AND FACIAL FEATURE DETECTION EVALUATION - Performance Evaluation of Public Domain Haar Detectors for Face and Facial Feature Detection
Author(s):
M. Castrillón-Santana, O. Déniz-Suárez, L. Antón-Canalís and J. Lorenzo-Navarro
Abstract:
Fast and reliable face and facial feature detection is a required ability for any Human Computer Interaction approach based on Vision. Since the publication of Viola-Jones object detection framework and the more recent open source implementation, an increasing number of applications have appeared in the context of facial processing. In this sense, the OpenCV community shares a collection of public domain classifiers for this scenario. However, as far as we know these classifiers have been rarely compared. In this paper we first analyze the individual performance of all those public classifiers getting the best performance for each target. These results are valid to define a baseline for future approaches. Additionally we propose a simple hierarchical combination of those classifiers to increase facial feature detection and reduce false facial detection.

Paper Nr.:
64
Title:
ROBUST FACE ALIGNMENT USING CONVOLUTIONAL NEURAL NETWORKS
Author(s):
Stefan Duffner and Christophe Garcia
Abstract:
Face recognition in real-world images mostly relies on three successive steps: face detection, alignment and identification. The second step of face alignment is crucial as the bounding boxes produced by robust face detection algorithms are still too imprecise for most face recognition techniques, i.e. they show slight variations in position, orientation and scale. We present a novel technique based on a specific neural architecture which, without localizing any facial feature points, precisely aligns face images extracted from bounding boxes coming from a face detector. The neural network processes face images cropped using misaligned bounding boxes and is trained to simultaneously produce several geometric parameters characterizing the global misalignment. After having been trained, the neural network is able to robustly and precisely correct translations of up to +-13% of the bounding box width, in-plane rotations of up to +-30 degrees and variations in scale from 90% to 110%. Experimental results show that 94% of the face images of the BioID database and 80% of the images of a complex test set extracted from the internet are aligned with an error of less than 10% of the face bounding box width.

Paper Nr.:
65
Title:
INVARIANT FACE RECOGNITION IN A NETWORK OF CORTICAL COLUMNS
Author(s):
Philipp Wolfrum, Jörg Lücke and Christoph von der Malsburg
Abstract:
We describe a neural network for invariant object recognition. The network is generative in the sense that it explicitly represents both the recognized object and the extrinsic properties to which it is invariant (especially object position). The model is biologically plausible, being formulated as a neuronal system composed of cortical columns and dynamic links. At the same time it has competitive face recognition performance.

Paper Nr.:
82
Title:
IMAGE RETRIEVAL USING KRAWTCHOUK CHROMATICITY DISTRIBUTION MOMENTS
Author(s):
E. Tziola, K. Konstantinidis, L. Kotoulas and I. Andreadis
Abstract:
In this paper a set of Krawtchouk Chromaticity Distribution Moments (KCDMs) for the effective representation of image color content is introduced. The proposed method describes chromaticity through a set of KCDMs applied on the associated chromaticity distribution function in the L*a*b* color space. Using only a small fixed number of KCDMs the method achieves satisfactory retrieval rates. The computational requirements of this approach are relatively small, compared to other methods addressing the issue of image retrieval using color features. This has a direct impact on the time required to index an image database. Furthermore, due to the short-length of KCDMs feature vector, there is a straight reduction on the time needed to retrieve the whole database. Comparing to previous relative works, KCDMs provide a more accurate representation of the L*a*b* chromaticity distribution functions, since no numerical approximation is involved in deriving the moments. Furthermore, unlike other orthogonal moments, Krawtchouk moments can be employed to extract local features of a chromaticity diagram. This property makes them more analytical near the centre of mass of the chromaticity distribution. The theoretical framework is validated by experiments which prove the superior performance of KCDMs above other methods.

Paper Nr.:
112
Title:
AUTOMATED OBJECT SHAPE MODELLING BY CLUSTERING OF WEB IMAGES
Author(s):
Giuseppe Scardino, Ignazio Infantino and Salvatore Gaglio
Abstract:
The paper deals with the description of a framework to create shape models of an object using images from the web. Results obtained from different image search engines using simple keywords are filtered, and it is possible to select images viewing a single object owning a well-defined contour. In order to have a large set of valid images, the implemented system uses lexical web databases (e.g. WordNet) or free web encyclopedias (e.g. Wikipedia), to get more keywords correlated to the given object. The shapes extracted from selected images are represented by Fourier descriptors, and are grouped by K-means algorithm. Finally, the more representative shapes of main clusters are considered as prototypical contours of the object, and they can be used to search the same object in images showing a more complex structure. Preliminary experimental results are illustrated to show the effectiveness of the proposed approach.

Paper Nr.:
119
Title:
SPATIAL NEIGHBORING HISTOGRAM FOR SHAPE-BASED IMAGE RETRIEVAL
Author(s):
Noramiza Hashim, Patrice Boursier and Hong Tat Ewe
Abstract:
The integration of camera in mobile phones has become a standard in mobile devices. Man-made object recognition such as building taken from such devices requires a fast and efficient approach in a practical application. Our work focuses on recognizing buildings based on a novel shape-based two dimensional histogram descriptor. It combines both the low level feature (i.e. edge orientation) and the middle level feature (i.e. spatial neighborhood pattern). The neighborhood pattern is coded in a 4-bit binary representation which offers a simple and efficient way to incorporate local spatial data into the histogram. We find that the proposed method increases the retrieval precision by approximately 12% compared to other similar shape-based histogram methods.

Paper Nr.:
121
Title:
TEXTURE BASED DESCRIPTION OF MOVEMENTS FOR ACTIVITY ANALYSIS
Author(s):
Kellokumpu Vili, Zhao Guoying and Pietikäinen Matti
Abstract:
Human motion can be seen as a type of moving texture pattern. In this paper, we propose a novel approach for activity analysis by describing human activities with texture features. Our approach extracts spatially enhanced local binary pattern (LBP) histograms from temporal templates (Motion History Images and Motion Energy Images) and models their temporal behavior with hidden Markov models. The description is useful for action modeling and is suitable for detecting and recognizing various kinds of activities. The method is computationally simple. We perform tests on two published databases and clearly show the good performance of our approach in classification and detection tasks. Furthermore, experimental results show that the approach performs robustly against irregularities in data, such as limping and walking with a dog, partial occlusions and low video quality.

Paper Nr.:
124
Title:
IMAGE COMPLETION USING A DIFFUSION DRIVEN MEAN CURVATURE FLOWIN A SUB-RIEMANNIAN SPACE
Author(s):
Gonzalo Sanguinetti, Giovanna Citti and Alessandro Sarti
Abstract:
In this paper we present an implementation of a perceptual completion model performed in the three dimensional space of position and orientation of level lines of an image. We show that the space is equipped with a natural subriemannian metric. This model allows to perform disocclusion representing both the occluding and occluded objects simultaneously in the space. The completion is accomplished by computing minimal surfaces with respect to the non Euclidean metric of the space. The minimality is achieved via diffusion driven mean curvature flow. Results are presented in a number of cognitive relevant cases.

Paper Nr.:
126
Title:
AN AUTOMATICWELDING DEFECTS CLASSIFIER SYSTEM
Author(s):
Juan Zapata, Ramón Ruiz and Rafael Vilar
Abstract:
Radiographic inspection is a well-established testing method to detect weld defects. However, interpretation of radiographic films is a difficult task. The reliability of such interpretation and the expense of training suitable experts have allowed that the efforts being made towards automation in this field. In this paper, we describe an automatic detection system to recognise welding defects in radiographic images. In a first stage, image processing techniques, including noise reduction, contrast enhancement, thresholding and labelling, were implemented to help in the recognition of weld regions and the detection of weld defects. In a second stage, a set of geometrical features which characterise the defect shape and orientation was proposed and extracted between defect candidates. In a third stage, an artificial neural network for weld defect classification was used under three regularisation process with different architectures. For the input layer, the principal component analysis technique was used in order to reduce the number of feature variables; and, for the hidden layer, a different number of neurons was used in the aim to give better performance for defect classification in both cases. The proposed classification consists in detecting the four main types of weld defects met in practice plus the non-defect type.

Paper Nr.:
128
Title:
INVARIANT CODES FOR SIMILAR TRANSFORMATION AND ITS APPLICATION TO SHAPE MATCHING
Author(s):
Eiji Yoshida and Seiichi Mita
Abstract:
In this paper, we propose a new method for the measurement of shape similarity. Our proposed method encodes the contour of an object by using the curvature of the object. If two objects are similar (under translation, rotation, and scaling) in shape, these codes themselves or their cyclic shift have the same values. We have compared our method with other methods such as Fourier descriptor, CSS (curvature scale space) and shape context. We have shown that the computational cost of our method is about one-hundredth that of CSS, and the recognition rate of our method is 90.40% for the scaling robustness test using MPEG7_CE-Shape1 and 81.82% for the similarity-based retrieval test using Kimia’s silhouette. These values are slightly better than those of CSS.

Paper Nr.:
141
Title:
DRIVING WARNING SYSTEM BASED ON VISUAL PERCEPTION OF ROAD SIGNS
Author(s):
Juan Pablo Carrasco, Arturo de la Escalera and José María Armingol
Abstract:
Advanced Driver Assistance Systems are used to increase the security of vehicles. Computer Vision is one of the main technologies used for this aim. Lane marks recognition, pedestrian detection, driver drowsiness or road sign detection and recognition are examples of these systems. The last one is the goal of this paper. A system that can detect and recognize road signs based on color and shape features is presented in this article. It will be focused on detection, especially the color space used, investigating on the case of road signs under shadows. The system, also tracks the road sign once it has been detected. It warns the driver in case of anomalous speed for the recognized road sign using the information from a GPS.

Paper Nr.:
146
Title:
FAST WIREFRAME-VISIBILITY ALGORITHM
Author(s):
Ezgi Gunaydin Kiper
Abstract:
In this paper, a fast wireframe-visibility algorithm is introduced. The algorithm’s inputs are 3D wireframe model of the object, internal and external camera calibration parameters. Afterwards, the algorithm outputs the 2D image of the object with only visible lines and surfaces. 2D image of an object is constructed by using a camera model with the given camera calibration parameters and 3D wireframe model. The idea behind the algorithm is finding the intersection points of the all lines in 2D image of the object. These intersection points are called as critical points and the lines having them are also called as critical lines. Lines without any critical points are called as normal lines. Critical lines are separated into smaller lines by its critical points and depth calculation is performed for the middle points of these smaller lines. For the normal lines, depth of the middle point of the normal line is calculated to determine if it is visible or not. Therefore, the algorithm provides the minimum amount of point’s depth calculation. Moreover, this idea provides much faster process for the reason that there aren’t any resolution and memory problems like well-known image-space scan-line and z-buffering algorithms.

Paper Nr.:
149
Title:
A MULTI-SCALE LAYOUT DESCRIPTOR BASED ON DELAUNAY TRIANGULATION FOR IMAGE RETRIEVAL
Author(s):
Agnés Borrŕs Angosto and Josep Lladós Canet
Abstract:
Working with large collections of videos and images has need of effective and flexible techniques of retrieval and browsing. Beyond the classical color histogram approaches, the layout information has proven to be a very descriptive cue for image description. We have developed a descriptor that encodes the layout of an image using a histogram-based representation. The descriptor uses a multi-layer representation that captures the saliency of the image parts. Furthermore it encodes their relative positions using the properties of a Delaunay triangulation. The descriptor is a compact feature vector which content is normalized. Their properties make it suitable for image retrieval and indexing applications. Finally, have applied it to a video browsing application that detects characteristic scenes of a news program.

Paper Nr.:
150
Title:
A SIGNAL-SYMBOL LOOP MECHANISM FOR ENHANCED EDGE EXTRACTION
Author(s):
Sinan Kalkan, Florentin Wörgötter, Shi Yan, Volker Krüger and Norbert Krüger
Abstract:
The transition to symbolic information from images involves in general the loss or misclassification of information. One way to deal with this missing or wrong information is to get feedback from concrete hypotheses derived at a symbolic level to the sub-symbolic (signal) stage to amplify weak information or correct misclassifications. This paper proposes such a feedback mechanism between the symbolic level and the signal level, which we call signal symbol loop. We apply this framework for the detection of low contrast edges making use of predictions based on Rigid Body Motion. Once the Rigid Body Motion is known, the location and the properties of edges at a later frame can be predicted. We use these predictions as feedback to the signal level at a later frame to improve the detection of low contrast edges. We demonstrate our mechanism on a real example, and evaluate the results using an artificial scene, where the ground truth data is available.

Paper Nr.:
160
Title:
CLASSIFIER SELECTION FOR FACE RECOGNITION ALGORITHM BASED ON ACTIVE SHAPE MODEL
Author(s):
Andrzej Florek and Maciej Król
Abstract:
In this paper, experimental results from face contour classification tests are shown. Presented approach is dedicated to a face recognition algorithm based on the Active Shape Model. The results were obtained from experiments carried out on the set of 2700 images taken from 100 persons. Manually fitted contours (194 samples for 8 components of one face contour) were classified after feature space decomposition carried out by Linear Discriminant Analysis or by Support Vector Machines algorithms.

Paper Nr.:
167
Title:
ON THE CONTRIBUTION OF COMPRESSION TO VISUAL PATTERN RECOGNITION
Author(s):
Gunther Heidemann and Helge Ritter
Abstract:
Most pattern recognition problems are solved by highly task specific algorithms. However, all recognition and classification architectures are related in at least one aspect: They rely on compressed representations of the input. It is therefore an interesting question how much compression itself contributes to the pattern recognition process. The question has been answered by Benedetto et al. (2002) for the domain of text, where a common compression program (gzip) is capable of language recognition and authorship attribution. The underlying principle is estimating the mutual information from the obtained compression factor. Here we show that compression achieves astonishingly high recognition rates even for far more complex tasks: Visual object recognition, texture classification, and image retrieval. Though, naturally, specialized recognition algorithms still outperform compressors, our results are remarkable, since none of the applied compression programs (gzip, bzip2) was ever designed to solve this type of tasks. Compression is the only known method that solves such a wide variety of tasks without any modification, data preprocessing, feature extraction, even without parametrization. We conclude that compression can be seen as the ``core'' of a yet to develop theory of unified pattern recognition.

Paper Nr.:
171
Title:
POSE INVARIANT FACE RECOGNITION USING IMAGE HISTOGRAMS
Author(s):
Hasan Demirel and Gholamreza Anbarjafari
Abstract:
The faces with changing poses show significant variations on the local details of the facial features. However, the global pixel statistics, represented by the image histograms, of the same subject with pose variations, are highly correlated. The image histograms are very robust features that capture the global pixel statistics of faces. In this paper, histograms of the intensity images are used as the feature vectors for the recognition of the faces of different poses. Histogram matching, based on the cross correlation of the image histograms, is used as the measure of similarity in the classification process. The recognition rate of the proposed face recognition system reaches to 98.80% on the HP face database, with 10 poses incorporating up to ±90o of horizontal pose variations.

Paper Nr.:
189
Title:
SUBJECT RECOGNITION USING A NEW APPROACH FOR FEATURE SELECTION
Author(s):
Ŕgata Lapedriza , David Masip and Jordi Vitria
Abstract:
In this paper we propose a feature selection method that uses the mutual information (MI) measure on a Principal Component Analysis (PCA) based decomposition. PCA finds a linear projection of the data in a non-supervised way, which preserves the larger variance components of the data under the reconstruction error criterion. Previous works suggest that using the MI among the PCA projected data and the class labels applied to feature selection can add the missing discriminability criterion to the optimal reconstruction feature set. Our proposal goes one step further, defining a global framework to add independent selection criteria in order to filter misleading PCA components while the optimal variables for classification are preserved. We apply this approach to a face recognition problem using the AR Face data set. Notice that, in this problem, PCA projection vectors strongly related to illumination changes and occlusions are usually preserved given their high variance. Our additional selection tasks are able to discard this type of features while the relevant features to perform the subject recognition classification are kept. The experiments performed show an improved feature selection process using our combined criterion.

Paper Nr.:
196
Title:
EFFICIENT OBJECT DETECTION USING PCA MODELING AND REDUCED SET SVDD
Author(s):
Venkataramana Kini and Rudra Hota
Abstract:
Object detection problem is traditionally tackled as two class problem. Wherein the non object classes are not precisely defined. In this paper we propose cascade of principal component modeling with associated test statistics and reduced set support vector data description for efficient object detection, both of which hinge mainly on modeling of object class training data. The PCA modeling enables quick rejection of comparatively obvious non object in initial stage of the cascade to gain computation advantage. The reduced set SVDD is applied in latter stages of cascade to classify relatively difficult images. This combination of PCA modeling and reduced set support vector data description leads to a good object detection with simple pixel features.

Paper Nr.:
200
Title:
RELEVANCE FEEDBACK WITH MAX-MIN POSTERIOR PSEUDO-PROBABILITY FOR IMAGE RETRIEVAL
Author(s):
Yuan Deng, Xiabi Liu and Yunde Jia
Abstract:
In this paper, a new relevance feedback method for image retrieval based on max-min posterior pseudo-probabilities (MMP) framework is proposed to learn user’s intention during feedback. We assume that the feature vectors extracted from the relevant images be of the distribution of Gaussian mixture model (GMM). The posterior pseudo-probability function for the relevant images is used as user intention model. The relevant image’s posterior pseudo-probability function is used to classify images into two categories: relevant and irrelevant. During feedback, relevant and irrelevant images labelled by user are taken as the training data of user intention model. The optimum parameter set of the model is learned from the training data using MMP criterion. Experimental results on Corel database show the effectiveness of the proposed approach.

Paper Nr.:
209
Title:
TEXT DETECTION WITH CONVOLUTIONAL NEURAL NETWORKS
Author(s):
Manolis Delakis and Christophe Garcia
Abstract:
Text detection is an important preliminary step before text can be recognized in unconstrained image environments. We present an approach based on convolutional neural networks to detect and localize horizontal text lines from raw color pixels. The network learns to extract and combine its own set of features through learning instead of using hand-crafted ones. Learning was also used in order to precisely localize the text lines by simply training the network to reject badly-cut text and without any use of tedious knowledge-based post-processing. Although the network was trained with synthetic examples, experimental results demonstrated that it can outperform other methods on the real-world test set of ICDAR'03.

Paper Nr.:
210
Title:
FAST TEMPLATE MATCHING FOR MEASURING VISIT FREQUENCIES OF DYNAMICWEB ADVERTISEMENTS
Author(s):
Dániel Szolgay, Csaba Benedek and Tamás Szirányi
Abstract:
In this paper an on-line method is proposed for statistical evaluation of dynamic web advertisements via measuring their visit frequencies. To minimize the required user-interaction, the eye movements are tracked by a special eye camera, and the hits on advertisements are automatically recognized. The detection step is mapped to a 2D template matching problem, and novel algorithms are developed to significantly decrease the processing time, via excluding quickly most of the false hit-candidates. We show that due to the improvements the method runs in real time in the context of the selected application. The solution has been validated on real test data and quantitave results have been provided to show the gain in recognition rate and processing time versus previous approaches.

Paper Nr.:
226
Title:
OBJECTIVE EVALUATION OF SEAM PUCKER USING AN ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM
Author(s):
K. L. Mak and Wei Li
Abstract:
Seam pucker evaluation plays a very important role in the garments manufacturing industry. At present, seam puckers are usually evaluated by human inspectors, which is subjective, unreliable and time-consuming. With the development of image processing and pattern recognition technologies, an automatic vision-based seam pucker evaluation system becomes possible. This paper presents a new approach based on adaptive neuro-fuzzy inference system (ANFIS) to establish the relationship between seam pucker grades and textural features of seam pucker images. The evaluation procedure is performed in two stages: features extraction with the co-occurrence matrix approach, and classification with ANFIS. Experimental results demonstrate the validity and effectiveness of the proposed ANFIS-based method.

Paper Nr.:
248
Title:
A BAYESIAN APPROACH TO 3D OBJECT RECOGNITION USING LINEAR COMBINATION OF 2D VIEWS
Author(s):
Vasileios Zografos and Bernard F. Buxton
Abstract:
In this work, we introduce a Bayesian approach for pose-invariant recognition of the images of 3d objects modelled by a small number of stored 2d intensity images taken from nearby but otherwise arbitrary viewpoints. A linear combination of views approach is used to combine images from two viewpoints of a 3d object and synthesise novel views of that object. Recognition is performed by matching a target, scene image to such a synthesised, novel view using an optimisation algorithm, constrained by construction of Bayes prior distributions on the linear combination. We have experimented with both a direct search and an evolutionary optimisation method on a real-image, public database. The Bayes priors effectively regularised the posterior distribution so that all algorithms were able to find good solutions close to the optimum. Further exploration of the parameter space has been carried out using Markov-Chain Monte-Carlo sampling.

Paper Nr.:
253
Title:
IMAGE ANNOTATION WITH RELEVANCE FEEDBACK USING A SEMI-SUPERVISED AND HIERARCHICAL APPROACH
Author(s):
Cheng-Chieh Chiang, Ming-Wei Hung, Yi-Ping Hung and Wee Kheng Leow
Abstract:
This paper presents a novel approach for image annotation with relevance feedback that interactively employs a semi-supervised learning to build hierarchical classifiers associated with annotation labels. We construct individual hierarchical classifiers each corresponding to one semantic label that is used for describing the semantic contents of the images. This proposed semi-supervised and hierarchical approach is involved in an interactive scheme of relevance feedbacks to assist the user in annotating images. Our semi-supervised approach for learning classifiers reduces the need of training images by use of both labeled and unlabeled images. We adopt hierarchical approach for classifiers to divide the whole semantic concept associated with a label into several parts such that the complex contents in images can be simplified. We also describe some experiments to show the performance of the proposed approach.

Paper Nr.:
264
Title:
MULTI-DISCRIMINANT CLASSIFICATION ALGORITHM FOR FACE VERIFICATION
Author(s):
Cheng-Ho Huang and Jhing-Fa Wang
Abstract:
Linear discriminant analysis (LDA) is a common method used for face verification. For computing the large amounts of data collected for a given face verification system, we propose a multi-discriminated classification algorithm to classify and verify voluminous facial images. In the training phase, it indexes all discriminant features of the training data to class them as the clients’ individual discriminant sets. In order to verify whether a claimant is a client, we only verify the client’s discriminant set to determine the result: acceptance or rejection. The results of comparative experiments demonstrate that our algorithm achieves encouraging improvement in the performances for volumes of face verification.

Paper Nr.:
263
Title:
TOWARDS EMBEDDEDWASTE SORTING - Using Constellations of Visual Words
Author(s):
Toon Goedemé
Abstract:
In this paper, we present a method for fast and robust object recognition, especially developed for implementation on an embedded platform. As an example, the method is applied to the automatic sorting of consumer waste. Out of a stream of different thrown-away food packages, specific items—in this case beverage cartons — can be visually recognised and sorted out. To facilitate and optimise the implementation of this algorithm on an embedded platform containing parallel hardware, we developed a voting scheme for constellations of visual words, i.e. clustered local features (SURF in this case). On top of easy implementation and robust and fast performance, even with large databases, an extra advantage is that this method can handle multiple identical visual features in one model.

Paper Nr.:
270
Title:
INTRODUCING 3D VISION AND COMPUTER GRAPHICS TO ARCHAEOLOGICAL WORKFLOW - An Applicable Framework
Author(s):
Hubert Mara, Andreas Monitzer and Julian Stöttinger
Abstract:
Cataloging drawings of ancient vessels and sherds is still the most time consuming task in the typical archaeological workflow. The properties of these findings like profile, volume, and wall thickness have always been estimated and drawn by hand. Through archiving, classifying and exhibiting these ancient artifacts we wish to gather as precise information as possible. Within seconds, today's 3D-scanners provide surface meshes of ancient vessels which are more precise than any manual estimation which may take up to several hours. We propose a semi-automated, applicable framework for dealing with large 3D-meshes of ancient findings from scanning the vessels for publication. In this interactive environment we estimate the axis of vessels, estimate their profile lines and render real time visualizations using state-of-the-art 3D-hardware techniques. The results can be printed in their real size for direct use in archaeological literature. Further, these methods will give the ability to publish 3D-meshes of ancient vessels for archaeological research. Recent extended tests have been carried out on archaeological sites in Peru and Austria. These experiments showed under real life circumstances the improvement of using this system in both precision and time efficiency.

Paper Nr.:
272
Title:
LOW-LEVEL FUSION OF AUDIO AND VIDEO FEATURE FOR MULTI-MODAL EMOTION RECOGNITION
Author(s):
Matthias Wimmer, Björn Schuller, Dejan Arsic, Gerhard Rigoll and Bernd Radig
Abstract:
Bimodal emotion recognition through audiovisual feature fusion has been shown superior over each individual modality in the past. Still, synchronization of the two streams is a challenge, as many vision approaches work on a frame basis opposing audio turn- or chunk-basis. Therefore, late fusion schemes such as simple logic or voting strategies are commonly used for the overall estimation of underlying affect. However, early fusion is known to be more effective in many other multimodal recognition tasks. We therefore suggest a combined analysis by descriptive statistics of audio and video Low-Level-Descriptors for subsequent static SVM Classification. This strategy also allows for a combined feature-space optimization which will be discussed herein. The high effectiveness of this approach is shown on a database of 11.5h containing six emotional situations in an airplane scenario.

Paper Nr.:
280
Title:
DETERMINATION OF THE VISUAL FIELD OF PERSONS IN A SCENE
Author(s):
Adel Lablack, Frédéric Maquet and Chabane Djeraba
Abstract:
The determination of the visual field for multiple persons in a scene is an important problem with many applications in human behavior understanding for security and customized marketing. One such application, addressed in this paper, is to catch the visual field of persons in a scene. We obtained the head pose in the image sequence manually in order to determine exactly the visual field of persons in the monitored scene. We use the knowledge about the vision of a human, the trigonometrical relations to calculate the length and the height of the visual field and quaternion approach for doing several changes of reference mark. We demonstrate this technique using a realistic data set of videos taken by surveillance camera on shops.

Paper Nr.:
282
Title:
BUILDING DETECTION IN IKONOS IMAGES FROM DISPARITY OF EDGES
Author(s):
Charles Beumier
Abstract:
The availability of very high resolution satellite images has enabled the automatic remote detection of man-made structures for applications such as damage assessment or change detection. In particular, stereo pairs of Ikonos or Quickbird images allow for the estimation of the third dimension so distinctive for buildings. Since the areas to be studied may be quite large we propose a simple, fast and possibly accurate approach for building detection. This approach consists in a three step procedure which first detects linear segments independently in the left and right images, then matches segments according to their mutual coverage, orientation and plausible disparity, and finally identifies building areas thanks to the presence of elevated segments. The solution is fast as only pixels of high gradient connected into linear segments are considered. Modelling object parts with linear segments is valid for the vast majority of man-made objects and allows for rapid segment pairing for disparity computation with possible sub-pixel accuracy. This approach has been applied to an Ikonos pair for the detection of large buildings in the context of risk assessment within GMOSS, a European Network of Excellence.

Paper Nr.:
285
Title:
TOWARDS THE ESTIMATION OF CONSPICUITY WITH VISUAL
Author(s):
Ludovic Simon, Jean-Philippe Tarel and Roland Brémond
Abstract:
The estimation of conspicuity is of importance for engineers who aim at making traffic signs conspicuous enough to attract drivers' attention. Unfortunately, conspicuity remains a poorly understood attribute due to the relatively limited - although growing - knowledge about the human visual processing system. Our goal is to develop a system which estimates the conspicuity of a traffic sign based on the processing of images acquired with a camera onboard a vehicle, in order to be able to make a diagnosis regarding their conspicuity. Aside from specific feature known to be of importance for road signs, there is currently no complete model for conspicuity. The previously proposed attentional conspicuity model, which is based on vision science knowledge of the low levels of the human visual processing system, was shown to be not suitable for sign detection tasks. We thus propose a new paradigm for conspicuity estimation in search tasks based on statistical learning of the features of the searched object.

Paper Nr.:
290
Title:
EFFICIENT OBJECT DETECTION ROBUST TO RST WITH MINIMAL SET OF EXAMPLES
Author(s):
Sebastien Onis, Henri Sanson, Christophe Garcia and Jean-Luc Dugelay
Abstract:
In this paper, we present an object detection approach based on a similarity measure combining cross-correlation and affine deformation. Current object detection systems provide good results, at the expense of requiring a large training database. The use of correlation anables object detection with very small training set but is not robust to the luminosity change and RST (Rotation, Scale, translation) transformation. This paper presents a detection system that first searches the likely positions and scales of the object using image preprocessing and cross-correlation method and secondly, uses a similarity measure based on affine deformation to confirm or not the predetection. We apply our system to face detection and show the improvement in results due to the images preprocessing and the affine deformation.

Paper Nr.:
291
Title:
RELATIONS BETWEEN RECONSTRUCTED 3D ENTITIES
Author(s):
Nicolas Pugeault, Sinan Kalkan, Florentin Wöergöetter, Emre Baseski and Norbert Krüger
Abstract:
In this paper, we first propose an analytic formulation for the position's and orientation's uncertainty of local 3D line descriptors reconstructed by stereo. We evaluate these predicted uncertainties with Monte Carlo simulations, and study their dependency on different parameters (position and orientation). In a second part, we use this definition to derive a new formulation for inter--features distance and coplanarity. These new formulations take into account the predicted uncertainty, allowing for better robustness. We demonstrate the positive effect of the modified definitions on some simple scenarios.

Paper Nr.:
353
Title:
RECOGNITION OF TEXTWITH KNOWN GEOMETRIC AND GRAMMATICAL STRUCTURE
Author(s):
Jan Rathouský, Martin Urban and Vojtech Franc
Abstract:
The optical character recognition (OCR) module is a fundamental part of each automated text processing system. The OCR module translates an input image with a text line into a string of symbols. In many applications (e.g. license plate recognition) the text has some a priori known geometric and grammatical structure. This article proposes an OCR method exploiting this knowledge which restricts the set of possible strings to a limited set of feasible combinations. The recognition task is formulated as maximization of a similarity function which uses character templates as reference. These templates are estimated by a support vector machine method from a set of examples. In contrast to the common approach, the proposed method performs character segmentation and recognition simultaneously. The method was successfully evaluated in a car license plate recognition system.

Paper Nr.:
357
Title:
REPRESENTATION AND RECOGNITION OF HUMAN ACTIONS - A New Approach based on an Optimal Control Motor Model
Author(s):
Sumitra Ganesh and Ruzena Bajcsy
Abstract:
We present a novel approach to the problem of representation and recognition of human actions, that uses an optimal control based model to connect the high-level goals of a human subject to the low-level movement trajectories captured by a computer vision system. These models quantify the high-level goals as a performance criterion or cost function which the human sensorimotor system optimizes by picking the control strategy that achieves the best possible performance. We show that the human body can be modeled as a hybrid linear system that can operate in one of several possible modes, where each mode corresponds to a particular high-level goal or cost function. The problem of action recognition, then is to infer the current mode of the system from observations of the movement trajectory. We demonstrate our approach on 3D visual data of human arm motion.

Paper Nr.:
371
Title:
FACE MODEL FITTINGWITH GENERIC, GROUP-SPECIFIC, AND PERSON-SPECIFIC OBJECTIVE FUNCTIONS
Author(s):
Sylvia Pietzsch, Matthias Wimmer, Freek Stulp and Bernd Radig
Abstract:
In model-based fitting, the model parameters that best fit the image are determined by searching for the optimum of an objective function. Often, this function is designed manually, based on implicit and domaindependent knowledge. We acquire more robust objective function by learning them from annotated images, in which many critical decisions are automated, and the remaining manual steps do not require domain knowledge. Still, the trade-off between generality and accuracy remains. General functions can be applied to a large range of objects, whereas specific functions describe a subset of objects more accurately. (Gross et al., 2005) have demonstrated this principle by comparing generic to person-specific Active Appearance Models. As it is impossible to learn a person-specific objective function for the entire human population, we automatically partition the training images and then learn partition-specific functions. The number of groups influences the specificity of the learned functions. We automatically determine the optimal partitioning given the number of groups, by minimizing the expected fitting error. Our empirical evaluation demonstrates that the group-specific objective functions more accurately describe the images of the corresponding group. The results of this paper are especially relevant to face model tracking, as individual faces will not change throughout an image sequence.

Paper Nr.:
372
Title:
SIMILARITY MEASURES FUSION USING SVM CLASSIFIER FOR FACE AUTHENTICATION
Author(s):
Mohammad T. Sadeghi,Masoumeh Samiei, Seyed Mohammad T. Almodarresi and Josef Kittler
Abstract:
In this paper, the problems of measuring similarity in LDA face space using different metrics and fusing the associated classifiers are considered. A few similarity measures used in different pattern recognition applications, including the recently proposed Gradient Direction (GD) metric are reviewed. An automatic parameter selection algorithm is then proposed for optimising the GD metric. In extensive experimentation on the BANCA database, we show that the optimised GD metric outperforms the other metrics in various conditions. Moreover, we demonstrate that by combining the GD metric and seven other metrics in the decision level using Support Vector Machines, the performance of the resulting decision making scheme consistently improves.

Paper Nr.:
375
Title:
HIERARCHICAL EVALUATION MODEL FOR 3D FACE RECOGNITION
Author(s):
Sídnei A. Drovetto Jr., Luciano Silva and Olga R. P. Bellon
Abstract:
In this paper we propose a 3D face matching based on alignments obtained using the Simulated Annealing global optimization algorithm guided by the Mean Squared Error with M-estimator Sample Consensus and the Surface Interpenetration Measure (SIM). The matching score is obtained by the calculation of the SIM after the registration process. Since the SIM is a sensitive measure, it needs a good alignment to give relevance to its value. Our registration approach tends to reach a near global solution and, therefore, produces the necessary precise alignments. By analyzing the matching score, the system can identify if the input images come from the same subject or not. In a verification scenario we use a hierarchical evaluation model which maximizes the results and reduces the computing time. Extensive experiments were performed on the well-known FRGC v2.0 3D face database using five different facial regions: three regions of the nose; the region of the eyes; and the face itself. Compared to state-of-the-art works, our approach have achieved a high rank-one recognition rate and, also, a high verification rate.

Paper Nr.:
378
Title:
SINGLE-IMAGE 3D RECONSTRUCTION OF BALL VELOCITY AND SPIN FROM MOTION BLUR - An Experiment in Motion-from-Blur
Author(s):
Giacomo Boracchi, Vincenzo Caglioti and Alessandro Giusti
Abstract:
We present an algorithm for analyzing a single calibrated image of a ball and reconstruct its instantaneous motion (3D velocity and spin) by exploiting motion blur. We use several state-of-the-art image processing techniques for extracting information from space-variant blur, then robustly integrate such information in a geometrical model of the 3D motion. We initially handle the simpler case in which the ball apparent translation is neglegible w.r.t. its spin, then extend the technique to handle the full motion. We show extensive experimental results both on synthetic and camera images. In a broader scenario, we exploit this specific problem for discussing motivations, advantages and limits of reconstructing motion from motion blur.

Paper Nr.:
381
Title:
COMPLETE AND STABLE PROJECTIVE HARMONIC COMPLETE AND STABLE PROJECTIVE HARMONIC
Author(s):
Faten Chaieb and Faouzi Ghorbel
Abstract:
Planar shapes recognition is an important problem in computer vision and pattern recognition. We deal with planar shape contour views that differ by a general projective transformation. One method for solving such problem is to use projective invariants. In this work, we propose a projective and parametrization invariant generation framework based on the harmonic analysis theory. In fact, invariance to reparameterization is obtained by a projective arc length curve reparameterization process. Then, a complete and stable set of projective harmonic invariants is constructed from the Fourier coefficients computed on the reparameterized contours. We experiment this set of descriptors on analytic images in order to recognize projectively similar contours.

Paper Nr.:
402
Title:
FACIAL EXPRESSION RECOGNITION USING ACTIVE APPEARANCE MODELS
Author(s):
Pedro Martins, Joana Sampaio and Jorge Batista
Abstract:
A framework for automatic facial expression recognition combining Active Appearance Model (AAM) Linear Discriminant Analysis (LDA) is proposed. Seven different expressions of several subjects, representing the neutral face and the facial emotions of happiness, sadness, surprise, anger, fear and disgust were analysed. The proposed solution starts by describing the human face by an AAM model, projecting the appearance results to a Fisherspace using LDA to emphasize the different expression categories. Finaly the performed classification is based on malahanobis distance.

Paper Nr.:
442
Title:
MPEG-7 DESCRIPTORS BASED CLASSIFIER FOR FACE/NON FACE DETECTION
Author(s):
Malek Nadil, Abdenour Labed and Feryel Souami
Abstract:
In this paper we present a high level Face/Non-face classifier which can be integrated to a content based image retrieving system. It will help to extract semantics from images prior to their retrieving. This two-steps retrieval allows reducing effects of semantic gaps on the performance of existing systems. To construct our classifier, we exploit a standardized MPEG-7 low level descriptor. Experiments performed on images taking from two data bases, showed that our technique outperforms, in many cases, others presented in the literature.

Area 4 - Motion, Tracking and Stereo Vision

Paper Nr.:
12
Title:
TRAFFIC SURVEILLANCE USING GABOR FILTER BANK AND KALMAN PREDICTOR
Author(s):
Mehmet Celenk, James Graham and Santosh Singh
Abstract:
This paper describes a non-linear scene prediction method for use with traffic surveillance video. A Gabor-filter bank is selected as a primary detector for any changes in a given image sequence. The detected ROI (region of interest) in arbitrary motion is fed to a non-linear Kalman filter for predicting the next scene in time-varying video, which is subject to prediction error correction. Potential applications of this research are mainly in the areas of traffic control and monitoring, accident detection, traffic flow surveillance, and MPEG video-compression. Experimental results reported herein show that non-linear Kalman filtering based scene prediction is quite effective in the estimation of future frames in visual-band intensity driven sensing. The least mean square error (LMSE) in predicting future frames is relatively low, on the average of about 2 to 3 %, proving the effectiveness of the approach for traffic-motion control and management.

Paper Nr.:
19
Title:
AUTOMATIC INITIALIZATION FOR BODY TRACKING - Using Appearance to Learn a Model for Tracking Human Upper Body Motions
Author(s):
Joachim Schmidt and Modesto Castrillón-Santana
Abstract:
Social robots require the ability to communicate and recognize the intention of a human interaction partner. Humans commonly make use of gestures during everyday life for interactive purposes. For a social robot, recognition of gestures is therefore a necessary skill. As a common intermediate step, the pose of an individual is tracked over time making use of a body model. For a system based on such a communication scenario, self-starting tracking is a favored characteristic. The acquisition of a suitable body model, however, is a complex task. This paper presents an approach to facilitate the acquisition of the body model during interaction. Taking advantage of a robust face detection algorithm provides the opportunity for automatic and markerless acquisition of a 3D body model using a monocular color camera. For the given human robot interaction scenario, a prototype has been developed for a single user configuration. It provides automatic initialization and failure recovery of a 3D body tracker based on head and hand detection information, delivering promising results.

Paper Nr.:
28
Title:
POSE ESTIMATION FROM LINES BASED ON THE DUAL-NUMBER METHODS
Author(s):
Caixia Zhang, Zhanyi Hu and Fengmei Sun
Abstract:
It is a classical problem to estimate the camera pose from a calibrated image of 3D entities (points or lines) in computer vision, photogrammetry and even in mathematics. Although lines provide a more stable image feature to match and the point feature will often be missing from consecutive image for carrying on a series of camera pose determination, only the point features are used in the most papers and the line features have occasionally appeared in the literature. In this paper, based on the dual-number methods, we present a new method for pose estimation from lines, and introduce a similar formula with a general rigid transformation, thus, we set up a unified framework of coordinate transformations for lines and points. Then, according to the coplanarity of the corresponding image line and space line, a new group of constraints is introduced. Although they are not independent of each other, redundant constraints may be used to improve the estimation precision for all practical applications where noise in the data cannot be avoided. Different from the existing methods based on lines, we do not use an isolated point on either the space line or the image line, but the whole line data. Thus, it is evitable to detect the corner as well as the corresponding propagating error. Simulations and tests on real images confirm the validity and usefulness of our method.

Paper Nr.:
29
Title:
MODEL-FREE MARKERLESS TRACKING FOR REMOTE SUPPORT IN UNKNOWN ENVIRONMENTS
Author(s):
Alexander Ladikos, Selim Benhimane, Nassir Navab and Mirko Appel
Abstract:
We propose a complete system that performs real-time markerless tracking for Augmented Reality-based remote user support in a priori unknown environments. In contrast to existing systems, which require a prior setup and/or knowledge about the scene, our system can be used without preparation. This is due to our tracking algorithm which does not need a 3D-model of the scene or a learning-phase for the initialization. This allows us to perform fast and robust markerless tracking of the objects which are to be augmented. The proposed solution does not require artificial markers or special lighting conditions. The only requirement is the presence of locally planar objects in the scene, which is true for almost every man-made structure and in particular technical installations. The augmentations are chosen by a remote expert who is connected to the user over a network and receives a live stream of the scene.

Paper Nr.:
35
Title:
IMAGE SEQUENCE STABILIZATION USING FUZZY KALMAN FILTERING AND LOG-POLAR TRANSFORMATION
Author(s):
Nikolaos Kyriakoulis, Antonios Gasteratos and Angelos Amanatiadis
Abstract:
Digital image stabilization (DIS) is the process that compensates the undesired fluctuations of a frame’s position in an image sequence by means of digital image processing techniques. DIS techniques usually comprise two successive units. The first one estimates the motion and the successive one compensates it. In this paper, a novel digital image stabilization technique is proposed, which is featured with a fuzzy Kalman estimation of the global motion vector in the log-polar plane. The global motion vector is extracted using four local motion vectors computed on respective sub-images in the log-polar plane. The proposed technique exploits both the advantages of the fuzzy Kalman system and the log-polar plane. The compensation is based on the motion estimation in the log-polar domain, filtered by the fuzzy Kalman system. The described technique outperforms in terms of response times, the output quality and the level of compensation.

Paper Nr.:
38
Title:
OPTICAL-FLOWFOR 3D ATMOSPHERIC MOTION ESTIMATION
Author(s):
Patrick Héas and Etienne Mémin
Abstract:
In this paper, we address the problem of estimating three-dimensional motions of a stratified atmosphere from satellite image sequences. The complexity of three-dimensional atmospheric fluid flows associated to incomplete observation of atmospheric layers due to the sparsity of cloud systems makes very difficult the estimation of dense atmospheric motion field from satellite images sequences. The recovery of the vertical component of fluid motion from a monocular sequence of image observations is a very challenging problem for which no solution exists in the literature. Based on a physically sound vertical decomposition of the atmosphere into layers of different altitudes, we propose here a dense motion estimator dedicated to the extraction of three-dimensional wind fields characterizing the dynamics of a layered atmosphere. Wind estimation is performed over the complete three-dimensional space using a multi-layer model describing a stack of dynamic horizontal layers of evolving thickness, interacting at their boundaries via vertical winds. The efficiency of our approach is demonstrated on synthetic and real sequences.

Paper Nr.:
40
Title:
GLOBAL DEPTH ESTIMATION FOR MULTI-VIEW VIDEO CODING USING CAMERA PARAMETERS
Author(s):
Xiaoyun Zhang, Weile Zhu and George Yang
Abstract:
JVT decides to focus on multi-view video plus depth (MVD) data format for Multi-view Video Coding (MVC), in order to support rendering a wide range continuum of views at the decoder for advanced 3DV and FVV systems. Thus, it is important to study global depth to minimize rate for depth side information and to improve depth search efficiency. In this paper, we propose a global depth estimation algorithm from multi-view images using camera parameters. First, an initial depth value is obtained from the convergent point of the camera system by solving a set of linear equations. Then, the global depth is searched to minimize the absolute difference between the synthesized view and the practical view. Because the initial depth can provide appropriate depth search range and step size, the global depth can be estimated efficiently and quickly with less computation. Experimental results verify the algorithm performance.

Paper Nr.:
51
Title:
TOWARDS EUCLIDEAN RECONSTRUCTION FROM VIDEO SEQUENCES
Author(s):
Dimitri Bulatov
Abstract:
This paper presents two algorithms needed to perform a dense 3D-reconstruction from video streams recorded with uncalibrated cameras. Our algorithm for camera self-calibration makes extensive use of the constant focal length. Furthermore, a fast dense reconstruction can be performed by fusion of tesselations obtained from different sub-sequences (LIFT). Moreover, we will present our system for performing the reconstruction in a projective coordinate system. Since critical motions are common in the majority of practical situations, care has been taken to recognize and deal with them.

Paper Nr.:
57
Title:
BACKGROUND SUBTRACTION WITH ADAPTIVE SPATIO-TEMPORAL NEIGHBORHOOD ANALYSIS
Author(s):
Marco Cristani and Vittorio Murino
Abstract:
In the literature, visual surveillance methods based on joint pixel and region analysis for back- ground subtraction are proven to be effective in discovering foreground objects in cluttered scenes. Typically, per-pixel foreground detection is contextualized in a local neighborhood region in order to limit false alarms. However, such methods have an heavy computational cost, depending on the size of the surrounding region considered for each pixel. In this paper, we propose an original and e±cient joint pixel-region analysis technique able to automatically select the sampling rate with which pixels in different areas are checked out, while adapting the size of the neighborhood region considered. The algorithm has been validated on standard videos with benchmark tests, proving the goodness of the approach, especially in terms of quality of the detection with respect to the frame rate achieved.

Paper Nr.:
60
Title:
SIMPLE BUT EFFECTIVE TREE STRUCTURES FOR DYNAMIC PROGRAMMING-BASED STEREO MATCHING
Author(s):
Michael Bleyer and Margrit Gelautz
Abstract:
This work describes a fast method for computing dense stereo correspondences that is capable of generating results close to the state-of-the-art. We propose running a separate disparity computation process in each image pixel. The idea is to root a tree graph on the pixel whose disparity needs to be reconstructed. The tree thereby forms an individual approximation of the standard four-connected grid for this specific pixel. An exact optimum of a predefined energy function on the applied tree structure is determined via dynamic programming (DP), and the root pixel is assigned to the disparity of optimal costs. We present two simple tree structures that allow for the efficient calculation of all trees' optima with only four scanline-based DP passes. These simple trees are designed to capture all pixels of the reference frame and incorporate horizontal and vertical smoothness edges in order to weaken the scanline streaking problem inherent in DP-based approaches. We evaluate our results using the Middlebury test set. Our algorithm currently ranks at the eighth position of approximately 30 algorithms in the Middlebury database. More importantly, it is the currently best-performing method that does not use image segmentation and is significantly faster than most competing algorithms. Our method needs less than a second to determine the disparity map for typical stereo pairs.

Paper Nr.:
66
Title:
A FAST POST-PROCESSING TECHNIQUE FOR REAL-TIME STEREO CORRESPONDENCE
Author(s):
Georgios - Tsampikos Michailidis, Leonidas Kotoulas and Ioannis Andreadis
Abstract:
In computer vision, the extraction of dense and accurate disparity maps is a computationally expensive and challenging problem, and high quality results typically require from several seconds to several minutes to be obtained. In this paper, we present a new post-processing technique, which detects the incorrect reconstructed pixels after the initial matching process and replaces them with correct disparity values. Experimental results with Middlebury data sets show that our approach can process images of up to 3MPixels in less than 3.3 msec, producing at the same time semi-dense (up to 99%) and accurate (up to 94%) disparity maps. We also propose a way to adaptively change, in real time, the density and the accuracy of the extracted disparity maps. In addition, the matching and post-processing procedures are calculated without using any multiplication, which makes the algorithm very fast, while its reduced complexity simplifies its implementation. Finally, we present the hardware implementation of the proposed algorithm.

Paper Nr.:
71
Title:
TOUCH-LESS PALM PRINT BIOMETRIC SYSTEM
Author(s):
Michael Goh Kah Ong, Connie Tee and Andrew Teoh Beng Jin
Abstract:
In this research, we propose an innovative touch-less palm print recognition system. This project is motivated by the public’s demand for non-invasive and hygienic biometric technology. For various reasons, users are concerned about touching the biometric scanners. Therefore, we propose to use a low-resolution web camera to capture the user’s hand at a distance for recognition. The users do not need to touch any device for their palm print to be extracted for analysis. A novel hand tracking and palm print region of interest (ROI) extraction technique are used to track and capture the user’s palm in real time video streams. The discriminative palm print features are extracted based on a new way that applies local binary pattern (LBP) texture descriptor on the palm print directional gradient responses. Experiments show promising result by using the proposed method. Performance can be further improved when a modified probabilistic neural network (PNN) is used for feature matching.

Paper Nr.:
76
Title:
USING LOW-LEVEL MOTION TO ESTIMATE GAIT PHASE
Author(s):
Ben Daubney, David Gibson and Neill Campbell
Abstract:
This paper presents a method that is capable of robustly estimating gait phase of a human walking using the motion of a sparse cloud of feature points extracted using a standard feature tracker. We first learn statistical motion models of the trajectories we would expect to observe for each of the main limbs. By comparing the motion of the tracked features to our models and integrating over all features we create a state probability matrix that represents the likelihood of being at a particular phase as a function of time. By using dynamic programming and allowing only likely phase transitions to occur between consecutive frames an optimal solution can be found that estimates the gait phase for each frame. This work demonstrates that despite the sparsity and noise contained in the tracking data the information encapsulated in the motion of these points is sufficient to extract gait phase to a high level of accuracy. Presented results demonstrate our system is robust to changes in height of the walker, gait frequency and individual gait characteristics.

Paper Nr.:
77
Title:
EXPERIMENTAL EVALUATION OF RELATIVE POSE ESTIMATION ALGORITHMS
Author(s):
Marcel Brückner, Ferid Bajramovic and Joachim Denzler
Abstract:
We give an extensive experimental comparison of four popular relative pose (epipolar geometry) estimation algorithms: the eight, seven, six and five point algorithms. We focus on the practically important case that only a single solution may be returned by automatically selecting one of the solution candidates, and investigate the choice of error measure for the selection. We show that the five point algorithm gives very good results with automatic selection. As sometimes the eight point algorithm is better, we propose a combination algorithm which selects from the solutions of both algorithms and thus combines their strengths. We further investigate the behavior in the presence of outliers by using adaptive RANSAC, and give practical recommendations for the choice of the RANSAC parameters. Finally, we verify the simulation results on real data.

Paper Nr.:
87
Title:
EXPERIMENTAL EVALUATION OF RELATIVE POSE ESTIMATION ALGORITHMS
Author(s):
Marcel Brückner, Ferid Bajramovic and Joachim Denzler
Abstract:
Recently, much work has been devoted to multiple object tracking on the one hand side and to appearance model adaptation for a single object tracker on the other side. In this paper, we do both tracking of multiple objects (faces of people) in a meeting scenario and on-line learning to incrementally update the models of the tracked objects to account for appearance changes during tracking. Additionally, we automatically initialize and terminate tracking of individual objects based on low-level features, i.e. face color, face size, and object movement. For tracking a particle filter is incorporated to propagate sample distributions over time. Numerous experiments on meeting data demonstrate the capabilities of our tracking approach. Additionally, we provide an empirical verification of appearance model learning during tracking on an indoor and outdoor scene which supports a more robust tracking.

Paper Nr.:
89
Title:
A SLAG TEMPERATURE AND FLOW MONITORING SYSTEM
Author(s):
Jean-Philippe Andreu
Abstract:
Quality assessment of steel processing essentially relies on the continuous monitoring and control of the steel temperature and the flow patterns of the molten material. Among the various sensors developed to control that process, CCD sensors emerge as a good alternative to more classical measuring devices like thermocouple probes and pyrometers. While thermographic infrared cameras are often discarded as an option because of their high cost, multi-spectral imaging systems based on cameras working in the visible spectrum offer a viable alternative. This paper presents a slag monitoring system based on dual wavelength thermographic cameras. The system allows a real-time and contactless monitoring of the slag temperature and, as an added-value from the continuous video monitoring, it provides the flow patterns of the ingot slag topping in order to assess the quality of the steel processing.

Paper Nr.:
90
Title:
THE ACCURACY OF SCENE RECONSTRUCTION FROM IR IMAGES BASED ON KNOWN CAMERA POSITIONS - An Evaluation with the Aid of LiDAR Data
Author(s):
Stefan Lang, Marcus Hebel and Michael Kirchhof
Abstract:
In this work a system for 3D scene reconstruction from aerial infrared imagery by means of known pose and position information of the sensor is presented. Detected 2D image features are tracked and triangulated afterwards. Each estimated 3D point is assessed by means of its covariance matrix which is associated with the respective uncertainty. Finally a non-linear optimization (Gauss-Newton iteration) of 3D points yields the resulting point cloud. The obtained results are evaluated with the aid of LiDAR data. For that purpose we present a novel approach which quantifies the error of a reconstructed scene by means of a 3D point cloud acquired by a laser scanner. The evaluation procedure takes into account that the main uncertainty of a Structure from Motion (SfM) system is in direction of the line of sight. Results of both the SfM system and the evaluation are presented.

Paper Nr.:
101
Title:
IMPLEMENTATION OF REAL-TIME VISUAL TRACKING SYSTEM FOR AIRBORNE TARGETS
Author(s):
Muhammad Asif Memon, Furqan Muhammad Khan, Farrukh H. Khan, Rana Muhammad Anees and Omair Abdul Rahman
Abstract:
A real-time visual tracking system is presented for tracking airborne targets. The algorithm is based on intensity difference between background and the target in a gray-scale frame. As the background is uniform for aerial videos, decision is made on contrast between tracking gate boundary and the target inside that gate. The algorithm is embedded on DSP Starter Kit (DSK) 6713 and a 586 embedded controller is used for servo control and processing. A personal computer (PC) provides the user interface for the system. The performance of the system is verified with different airborne targets from birds to helicopters and its reliability and constraints are observed.

Paper Nr.:
110
Title:
REAL-TIME OBJECT DETECTION AND TRACKING FOR INDUSTRIAL APPLICATIONS
Author(s):
Selim Benhimane, Hesam Najafi, Matthias Grundmann, Yakup Genc, Nassir Navab and Ezio Malis
Abstract:
This paper deals with the fundamental problem of simultaneously tracking complex objects and accurately estimating the 3D displacement of the camera. It is targetting applications in industrial environments. We adapted recently proposed methods in order to overcome most of the limitations that the community is facing in markerless applications. The proposed algorithm permits to detect and track complex industrial machines that have poor textures in order to provide the user with information virtually overlaid on the images acquired. It is tailored such that most recent advances in computer vision and in pattern recognition are combined to realize a solution able to cope with real industrial environments.

Paper Nr.:
120
Title:
3D ARTICULATED HAND TRACKING BY NONPARAMETRIC BELIEF PROPAGATION ON FEASIBLE CONFIGURATION SPACE
Author(s):
Tangli Liu, Wei Liang and Yunde Jia
Abstract:
An efficient articulated hand tracking method underlying the 3D graphical model from monocular image sequences is proposed in this paper. Due to the inaccurate dependences among the components of human hand leading to distorted estimates in previous work, we design a pertinence graphical model combined with domain–specific heuristics among the components of human hand describing the hand’s 3D structure, kinematics, and dynamics. The proposed model decomposes multivariate, joint distributions into a set of local interactions among small subsets. The modular structure provides an intuitive language for expressing domain–specific knowledge about the variable relationships, and facilitates tracking each hand component independently. And then, we provide a novel belief propagation algorithm to inference in hand graphical model. The algorithm can accommodate an extremely broad class of potential functions besides the potentials appropriate for our model. The experimental results show the robustness and efficiency of tracking each hand component.

Paper Nr.:
122
Title:
A NOVEL EVOLUTIONARY FRAMEWORK FOR FEATURE MATCHING
Author(s):
Biao Wang and Chaoying Tang
Abstract:
The paper presents a new feature matching scheme based on the Queen-bee Evolution for two uncalibrated images. Matching features needs an exhaustive search in a vast space, for which evolutionary algorithms are recommended recently. This paper propose a simple and effective algorithm. We intuitively encode a string of integer numbers assigned to the features as chromosomes and develop a novel crossover operator respectively which can preserve the position information without any disruption. We also tailor swap mutation operator to prevent from premature convergence and invalid solutions. As a result, the proposed algorithm can quickly achieve the global or near global optimal solution cooperating with the linear ranking selection and the elitist replacement. Meanwhile, it is a more general framework for matching various types of features. The experimental results illustrate the performance of the proposed approach.

Paper Nr.:
123
Title:
TRACK AND CUT: SIMULTANEOUS TRACKING AND SEGMENTATION OF MULTIPLE OBJECTS WITH GRAPH CUTS
Author(s):
Aurelie Bugeau and Patrick Pérez
Abstract:
This paper presents a new method to both track and segment multiple objects in videos using min-cut/max-flow optimizations. We introduce objective functions that combine low-level pixel-wise measures (color, motion), high-level observations obtained via an independent detection module (connected components of foreground detection masks in the experiments), motion prediction and contrast-sensitive contextual regularization. One novelty is that external observations are used without adding any association step. The minimization of these cost functions simultaneously allows "detection-before-track" tracking (track-to-observation assignment and automatic initialization of new tracks) and segmentation of tracked objects. When several tracked objects get mixed up by the detection module (e.g., single foreground detection mask for objects close to each other), a second stage of minimization allows the proper tracking and segmentation of these individual entities despite the observation confusion. Experiments on sequences from PETS 2006 corpus demonstrate the ability of the method to detect, track and precisely segment persons as they enter and traverse the field of view, even in cases of occlusions (partial or total), temporary grouping and frame dropping.

Paper Nr.:
153
Title:
DETECTING ,TRACKING AND COUNTING FISH IN LOW QUALITY UNCONSTRAINED UNDERWATER VIDEOS
Author(s):
Yun-Heh Chen-Burger, Gayathri Nadarajan and Robert B. Fisher
Abstract:
In this work a machine vision system capable of analysing underwater videos for detecting, tracking and counting fish is presented. The real-time videos, collected near the Ken-Ding sub-tropical coral reef waters are managed by EcoGrid, Taiwan and are barely analysed by marine biologists. The video processing system is consists of three subsystems: the video texture analysis, fish detection and tracking modules. The fish detection is based on two algorithms computed independently, whose results are combined in order to obtain a more accurate outcome. The tracking was carried out by the application of the CamShift algorithm that enables the tracking of objects whose numbers may vary over time. Unlike existing fish-counting methods, our approach provides a reliable method in which the fish number is computed in unconstrained environments and under several scenarios (murky water, algae on camera lens, moving plants, low contrast, etc.). The proposed approach was tested with 20 underwater videos, achieving an overall accuracy as high as 85%.

Paper Nr.:
164
Title:
MULTI-LANE VISUAL PERCEPTION FOR LANE DEPARTURE WARNING SYSTEMS
Author(s):
Juan M. Collado, Cristina Hilario, Arturo de la Escalera and Jose M. Armingol
Abstract:
This paper presents a Road Detection and Tracking algorithm for Lane Departure Warning Systems. An inverse perspective transformation gives a bird-eye view of the road, where longitudinal road markings are detected by exploration of horizontal gradient, looking for a road marking model. Next, a parabolic lane model is fitted to road markings and tracked through a particle filter. The right and left lane boundaries are classified in three types (solid, broken or merge lane boundaries), through a Fourier analysis, and adjacent lanes are searched when broken or merge lines are detected. This gives the system the ability to automatically detect the number and type of road lanes. This ability allows to tell the difference between allowed and forbidden manoeuvres, such as crossing a solid line, and it is used by the lane departure warning system. Despite of its importance, lane boundary classification has been seldom considered in previous works. A Lane Departure Warning System launches an acoustic signal when a lane departure is detected. Warnings are suppressed when the blinkers are enabled, or when the vehicle is crossing a solid line regardless of the state of the blinkers.

Paper Nr.:
175
Title:
A FEATURE GUIDED PARTICLE FILTER FOR ROBUST HAND TRACKING
Author(s):
Matti-Antero Okkonen, Janne Heikkilä and Matti Pietikäinen
Abstract:
Particle filtering offers an interesting framework for visual tracking. Unlike the Kalman filter, particle filters can deal with non-linear and non-Gaussian problems, which makes them suitable for visual tracking in presence of real-life disturbance factors, such as background clutter and movement, fast and unpredictable object movement and unideal illumination conditions. This paper presents a robust hand tracking particle filter algorithm which exploits the principle of importance sampling with a novel proposal distribution. The proposal distribution is based on effectively calculated color blob features, propagating the particles robustly through time even in unideal conditions. In addition, a novel method for conditional color model adaptation is proposed. The experiments show that using these methods in the particle filtering framework enables hand tracking with fast movements under real world conditions.

Paper Nr.:
181
Title:
BIOLOGICALLY INSPIRED ATTENTIVE MOTION ANALYSIS FOR VIDEO SURVEILLANCE
Author(s):
Florian Raudies and Heiko Neumann
Abstract:
Recently proposed algorithms in the field of vision based video surveillance are build upon directionally consistent flow, or statistics of foreground and background. Here, we present a novel approach which utilizes an attention mechanism to focus processing on (highly) suspicious image regions. The attention signal is generated from temporal integration of localized image features from monocular image sequences. This model uses biologically inspired mechanisms, like feature extraction and grouping to analyze spatio-temporal patterns aiming at defining scene signatures. Main parts of the model are the construction of a motion streak image, the estimation of image flow, and the incorporation of information from both parts for the computation of an attention signal. This incorporation of information is motivated by feature binding, assumed to exist at various stages in biologically plausible systems. We compare our model with an existing approach for the task of video surveillance with a receiver operator characteristic (ROC) analysis. In conclusion our model is shown to yield results which are comparable with existing approaches.

Paper Nr.:
188
Title:
DEPTH PREDICTION AT HOMOGENEOUS IMAGE STRUCTURES
Author(s):
Sinan Kalkan, Florentin Wörgötter and Norbert Krüger
Abstract:
This paper proposes a voting-based model that predicts depth at weakly-structured image areas from the depth that is extracted using a feature-based stereo method. We provide results, on both real and artificial scenes, that show the accuracy and robustness of our approach. Moreover, we compare our method to different dense stereo algorithms to investigate the effect of texture on performance of the two different approaches. The results confirm the expectation that dense stereo methods are suited better for textured image areas and our method for weakly-textured image areas.

Paper Nr.:
193
Title:
CORRELATION ICP ALGORITHM FOR POSE ESTIMATION BASED ON LOCAL AND GLOBAL FEATURES
Author(s):
Marco A. Chavarria and Gerald Sommer
Abstract:
In this paper we present a new variant of ICP (iterative closest point) algorithm based on local feature correlation. Our approach combines global and local feature information to find better correspondence sets and to use them to compute the 3D pose of the object model even for the case of large displacements between model and image data. For such cases, we propose a 2D alignment in the image plane (rotation plus translation) before the feature extraction process. This has some advantages over the classical methods like better convergence and robustness. Furthermore, it avoids the need of a normal pre-alignment step in 3D. Our approach was tested on synthetical and real-world data to compare the convergence behavior and performance against other versions of the ICP algorithm combined with a classical pre-alignment approach.

Paper Nr.:
212
Title:
A NEW SET OF FEATURES FOR ROBUST CHANGE DETECTION
Author(s):
José Sigut, Sid-Ahmed Ould Sidha, Juan Díaz and Carina González
Abstract:
A new set of features for robust change detection is proposed. These features are obtained from a transformation of the thresholded intensity difference image. Their performance is tested on two video sequences acquired in a human-machine interaction scenario under very different illumination conditions. Several performance measures are computed and a comparison with other well known classical change detection methods is done. The performed experiments show the effectiveness and robustness of our proposal.

Paper Nr.:
215
Title:
HAND GESTURE TRACKING FOR WEARABLE COMPUTING SYSTEMS
Author(s):
Xiujuan Chai, Kongqiao Wang, Luosi Wei and Hao Wang
Abstract:
Wearable computing is a hot research field in recent years. For the important role in wearable computing systems, hand gesture tracking attracts many researchers’ interests. This paper proposes a simple but effective temporal differencing based hand motion tracking scheme which is used to build an augmented drumming system. In our method, the accurate motion information is gotten by a fine-coarse-fine strategy. Once getting the motion region candidates, a skin detector based on skin colour histogram is used to determine which region is our concerned hand. In the tracking procedure, motion direction constraint is also adopted in order to get a robust result. Different with the traditional skin detection for the whole image frame, combining with the motion region detection, the hand detection is no longer effected by the skin-like background. Experimental results show that our presented hand gesture tracking is robust and fast. We also adopt it into an augmented drumming system to show the good performance and powerful potential of our method in wearable computing system.

Paper Nr.:
217
Title:
EXACT VISUAL HULL FROM MARCHING CUBES
Author(s):
Chen Liang and Kwan-Yee K. Wong
Abstract:
The marching cubes algorithm has been widely adopted for extracting a surface mesh from a volumetric description of the visual hull reconstructed from silhouettes. However, typical volumetric descriptions, such as an octree, provide only a binary description about the visual hull. The lack of interpolation information along each voxel edge, which is required by the marching cubes algorithm, usually results in inaccurate and bumpy surface mesh. In this paper, we propose a novel method to efficiently estimate the exact intersections between voxel edges and the visual hull boundary, which replace the missing interpolation information. The method improves both the visual quality and accuracy of the estimated visual hull mesh, while retaining the simplicity and robustness of the volumetric approach. To verify this claim, we present both synthetic and real-world experiments, as well as comparisons with existing volumetric approaches and other approaches targeting at an exact visual hull reconstruction.

Paper Nr.:
218
Title:
ROBUST MULTI-TARGET TRACKING USING MEAN SHIFT AND PARTICLE FILTER WITH TARGET MODEL UPDATE
Author(s):
Hong Liu, Jintao Li, Yueliang Qian and Qun Liu
Abstract:
We propose a novel multiple targets tracking algorithm combining Mean Shift and Particle Filter, and enhance the performance with target model update process. Mean Shift has a low complexity, but is weak in dealing with multi-modal probability density functions (pdfs). Particle Filter is robust to the partial occlusion and can deal with multi-modal pdfs. In real application, illumination conditions, the visual angle as well as object occlusion can change target appearance, thus influence the quality of Particle Filter. For multi-target tracking task, the mutual occlusion of targets and computational complexity are important problems for tracking system. In this paper, Mean Shift algorithm is embedded into Particle Filter framework to get stable tracking and reduce computational load. To overcome the target appearance changes caused by illumination changes and object occlusion, targets model are updated adaptively during tracking. Experimental results show that our tracking system can robustly track multiple targets with mutual occlusion and correctly maintain their identities with smaller number of particles than Particle Filter.

Paper Nr.:
222
Title:
A PDES METHOD PRESERVING BOUNDARIES ON DENSE DISPARITY MAP RECONSTRUCTION
Author(s):
Ji liu1, Junjian Peng, Yuechao Wang and Yandong Tang
Abstract:
Over smoothness restricts the application of PDEs in the field of dense disparity map reconstruction, because disparity map reconstruction usually requires preserving discontinuousness in some areas such as the boundaries of objects. To preserve disparity discontinuousness, this paper adopts two strategies. Firstly, ground control points (GCPs) are introduced as the soft constraint. Secondly, this paper designs a structure of smoothness part in energy functional, which can preserve discontinuousness effectively. Moreover, the adjustable parameters in the smoothness part advance its robustness. In experiments, we compare proposed method with graph cuts method and prove that PDEs is also a useful solution for disparity map reconstruction and has the advantage of dealing with smooth images.

Paper Nr.:
228
Title:
PRINCIPLED DETECTION-BY-CLASSIFICATION FROM MULTIPLE VIEWS
Author(s):
Jérôme Berclaz, François Fleuret and Pascal Fua
Abstract:
Machine-learning based classification techniques have been shown to be effective at detecting objects in complex scenes. However, the final results are often obtained from the alarms produced by the classifiers through a post-processing which typically relies on \emph{ad hoc} heuristics. Spatially close alarms are assumed to be triggered by the same target and grouped together. Here we replace those heuristics by a principled Bayesian approach, which uses knowledge about both the classifier response model and the scene geometry to combine multiple classification answers. We demonstrate its effectiveness for multi-view pedestrian detection. We estimate the marginal probabilities of presence of people at any location in a scene, given the responses of classifiers evaluated in each view. Our approach naturally takes into account both the occlusions and the very low metric accuracy of the classifiers due to their invariance to translation and scale. Results show our method produces one order of magnitude fewer false positives than a method that is representative of typical state-of-the-art approaches. Moreover, the framework we propose is generic and could be applied to any detection-by-classification task.

Paper Nr.:
229
Title:
STRUCTURE FROM OMNIDIRECTIONAL STEREO RIG MOTION FOR CITY MODELING
Author(s):
Michal Havlena, Tomáš Pajdla and Kurt Cornelis
Abstract:
This paper deals with a step towards a 3D reconstruction system for city modeling from omnidirectional video sequences using structure from motion together with stereo constraints. We concentrate on two issues. First, we show how the tracking and reconstruction paradigm were adapted to use omnidirectional images taken by lenses with 180 degrees field of view. This concerns mainly camera calibration transforming the pixel locations into rays and solving the minimal problem for 3D-to-2D matches using RANSAC. Secondly, we compare the results of the reconstruction using additional stereo constraints to the results when these constraints are not used and show that they are needed to make the reconstruction stable. Performance of the system is demonstrated on a sequence of 870 images acquired while driving in a city.

Paper Nr.:
230
Title:
3D HUMAN FACE MODELLING FROM UNCALIBRATED IMAGES USING SPLINE BASED DEFORMATION
Author(s):
Nikos Barbalios, Nikos Nikolaidis and Ioannis Pitas
Abstract:
Accurate and plausible 3D face reconstruction remains a difficult problem up to this day, despite the tremendous advances in computer technology and the continuous growth of the applications utilizing 3D face models (e.g. biometrics, movies, gaming). In this paper, a two-step technique for efficient 3D face reconstruction from a set of face images acquired using an uncalibrated camera is presented. Initially, a robust structure from motion (SfM) algorithm is applied over a set of manually selected salient image features to retrieve an estimate of their 3D coordinates. These estimates are further utilized to deform a generic 3D face model, using smoothing splines, and adapt it to the characteristics of a human face.

Paper Nr.:
266
Title:
KLT TRACKING USING INTRINSIC AND EXTRINSIC CAMERA PARAMETERS IN CONSIDERATION OF UNCERTAINTY
Author(s):
Michael Trummer, Joachim Denzler and Christoph Munkelt
Abstract:
Feature tracking is an important task in computer vision, especially for 3D reconstruction applications. Such procedures can be run in environments with a controlled sensor, e.g. a robot arm with camera. This yields the camera parameters as special knowledge that should be used during all steps of the application to improve the results. As a first step, KLT (Kanade-Lucas-Tomasi) tracking (and its variants) is an approach widely accepted and used to track image point features. So, it is straightforward to adapt KLT tracking in a way that camera parameters are used to improve the feature tracking results. The contribution of this work is an explicit formulation of the KLT tracking procedure incorporating known camera parameters. Since practical applications do not run without noise, the uncertainty of the camera parameters is regarded and modeled within the procedure. Comparing practical experiments have been performed and the results are presented.

Paper Nr.:
265
Title:
FEATURE SETS FOR PEOPLE AND LUGGAGE RECOGNITION IN AIRPORT SURVEILLANCE UNDER REAL-TIME CONSTRAINTS
Author(s):
J. Rosell-Ortega, G. Andreu-García, A. Rodas-Jordŕ, V. Atienza-Vanacloig and J. Valiente-González
Abstract:
We study two different sets of features with the aim of classifying objects from videos taken in the halls and corridors of an airport. Objects are classified as being one of three different classes: single person, group of people, and luggage. We have used two different feature sets, one set based on classical geometric features, and another based on dividing the blob into several cells and calculating the density of foreground pictures in each cell. In both cases, easily computed features were selected because our system must run under real-time constraints. During the development of the algorithms, we also studied if shadows affect the classification rate of objects. We achieved this by applying two shadow removal algorithms to estimate the usefulness of such techniques under real-time constraints.

Paper Nr.:
269
Title:
CALIBRATION-FREE EYE GAZE DIRECTION DETECTION WITH GAUSSIAN PROCESSES
Author(s):
Basilio Noris, Karim Benmachiche and Aude G. Billard
Abstract:
In this paper we present a solution for eye gaze detection from a wireless head mounted camera designed for children aged between 6 months and 18 months. Due to the constraints of working with very young children, the system does not seek to be as accurate as other state-of-the-art eye trackers, however it requires no calibration process from the wearer. Gaussian Process Regression and Support Vector Machines are used to analyse the raw pixel data from the video input and return an estimate of the child's gaze direction. A confidence map is used to determine the accuracy the system can expect for each coordinate on the image.

Paper Nr.:
281
Title:
A MAXIMUM LIKELIHOOD SURFACE NORMAL ESTIMATION ALGORITHM FOR HELMHOLTZ STEREOPSIS
Author(s):
Jean-Yves Guillemaut, Ondřej Drbohlav, John Illingworth and Radim Šára
Abstract:
Helmholtz stereopsis is a relatively recent reconstruction technique which is able to reconstruct scenes with arbitrary and unknown surface reflectance properties. Conventional implementations of the method estimate surface normal direction at each surface point via an eigenanalysis, thereby optimising an algebraic distance. We develop a more physically meaningful radiometric distance whose minimisation is shown to yield a Maximum Likelihood surface normal estimate. The proposed method produces more accurate results than algebraic methods on synthetic imagery and yields excellent reconstruction results on real data. Our analysis explains why, for some imaging configurations, a sub-optimal algebraic distance can yield good results.

Paper Nr.:
283
Title:
LUCAS-KANADE INVERSE COMPOSITIONAL USING MULTIPLE BRIGHTNESS AND GRADIENT CONSTRAINTS
Author(s):
Ahmed Fahad and Tim Morris
Abstract:
A recently proposed fast image alignment algorithm is the inverse compositional algorithm based on Lucas-Kanade. In this paper, we present an overview of different brightness and gradient constraints used with the inverse compositional algorithm. We also propose an efficient and robust data constraint for the estimation of global motion from image sequences. The constraint combines brightness and gradient constraints under multiple quadratic errors. The method can accommodate various motion models. We concentrate on the global efficiency of the constraint in capturing the global motion for image alignment. We have applied the algorithm to various test sequences with ground truth. From the experimental results we conclude that the new constraint provides reduced motion error at the expense of extra computations.

Paper Nr.:
287
Title:
VIEW-BASED ROBOT LOCALIZATION USING ILLUMINATION-INVARIANT SPHERICAL HARMONICS DESCRIPTORS
Author(s):
Holger Friedrich, David Dederscheck, Martin Mutz and Rudolf Mester
Abstract:
In this work we present a view-based approach for robot self-localization using a hemispherical camera system. We use view descriptors that are based upon Spherical Harmonics as orthonormal basis functions on the sphere. The resulting compact representation of the image signal enables us to efficiently compare the views taken at different locations. With the view descriptors stored in a database, we compute a similarity map for the current view by means of a suitable distance metric. Advanced statistical models based upon PCA introduced to that distance metric also allow to deal with even severe illumination changes, which extends our method to real-world applications.

Paper Nr.:
289
Title:
MEASUREMENT NOISE IN PHOTOMETRIC STEREO BASED SURFACE RECONSTRUCTION
Author(s):
Toni Kuparinen, Ville Kyrki and Pekka Toivanen
Abstract:
In this paper, surface reconstruction techniques for surfaces with high frequency height variation are studied. Such surfaces are important for many industrial settings, for example, in paper and textile manufacturing. Traditionally, photometric stereo methods have been developed and evaluated on large objects with strong additive Gaussian noise. The paper presents the derivation of the effect of white image noise to gradient fields and proposes a denoising approach of the gradient fields using Wiener filter. Several known surface reconstruction methods are evaluated experimentally, with respect to the effect of the noise, and the boundary conditions of the reconstruction. The experimental results validate that the proposed approach improves the surface reconstruction on surfaces with high frequency height variation.

Paper Nr.:
293
Title:
AN EFFICIENT SENSOR FOR TRAFFIC MONITORING AND TRACKING APPLICATIONS
Author(s):
Nikolaos Zournis-Karouzos, Alexandra Koutsia, Kosmas Dimitropoulos and Nikos Grammalidis
Abstract:
We propose a novel video sensor for real-time motion detection at specific user-defined regions of interest, designed primarily for traffic monitoring, surveillance and tracking applications. The ultimate goal is to extend the capabilities and to alleviate shortcomings of embedded motion detection video sensors (like Autoscope®) for target tracking and surveillance applications, including road traffic monitoring or Advanced Surface Movement, Guidance and Control Systems (A-SMGCS) at airports. Specifically, the new sensor a) supports virtual detectors with a generalized (polygonal) shape, thus providing additional flexibility in the design of detector configurations, b) is based on fast implementations of recent state-of-the art background extraction and update techniques and c) constitutes a generic, inexpensive software solution, which can be used with any video camera. First experimental results confirm that the new video sensor meets the expectations in terms of real-time performance and demonstrates the additional functionalities, according to which it was designed. The final goal is to use this new sensor as an alternative, improved version of the Autoscope video sensors for the targeted applications.

Paper Nr.:
304
Title:
OMNIDIRECTIONAL CAMERA MOTION ESTIMATION
Author(s):
Akihiko Torii and Tomáš Pajdla
Abstract:
We present an automatic technique for computing relative camera motion and simultaneous omnidirectional image matching. Our technique works for small as well as large motions, tolerates multiple moving objects and very large occlusions in the scene. We combine three principles and obtain a practical algorithm which improves the state of the art. First, we show that the correct motion is found much sooner if the tentative matches are sampled after ordering them by the similarity of their descriptors. Secondly, we show that the correct camera motion can be better found by soft voting for the direction of the motion than by selecting the motion that is supported by the largest set of matches. Finally, we show that it is useful to filter out the epipolar geometries which are not generated by points reconstructed in front of cameras. We demonstrate the performance of the technique in an experiment with 189 image pairs acquired in a city and in a park. All camera motion were recovered with the error of the motion direction smaller than 8 degree, which is 4% of the 183 degree field of view, w.r.t.\ the ground truth.

Paper Nr.:
317
Title:
PROBABILISTIC APPEARANCE-BASED NAVIGATION OF A MOBILE ROBOT
Author(s):
Luis Payá, Oscar Reinoso, Arturo Gil, M. Asuncion Vicente and Jose L. Aznar
Abstract:
This work presents an appearance-based approach to route following in multi-robot systems, using the information captured by a conventional forward-looking camera. In the teaching phase, the most relevant information along the route is stored using incremental Principal Components Analysis (PCA). Thanks to this approach, the follower robot can begin the route while the leader is still recording it and follow it with a distance as in time or in space. The follower robot makes an auto-localization process, comparing the current view with the information stored in the database, using a probabilistic approach that takes into account the current sensory input and the previous position. Then, a fuzzy controller is in charge of calculating the speed and turning to follow the route. The inputs of this controller are obtained also through the visual information. Experimental results have shown the robustness of the algorithms in an office environment.

Paper Nr.:
322
Title:
AUTONOMOUS MODEL-BASED OBJECT IDENTIFICATION & CAMERA POSITION ESTIMATION WITH APPLICATION TO AIRPORT LIGHTING QUALITY CONTROL
Author(s):
James H. Niblock, Jian-Xun Peng, Karen R. McMenemy and George W. Irwin
Abstract:
The development of an autonomous system for the accurate measurement of the quality of aerodrome ground lighting (AGL) in accordance with current standards and recommendations is presented. The system is composed of an imager which is placed inside the cockpit of an aircraft to record images of the AGL during a normal descent to an aerodrome. Before the performance of the AGL is assessed, it is first necessary to uniquely identify each luminaire within the image and track it through the complete image sequence. A model-based (MB) methodology is used to ascertain the optimum match between a template of the AGL and the actual image data. Projective geometry, in addition to the image and real world location of the extracted luminaires, is then used to calculate the position of the camera at the instant the image was acquired. Algorithms are also presented which model the distortion apparent within the sensors optical system and average the camera's intrinsic parameters over multiple frames, so as to minimise the effects of noise on the acquired image data and hence make the camera's estimated position and orientation more accurate. The positional information is validated using actual approach image data.

Paper Nr.:
328
Title:
MULTI-CAMERA DETECTION AND MULTI-TARGET TRACKING - Traffic Surveillance Applications
Author(s):
R. Reulke, S. Bauer, T. Döring and R. Spangenberg
Abstract:
Non-intrusive video-detection for traffic flow observation and surveillance is the primary alternative to conventional inductive loop detectors. Video Image Detection Systems (VIDS) can derive traffic parameters by means of image processing and pattern recognition methods. Existing VIDS emulate the inductive loops. We propose a trajectory based recognition algorithm to expand the common approach and to obtain new types of information (e.g. queue length or erratic movements). Different views of the same area by more than one camera sensor are necessary, because of the typical limitations of single camera systems, resulting from occlusions by other cars, trees and traffic signs. A distributed cooperative multi-camera system enables a significant enlargement of the observation area. The trajectories are derived from multi-target tracking. The fusion of object data from different cameras will be done by a tracking approach. This approach opens up opportunities to identify and specify traffic objects, their location, speed and other characteristic object information. The system creates new derived and consolidated information of traffic participants. Thus, also descriptions of individual traffic participants are possible.

Paper Nr.:
347
Title:
RANDOM FOREST CLASSIFIERS FOR REAL-TIME OPTICAL MARKERLESS TRACKING
Author(s):
Ińigo Barandiaran, Charlotte Cottez, Céline Paloc and Manuel Grańa
Abstract:
Augmented reality (AR) is a very promising technology that can be applied in many areas such as healthcare, broadcasting or manufacturing industries. One of the bottlenecks of such application is a robust real-time optical markerless tracking strategy. In this paper we focus on the development of tracking by detection for plane homography estimation. Feature or keypoint matching is a critical task in such approach. We propose to apply machine learning techniques to solve this problem. We present an evaluation of an optical tracking implementation based on Random Forest classifier. The implementation has been successfully applied to indoor and outdoor augmented reality design review application.

Paper Nr.:
359
Title:
CAMERA MOTION ESTIMATION USING PARTICLE FILTERS
Author(s):
Symeon Nikitidis, Stefanos Zafeiriou and Ioannis Pitas
Abstract:
In this paper a novel algorithm for estimating the parametric form of the camera motion is proposed. In particular, a novel stochastic vector field model is proposed which can handle smooth motion patterns derived from long periods of stable camera movement and also can cope with rapid motion changes and periods where camera remains still. A set of rules for robust and online updating of the model parameters is also proposed, based on the Expectation Maximization algorithm. Finally, we fit this model in a particle filters framework, in order to predict the future camera motion based on current and prior knowledge. Extensive experimental results verify the usefulness of the proposed scheme in camera motion pattern classification and in accurate estimation of the camera 2D affine transform parameters.

Paper Nr.:
364
Title:
ITERATIVE RIGID BODY TRANSFORMATION ESTIMATION FOR VISUAL 3-D OBJECT TRACKING
Author(s):
Micha Hersch, Thomas Reichert and Aude Billard
Abstract:
We present a novel yet simple 3D stereo vision tracking algorithm which computes the position and orientation of an object from the location of markers attached to the object. The novelty of this algorithm is that it does not assume that the markers are tracked syncronously. This provides a higher robustness to the noise in the data, missing points and outliers. The principle of the algorithm is to perform a simple gradient descent on the rigid body transformation describing the object position and orientation. This is proved to converge to the correct solution and is illustrated in a simple experimental setup involving two USB cameras.

Paper Nr.:
374
Title:
ANOMALY DETECTION WITH LOW-LEVEL PROCESSES IN VIDEOS
Author(s):
Ákos Utasi and László Czúni
Abstract:
In our paper we deal with the problem of low-level motion modeling and unusual event detection in urban surveillance videos. We model the direction of optical flow vectors at image pixels. We implemented and tested probability based approaches such as probability estimation, Mixture of Gaussians modeling, and spatial averaging (with Mean-shift segmentation). We propose a Markovian prior to get reliable spatio-temporal support. We tested the tech-niques on synthetic and real video sequences.

Paper Nr.:
380
Title:
ESTIMATING VEHICLE VELOCITY USING RECTIFIED IMAGES
Author(s):
Cristina Maduro, Katherine Batista, Paulo Peixoto and Jorge Batista
Abstract:
In this paper we propose a technique to estimate vehicles velocity, using rectified images that represent a top view of the highway. To rectify image sequences captured by uncalibrated cameras, this method automatically estimates two vanishing points using lines from the image plane. This approach requires two known lengths on the ground plane and can be applied to highways that are fairly straight near the surveillance camera. Once the background image is rectified it is possible to locate the stripes and boundaries of the highway lanes. This process may also be used to count vehicles, estimate their velocities and the mean velocity associated to each of the previously identified highway lanes.

Paper Nr.:
401
Title:
LONG-TERM VS. GREEDY ACTION PLANNING FOR COLOR LEARNING ON A MOBILE ROBOT
Author(s):
Mohan Sridharan and Peter Stone
Abstract:
A major challenge in the path of widespread use of mobile robots is the ability to function autonomously, learning useful models for environmental features, and adapting these models in accordance to environmental changes. In this paper, we address an important subtask of robot vision, namely color modeling/learning. We present and analyze the performance of two algorithms that enable a mobile robot to plan an action sequence to facilitate color learning: local heuristic planning, and global action selection. We show that global planning, which maximizes color learning opportunities while minimizing localization, provides better performance. Our approach is fully implemented and tested on the Sony AIBO robots

Paper Nr.:
433
Title:
AN ARTICULATED MODELWITH A KALMAN FILTER FOR REAL TIME VISUAL TRACKING - Application to the Tracking of Pedestrians with a Monocular Camera
Author(s):
Youssef Rouchdy
Abstract:
This work presents a method for the visual tracking of articulated targets in image sequences in real time. Each part of the target object is considered as a region of interest and tracked by a parametric transformation. Prior geometric and dynamic informations about the target are introduced with a Kalman filter to guide the evolution of the tracking process of regions. An articulated model with two areas is proposed and applied to track pedestrians in the urban image sequences.

Bayesian Approach for Inverse Problems in Computer Vision

Paper Nr.:
486
Title:
USING LOGARITHMIC OPINION POOLING TECHNIQUES IN BAYESIAN BLIND MULTI-CHANNEL RESTORATION
Author(s):
Bruno Amizic, Aggelos K. Katsaggelos and Rafael Molina
Abstract:
In this paper we examine the use of logarithmic opinion pooling techniques to combine two observations models that are normally used in multi-channel image restoration techniques. The combined observation model is used together with simultaneous autoregression prior models for the image and blurs to define the joint distribution of image, blurs and observations. Assuming that all the unknown parameters are previously estimated we use variational techniques to approximate the posterior distribution of the real underlying image and the unknown blurs. We will examine the use of two approximations of the posterior distribution. Experimental results are used to validate the proposed approach.

Paper Nr.:
499
Title:
VARIATIONAL BAYES WITH GAUSS-MARKOV-POTTS PRIOR MODELS FOR JOINT IMAGE RESTORATION AND SEGMENTATION
Author(s):
Hacheme Ayasso and Ali Mohammad-Djafari
Abstract:
In this paper, we propose a family of non-homogeneous Gauss-Markov fields with Potts region labels model for images to be used in a Bayesian estimation framework, in order to jointly restore and segment images degraded by a known point spread function and additive noise. The joint posterior law of all the unknowns ( the unknown image, its segmentation hidden variable and all the hyperparameters) is approximated by a separable probability laws via the variational Bayes technique. This approximation gives the possibility to obtain practically implemented joint restoration and segmentation algorithm. We will present some preliminary results and comparison with a MCMC Gibbs sampling based algorithm

Paper Nr.:
511
Title:
A MINIMUM ENTROPY IMAGE DENOISING ALGORITHM - Minimizing Conditional Entropy in a New Adaptive Weighted K-th Nearest Neighbor Framework for Image Denoising
Author(s):
Cesario Vincenzo Angelino, Eric Debreuve and Michel Barlaud
Abstract:
In this paper we address the image restoration problem in the variational framework. The focus is set in denoising applications. Natural image statistics are consistent with a Markov random field (MRF) model for the image structure, thus in a restoration process attention must be posed on the spatial correlation between adjacent pixels.The proposed approach minimizes the conditional entropy of a pixel knowing its neighborhood. The estimation procedure of statistical properties of the image is carried out in a new adaptive weighted k-th nearest neighbor (AWkNN) framework. Experimental results shows the interest of such approach. Images quality is evaluated by means of the RMSE measure and SSIM index, more adapted to the human visual system.

Online Pattern Recognition and Machine Learning Techniques for Computer-Vision Applications

Paper Nr.:
192
Title:
MULTITASK LEARNING - An Application to Incremental Face Recognition
Author(s):
David Masip, Ŕgata Lapedriza and Jordi Vitriŕ
Abstract:
Usually face classification applications suffer from two important problems: the number of training samples from each class is reduced, and the final system usually must be extended to incorporate new people to recognize. In this paper we introduce a face recognition method that extends a previous boosting-based classifier adding new classes and avoiding the need of retraining the system each time a new person joins the system. The classifier is trained using the multitask learning principle and multiple verification tasks are trained together sharing the same feature space. The new classes are added taking advantage of the previous learned structure, being the addition of new classes not computationally demanding. Our experiments with two different data sets show that the performance does not decrease drastically even when the number of classes of the base problem is multiplied by a factor of $8$.

Paper Nr.:
418
Title:
AN ONLINE SELF-BALANCING BINARY SEARCH TREE FOR HIERARCHICAL SHAPE MATCHING
Author(s):
N. Tsapanos, A. Tefas and I. Pitas
Abstract:
In this paper we propose a self-balanced binary search tree data structure for shape matching. This was originaly developed as a fast method of silhouette matching in videos recorded from IR cameras by firemen during rescue operations. We introduce a similarity measure with which we can make decisions on how to traverse the tree and backtrack to find more possible matches. Then we describe every basic operation a binary search tree can perform adapted to a tree of shapes. Note that as a binary search tree, all operations can be performed in O(log n) time and are very fast and efficient. Finally we present experimental data evaluating the performance of our proposed data structure.

Paper Nr.:
444
Title:
CONTINUOUS LEARNING OF SIMPLE VISUAL CONCEPTS USING INCREMENTAL KERNEL DENSITY ESTIMATION
Author(s):
Danijel Skočaj, Matej Kristan and Aleš Leonardis
Abstract:
In this paper we propose a method for continuous learning of simple visual concepts. The method continuously associates words describing observed scenes with automatically extracted visual features. Since in our setting every sample is labelled with multiple concept labels, and there are no negative examples, reconstructive representations of the incoming data are used. The associated features are modelled with kernel density probability distribution estimates, which are built incrementally. The proposed approach is applied to the learning of object properties and spatial relations.

Paper Nr.:
447
Title:
ONLINE LEARNING OF GAUSSIAN MIXTURE MODELS - A Two-Level Approach
Author(s):
Arnaud Declercq and Justus H. Piater
Abstract:
We present a method for incrementally learning mixture models that avoids the necessity to keep all data points around. It contains a single user-settable parameter that controls via a novel statistical criterion the trade-off between the number of mixture components and the accuracy of representing the data. A key idea is that each component of the (non-overfitting) mixture is in turn represented by an underlying mixture that represents the data very precisely (without regards to overfitting); this allows the model to be refined without sacrificing accuracy.

Paper Nr.:
454
Title:
TIME DEPENDENT ON-LINE BOOSTING FOR ROBUST BACKGROUNDMODELING
Author(s):
Helmut Grabner, Christian Leistner and Horst Bischof
Abstract:
In modern video surveillance systems change and outlier detection is of highest interest. Most of these systems are based on standard pixel-by-pixel background modeling approaches. In this paper, we propose a novel robust block-based background model as it is suitable for outlier detection using an extension to on-line boosting for feature selection. In order to be robust and still easy to operate our system incorporates several novelties in both previous proposed on-line boosting algorithms and classifier-based background modeling systems. We introduce time-dependency and control into on-line boosting. Our system allows for automatically adjusting its temporal behavior to the underlying scene by using a control system which regulates the model parameters. The benefits of our approach are illustrated on several experiments on challenging standard datasets.

VISAPP International Workshop on Robotic Perception (VISAPP-RoboPerc08)

Paper Nr.:
412
Title:
Comparing Two Action Planning Approaches for Color Learning on a Mobile Robot
Author(s):
Mohan Sridharan and Peter Stone
Abstract:
A major challenge to the deployment of mobile robots in a wide range of tasks is the ability to function autonomously, learning appropriate models for environmental features and adapting those models, over time, in accordance to environmental changes. Such autonomous operation is feasible iff the robot is able to autonomously select/plan an action sequence that facilitates learning and adaptation. In this paper, we focus on the task of color modeling/learning, and present and analyze two algorithms that enable a mobile robot to plan action sequences that facilitate color learning. We propose a long-term action selection approach that maximizes color learning opportunities while minimizing localization errors over an entire action sequence, and compare it with a greedy/heuristic action selection approach that plans incrementally, one step at a time, to maximize the benefits based on the current state of the world. We show (experimentally) that long-term action selection results in a more principled solution that requires minimal human supervision, and that better failure recovery can be achieved by incorporating some features of the greedy planning approach as well. All algorithms are fully implemented and tested on the Sony AIBO robots.

Paper Nr.:
505
Title:
Implementation of an Intentional Vision System to support Cognitive Architectures
Author(s):
Ignazio Infantino, Carmelo Lodato, Salvatore Lopes and Filippo Vella
Abstract:
An effective cognitive architecture has to be able to model, recognize and interpret user wills. The aim of the proposed framework is the development of an intentional vision system oriented to man-machine interaction. Such system will be able to recognize user faces, to recognize and tracking human postures by video cameras. It could be integrated in an cognitive software architecture, and could be tested in several demonstrative scenarios such as domotics, or entrainment robotics, and so on. The described framework is organized on two modules mapped on the corresponding outputs to obtain: intentional perception of faces; intentional perception of human body movements. Moreover a possible integration of intentional vision module in a completecognitive architecture is proposed.

Paper Nr.:
512
Title:
Data Fusion by Uncertain Projective Geometry in 6DoF Visual SLAM
Author(s):
Daniele Marzorati, Matteo Matteucci, Davide Migliore and Domenico G. Sorrenti
Abstract:
In this paper we face the issue of fusing 3D data from different sensors in a seamless way, using the unifying framework of uncertain projective geometry. Within this framework it is possible to describe, combine, and estimate various types of geometric elements (2D and 3D points, 2D and 3D lines, and 3D planes) taking their uncertainty into account. Because of the size of the data involved in this process, the integration process and thus the SLAM algorithm turns out to be very slow. For this reason, in this work, we propose the use of an R*-Tree data structure to speed up the whole process, managing in an efficent way both the estimated map and the 3D points clouds coming out from the stereo camera. The experimental section shows that the use of uncertain projective geometry and the R*-Tree data structure improves the mapping and the pose estimation.

Paper Nr.:
513
Title:
Mutual Calibration of a Camera and a Laser Rangefinder
Author(s):
Vincenzo Caglioti, Alessandro Giusti and Davide Migliore
Abstract:
We present a novel geometrical method for mutually calibrating a camera and a laser rangefinder by exploiting the image of the laser dot in relation to the rangefinder reading. Our method simultaneously estimates all intrinsic parameters of a pinhole natural camera, its position and orientation w.r.t. the rangefinder axis, and four parame- ters of a very generic rangefinder model with one rotational degree of freedom. The calibration technique uses data from at least 5 different rangefinder rota- tions: for each rotation, at least 3 different observations of the laser dot and the respective rangefinder reading are needed. Data collection is simply performed by generically moving the rangefinder-camera system, and does not require any calibration target, nor any knowledge of the environment or motion. We investigate the theoretical limits of the technique as well as its practical ap- plication; we also show extensions for using more data than strictly necessary or exploit a priori knowledge of some parameters.

Paper Nr.:
514
Title:
Integration of Tracked and Recognized Features for Locally and Globally Robust Structure from Motion
Author(s):
Chris Engels, Friedrich Fraundorfer and David Nistér
Abstract:
We present a novel approach to structure from motion that integrates wide baseline local features with tracked features to rapidly and robustly reconstruct scenes from image sequences. Rather than assume we can create and maintain a consistent and drift-free reconstructed map over an arbitrarily long sequence, we instead create small, independent submaps generated over short periods of time and attempt to link the submaps together via recognized features. The tracked features provide accurate pose estimates frame to frame, while the recognizable local features stabilize the estimate over larger baselines and provide a context for linking submaps together. As each frame in the submap is inserted, we apply real-time bundle adjustment to maintain a high accuracy for the submaps. Recent advances in feature-based object recognition enable us to efficiently localize and link new submaps into a reconstructed map within a localization and mapping context. Because our recognition system can operate efficiently on many more features than previous systems, our approach easily scales to larger maps. We provide results that show that accurate structure and motion estimates can be produced from a handheld camera under shaky camera motion.

Paper Nr.:
515
Title:
Pose Clustering From Stereo Data
Author(s):
Ulrich Hillenbrand
Abstract:
This article describes an algorithm for pose or motion estimation based on clustering of parameters in the 6-dimensional pose space. The parameter samples are computed from data samples randomly drawn from stereo data points. The estimator is global and robust, performing matches to parts of a scene without prior pose information. It is general, in that it does not require any particular object features. Empirical object models can be built largely automatically. An implemented application from the service robotic domain and a quantitative performance study on real data are presented.

The First International Workshop on Metadata Mining for Image Understanding (MMIU 2008)

Paper Nr.:
411
Title:
Combining Visual and Text Features for Learning in Multimedia Direct Marketing Domain
Author(s):
Sebastiano Battiato, Giovanni Maria Farinella, Giovanni Giuffrida, Catarina Sismeiro and Giuseppe Tribulato
Abstract:
Direct marketing companies systematically dispatch the offers under consideration to a limited sample of potential buyers, rank them with respect to their performance and, based on this ranking, decide which offers to send to the wider population. Though this pre-testing process is simple and widely used, recently the direct marketing industry has been under increased pressure to further optimize learning, in particular when facing severe time and space constraints. Taking into account the multimedia nature of offers, which typically comprise both a visual and text component, we propose a two-phase learning strategy based on a cascade of regression methods. This proposed approach takes advantage of visual and text features to improve and accelerate the learning process. Experiments in the domain of a commercial Multimedia Messaging Service (MMS) show the effectiveness of the proposed methods that improve on classical learning techniques.

Paper Nr.:
417
Title:
Automatic Image Annotation using Visual Content and Folksonomies
Author(s):
Roland Mörzinger, Robert Sorschag, Georg Thallinger1 and Stefanie Lindstaedt
Abstract:
Automatic image annotation is an important and challenging task in content-based image retrieval. This paper describes techniques for automatic image annotation by taking advantage of collaboratively annotated image databases, so called visual folksonomies. Our approach includes a classification and tag propagation system using content-based image analysis. Classification annotates images with a controlled vocabulary while tag propagation uses user generated, folksonomic annotations and is therefore capable of dealing with unlimited vocabulary. Experiments with a pool of Flickr images demonstrate that the high accuracy and efficiency of the proposed methods in the task of automatic image annotation.

Paper Nr.:
422
Title:
Computational Linguistics for Metadata Building (CLiMB) Text Mining for the Automatic Extraction of Subject Terms for Image Metadata
Author(s):
Judith L. Klavans, Tandeep Sidhu, Carolyn Sheffield, Dagobert Soergel Jimmy Lin, Eileen Abels and Rebecca Passonneau
Abstract:
In this paper, we present a fully-implemented system using computa-tional linguistic techniques to apply automatic text mining for the extraction of metadata for image access. We describe the implementation of a workbench created for, and evaluated by, image catalogers. We discuss the current func-tionality and future goals for this image catalogers’ toolkit, developed under the Computational Linguistics for Metadata Building (CLiMB) research project.1 Our primary user group for initial phases of the project is the cataloger expert; in future work we address applications for end users.

Paper Nr.:
425
Title:
Travel Blog Assistant System (TBAS) - An Example Scenario of how to Enrich Text with Images and Images with Text using Online Multimedia Repositories
Author(s):
Marco Bressan, Gabriela Csurka, Yves Hoppenot and Jean-Michel Renders
Abstract:
In this paper we present a Travel Blog Assistant System that facilitates the travel blog writing by automatically selecting for each blog paragraph written by the user the mostr elevant images from an uploaded image set. In order to do this, the system first automatically adds metadata to the traveler's photos based both on a Generic Visual Categorizer (visual keywords) and by exploiting cross-content web repositories (textual keywords). For a given paragraph, the system ranks the uploaded images according to the similarity between the extracted metadata and the paragraph. The technology developed and presented here has potential beyond travel blogs, which served just as an illustrative example. Clearly, the same methodology can be used by professional users in the fields of multimedia document generation and automatic illustration and captioning.

Paper Nr.:
437
Title:
Describing the Where – Improving Image Annotation and Search through Geography
Author(s):
Ross S. Purves, Alistair Edwardes and Mark Sanderson
Abstract:
Image retrieval, using either content or text-based techniques, does not match up to the current quality of standard text retrieval. One possible reason for this mismatch is the semantic gap – the terms by which images are indexed do not accord with those imagined by users querying image databases. In this paper we set out to describe how geography might help to index the where facet of the Pansofsky-Shatford matrix, which has previously been shown to accord well with the types of queries users make. We illustrate these ideas with existing (e.g. identifying place names associated with a set of coordinates) and novel (e.g. describing images using land cover data) techniques to describe images and contend that such methods will become central as increasing numbers of images become georeferenced.

Paper Nr.:
440
Title:
Which Strategy to combine Face Identification Tools with Clothing Similarity - Contesting or Reinforcing?
Author(s):
Saďd Kharbouche and Michel Plu
Abstract:
This paper describes a novel and efficient approach that integrates clothing similarity into face identification process in personal photos. The information extracted from people's clothes would be helpful if they are dissimilar, however, this information could make errors and noise if we have some people with similar clothes. To resolve this problem, we propose here a new and intelligent methodology that exploits clothing similarity. The main idea is summarized as follows: if a person is well identified in a detected face, instead to reinforce this person in every face (in other photo) with similar clothes, we contest her/him in every face with dissimilar clothes. The weight and the influence of the information extracted from a face in a photo to another face depend on the spatiotemporal distance between photos, the similarity degree between the clothes and the incertitude level about their real identities. We utilize belief functions theory in order to manage efficiently the imprecision and the uncertainty. Besides, the results obtained showed off the useful of our approach.

Paper Nr.:
450
Title:
Improved Image Retrieval using Visual Sorting and Semi-Automatic Semantic Categorization of Images
Author(s):
Kai Uwe Barthel, Sebastian Richter, Anuj Goyal and Andreas Follmann
Abstract:
The increasing use of digital images has led to the growing problem of how to organize these images efficiently for search and retrieval. Interpretation of what we see in images is hard to characterize, and even more so to teach a machine such that any automated organization can be possible. Due to this, both keyword-based Internet image search systems and content-based image retrieval systems are not capable of searching images according to the human high-level semantics of images. In this paper we propose a new image search system using keyword annotations, low-level visual metadata and semantic inter-image relationships. The semantic relationships are learned exclusively from the human users’ interaction with the image search system. Our system can be used to search huge (web-based) image sets more efficiently. However, the most important advantage of the new system is that it can be used to generate semi-automatically semantic relationships between the images.

Paper Nr.:
453
Title:
Can Feature Information Interaction help for Information Fusion in Multimedia Problems?
Author(s):
Jana Kludas, Eric Bruno and Stephane Marchand-Maillet
Abstract:
...

Paper Nr.:
455
Title:
Extracting Semantic Meaning from Photographic Annotations using a Hybrid Approach
Author(s):
Rodrigo Carvalho, Sam Chapman and Fabio Ciravegna
Abstract:
This paper evaluates singular then hybrid methodologies for extracting semantics considered to be relevant to users in cataloguing and searching of personal photographs. This work concentrates upon extraction of meaningful concepts within textual annotations focusing around geographical identification, together with references to people and objects concerning each image. Extraction considers a number of approaches to achieve this goal; machine learning, rule based approaches as well as a novel hybrid approach considering both previous techniques. This evaluation identifies the strengths of the singular approaches and defines rules best suited to differing extractions providing a higher performing hybrid method.

Paper Nr.:
488
Title:
Functional Semantic Categories for Art History Text - Human Labeling and Preliminary Machine Learning
Author(s):
Rebecca J. Passonneau, Tae Yano, Tom Lippincott and Judith Klavans
Abstract:
Descriptive metadata for indexing images of works of art can be classified into a variety of functions, such as descriptions of the depicted work, versus about the art historical impact of the work ([7]). Similarly, illustrated art history survey texts address multiple topics pertaining to a given work. We report on an effort to develop a set of functional semantic categories to classify text extracts from art history survey texts, for use in locating specific classes of descriptive metadata. Each category specifies a distinct relation between the depicted work and the text, one that indicates the expository purpose the text serves. In a series of pilot studies, we found that the ability of humans to label text consistently using our categories varied widely, depending on a wide range of factors such as the labeler’s area of expertise, the image-text pair under consideration, the constraints placed on the labeling task, and the method used to introduce labelers to the categories. Based on these studies, we implemented a labeling interface which we have used to collect the first 10% to 20% of a large dataset of text that will be used in training and testing a machine learner. Initial machine learning results on our pilot data indicate the three most relevant categories are machine learnable.

The First International Workshop on Image Mining. Theory and Applications (IMTA 2008)

Paper Nr.:
435
Title:
Text-Dependent Speaker Identification using Spectrograms based on Conditional Quantization
Author(s):
Tridibesh Dutta
Abstract:
The goal of this paper is to study a new approach to text dependent speaker identification using spectrograms. This, mainly, revolves around trapping the complex patterns of variation in frequency and amplitude with time while an individual utters a given word through spectrogram segmentation. These optimally segmented spectrograms are used as a database to successfully identify the unknown individual from his/her voice. The methodology used for identifying, rely on classification of spectrograms (of speech signals), based on clustering of the quantized frequency-time domain features of the database spectrogram samples and the unknown speech sample. Performance of this novel approach on a sample collected from 40 speakers show that this methodology can be effectively used to produce a desirable success rate.

Paper Nr.:
439
Title:
Fast Multi-View Evaluation of Data Represented by Symmetric Clusters
Author(s):
Alexander Vinogradov
Abstract:
A new framework is proposed for a fast calculation of linear scalings posed on structured data. Several widely used types of data representation based on clusters with intrinsic features of local simmetry are taken into account. Paper presents some Image Mining technologies that are used for improvement of abstract data multi-view evaluation procedures.

Paper Nr.:
441
Title:
Search Algorithm and the Distortion Analysis of Fine Details of Real Images
Author(s):
Sai S.V. and Sorokin N.Yu.
Abstract:
This work describes a search algorithm and a method of the distortions analysis of fine details of real images based on objective criteria.

Paper Nr.:
448
Title:
Elements of a Gestalt Algebra: Steps towards understanding Images and Scenes
Author(s):
Eckart Michaelsen, Michael Arens and Leo Doktorski
Abstract:
A mathematical structure is sketched that is meant to capture the regularities and hierarchies in the structure of images. The approach is motivated by difficulties arising from aerial image analysis of urban terrain. It is not feasible to list and model all possibilities for things such as buildings that occur in such data. Emanating from the Gestalt-theory of perception an abstract algebra of operations on image objects is defined and the formal properties are discussed. It is intended to build future software system on such formalisms that will realize only those gestalt models that are evident from the data and can build and recognize structures of previously unseen and unexpected structure.

Paper Nr.:
449
Title:
A Proposal for Automatic Inference of Pressure Ulcers Grade Based on Wound Images and Patient Data
Author(s):
Rinaldo de S. Neves, Simônia F. Silva, Edvar F.Rocha Jr. Levy A. Santana, Renato Guadagnin and Edílson Ferneda
Abstract:
Pressure ulcers (PU) occurs in a significant amount of patients that cannot move for long periods. Data from patient concerning both their individual features and wound origin are collected. PU images and medical diagnosis about PU grade can be stored. Such sets of information can be submitted to data mining procedures in order to be detected some relations between data. Is seems to be also possible computationally to generate a PU grade inference that will help medical experts to accomplish therapeutic procedures. Present proposal aims so to sPUport PU diagnosis process and so to accelerate healing process towards important benefits for a better patients life quality with lower medical assistance costs.

Paper Nr.:
484
Title:
Descriptive Approach to Medical Image Analysis - Substantiation and Interpretation
Author(s):
I. Gurevich, V. Yashina, H. Niemann and O. Salvetti
Abstract:
The paper is devoted to the development and formal representation of the descriptive model of information technology for automating morphologic analysis of cytological specimens (lymphatic system tumors). The main contributions are detailed description of algebraic constructions used for creating of mathematical model of information technology and its specification in the form of algorithmic scheme based on Descriptive Image Algebras. It is specified the descriptive model of an image recognition task and the stage of an image reduction to a recognizable from. The theoretical base of the model is the Descriptive Approach to Image Analysis and its main mathematical tools. It is demonstrated practical application of algebraic tools of the Descriptive Approach to Image Analysis and presented an algorithmic scheme of a technology implementing the apparatus of Descriptive Image Algebras.

Paper Nr.:
485
Title:
Descriptive Analysis of Image Data: Basic Models
Author(s):
I. Gurevich and V. Yashina
Abstract:
The paper is devoted to the foundations, general methodology, the axiomatic and formal structures of the Descriptive Theory for Image Analysis (DTIA) providing a methodology, mathematical and computational techniques for automation of image analysis and estimation (IAE). The main purpose of theoretical apparatus of the DTIA is structuring of the variety of methods, operations and representations being used in IEA. The final goal of the DTIA is automated image mining: a) automated selection of techniques and algorithms for image recognition, estimation, and understanding; b) automated testing of the raw data quality and its suitability for solving the image recognition problem. The DTIA provides mathematical fundamentals for image mining. The axiomatics and formal structures of Descriptive Theory of Image Analysis provide the ways and means to represent and to describe images for its analysis and estimating. The main contributions of axiomatics are Descriptive Image Models: its definitions, classification, properties, interrelations, and conditions of generation

Paper Nr.:
495
Title:
Geo-located Image Categorization and Location Recognition
Author(s):
Marco Cristani, Alessandro Perina, Umberto Castellani and Vittorio Murino
Abstract:
Image categorization is undoubtedly one of the most recent and challenging problems faced in Computer Vision. The scientific literature is plenty of methods more or less efficient and dedicated to a specific class of images; further, commercial systems are also going to be advertised in the market. Nowadays, additional data can also be attached to the images, enriching its semantic interpretation beyond the pure appearance. This is the case of geo-location data that contain information about the geographical place where an image has been acquired. This data allow, if not require, a different management of the images, for instance, to the purpose of easy retrieval from a repository, or of identifying the geographical place of an unknown picture, given a geo-referenced image repository. This paper constitutes a first step in this sense, presenting a method for geo-referenced image categorization, and for the recognition of the geographical location of an image without such information available. The solutions presented are based on robust pattern recognition techniques, such as the probabilistic Latent Semantic Analysis, the Mean Shift clustering and the Support Vector Machines. Experiments have been carried out on a couple of geographical image databases: results are actually very promising, opening new interesting challenges and applications in this research field.

Paper Nr.:
500
Title:
Pearling: Stroke segmentation with crusted pearl strings
Author(s):
B. Whited, J. Rossignac, G. Slabaugh, T. Fang and G. Unal
Abstract:
We introduce a novel segmentation technique, called Pearling, for the semi-automatic extraction of idealized models of networks of strokes (variable width curves) in images. These networks may for example represent roads in an aerial photograph, vessels in a medical scan, or strokes in a drawing. The operator seeds the process by selecting representative areas of good (stroke interior) and bad colors. Then, the operator may either provide a rough trace through a particu- lar path in the stroke graph or simply pick a starting point (seed) on a stroke and a direction of growth. Pearling computes in realtime the centerlines of the strokes, the bifurcations, and the thickness function along each stroke, hence producing a purified medial axis transform of a desired portion of the stroke graph. No prior segmentation or thresholding is required. Simple gestures may be used to trim or extend the selection or to add branches. The realtime performance and relia- bility of Pearling results from a novel disk-sampling approach, which traces the strokes by optimizing the positions and radii of a discrete series of disks (pearls) along the stroke. A continuous model is defined through subdivision. By design, the idealized pearl string model is slightly wider than necessary to ensure that it contains the stroke boundary. A narrower core model that fits inside the stroke is computed simultaneously. The difference between the pearl string and its core contains the boundary of the stroke and may be used to capture, compress, visu- alize, or analyze the raw image data along the stroke boundary.

Paper Nr.:
503
Title:
Automatic Target Retrieval in a Video-Surveillance Task
Author(s):
Davide Moroni and Gabriele Pieri
Abstract:
In this paper we face the automatic target search problem. While performing an object tracking task, we address the problem of identifying a previously selected target when it is lost due to masking, occlusions, or quick and unexpected movements. Firstly a candidate target is identified in the scene through motion detection techniques, subsequently using a semantic categorization and content based image retrieval techniques, the candidate target is identified whether it is the correct one (i.e. the previous lost target), or not. Content Based Image Retrieval serves as support to the search problem and is performed using a reference data base which was populated a priori.

Paper Nr.:
504
Title:
Learning Probabilistic Models for Recognizing Faces under Pose Variations
Author(s):
M. Saquib Sarfraz and Olaf Hellwich
Abstract:
Recognizing a face from a novel view point poses major challenges for automatic face recognition. Recent methods address this problem by trying to model the subject specific appearance change across pose. For this, however, almost all of the existing methods require a perfect alignment between a gallery and a probe image. In this paper we present a pose invariant face recognition method centered on modeling joint appearance of gallery and probe images across pose in a probabilistic framework. We propose novel extensions in this direction by introducing to use a more robust feature description as opposed to pixel-based appearances. Using such features we propose to synthesize the non-frontal views to frontal. Furthermore, using local kernel density estimation, instead of commonly used normal density assumption, is proposed to derive the prior models. Our method does not require any strict alignment between gallery and probe images which makes it particularly attractive as compared to the existing state of the art methods. Improved recognition across a wide range of poses has been achieved using these extensions.

Paper Nr.:
506
Title:
Shape Modeling for the Analysis of Heart Deformation Patterns
Author(s):
Davide Moroni, Sara Colantonio, Ovidio Salvetti and Mario Salvetti
Abstract:
In this paper, we present an approach to the description of time-varying anatomical structures. The main goal is to compactly but faithfully describe the whole heart cycle in such a way to allow for deformation pattern characterization and assessment. Using such an encoding, a reference database can be built, thus permitting similarity searches or data mining procedures.

Paper Nr.:
508
Title:
Media Analysis and the Algorithm Ontology
Author(s):
Patrizia Asirelli, Sara Colantonio, Suzanne Little, Massimo Martinelli and Ovidio Salvetti
Abstract:
Media analysis algorithms are used for a variety of purposes. They may improve media facets such as contrast or signal-to-noise ratio or extract lowlevel details such as MPEG-7 features to be used in data mining and other higherlevel processing. However, algorithms are difficult to manage, understand and apply in particular for non-expert users. Therefore we are developing an algorithm ontology to support identification, aggregation and recording of algorithms for media analysis. This is especially useful for domains with high-volumes of complex media objects to investigate and integrate. Algorithms for media analysis may be applied at multiple points within a typical multimedia lifecycle. This article discusses a proposed algorithm ontology to support identification, retrieval and application of multimedia analysis processes and its application to metadata management and multimedia interoperability.

Paper Nr.:
510
Title:
An Image Mining Medical Warehouse
Author(s):
Sara Colantonio, Igor B. Gurevich, Ovidio Salvetti and Yulia Trusova
Abstract:
Advances in medical imaging technologies have assured the availability of more and more precise and detailed images whose analysis has became a necessary step in the diagnostic, prognostic and monitoring processes of main pathologies. Such development has stressed the need for advanced systems that are not limited to storage and management but include intelligent representation and retrieval of images. In this paper, we report current results of a medical warehouse we are developing for mining medical images, thus offering medical experts and researchers the possibility of storing, retrieving, analyzing and investigating biomedical images to discover novel knowledge relevant to diagnostic processes