VILSS: 2D Pairwise Geometry for Robust and Scalable Place Recognition

Edward Jones, Dyson Research Lab, Imperial College London

Teo de CamposIn this talk, I will present an overview of my PhD research on extending recent trends in visual place recognition to offer robustness and scalability. The underlying theme of my work is the exploitation of 2D geometry between pairs of local image features, which is often overlooked in favour of stronger 3D constraints. I will show how 2D geometry can be effectively applied to complement the limitations of 3D geometry, or even replace it at a fraction of the computational cost. The talk is divided into three sections, each discussing one example of such a method. First, I will show how 2D pairwise geometry can help to eliminate false positive feature correspondences which arise from RANSAC-based 3D geometric constraints. Then, an inverted index consisting of pairwise geometries will be introduced, which makes scalable recognition with geometry possible. Finally, I will introduce a topological robot localisation system which aims towards encoding probability into place recognition attempts, and hence offering suitability to visual SLAM frameworks.

 

VILSS: Transductive Transfer Learning for Computer Vision

Teo de Campos, University of Surrey

Teo de CamposOne of the ultimate goals of the open ended learning systems is to take advantage of previous experience in dealing with future problems. We focus on classification problems where labelled samples are available in a known problem (the source domain), but when the system is deployed in the target dataset, the distribution of samples is different. Although the number of classes and the feature extraction method remain the same, a change of domain happens because there is a difference between the typical distribution of data of source and target samples. This is a very common situation in computer vision applications, e.g., when a synthetic dataset is used for training but the system is applied on images “in the wild”. We assume that a set of unlabelled samples is available target domain. This constitutes a Transductive Transfer Learning problem, also known as Unsupervised Domain Adaptation. We proposed to tackle this problem by adapting the feature space of the source domain samples, so that their distribution becomes more similar to that of the target domain samples. Therefore a classifier re-trained on the updated source space can give better results on the target samples. We proposed to use a pipeline which consists of three main components:  (i) a method for global adaptation of the marginal distribution of the data using Maximum Mean Discrepancy; (ii) a sample-based adaptation method, which translates each source sample towards the distribution of the target samples; (iii) a class-based conditional distribution adaptation method. We conducted experiments on a range of image classification and action recognition datasets and showed that our method gives state-of-the-art results.

 

VILSS: Ortho-diffusion decompositions of graph-based representations of images

Adrian Bors, University of York

Adrian BorsIn this presentation I introduce the ortho-diffusion operator. I consider graph-based data representations where full data interconnectivity is modelled using probability transition matrices. Multi-scale dimensionality reduction at different scales is used in order to extract the meaningful data representations. The QR orthonormal decomposition algorithm, alternating with diffusion and data reduction stages is applied recursively at each scale level for the given data representation. Columns in the ortho-diffusion representation matrix represent characteristic features of the data. Those columns that are not considered essential for the data representation are removed at each scale. The proposed methodology is used to model features extracted from images which are then used for image matching and face recognition. Image matching is applied to optical flow estimation from image sequences. For the face recognition application I consider both global appearance models, based on either the correlation or the covariance of training sets, as well as semantic representations of biometric features. The proposed methodology is shown to be robust in face classification applications when considering image corruption by various noise statistics.

 

VILSS: Ultrasound imaging and inverse problems

Denis Kouame, Universite Paul Sabatier Toulouse

Denis KouameAmong all the medical imaging modalities, ultrasound imaging is the most widely used, due to its safety, cost-effectiveness, flexibility and real-time nature. However, compared to other medical imaging modalities such as Magnetic Resonance Imaging (MRI), or Computed Tomography (CT), ultrasound images suffers from the presence of speckle and have low-resolution in most standard applications. Although most manufacturers of ultrasound scanners have developed many device-based-routines in order to overcome these issues, many challenges in terms of signal and image processing remain. In this tutorial, we will review the basics and advanced ultrasound imaging, then we will focus on the current signal and image processing challenges, and show some recent results.

 

VILSS: Global description of images. Application to robot mapping and localisation

Luis Payá, Miguel Hernández University, Spain

Luis PayaNowadays, the design of fully autonomous mobile robots is a key discipline. Building a robust model of the unknown environment is an important ability the robot must develop. Using this model, the robot must be able to estimate its current position and to navigate to the target points. The use of omnidirectional vision sensors is usual to solve these tasks. When using this source of information, the robot must extract relevant information from the scenes both to build the model and to estimate its position. The possible frameworks include the classical approach of extracting and describing local features or working with the global appearance of the scenes, which has emerged as a conceptually simple and robust solution. In this talk, the role of global-appearance techniques in robot mapping and localization is analysed.

 

 

VILSS: Joint Tracking and Event Analysis for Carried Object Detection

Aryana Tavanai, University of Leeds

Mengyang YuTracking and Event Analysis are areas of video analysis which have great importance in robotics applications and automated surveillance. Although they have been greatly studied individually, there has been little work on performing them jointly where they mutually influence and improve each other. In this talk I will present our novel approach for jointly estimating the track of a moving object and recognising the events in which it participates. First, I will introduce our geometric carried object detector. Then I will present our tracklet building approach which enforces spatial consistency between the carried objects and other pre-tracked entities in the scene. Finally, I will present our joint tracking and event analysis framework posed as maximisation of a posterior probability defined over event sequences and temporally-disjoint subsets of tracklets. We evaluate our approach using tracklets from three state of the art trackers and demonstrate improved tracking performance in each case, as a result of jointly incorporating events, while also subsequently improving event recognition.