Penguin Identification

Tilo Burghardt, Neill Campbell, Peter Barham, Richard Sherley

This early research was conducted between 2006-2009. The research aimed at exploring some first non-invasive identification solutions for problems in field biology and to better understand and help conserve endangered species. Specifically, we penguinsdeveloped approaches to monitor individuals in uniquely patterned animal populations using techniques that originated in computer vision and human biometrics. Work was centred around the African penguin (Spheniscus demersus).

During the project we provided a proof of concept for an autonomously operating prototype system capable of monitoring and recognising a group of individual African penguins in their natural environment without tagging or otherwise disturbing the animals. The prototype system was limited to very good acquisitional and environmental conditions, and operated on animals with sufficiently complex natural patterns.

Research was conducted together with the Animal Demography Unit at the University of Cape Town, South Africa. The project was funded by the Leverhulme Trust, with long-term support in the field from the Earthwatch Institute, and with pilot tests run in collaboration with Bristol Zoo Gardens.

Whilst deep learning approaches of today have replaced most of the traditional identification techniques of the 2000s, the practical and applicational insights gained in this project helped inform some of our current work on animal biometrics.

Analysis of moth camouflage

mothcam

David Gibson, Neill Campbell

A half million pound BBSRC collaboration with Biological sciences and experimental Psychology, the aim of this project is to develop a computational theory of animal camouflage, with models specific to the visual systems of birds and humans. Moths have been chosen for this study as they are a particularly good demonstrators of a wide range of cryptic and disruptive camouflage in nature. Using psychophysically plausible low-level image features, learning algorithms are used to determine the effectiveness of camouflage examples. The ability to generate and process large numbers of camouflage examples enables predictive computational models to be created and compared to the performance of human and bird subjects. Such comparisons will give insights into what aspects of moth camouflage are important for avoiding detection and recognition by birds and humans and thereby, give insight into the mechanisms being employed by bird and human visual systems

Weakly Supervised Learning of Visual Semantic Attributes

David Hanwell

Many of the adjectives we use in everyday language denote visual attributes of objects. For example colours (`red’, `ruby’, `vermilion’), patterns (`stripy’, `chequered’), and others (`leopard skin’, `dark’). We refer to these as “visual semantic attributes” – they define associations between concepts of thought and language (semantic), and the visual appearance of objects in our physical world (visual).

Automating the recognition of such attributes in photos and video would allow rich textual descriptions to be generated, especially when combined with object detection. This would be useful for various applications; image archiving, generating descriptions for products from their photos (for example on ebay), and perhaps even for surveillance (for example “find the man in the blue stripy shirt in these CCTV feeds”). Typically, models of such attributes are trained using image data which has been specifically chosen or acquired for the task, and has been manually labelled, annotated and segmented.

There are already billions of images available on the internet, with many more being added every day. Although these images have not been manually annotated or segmented, many of them have text associated with them, for example items for sale on ebay have both an image and a text description. We call such images ‘weakly labelled’ – each has a corresponding set of words, some of which may describe some, all or none of the image content.

The question we are trying to answer is: Can we use weakly labelled images to train computer models to recognise visual semantic attributes?

There would be several potential benefits to this approach:

  • Without the need to acquire bespoke images, or to manually label, annotate or segment them, the process of learning an attribute would be faster, easier and cheaper.
  • Since the training data would be acquired from many different sources, and the corresponding text written by many different people, the resulting models would be unbiased by any one person’s notion of an attribute.
  • Due the huge range of images available, with different objects, lighting, and variations on each attribute, each model may better capture possible variation of an attribute, and thus be better able to recognise it in novel images.

(Back to top)

Weakly Supervised Learning of Semantic Colour Terms

In order to take advantage of such easily obtainable data, there are a number of challenges which must be overcome. If we acquire a set of images from a web search, using an attribute term, not all of the returned images will contain the attribute, and even in those which do, there will be regions which do not. Here are some examples of images returned from a search for ‘Blue’:

Training data for the word 'blue'

We have proposed a method of learning colour terms from weakly labelled data such as this. Below are some examples of images from the test-set of [1], which have been segmented into regions semantic colour terms using the proposed method. This work has been accepted for publication in IET Computer Vision [2].

Images from dataset segmented into regions of colour by the system

(Back to top)

QUAC: Quick Unsupervised Anisotropic Clustering

We would like to extend the above work on the weakly supervised learning of colour terms, to more complex attributes using other features. To do this, we propose to first cluster the features extracted from each image, before performing weakly supervised learning using paramaterisations of the clusters. This way the volume of data per image will be greatly reduced, allowing many more images to be used for learning.

To this end, we propose QUAC – Quick Unsupervised Anisotropic Clustering. The algorithm works by finding elliptical (or hyper-elliptical) clusters one at a time, removing the data corresponding to each cluster after it is found. It has several advantages over other clustering algorithms:

  • It does not assume isotropy, and is capable of finding highly elongated clusters.
  • It is unsupervised, having only a single parameter, and not requiring the number of clusters to be set.
  • Its complexity is linear in the number of data, resulting in good performance on larger data sets.

Below is an image of five pairs of differently coloured trousers against a white background. We have taken the Green and Blue channels of this image to create a 2D data set, also shown below. The five elongated clusters corresponding to the five pairs of trousers are clearly visible, as is the less elongated cluster in the bottom right corner of the feature space, corresponding to the white background. An animation shows how QUAC finds the clusters. This work has been accepted for publication in Pattern Recognition [3]. Source code is available below.

Trouser test data Trouser clustering result

(Back to top)

Weakly Supervised Learning of Semantic Pattern Terms:

To extend the colour learning work above to more complex attributes, there are several more challenges we must overcome. As well as not knowing which parts of the training images (if any) contain the attribute to be learned, the scale and orientation of the attribute are also unknown, and may vary throughout an image. This is the focus of our current research.

(Back to top)

References

  1. J. van de Weijer, C. Schmid, and J. Verbeek. Learning Color Names from Real-World Images. In CVPR, Minneapolis, MN, 2007.
  2. D. Hanwell and M. Mirmehdi. Weakly Supervised Learning of Semantic Colour Terms. Accepted by IET Computer Vision, 22nd February 2013.
  3. D. Hanwell and M. Mirmehdi. QUAC: Quick Unsupervised Anisotropic Clustering. Accepted by Pattern Recognition, 7th May 2013.