Optimal presentation duration for video quality assessment

Video content distributors, codec developers and researchers in related fields often rely on subjective assessments to ensure that their video processing procedures result in satisfactory quality. The current 10-second recommendation for the length of test sequences in subjective video quality assessment studies, however, has recently been questioned. Not only do sequences of this length depart from modern cinematic shooting styles, the use of shorter sequences would enable substantial efficiency improvements to the data collection process. This project, therefore, aims to explore the impact upon viewer rating behaviour of using different length video sequences and the consequent savings that could be made in time, labour and money .

Publications:

 

Felix Mercer Moss, Ke Wang, Fan Zhang, Roland Baddeley and David R. Bull, On the optimal presentation duration for subjective video quality assessment, IEEE Transactions on Circuits and Systems for Video Technology, Volume PP, Issue 99, July 2015.

Felix Mercer Moss, Chun-Ting Yeh, Fan Zhang, Roland Baddeley and David R. Bull, Support for reduced presentation durations in subjective video quality assessment, Signal Processing: Image Communication, Volume 48, October 2016, Pages 38-49.
sampleHamster-1200x675

 

Undecimated 2D Dual Tree Complex Wavelet Transforms

Dr Paul Hill, Dr Alin Achim and Professor Dave Bull

This work introduces two undecimated forms of the 2D Dual Tree Complex Wavelet Transform (DT-CWT) which combine the benefits of the Undecimated Discrete Wavelet Transform (exact translational invariance, a one-to-one relationship between all co-located coefficients at all scales) and the DT-CWT (improved directional selectivity and complex subbands).

The Discrete Wavelet Transform (DWT) is a spatial frequency transform that has been used extensively for analysis, denoising and fusion within image processing applications. It has been recognised that although the DWT gives excellent combined spatial and frequency resolution, the DWT suffers from shift variance. Various adaptations to the DWT have been developed to produce a shift invariant form. Firstly, an exact shift invariance has been achieved using the Undecimated Discrete Wavelet Transform (UDWT). However, the UDWT variant suffers from a considerably overcomplete representation together with a lack of directional selectivity. More recently, the Dual Tree Complex Wavelet Transform (DT-CWT) has given a more compact representation whilst offering near shift invariance. The DT-CWT also offers improved directional selectivity (6 directional subbands per scale) and complex valued coefficients that are useful for magnitude / phase analysis within the transform domain. This paper introduces two undecimated forms of the DT-CWT which combine the benefits of the UDWT (exact translational invariance, a one-to-one relationship between all co-located coefficients at all scales) and the DT-CWT (improved directional selectivity and complex subbands).

This image illustrates the three different 2D Dual Tree Complex Wavelet Transforms

nddtcwt

 

Matlab code download 

Implementations of three complex wavelet transforms can be downloaded below as mex matlab files. They have been compiled in 32bit and 64bit windows and 64bit linux formats. If you need an alternative format please mail me at paul.hill@bristol.ac.uk. Code updated 16/7/2014.

Please reference the following paper if you use this software

Hill, P. R., N. Anantrasirichai, A. Achim, M. E. Al-Mualla, and D. R. Bull. “Undecimated Dual-Tree Complex Wavelet Transforms.” Signal Processing: Image Communication 35 (2015): 61-70.

The paper is available here: http://www.sciencedirect.com/science/article/pii/S0923596515000715

A previous paper is here:

Hill, P.; Achim, A.; Bull, D., “The Undecimated Dual Tree Complex Wavelet Transform and its application to bivariate image denoising using a Cauchy model,” Image Processing (ICIP), 2012 19th IEEE International Conference on , vol., no., pp.1205,1208, Sept. 30 2012-Oct. 3 2012.

 

Matlab Code Usage

Forward Transform: NDxWav2DMEX
Backward Transform: NDixWav2DMEX
Useage:  w = NDxWav2DMEX(x, J, Faf, af, nondecimate);
     y = NDixWav2DMEX(w, J, Fsf, sf, nondecimate);
x,y - 2D arrays
J - number of decomposition 
Faf{i}: tree i first stage analysis filters 
af{i}:  tree i filters for remaining analysis stages
Fsf{i}: tree i first stage synthesis filters 
sf{i}:  tree i filters for remaining synthesis stages
Nondecimated: 0 (default) for original decimated version, 1 for completely decimated version, 2 for decimation of just first level.
w – wavelet coefficients
w{a}{b}{c}{d} - wavelet coefficients
        a = 1:J (scales)
        b = 1 (real part); b = 2 (imag part)
        c = 1,2; d = 1,2,3 (orientations)
w{J+1}{a}{b} - lowpass coefficients
        a = 1,2; b = 1,2 
 
 
Example of Usage: 
 
  % Original Decimated Version
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = AntonB;
  [af, sf] = dualfilt1;
  w = NDxWav2DMEX(x, J, Faf, af,0);
  y = NDixWav2DMEX(w, J, Fsf, sf,0);
  err = x - y;
  max(max(abs(err)))
 
  % Decimated Version 1 (no decimation)
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = NDAntonB2; %(Must use ND filters for both)
  [af, sf] = NDdualfilt1;
  w = NDxWav2DMEX(x, J, Faf, af, 1);
  y = NDixWav2DMEX(w, J, Fsf, sf, 1);
  err = x - y;
  max(max(abs(err)))
 
  %Decimated Version 2 (decimation on only first level)
  x = rand(256,256);
  J = 4;
  [Faf, Fsf] = AntonB; 
  [af, sf] = NDdualfilt1; %(Must use ND filters for just these)
  w = NDxWav2DMEX(x, J, Faf, af, 2);
  y = NDixWav2DMEX(w, J, Fsf, sf, 2);
  err = x - y;
  max(max(abs(err)))
 
% SIZE LIMITS
% (s/J^2) must be bigger than 5 (where s is both height and width)
% Height and width must be divisible by 2^J for fully decimated version
% Height and width must be divisible by 2 for nondecimated version 2


Home Activity Monitoring Study using Visual and Inertial Sensors

Lili Tao, Tilo Burghardt, Sion Hannuna, Massimo Camplani, Adeline Paiement, Dima Damen, Majid Mirmehdi, Ian Craddock. A Comparative Home Activity Monitoring Study using Visual and Inertial Sensors, 17th International Conference on E-Health Networking, Application and Services (IEEE HealthCom), 2015

Monitoring actions at home can provide essential information for rehabilitation management. This paper presents a comparative study and a dataset for the fully automated, sample-accurate recognition of common home actions in the living room environment using commercial-grade, inexpensive inertial and visual sensors. We investigate the practical home-use of body-worn mobile phone inertial sensors together with an Asus Xmotion RGB-Depth camera to achieve monitoring of daily living scenarios. To test this setup against realistic data, we introduce the challenging SPHERE-H130 action dataset containing 130 sequences of 13 household actions recorded in a home environment. We report automatic recognition results at maximal temporal resolution, which indicate that a vision-based approach outperforms accelerometer provided by two phone-based inertial sensors by an average of 14.85% accuracy for home actions. Further, we report improved accuracy of a vision-based approach over accelerometry on particularly challenging actions as well as when generalising across subjects.

Data collection and processing

For visual data, we simultaneously collect RGB and depth images using the commercial product Asus Xmotion. Motion information can be recovered best from RGB data as it contains rich texture and colour information. Depth information, on the other hand, reveals details of the 3D configuration. We extract and encode both motion and depth features over the area of a bounding box as returned by the human detector and tracker provided by the OpenNI SDK[1]. For collecting inertial data, we opt for subjects to wear two accelerometers mounted at the centre of the waist and the dominant wrist.
The figure below gives an overview of the system.

AR_overview1

 Figure 1. Experimental setup.

SPHERE-H130 Action Dataset

We introduce the SPHERE-H130 action dataset for human action recognition from RGB-Depth and inertial sensor data captured in a real living environment. The dataset is generated over 10 sessions by 5 subjects containing 13 action categories per session: sit still, stand still, sitting down, standing up, walking, wiping table, dusting, vacuuming, sweeping floor, cleaning stain, picking up, squatting, upper body stretching. Overall, recordings correspond to about 70 minutes of total time captured. Actions were simultaneously captured by the Asus Xmotion RGB-depth camera and the two wireless accelerometers. Colourand depth images were acquired at a rate of 30Hz. The accelerometer data was captured at about 100Hz sampled down to 30Hz, a frequency recognised as optimal for human action recognition.

1sitting 2standing 3sittingdown 4standingup 5walking

     sitting                            standing                               sitting down                           standing up                               walking

6wiping 7dusting 8vacumming 9sweeping 10cleaningfloor
wiping                              dusting                               vacumming                                  sweeping                       cleaning stain

11picking 12squatting 13stretching

Picking                          squatting                           stretching

Figure 2. Sample videos from the dataset

Results

Vision can be more accurate than Accelerometers. Figure 3 depicts the recognition confusion matrices corresponding to the use of inertial and visual sensors, respectively.

confusion_visual confusion_acc
Figure 3. The confusion matrices by (left) visual data and (right) accelerometer data.

Publication and Dataset

The dataset and the proposed method is presented in the following paper:

  • Lili Tao, Tilo Burghardt, Sion Hannuna, Massimo Camplani, Adeline Paiement, Dima Damen, Majid Mirmehdi, Ian Craddock. A Comparative Home Activity Monitoring Study using Visual and Inertial Sensors, 17th International Conference on E-Health Networking, Application and Services (IEEE HealthCom), 2015

SPHERE-H130 action dataset can be downloaded here.

References

[1] OpenNI User Guide, OpenNI organization, November 2010. Available: http://www.openni.org/documentation