Session Index

TU-L1: Best Student Paper Award Session

TU-L2: Signal Processing Applications I

TU-L3: Audio and Acoustic Signal Processing I

TU-L4: Signal processing for Massive MIMO and Large Antenna Array 1 (Special Session)

TU-L5: Signal Processing for Multimodal Data (Special Session)

TU-L6: Design and Implementation of Signal Processing Systems

TU-L7: Image and Video Coding I

TU-L8: Audio and Acoustic Signal Processing II

TU-L9: Signal processing for Massive MIMO and Large Antenna Array 2 (Special Session)

TU-L10: Multi-Modal Sensing and Analysis of Human Motion for Sporting and Leisure Applications (Special Session)

TU-P1: Signal Processing for Communications I

TU-P2: Image and Video Coding and Processing

TU-P3: Signal Processing for Communications II

TU-P4: Signal Estimation and Detection I

WE-L1: Machine Learning : Signal Processing Applications

WE-L2: Image and Video Coding II

WE-L3: Cooperation and Cognition in Wireless Networks

WE-L4: Recent Advances on CR/SDR Circuits, Systems and Signal-Processing Techniques (Special Session)

WE-L5: Advances in Music and Audio Recognition/Analysis (Special Session)

WE-L6: Machine Learning : Sparsity

WE-L7: Image and Video Security

WE-L8: Signal Processing for Communications

WE-L9: Signal Processing for Cognitive Radio Networks (Special Session)

WE-L10: Inference and Estimation of Physical Fields Using Sensor Networks (Special Session)

WE-P1: Audio and Acoustic Signal Processing I

WE-P2: Design and Implementation of Signal Processing Systems

WE-P3: Audio and Acoustic Signal Processing II

WE-P4: Signal Estimation and Detection II

TH-L1: DOA Estimation

TH-L2: Image and Video Analysis I

TH-L3: Estimation and Detection

TH-L4: Digital Audio Processing for Loudspeakers and Headphones 1 (Special Session)

TH-L5: Physical Layer Network Coding (Special Session)

TH-L6: Sensor Array and Multichannel Signal Processing

TH-L7: Image and Video Analysis II

TH-L8: Bayesian Inference

TH-L9: Digital Audio Processing for Loudspeakers and Headphones 2 (Special Session)

TH-L10: Biometric Technologies for Security and Forensics Applications (Special Session)

TH-L11: Location and Positioning

TH-L12: Signal Processing Applications II

TH-L13: Audio and Acoustic Signal Processing III

TH-L14: Statistical Methods for Inverse Problems in Image Processing (Special Session)

TH-L15: High Dynamic Range imaging: Providing a step change in imaging technology (Special Session)

TH-P1: Machine Learning I

TH-P2: Speech Processing I

TH-P3: Machine Learning II

TH-P4: Speech Processing II

TH-P5: Image and Video Analysis I

TH-P6: Signal Estimation and Detection III

FR-L1: Speech Processing I

FR-L2: Image and Video Applications

FR-L3: Compressive Sensing and Sparsity

FR-L4: Capacity Enhancing Techniques (Special Session)

FR-L5: Advanced Signal Processing for Optical Communications (Special Session)

FR-L6: Speech Processing II

FR-L7: Signal Processing Applications III

FR-L8: Signal Modelling and Estimation

FR-L9: The Many Faces of Signal Processing in Multimedia QoE Assessment (Special Session)

FR-L10: Random Matrix Theory Methods for Multi-Antenna Signal Processing (Special Session)

FR-L11: Speech Processing III

FR-L12: Signal Processing Applications IV

FR-L13: Graphs, Networks and Distributed Estimation

FR-L14: Acoustic Scene Analysis in Domestic Environments (Special Session)

FR-L15: Non-Linear Signal Processing

FR-P1: Sensor Array and Multichannel Signal Processing I

FR-P2: Signal Processing Applications I

FR-P3: Image and Video Analysis II

FR-P4: Sensor Array and Multichannel Signal Processing II

FR-P5: Image and Video Applications

FR-P6: Signal Processing Applications II



Session TU-L01: Best Student Paper Award Session


TU-L01-1: Sparse Linear Parametric Modeling of Room Acoustics with Orthonormal Basis Functions

Giacomo Vairetti (KU Leuven, Belgium); Toon van Waterschoot (KU Leuven, Belgium); Marc Moonen (KU Leuven, Belgium); Michael Catrysse (Televic N. V., Belgium); Søren Holdt Jensen (Aalborg University, Denmark)

Abstract: Orthonormal Basis Function (OBF) models provide a stable and well-conditioned representation of a linear system. When used for the modeling of room acoustics, useful information about the true dynamics of the system can be introduced by a proper selection of a set of poles, which however appear non-linearly in the model. A novel method for selecting the poles is proposed, which bypass the non-linear problem by exploiting the concept of sparsity and by using convex optimization. The model obtained has a longer impulse response compared to the all-zero model with the same number of parameters, without introducing substantial error in the early response. The method also allows to increase the resolution in a specified frequency region, while still being able to approximate the spectral envelope in other regions.


TU-L01-2: A Dynamic Screening Principle for the Lasso

Antoine Bonnefoy (Aix-Marseille Université, France); Valentin Emiya (Aix-Marseille Université, France); Liva Ralaivola (LIF, Aix-Marseille Universités, France); Rémi Gribonval (INRIA, France)

Abstract: The Lasso is an optimization problem devoted to finding a sparse representation of some signal with respect to a predefined dictionary. An original and computationally-efficient method is proposed here to solve this problem, based on a dynamic screening principle. It makes it possible to accelerate a large class of optimization algorithms by iteratively reducing the size of the dictionary during the optimization process, discarding elements that are provably known not to belong to the solution of the Lasso. The iterative reduction of the dictionary is what we call dynamic screening. As this screening step is inexpensive, the computational cost of the algorithm using our dynamic sreening strategy is lower than that of the base algorithm. Numerical experiments on synthetic and real data support the relevance of this approach.


TU-L01-3: Locally Linear Embedding-Based Prediction for 3D Holoscopic Image Coding Using HEVC

Luis Lucas (Universidade Federal do Rio de Janeiro, Portugal); Caroline Conti (Instituto de Telecomunicacoes, Portugal); Paulo Nunes (ISCTE-IUL / Instituto de Telecomunicações, Portugal); Luis Ducla Soares (I.S.C.T.E. / I.T. - Lisbon, Portugal); Nuno Rodrigues (Instituto de Telecomunicações - ESTG IPLeiria, Portugal); Carla Pagliari (Instituto Militar de Engenharia, Brazil); Eduardo A. B. da Silva (Universidade Federal do Rio de Janeiro, Brazil); Sérgio M. M. Faria (Instituto de Telecomunicacoes, Portugal)

Abstract: Holoscopic imaging is a prospective acquisition and display solution for providing true 3D content and fatigue-free 3D visualization. However, efficient coding schemes for this particular type of content are needed to enable proper storage and delivery of the large amount of data involved in these systems. Therefore, this paper proposes an alternative HEVC-based coding scheme for efficient representation of holoscopic images. In this scheme, some directional intra prediction modes of the HEVC are replaced by a more efficient prediction framework based on locally linear embedding techniques. Experimental results show the advantage of the proposed prediction for 3D holoscopic image coding, compared to the reference HEVC standard as well as previously presented approaches in this field.


TU-L01-4: Robust Linear Regression Analysis - The Greedy Way

George Papageorgiou (University of Athens, Greece); Pantelis Bouboulis (University of Athens, Greece); Sergios Theodoridis (University of Athens, Greece); Konstantinos E. Themelis (National Observatory of Athens, Greece)

Abstract: In this paper, the task of robust estimation in the presence of outliers is presented. Outliers are explicitly modeled by employing sparsity arguments. A novel efficient algorithm, based on the greedy Orthogonal Matching Pursuit (OMP) scheme, is derived. Theoretical results concerning the recovery of the solution as well as simulation experiments, which verify the comparative advantages of the new technique, are discussed.


TU-L01-5: Feature Enhancement for Robust Speech Recognition on Smartphones with Dual-Microphone

Iván López-Espejo (University of Granada, Spain); Angel Manuel Gomez Garcia (University of Granada, Spain); Jose Andres Gonzalez Lopez (University of Granada, Spain); Antonio M. Peinado (University of Granada, Spain)

Abstract: Latest smartphones often have more than one microphone in order to perform noise reduction. Although research on speech enhancement is already exploiting this new feature, robust speech recognition is not still benefiting from it. In this paper we propose two feature enhancement methods especially developed for the case of a smartphone with a dual-microphone operating in an adverse acoustic environment. In order to test these proposals, we have already developed a new experimental framework which includes a noisy speech database (based on AURORA2) which emulates the acquisition of dual-microphone data. Our experimental results show a clear improvement in terms of word accuracy in comparison with both using a power level difference-based speech enhancement algorithm and a single channel feature compensation approach.



Session TU-L02: Signal Processing Applications I


TU-L02-1: A Spatially Constrained Low-Rank Matrix Factorisation for the Functional Parcellation of the Brain

Alexis Benichoux (University of Southampton, United Kingdom); Thomas Blumensath (University of Southampton, United Kingdom)

Abstract: We propose a new matrix recovery framework to partition brain activity using time series of resting-state functional Magnetic Resonance Imaging (fMRI). Spatial clusters are obtained with a new low-rank factorization algorithm that offers the ability to add different types of constraints. As an example we add a total variation type cost function in order to exploit neighborhood constraints. We first validate the performance of our algorithm on simulated data, which allows us to show that the neighborhood constraint improves the recovery in noisy or under sampled set-ups. Then we conduct experiments on real-world data, where we simulated an accelerated acquisition by randomly under sampling the time series. The obtained parcellation are reproducible when analysing data from different sets of individuals, and the estimation is robust to under sampling.


TU-L02-2: Model Selection for Hemodynamic Brain Parcellation in fMRI

Mohanad Albughdadi (Institut National Polytechnique de Toulouse, France); Lotfi Chaari (University of Toulouse, IRIT - INP-ENSEEIHT, France); Florence Forbes (INRIA, France); Jean-Yves Tourneret (University of Toulouse, France); Philippe Ciuciu (CEA/NeuroSpin, France)

Abstract: Brain parcellation into a number of hemodynamically ho- mogeneous regions (parcels) is a challenging issue in fMRI analyses. This task has been recently integrated in the joint detection-estimation (JDE) [1] resulting in the so-called joint detection-parcellation-estimation (JPDE) model [2]. JPDE automatically estimates the parcels from the fMRI data but requires the desired number of parcels to be fixed. This is potentially critical in that the chosen number of parcels may influence detection-estimation performance. In this paper, we propose a model selection procedure to automatically fix the number of parcels from the data. The selection procedure relies on the calculation of the free energy corresponding to each concurrent model, within the variational expectation maximization framework. Experiments on synthetic and real fMRI data demonstrate the ability of the proposed procedure to select an adequate number of parcels.


TU-L02-3: Dynamical Analysis of Brain Seizure Activity From EEG Signals

Ladan Amini (GIPSA-LAB, Grenoble INP, France); Christian Jutten (GIPSA-Lab, France); Benoît Pouyatos (SynapCell S.A.S, La Tronche, France); Antoine Depaulis (Université Joseph Fourier, Grenoble, France); Corinne Roucard (SynapCell, France)

Abstract: A sudden emergence of seizure activity on a normal background EEG can be seen from visual inspection of the intracranial EEG (iEEG) recordings of Genetic Absence Epilepsy Rat from Strasbourg (GAERS). We observe that most of the recording channels from different brain regions display seizure activity. We wonder if the brain behavior changes within a given seizure. Using source separation methods on temporal sliding windows, we develop a map of dynamic behavior to study this dynamicity. The map is built by computing the correlation between the main sources extracted in different time windows. The proposed method is applied on iEEG of four GAERS. We see that the behavior of brain changes about 0.5s-1.5s after onset when the relevant temporal sources become very similar. The corresponding spatial maps for each time window shows that the seizure activity starts from a focus and propagates quickly.


TU-L02-4: Fast, Variation-Based Methods for the Analysis of Extended Brain Sources

Hanna Becker (I3S-CNRS-University of Nice Sophia Antipolis, France); Laurent Albera (Université de Rennes1, France); Pierre Comon (CNRS UMR5216, France); Rémi Gribonval (INRIA, France); Isabelle Merlet (University of Rennes 1, France)

Abstract: Identifying the location and spatial extent of several highly correlated and simultaneously active brain sources from electroencephalographic (EEG) recordings and extracting the corresponding brain signals is a challenging problem. In a recent comparison of source imaging techniques, the VB-SCCD algorithm, which exploits the sparsity of the variational map of the sources, proved to be a promising approach. In this paper, we propose several ways to improve this method. In order to adjust the size of the estimated sources, we add a regularization term that imposes sparsity in the original source domain. Furthermore, we demonstrate the application of ADMM, which permits to efficiently solve the optimization problem. Finally, we also consider the exploitation of the temporal structure of the data by employing L12-norm regularization. The performance of the resulting algorithm, called L12-SVB-SCCD, is evaluated based on realistic simulations in comparison to VB-SCCD and several state-of-the-art techniques for extended source localization.


TU-L02-5: Singular Spectrum Analysis as a Preprocessing Filtering Step for fNIRS Brain Computer Interfaces

Loukianos Spyrou (Radboud University, The Netherlands); Yvonne Blokland (Radboud University, The Netherlands); Jason Farquhar (Radboud University, The Netherlands); Jorgen Bruhn (Radboud University, The Netherlands)

Abstract: Near Infrared Spectroscopy is a method that measures the brain's haemodynamic response. It is of interest in brain-computer interfaces where haemodynamic patterns in motor tasks are exploited to detect movement. However, the NIRS signal is usually corrupted with background biological processes, some of which are periodic or quasi-periodic in nature. Singular spectrum analysis (SSA) is a time-series decomposition method which separates a signal into a trend, oscillatory components and noise with minimal prior assumptions about their nature. Due to the frequency spectrum overlap of the movement response and of background processes such as Mayer waves, spectral filters are usually suboptimal. In this study, we perform SSA both in an online and a block fashion resulting in the removal of periodic components and in increased classification performance. Our study indicates that SSA is a practical method that can replace spectral filtering and is evaluated on healthy participants and patients with tetraplegia.



Session TU-L03: Audio and Acoustic Signal Processing I


TU-L03-1: Representation of Spectral Envelope with Warped Frequency Resolution for Audio Coder

Ryosuke Sugiura (The Univerity of Tokyo, Japan); Yutaka Kamamoto (NTT Communication Science Labs., Japan); Noboru Harada (NTT Communication Science Labs., Japan); Hirokazu Kameoka (The University of Tokyo, Japan); Takehiro Moriya (NTT, Japan)

Abstract: We have devised a method for representing frequency spectral envelopes with warped frequency resolution based on sparse non-negative matrices aiming at its use for frequency domain audio coding. With optimally prepared matrices, we can selectively control the resolution of spectral envelopes and enhance the coding efficiency. We show that the devised method can enhance the subjective quality of the state-of-the-art wide-band coder at 16 kbit/s at a cost of minor additional complexity. The method is therefore, expected to be useful for low-bit-rate and low-delay audio coder for mobile communications.


TU-L03-2: Direct Linear Conversion of LSP Parameters for Perceptual Control in Speech and Audio Coding

Ryosuke Sugiura (The Univerity of Tokyo, Japan); Yutaka Kamamoto (NTT Communication Science Labs., Japan); Noboru Harada (NTT Communication Science Labs., Japan); Hirokazu Kameoka (The University of Tokyo, Japan); Takehiro Moriya (NTT, Japan)

Abstract: We have devised a direct and simple scheme for linear conversion of line spectrum pairs (LSP) with low computational complexity aiming at weighting or inverse weighting spectral envelopes for noise control in speech and audio coders. Using optimally prepared coefficients, we can perform the conversion directly in the LSP domain, which ensures low computational costs and also simplifies the check or the modification of unstable parameters. We show that this method performs the same as the weighting in the linear prediction coding domain but with lower complexity in a low-bit-rate situation. The devised method is therefore expected to be useful for low-bit-rate speech and audio coders for mobile communications.


TU-L03-3: Maximum Likelihood Based Multi-Channel Isotropic Reverberation Reduction for Hearing Aids

Adam Kuklasiński (Oticon A/S, Denmark); Simon Doclo (University of Oldenburg, Germany); Søren Holdt Jensen (Aalborg University, Denmark); Jesper Jensen (Oticon A/S and Aalborg University, Denmark)

Abstract: We propose a multi-channel Wiener filter for speech dereverberation in hearing aids. The proposed algorithm uses joint maximum likelihood estimation of the speech and late reverberation spectral variances, under the assumption that the late reverberant sound field is cylindrically isotropic. The dereverberation performance of the algorithm is evaluated using computer simulations with realistic hearing aid microphone signals including head-related effects. The algorithm is shown to work well with signals reverberated both by synthetic and by measured room impulse responses, achieving improvements in the order of 0.5 PESQ points and 5 dB frequency-weighted segmental SNR.


TU-L03-4: Elimination of Impulsive Disturbances From Stereo Audio Recordings

Maciej Niedźwiecki (Gdansk University of Technology, Poland); Marcin Ciołek (Gdansk University of Technology, Poland)

Abstract: This paper presents a new approach to elimination of impulsive disturbances from stereo audio recordings. The proposed solution is based on vector autoregressive modeling of audio signals. On - line tracking of signal model parameters is performed using the stability - preserving Whittle - Wiggins - Robinson algorithm with exponential data weighting. Detection of noise pulses and model - based interpolation of the irrevocably distorted samples is realized using an adaptive, variable - order Kalman filter. The proposed approach is evaluated on a set of clean audio signals contaminated with real click waveforms extracted from silent parts of old gramophone recordings.


TU-L03-5: Implementation and Evaluation of the Vandermonde Transform

Tom Bäckström (Friedrich-Alexander University Erlangen-Nürnberg, Germany); Johannes Fischer (Friedrich-Alexander University Erlangen-Nürnberg, Germany); Daniel Boley (University of Minnesota, USA)

Abstract: The Vandermonde transform was recently presented as a time-frequency transform which, in difference to the discrete Fourier transform, also decorrelates the signal. Although the approximate or asymptotical decorrelation provided by Fourier is good enough in many cases, its performance is inadequate in applications which employ short windows. The Vandermonde transform will therefore be useful in speech and audio processing applications, which have to use short analysis windows because the input signal varies rapidly over time. Such applications are often used on mobile devices with limited computational capacity, whereby efficient computations are of paramount importance. Implementation of the Vandermonde transform has, however, turned out to be a considerable effort: it requires advanced numerical tools whose performance is optimized for complexity and accuracy. This contribution provides a baseline solution to this task including a performance evaluation.



Session TU-L04: Signal processing for Massive MIMO and Large Antenna Array 1 (Special Session)


TU-L04-1: Aspects of Favorable Propagation in Massive MIMO

Hien Quoc Ngo (Linkoping University, Sweden); Erik G. Larsson (Linköping University, Sweden); Tom Marzetta (Bell Labs, USA)

Abstract: Favorable propagation (FP), defined as mutual orthogonality among the vector-valued channels to the terminals, is one of the key properties of the radio channel that is exploited in Massive MIMO. We first show that FP offers the most desirable scenario in terms of maximizing the sum-capacity. One useful proxy for whether propagation is favorable or not is the channel condition number. However, this proxy is not good for the case where the norms of the channel vectors may not be equal. For this case, to evaluate how favorable the channel is, we propose a "distance from FP" measure. Secondly, we examine how favorable the channels can be for two scenarios: Rayleigh fading and uniform random line-of-sight (UR-LoS). Both environments offer (nearly) FP. To analyze the UR-LoS model, we propose an urns-and-balls model. This model is simple and explains the singular value spread characteristic of the UR-LoS model well.


TU-L04-2: Channel Estimation for Millimeter-Wave Very Large MIMO

Daniel Araújo (Federal University of Ceará, Brazil); André Almeida (Federal University of Ceará, Brazil); Johan Axnäs (Ericsson Research, Sweden); João Cesar Mota (Wireless Telecom Research Group - Federal University of Ceará, Brazil)

Abstract: We present an efficient pilot-assisted technique for downlink channel estimation in VL-MIMO systems operating in a 60 GHz indoor channel. Our estimator exploits the inherent sparsity of the channel and requires quite low pilot overhead. It is based on a coarse estimation stage that capitalizes on compressed sensing, followed by a refinement stage to find the transmit/receive spatial frequencies. Considering a ray-tracing channel model, the system throughput is evaluated from computer simulations by considering different beamforming schemes designed from the estimated channel. Our results show that the proposed channel estimator performs quite well with very low pilot overhead.


TU-L04-3: CHEMP Receiver for Large-scale Multiuser MIMO Systems Using Spatial Modulation

T. Lakshmi Narasimhan (Indian Institute of Science, Bangalore, India); A. Chockalingam (Indian Institute of Science, India)

Abstract: In spatial modulation (SM), information bits are conveyed through the index of the active transmit antenna in addition to the information bits conveyed through conventional modulation symbols. In this paper, we propose a receiver for large-scale multiuser spatial modulation MIMO (SM-MIMO) systems. The proposed receiver exploits the channel hardening phenomenon observed in large-dimensional MIMO channels. It works with a matched filtered system model. On this system model, it obtains an estimate of the matched filtered channel matrix (rather than the channel matrix itself) and uses this estimate for detecting the data. The data detection is done using an approximate message passing algorithm. The proposed receiver, referred to as the channel hardening-exploiting message passing receiver for SM (CHEMP-SM), is shown to achieve very good performance at low complexity.


TU-L04-4: Hardware Realizable Lattice-Reduction-Aided Detectors for Large-Scale MIMO Systems

Qi Zhou (Georgia Institute of Technology, USA); Xiaoli Ma (Georgia Institute of Technology, USA)

Abstract: Because of their lower complexity and better error performance over K-best detectors, lattice-reduction (LR)-aided K-best detectors have recently proposed for large-scale multi-input multi-output (MIMO) detection. Among existing LR-aided K-best detectors, complex LR-aided K-best detector is more attractive compared to its real counterpart due to its potential lower latency and resources. However, one main difficulty in hardware implementation of complex LR-aided K-best is to efficiently find top K children of each layer in complex domain. In this paper, we propose and implement an LR-aided K-best algorithm that efficiently finds top K children in each layer when K is relatively small. Our implementation results on Xilinx FPGA show that, with the aid of LR, the proposed LR-aided K-best implementation can support 3 Gbps transmissions for 16x16 MIMO systems with 1024-QAM with about 2.7 dB loss to the maximum likelihood detector at bit-error rate 10^{-4}.


TU-L04-5: Iterative Detection and Decoding in 3GPP LTE-based Massive MIMO Systems

Michael Wu (Rice University, USA); Chris Dick (Xilinx, USA); Joseph R. Cavallaro (Rice University, USA); Christoph Studer (Cornell University, USA)

Abstract: Massive multiple-input multiple-output (MIMO) is expected to be a key technology in next-generation multi-user cellular systems for achieving higher throughput and better link reliability than existing (small-scale) MIMO systems. In this work, we develop a novel, low-complexity iterative detection and decoding algorithm for single carrier frequency division multiple access (SC-FDMA)-based massive MIMO systems, such as future 3GPP LTE-based systems. The proposed algorithm combines a novel frequency-domain minimum mean-square error (FD-MMSE) equalization method with parallel interference cancellation (PIC), requires low computational complexity, and achieves near-optimal error-rate performance in realistic 3GPP-LTE-based massive MIMO systems having roughly 2x more base-station antennas than users.



Session TU-L05: Signal Processing for Multimodal Data (Special Session)


TU-L05-1: Challenges in Multimodal Data Fusion

Dana Lahat (Gipsa-Lab, France); Tulay Adali (University of Maryland, Baltimore County, USA); Christian Jutten (GIPSA-Lab, France)

Abstract: In various disciplines, information about the same phenomenon can be acquired from different types of detectors, at different conditions, different observations times, in multiple experiments or subjects, etc. We use the term ``modality'' to denote each such type of acquisition framework. Due to the rich characteristics of natural phenomena, as well as of the environments in which they occur, it is rare that a single modality can provide complete knowledge of the phenomenon of interest. The increasing availability of several modalities at once introduces new degrees of freedom, which raise questions beyond those related to exploiting each modality separately. It is the aim of this paper to evoke and promote various challenges in multimodal data fusion at the conceptual level, without focusing on any specific model, method or application.


TU-L05-2: Challenges and Opportunities of Multimodality and Data Fusion in Remote Sensing

Mauro Dalla Mura (Grenoble Institute of Technology, France); Saurabh Prasad (University of Houston, USA); Fabio Pacifici (DigitalGlobe Inc., USA); Paolo Gamba (Università degli Studi di Pavia, Italy); Jocelyn Chanussot (Grenoble Institute of Technology, France)

Abstract: Remote sensing is one of the most common ways to extract relevant information about the Earth through observations. Remote sensing acquisitions can be done by both active (SAR, LiDAR) and passive (optical and thermal range, multispectral and hyperspectral) devices. According to the sensor, diverse information of Earth's surface can be obtained. These devices provide information about the structure (optical, SAR), elevation (LiDAR) and material content (multi- and hyperspectral). Together they can provide information about land use (urban, climatic changes), natural disasters (floods, hurricanes, earthquakes), and potential exploitation (oil fields, minerals). In addition, images taken at different times can provide information about damages from floods, fires, seasonal changes etc. In this paper, we sketch the current opportunities and challenges related to the exploitation of multimodal data for Earth observation.


TU-L05-3: A Flexible Modeling Framework for Coupled Matrix and Tensor Factorizations

Evrim Acar (University of Copenhagen, Denmark); Mathias Nilsson (University of Copenhagen, Denmark); Michael Saunders (Stanford University, USA)

Abstract: Joint analysis of data from multiple sources has proved useful in many disciplines including metabolomics and social network analysis. However, data fusion remains a challenging task in need of data mining tools that can capture the underlying structures from multi-relational and heterogeneous data sources. In order to address this challenge, data fusion has been formulated as a coupled matrix and tensor factorization (CMTF) problem. Coupled factorization problems have commonly been solved using alternating methods and, recently, unconstrained all-at-once optimization algorithms. In this paper, unlike previous studies, in order to have a flexible modeling framework, we use a general-purpose optimization solver that solves for all factor matrices simultaneously and is also capable of handling linear/nonlinear constraints with a nonlinear objective function. We formulate CMTF as a constrained optimization problem and develop accurate models more robust to overfactoring. The effectiveness of the proposed modeling/algorithmic framework is demonstrated on simulated and real data.


TU-L05-4: Geometry Calibration of Distributed Microphone Arrays Exploiting Audio-Visual Correspondences

Axel Plinge (TU Dortmund University, Germany); Gernot Fink (TU Dortmund University, Germany)

Abstract: Smart rooms are used for a growing number of practical applications. They are often equipped with microphones and cameras allowing acoustic and visual tracking of persons. For that, the geometry of the sensors has to be calibrated.

In this paper, a method is introduced that calibrates the microphone arrays by using the visual localization of a speaker at a small number of fixed positions. By matching the positions to the direction of arrival (DoA) estimates of the microphone arrays, their absolute position and orientation are derived. Data from a reverberant smart room is used to show that the proposed method can estimate the absolute geometry with about 0.1m and 2 degrees precision. The calibration is good enough for acoustic and multimodal tracking applications and eliminates the need for dedicated calibration measures by using the tracking data itself.


TU-L05-5: Incorporating Higher Dimensionality in Joint Decomposition of EEG and FMRI

Wout Swinnen (KU Leuven, Belgium); Borbála Hunyadi (KU Leuven, Belgium); Evrim Acar (University of Copenhagen, Denmark); Sabine Van Huffel (Katholieke Universiteit Leuven, Belgium); Maarten De Vos (University of Oldenburg, Germany)

Abstract: EEG-fMRI research to study brain function became popular because of the complementarity of the modalities. Through the use of data-driven approaches such as jointICA, sources extracted from EEG can be linked to regions in fMRI. Joint-ICA in its standard formulation however does not allow for the inclusion of multiple EEG electrodes, so it is a rather arbitrary choice which electrode is used in the analysis. In this study, we explore several ways to include the higher dimensionality of the EEG during a joint decomposition of EEG and fMRI. Our results show that incorporation of multiple channels in the jointICA can reveal new relations between fMRI activation maps and ERP features.



Session TU-L06: Design and Implementation of Signal Processing Systems


TU-L06-1: Minimum Energy Control of Fractional Positive Electrical Circuits

Tadeusz Kaczorek (Bialystok University of Technology, Poland)

Abstract: Minimum energy control problem for the fractional positive electrical circuits is formulated and solved. Sufficient conditions for the existence of solution to the problem are established. A procedure for solving of the problem is proposed and illustrated by a example of fractional positive electrical circuit.


TU-L06-2: Compressive Sensing Spectrum Recovery From Quantized Measurements in 28 nm SOI CMOS

David Bellasi (ETH Zurich, Switzerland); Luca Bettini (ETH Zurich, Switzerland); Thomas Burger (Swiss Federal Institute of Technology (ETH) Zurich, Switzerland); Christian Benkeser (RUAG Space, Switzerland); Qiuting Huang (ETH Zurich, Switzerland); Christoph Studer (Cornell University, USA)

Abstract: Spectral activity detection of wideband radio-frequency signals for cognitive radios typically requires expensive and energy-inefficient analog-to-digital converters. Fortunately, the RF spectrum is in many practical situations sparsely populated, which enables the design of so called analog-to-information (A2I) converters. A2I converters are capable of acquiring and extracting the spectral activity information at low cost and low power by means of compressive sensing (CS). In this paper, we present a high-throughput spectrum recovery stage for CS-based wideband A2I converters. The recovery stage is designed for a CS-based signal acquisition front-end that performs pseudo-random subsampling in combination with coarse quantization. High-throughput spectrum activity detection from such coarsely quantized and compressive measurements is achieved through a massively-parallel VLSI design of a novel accelerated sparse signal dequantization (ASSD) algorithm. The resulting design is implemented in 28 nm SOI CMOS and able to reconstruct 2^15-point frequency-sparse RF spectra at a rate of more than 7.6k reconstructions/second.


TU-L06-3: Unbiased RLS Identification of Errors-in-Variables Models in the Presence of Correlated Noise

Reza Arablouei (University of South Australia, Australia); Kutluyıl Doğançay (University of South Australia, Australia); Tulay Adali (University of Maryland, Baltimore County, USA)

Abstract: We propose an unbiased recursive-least-squares(RLS)-type algorithm for errors-in-variables system identification when the input noise is colored and correlated with the output noise. To derive the proposed algorithm, which we call unbiased RLS (URLS), we formulate an exponentially-weighted least-squares problem that yields an unbiased estimate. Then, we solve the associated normal equations utilizing the dichotomous coordinate-descent iterations. Simulation results show that the estimation performance of the proposed URLS algorithm is similar to that of a previously proposed bias-compensated RLS (BCRLS) algorithm. However, the URLS algorithm has appreciably lower computational complexity as well as improved numerical stability compared with the BCRLS algorithm.


TU-L06-4: A 128~2048/1536 Point FFT Hardware Implementation with Output Pruning

Tuba Ayhan (Katholieke Universiteit Leuven, Belgium); Wim Dehaene (Katholieke Universiteit Leuven, Belgium); Marian Verhelst (Katholieke Universiteit Leuven, Belgium)

Abstract: In this work, an FFT architecture supporting variable FFT sizes, 128~2048/1536, is proposed. This implementation is a combination of a 2p point Common Factor FFT and a 3 point DFT. Various FFT output pruning techniques for this architecture are discussed in terms of memory and control logic overhead. It is shown that the used Prime Factor FFT as an FFT in the 1536 point FFT is able to increase throughput by exploiting single tone pruning with low control logic overhead. The proposed FFT processor is implemented on a Xilinx Virtex 5 FPGA. It occupies only 3148 LUTs and 612 kb memory in FGPA and calculates 1536 point FFT less than 3092 clock cycles with output pruned settings.


TU-L06-5: GPU Parallel Implementation of the Approximate K-SVD Algorithm Using OpenCL

Paul Irofti (University Politehnica Bucharest, Romania); Bogdan Dumitrescu (Tampere University of Technology, Finland)

Abstract: Training dictionaries for sparse representations is a time consuming task, due to the large size of the data involved and to the complexity of the training algorithms. We investigate a parallel version of the approximate K-SVD algorithm, where multiple atoms are updated simultaneously, and implement it using OpenCL, for execution on graphics processing units (GPU). This not only allows reducing the execution time with respect to the standard sequential version, but also gives dictionaries with which the training data are better approximated. We present numerical evidence supporting this somewhat surprising conclusion and discuss in detail several implementation choices and difficulties.



Session TU-L07: Image and Video Coding I


TU-L07-1: A Method for Early-Splitting of HEVC Inter Blocks Based on Decision Trees

Guilherme Correa (University of Coimbra, Portugal); Pedro A. Amado Assuncao (Instituto de Telecomunicacoes / Polytechnic Institute of Leiria, Portugal); Luciano Agostini (Federal University of Pelotas, Brazil); Luís Cruz (Instituto de Telecomunicacoes / University of Coimbra, Portugal)

Abstract: The High Efficiency Video Coding (HEVC) standard provides a large improvement in terms of compression efficiency in comparison to its predecessors, mainly due to the introduction of new coding tools and more flexible data structures. However, since much more options are tested in a Rate-Distortion (R-D) optimization scheme, such improvement is accompanied by a significant increase in the encoding computational complexity. We propose in this paper a novel method for efficient early-splitting decision of inter-predicted Coding Blocks (CB). The method employs a set of decision trees which are trained using information from unconstrained HEVC encoding runs. The resulting early-splitting decision process has an accuracy of 86% with a negligible computational overhead and an average computational complexity decrease of 42% at the cost of a very small Bjontegaard Delta (BD)-rate increase (0.3%).


TU-L07-2: Dynamic Motion Vector Refreshing for Enhanced Error Resilience in HEVC

Joao F. M. Carreira (University of Surrey / Instituto de Telecomunicações, Portugal); Varuna De Silva (Apical Ltd, United Kingdom); Erhan Ekmekcioglu (University of Surrey, United Kingdom); Ahmet Kondoz (University of Surrey, United Kingdom); Sérgio M. M. Faria (Instituto de Telecomunicacoes, Portugal); Pedro A. Amado Assuncao (Instituto de Telecomunicacoes / Polytechnic Institute of Leiria, Portugal)

Abstract: The high level of compression efficiency achieved by HEVC coding techniques decreases the error resilience performance under error prone conditions. This paper addresses the error resiliency of the HEVC standard, focusing on the new motion estimation tools. It is shown that the temporal dependency of motion information is comparatively higher than that in the H.264/AVC standard, causing an increase in the error propagation. Based on this evidence, this paper proposes a method to make intelligent use of temporal motion vector (MV) candidates during the motion estimation process, in order to decrease the temporal dependency, and improve the error resiliency without penalising the rate-distortion performance. The simulation results show that the proposed method improves the error resilience under tested conditions by increasing the video quality by up to 1.7 dB in average, compared to the reference method that always enables temporal MV candidates.


TU-L07-4: Joint Disparity and Motion Estimation Using Optical Flow for Multiview Distributed Video Coding

Matteo Salmistraro (Technical University of Denmark, Denmark); Lars Lau Rakêt (University of Copenhagen, Denmark); Catarina Brites (IST - IT, Portugal); João Ascenso (Instituto Superior de Engenharia de Lisboa, Portugal); Soren Forchhammer (Technical University of Denmark, Denmark)

Abstract: Distributed Video Coding (DVC) is a video coding paradigm where the source statistics are exploited at the decoder based on the availability of Side Information (SI). In a monoview video codec, the SI is generated by exploiting the temporal redundancy of the video, through motion estimation and compensation techniques. In a multiview scenario, the correlation between views can also be exploited to further enhance the overall Rate-Distortion (RD) performance. Thus, to generate SI in a multiview distributed coding scenario, a joint disparity and motion estimation technique is proposed, based on optical flow. The proposed SI generation algorithm allows for RD improvements up to 10% (Bjontegaard) in bit-rate savings, when compared with block-based SI generation algorithms leveraging temporal and inter-view redundancies.


TU-L07-5: Two-Dimensional Non-Separable Block-Lifting-based M-Channel Biorthogonal Filter Banks

Taizo Suzuki (University of Tsukuba, Japan); Hiroyuki Kudo (University of Tsukuba, Japan)

Abstract: We propose a two-dimensional non-separable block-lifting structure (NSBL) that is easily formulated from the one-dimensional separable block-lifting structure (SBL) and 2D non-separable lifting structure (NSL). The NSBL can be regarded as an extension of the NSL because a two-channel NSBL is completely equivalent to a NSL. We apply the NSBL to M-channel (M=2^{n}, n is integer) biorthogonal filter banks (BOFBs). The NSBL-based BOFBs (NSBL-BOFBs) outperform SBL-based BOFBs (SBL-BOFBs) at lossy-to-lossless coding, whose image quality is scalable from lossless data to high compressed lossy data, because their rounding error is reduced by merging many rounding operations, i.e., the number of the NSBL is the almost half that of the SBL.


TU-L07-6: Efficient Quantization Parameter Estimation in HEVC Based on Rho-Domain

Thibaud Biatek (BCom, France); Mickaël Raulet (IETR/INSA Rennes, France); Jean-Francois Travers (TDF, France); Olivier Deforges (IETR, Rennes, France)

Abstract: This paper proposes a quantization parameter estimation algorithm for HEVC CTU rate control. Several methods were proposed, mostly based on Lagrangian optimization combined with Laplacian distribution for transformed coefficients. These methods are accurate but increase the encoder complexity. This paper provides an innovative reduced complexity algorithm based on a rho-domain rate model. Indeed, for each CTU, the algorithm predicts encoding parameters based on co-located CTU. By combining it with Laplacian distribution for transformed coefficients, we obtain the dead-zone boundary for quantization and the related quantization parameter. Experiments in the HEVC HM Reference Software show a good accuracy with only a 3% average bitrate error and no PSNR deterioration for random-access configuration.



Session TU-L08: Audio and Acoustic Signal Processing II


TU-L08-1: Source Localization and Signal Reconstruction in a Reverberant Field Using the FDTD Method

Niccolò Antonello (KU Leuven, Belgium); Toon van Waterschoot (KU Leuven, Belgium); Marc Moonen (KU Leuven, Belgium); Patrick A Naylor (Imperial College London, United Kingdom)

Abstract: Numerical methods applied to room acoustics are usually employed to predict the sound pressure at certain positions generated by a known source. In this paper the inverse problem is studied: given a number of microphones placed in a room, the sound pressure is known at these positions and this information may be used to perform a localization and signal reconstruction of the sound source. The source is assumed to be spatially sparse meaning it can be modeled as a point source. The finite difference time domain method is used to model the acoustics of a simple two dimensional square room and its matrix formulation is presented. A two step method is proposed. First a convex optimization problem is solved to localize the source while exploiting its spatial sparsity. Once its position is known the source signal can be reconstructed by solving an overdetermined system of linear equations.


TU-L08-2: Real-Time Localization of Multiple Audio Sources in A Wireless Acoustic Sensor Network

Anthony Griffin (FORTH/University of Crete, Greece); Anastasios Alexandridis (FORTH/University of Crete, Greece); Despoina Pavlidi (University of Crete, Greece); Athanasios Mouchtaris (Foundation for Research and Technology-Hellas, Greece)

Abstract: In this work we propose a grid-based method to estimate the location of multiple sources in a wireless acoustic sensor network, where each sensor node contains a microphone array and only transmits direction-of-arrival estimates in each time interval, minimizing the transmissions to the central processing node. We present new work on modeling the DOA estimation error in such a scenario. Through extensive, realistic simulations, we show our method outperforms other state-of-the-art methods, in both accuracy and complexity. We present localization results of real recordings in an outdoor cell of a sensor network.


TU-L08-3: Audio Source Separation Using Multiple Deformed References

Nathan Souviraà-Labastie (Université Rennes 1, France); Anaik Olivero (Inria, Centre Rennes - Bretagne-Atlantique, France); Emmanuel Vincent (Inria Nancy - Grand Est, France); Frédéric Bimbot (IRISA (CNRS & INRIA), France)

Abstract: This paper deals with audio source separation guided by multiple audio references. We present a general framework where additional audio references for one or more sources of a given mixture are available. Each audio reference is another mixture which is supposed to contain at least one source similar to one of the target sources. Deformations between the sources of interest and their references are modeled in a general manner. A nonnegative matrix co-factorization algorithm is used which allows sharing of information between the considered mixtures. We run our algorithm on music plus voice mixtures with music and/or voice references. Applied on movies and TV series data, our algorithm improves the signal-to-distortion ratio (SDR) of the sources with the lowest intensity by 9 to 12 decibels with respect to original mixture.


TU-L08-4: NMF with Spectral and Temporal Continuity Criteria for Monaural Sound Source Separation

Julian M. Becker (RWTH Aachen University, Germany); Christian Sohn (RWTH Aachen University, Germany); Christian Rohlfing (RWTH Aachen University, Germany)

Abstract: Nonnegative Matrix Factorization (NMF) is a well suited and widely used method for monaural sound source separation. It has been shown, that an additional cost term supporting temporal continuity can improve the separation quality. We extend this model by adding a cost term, that penalizes large variations in the spectral dimension. We propose two different cost terms for this purpose and also propose a new cost term for temporal continuity. We evaluate these cost terms on different mixtures of samples of pitched instruments, drum sounds and other acoustical signals. Our results show, that penalizing large spectral variations can improve separation quality. The results also show, that our alternative temporal continuity cost term leads to better separation results than existing temporal continuity cost terms.


TU-L08-5: A Binaural Hearing Aid Speech Enhancement Method Maintaining Spatial Awareness for the User

Joachim Thiemann (Carl-von-Ossietzky Universität Oldenburg, Germany); Menno Müller (Carl-von-Ossietzky Universität Oldenburg, Germany); Steven van de Par (University of Oldenburg, Germany)

Abstract: Multi-channel hearing aids can use directional algorithms to enhance speech signals based on their spatial location. In the case where a hearing aid user is fitted with a binaural hearing aid, it is important that the binaural cues are kept intact, such that the user does not loose spatial awareness, the ability to localize sounds, or the benefits of spatial unmasking. Typically algorithms focus on rendering the source of interest in the correct  spatial location, but degrade all other source positions in the auditory scene. In this paper, we present an algorithm  that uses a binary mask such that the target signal is enhanced but the background noise remains unmodified except for an attenuation. We also present two variations of the  algorithm, and in initial evaluations find that this  type of mask-based processing has promising performance.



Session TU-L09: Signal processing for Massive MIMO and Large Antenna Array 2 (Special Session)


TU-L09-1: Multi-polarized Multi-user Massive MIMO: Precoder Design and Performance Analysis

Jaehyun Park (Imperial College London, United Kingdom); Bruno Clerckx (Imperial College London, United Kingdom)

Abstract: The space limitation and the channel acquisition of a large number of antennas prevent the massive MIMO system to operate in a practical setup. The multi-polarized antennas can be one solution to alleviate the first obstacle. Furthermore, the dual structured precoding, in which a preprocessing based on the spatial correlation and a subsequent linear precoding based on the short-term CSIT are concatenated, can reduce the feedback overhead efficiently. To reduce the feedback overhead further, we propose a dual structured multi-user linear precoding, in which the subgrouping method based on co-polarization is additionally applied to the spatially grouped MSs in the preprocessing stage. By investigating the behavior of the asymptotic performance over the cross-polar discrimination (XPD) parameter, we also propose a new dual structured precoding, in which the XPD, spatial correlation, and CSIT quality are jointly utilized in the precoding/feedback for the multi-polarized multi-user massive MIMO system.


TU-L09-2: Reduced-Rank Widely Linear Precoding in Massive MIMO Systems with I/Q Imbalance

Wence Zhang (Southeast University, P.R. China); Rodrigo C. de Lamare (University of York, United Kingdom); Ming Chen (Southeast University, P.R. China)

Abstract: We present reduced-rank widely linear precoding algorithms for Massive MIMO systems with I/Q imbalance (IQI). With a large number of transmit antennas, the imperfection I/Q branches at the transmitter has a significant impact on the downlink performance. We develop linear precoding tech- niques using an equivalent real-valued model to mitigate IQI and multiuser interference. In order to reduce the computational complexity required by the matrix inverse, a widely linear reduced-rank precoding strategy based on the Krylov subspace (KS) is devised. Simulation results show that the proposed methods work well under IQI, and the KS precoding algorithm performs almost as well as the full-rank precoder while requiring much lower complexity.


TU-L09-3: Flexible Coordinated Beamforming with Lattice Reduction for Multi-User Massive MIMO Systems

Keke Zu (University of York, United Kingdom); Bin Song (Ilmenau University of Technology, Germany); Martin Haardt (Ilmenau University of Technology, Germany); Rodrigo C. de Lamare (University of York, United Kingdom)

Abstract: The application of precoding algorithms in multi-user Massive multiple-input multiple-out (MU-Massive-MIMO) systems is restricted by the dimensionality constraint that the number of transmit antennas has to be greater than or equal to the total number of receive antennas. In this paper, a lattice reduction (LR)-aided flexible coordinated beamforming (LR-FlexCoBF) algorithm is proposed to overcome the dimensionality constraint and to equip overloaded MU-Massive-MIMO systems. Random user selection scheme is also integrated with the proposed LR-FlexCoBF to extend its application to MU-Massive-MIMO systems with different overloading situations. Simulation results show that remarkable improvements in terms of bit error rate (BER) and sum-rate performances can be achieved by the proposed LR-FlexCoBF precoding algorithm.


TU-L09-4: Decentralized Multi-cell Beamforming Via Large System Analysis in Correlated Channels

Hossein Asgharimoghaddam (University of Oulu, Finland); Antti Tölli (University of Oulu, Finland); Premanandana Rajatheva (University of Oulu, Finland)

Abstract: A decentralized multi-cell minimum power beamforming problem is considered. The optimal decentralization requires exchange of terms related to instantaneous inter-cell interference (ICI) values or channel state information (CSI) via a backhaul link. This limits the achievable performance in limited backhaul capacity scenarios, especially when dealing with a fast fading scenario or a large number of users and antennas. In this work, we utilize results from random matrix theory for developing two algorithms based on uplink-downlink duality and optimization decomposition relying on limited cooperation between nodes to share knowledge about channel statistics. As a result, approximately optimal beamformers are achieved with greatly reduced backhaul information exchange rate. The simulations show that the performance gap due to approximations is small even when the problem dimensions are relatively small. Moreover, the results are very general as the channel model considered allows different correlation matrices for distinct users which can model various propagation environments.


TU-L09-5: Generalised Spatial Modulation for Large-Scale MIMO

Abdelhamid Younis (The University of Edinburgh, United Kingdom); Raed Mesleh (University of Tabuk, Saudi Arabia); Marco Di Renzo (French National Center for Scientific Research (CNRS), France); Harald Haas (The University of Edinburgh, United Kingdom)

Abstract: In this paper, the performance of GSM and SM systems is studied assuming channel estimation errors and correlated Rayleigh and Rician fading channels. A new, simple, accurate and general analytical closed-form upper bound for the ABER performance of both systems is derived. The analytical bound is shown to be applicable to correlated and uncorrelated channels, as well as to small and large scale MIMO systems. The results demonstrate that GSM is more suitable for large-scale MIMO systems than SM. The performance gain of GSM over SM is about 5 dB. The results also show that SM is very robust to CSEs. Specifically, the performance degradation of SM in the presence of CSEs are 0.7 dB and 0.3 dB for Rayleigh and Rician fading channels respectively. Lastly, the findings in this paper showcase that both GSM and SM are very promising candidates for future large-scale MIMO systems.



Session TU-L10: Multi-Modal Sensing and Analysis of Human Motion for Sporting and Leisure Applications (Special Session)


TU-L10-1: Interactive Games for Preservation and Promotion of Sporting Movements

Noel. E. O'Connor (Dublin City University, Ireland); Yvain Tisserand (MIRALab, University of Geneva, Switzerland); Argyris Chatzitofis (Centre for Research and Technology Hellas, Information Technologies Institute, Greece); Francois Destelle (Dublin City University, Ireland); Jon Goenetxea (Vicomtech-IK4, Spain); Luis Unzueta (Vicomtech-IK4, Spain); Dimitrios Zarpalas (Informatics and Telematics Institute, Greece); Petros Daras (Centre for Research and Technology Hellas, Greece); Mariate Linaza (Vicomtech-IK4, Spain); Kieran Moran (Dublin City University, Ireland); Nadia Magnenat-Thalmann (Miralab, Switzerland)

Abstract: In this paper we describe two interactive applications for capturing the motion signatures associated with key skills of traditional sports and games. We first present the case for sport as an important example of intangible cultural heritage. We then explain that sport requires special consideration in terms of digitization for preservation as the key aspects to be digitized are the characteristic movement signatures of such sports. We explain that, given the nature of traditional sporting agencies, this requires low-cost motion capture technology. Furthermore we argue that in order to ensure ongoing preservation, this should be provided via fun interactive gaming scenarios that promote uptake of the sports, particularly among children. We then present two such games that we have developed and illustrate their performance.


TU-L10-2: 3DLIVE: A Multi-Modal Sensing Platform Allowing Tele-Immersive Sports Applications

Benjamin Poussard (Arts et Métiers Paristech, France); Simon Richir (Arts et Métiers Paristech, LAMPA, France); Dimitrios Zarpalas (Informatics and Telematics Institute, Greece); Stylianos Asteriadis (Information Technologies Institute, Greece); Petros Daras (Centre for Research and Technology Hellas, Greece)

Abstract: 3DLive project is developing a user-driven mixed reality platform, intended for augmented sports. Using latest sensing techniques, 3DLive will allow remote users to share a three-dimensional sports experience, interacting with each other in a mixed reality space. This paper presents the multi-modal sensing technologies used in the platform. 3DLive aims at delivering a high sense of tele-immersion among remote users, regardless of whether they are indoors or outdoors, in the context of augmented sports. In this paper, functional and technical details of the first prototype of the jogging scenario are presented, while a clear separation between indoor and outdoor users is given, since different technologies need to be employed for each case.


TU-L10-3: Viewpoint-dependent 3D Human Body Posing for Sports Legacy Recovery From Images and Video

Luis Unzueta (Vicomtech-IK4, Spain); Jon Goenetxea (Vicomtech-IK4, Spain); Mikel Rodriguez (Vicomtech-IK4, Spain); Mariate Linaza (Vicomtech-IK4, Spain)

Abstract: In this paper we present a method for 3D human body pose reconstruction from images and video, in the context of sports legacy recovery. The video and image legacy content can include camera motion, several players, considerable partial occlusions, motion blur and image noise, recorded with non-calibrated cameras, which increases even more the difficulty of solving the problem of 3D reconstruction from 2D data. Therefore, we propose a semi-automatic approach in which a set of 2D key-points are manually marked in key-frames and then an automatic process estimates the camera calibration parameters, the positions and poses of the players and their body part dimensions. In-between frames are automatically estimated taking into account constraints related to human kinematics and collisions with the environment. Experimental results show that this approach obtains reconstructions that can help to analyze playing techniques and the evolution of sports through time.


TU-L10-4: Articulated Human Motion Tracking with Foreground Learning

Aichun Zhu (University of Technology of Troyes, France); Hichem Snoussi (University of Technology of Troyes, France); Abel Cherouat (University of Technology of Troyes, France)

Abstract: Tracking the articulated human body is a challenging computer vision problem because of changes in body poses and their appearance. Pictorial structure (PS) models are widely used in 2D human pose estimation. In this work, we extend the PS models for robust 3D pose estimation, which includes two stages: multi-view human body parts detection by foreground learning and pose states updating by annealed particle filter (APF) and detection. Moreover, the image dataset F-PARSE was built for foreground training and flexible mixture of parts (FMP) model was used for foreground learning. Experimental results demonstrate the effectiveness of our foreground learning-based method.


TU-L10-5: Low-cost Accurate Skeleton Tracking Based on Fusion of Kinect and Wearable Inertial Sensors

Francois Destelle (Dublin City University, Ireland); Argyris Chatzitofis (Centre for Research and Technology Hellas, Information Technologies Institute, Greece); Amin Ahmadi (Dublin City University, Ireland); Dimitrios Zarpalas (Informatics and Telematics Institute, Greece); Petros Daras (Centre for Research and Technology Hellas, Greece); Noel. E. O'Connor (Dublin City University, Ireland); Kieran Moran (Dublin City University, Ireland)

Abstract: In this paper, we present a novel multi-sensor fusion method to build a human skeleton. We propose to fuse the joint position information obtained from the popular Kinect sensor with more precise estimation of body segment orientations

provided by a small number of wearable inertial sensors. The use of inertial sensors can help to address many of the well known limitations of the Kinect sensor. The precise calculation of joint angles potentially allows the quantification of movement errors in technique training, thus facilitating the use of the low-cost Kinect sensor for accurate biomechanical purposes e.g. the improved human skeleton could be used in visual feedback-guided motor learning, for example. We compare our system to the gold standard Vicon optical motion capture system, proving that the fused skeleton achieves a very high level of accuracy.



Session TU-P1: Signal Processing for Communications I


TU-P1-1: Secrecy Rate Optimization for a MIMO Secrecy Channel Based on Stackelberg Game

Zheng Chu (Newcastle University, United Kingdom); Kanapathippillai Cumanan (Newcastle University, United Kingdom); Zhiguo Ding (Newcastle University, United Kingdom); Martin Johnston (Newcastle University, United Kingdom); Stephane Y. Le Goff (University of Newcastle upon Tyne, United Kingdom)

Abstract: In this paper, we consider a multi-input-multi-output (MIMO) wiretap channel with a multiantenna eavesdropper, where a private cooperative jammer is employed to improve the achievable secrecy rate. The legitimate user pays the legitimate transmitter for its secured communication based on the achieved secrecy rate. We first approximate the legitimate transmitter covariance matrix by employing Taylor series expansion, then this secrecy rate problem can be formulated into a Stackelberg game based on a fixed covariance matrix of the transmitter, where the transmitter and the jammer try to maximize their revenues. A Stackelberg equilibrium solution can be obtained where both the transmitter and the cooperative jammer come to an agreement on the interference requirement at the eavesdropper and the interference price. Simulation results have provided to show that the revenue functions of the legitimate user and the jammer are concave functions, and the Stackelberg equilibrium point can be obtained.


TU-P1-2: Fast Average Gossiping Under Asymmetric Links in WSNs

César Asensio-Marco (Universidad de Valencia, Spain); Baltasar Beferull-Lozano (Universidad de Valencia, Spain)

Abstract: Average consensus algorithms are an essential tool in wireless sensor networks for multiple estimation tasks, being the convergence time and the energy consumption of these algorithms critical for their usability. Most existing work in the related literature focuses on improving these two parameters, assuming generally unicast communications, which are neither realistic nor efficient given the wireless nature of these networks. Instead, broadcast communications allow a greater instantaneous exchange of information between the network nodes, accelerating the consensus and saving energy in communications. In this work, we propose two methods that optimize the network topology to simultaneously improve the total power consumption per iteration, the maximum power consumption per node and the convergence time in a broadcast scenario. The first method is applied to continuous systems, while the second one is more suitable for discrete systems. Numerical results are presented to show the validity and efficiency of the proposed methods.


TU-P1-3: Blind Separation of Sources with Finite Rate of Innovation

Richard Porter (University of Bristol, United Kingdom); Vladislav B Tadic (University of Bristol, United Kingdom); Alin M Achim (University of Bristol, United Kingdom)

Abstract: We propose a method for recovering the parameters of periodic signals with finite rate of innovation sampled using a raised cosine pulse. We show that the proposed method exhibits the same numerical stability as existing methods of its type, and we investigate the effect of oversampling on the performance of our method in the presence of noise. Our method can also be applied to non-periodic signals and we assess the efficacy of signal recovery in this case. Finally, we show that the problem of cochannel QPSK signal separation can be converted into a general finite rate of innovation framework, and we test the effectiveness of this approach.


TU-P1-4: Optimal Quantization and Power Allocation for Energy-Based Distributed Sensor Detection

Edmond Nurellari (University of Leeds, United Kingdom); Desmond McLernon (The University of Leeds, United Kingdom); Mounir Ghogho (University of Leeds, United Kingdom); Sami A Aldalahmeh (Al-Zaytoonah University of Jordan, Jordan)

Abstract: We consider the decentralized detection of an unknown deterministic signal in a spatially uncorrelated distributed wireless sensor network. N samples from the signal of interest are gathered by each of the M spatially distributed sensors, and the energy is estimated by each sensor. The sensors send their quantized information over orthogonal channels to the fusion center (FC) which linearly combines them and makes a final decision. We show how by maximizing the modified deflection coefficient we can calculate the optimal transmit power allocation for each sensor and the optimal number of quantization bits to match the channel capacity.


TU-P1-5: A Maxmin Model for Solving Channel Assignment Problem in IEEE 802.11 Networks

Mohamed Elwekeil (Egypt-Japan University of Science and Technology, Egypt); Masoud Alghoniemy (University of Alexandria, Egypt); Osamu Muta (Kyushu University, Japan); Adel Abdel Rahman (Egypt-Japan University of Science & Technology, Egypt); Hiroshi Furukawa (Kyushu University, Japan); Haris Gacanin (Alcatel-Lucent Bell N.V., Belgium)

Abstract: In this paper, an optimization model for solving the channel assignment problem in multi-cell WLANs is proposed. This model is based on maximizing the minimum distance between access points (APs) that work on the same channel. The proposed model is formulated in the form of a mixed integer linear program (MILP). The main advantage of the proposed algorithm is that it ensures non-overlapping channel assignment with no overhead power measurements. The proposed channel assignment algorithm can be implemented within practical time frames for different topology sizes. Simulation results indicate that the proposed algorithm exhibits better performance than that of the pick-first greedy algorithm and the single channel assignment method.


TU-P1-6: Energy Efficiency Improvements in HetNets by Exploiting Device-to-Device Communication

Yusuf Sambo (University of Surrey, United Kingdom); Muhammad Zeeshan Shakir (Texas A&M University at Qatar (TAMUQ), Qatar); Khalid A. Qaraqe (Texas A&M University at Qatar, USA); Erchin Serpedin (Texas A&M University, USA); Muhammad Ali Imran (University of Surrey, United Kingdom); Beena Ahmed (Texas A&M University at Qatar, Qatar)

Abstract: Device-to-device (D2D) communication is considered as an integral part of future heterogeneous networks (HetNets). In this paper, we consider a two tier HetNet which promises energy savings by exploiting D2D communication with macro-cellular network. D2D communication has been deployed within the HetNet to reduce the end-to-end power consumption of the network where the mobile users are transmitting with adaptive power while maintaining the desired link quality. In this context, downlink/uplink and backhaul power consumptions of each link have been investigated. Later, we define a new performance metric which is the maximum amount of bits transmitted per Joule of energy consumed by the backhaul network and is referred to as backhaul energy efficiency (BEE) of HetNet. Simulation results show that D2D communication deployment outperforms the HetNet with small-cell deployment in terms of backhaul power consumption, BEE and the total power consumption of HetNet, thus providing a greener alternative to small-cell deployment.


TU-P1-7: Numerical Characterization for Optimal Designed Waveform to Multicarrier Systems in 5G

Zeineb Hraiech (SUP'COM, Tunisia); Mohamed Siala (Sup'Com, Tunisia); Fatma Abdelkefi (High School of Communications of Tunis (SUPCOM), Tunisia)

Abstract: High mobility of terminals constitutes a hot topic that is commonly envisaged for the 5G mobile communication systems. The wireless propagation channel is time-frequency variant. This aspect can dramatically damage the waveforms orthogonality that is induced in the OFDM signal. Consequently, this results in oppressive ICI and ISI, which leads to performance degradation in OFDM systems.  In this paper, we go further by investigating the performance evaluation of our algorithm that maximizes the received SINR by optimizing systematically the OFDM waveforms. We start by testing its robustness against time and frequency synchronization errors. Then, as this algorithm banks on an iterative approach to find the optimal waveforms, we study the impact of the waveform initialization on its convergence. The obtained simulation results confirm the efficiency of this algorithm and its robustness compared to the conventional OFDM schemes, which makes it an appropriate good candidate for 5G systems.


TU-P1-8: Optimum Relay Selection for Cooperative Spectrum Sensing and Transmission in Cognitive Networks

Hasan Kartlak (Istanbul University, Turkey); Niyazi Odabasioglu (Istanbul University, Turkey); Aydin Akan (Istanbul University, Turkey)

Abstract: In this paper, cyclostationarity based cooperative spectrum sensing is presented to detect the idle bands and then locate the secondary users into these bands. As such, an optimum relay is selected to perform both cooperative communication and cyclostationarity based spectrum sensing. Performance of transmission, probability of detection, and probability of missing are presented via computer simulations. Results show that proposed jointly optimized relay selection scheme provides sufficient performance for both transmission and spectrum sensing.


TU-P1-9: Performance Analysis of the Opportunistic Multi-relay Network with Co-channel Interference

Jamal A Hussein (Newcastle University, United Kingdom); Salama Said Ikki (Lakehead University, Canada); Said Boussakta (Newcastle University, United Kingdom); Charalampos C. Tsimenidis (Newcastle University, United Kingdom)

Abstract: A study of the effect of co-channel interference on the performance of opportunistic scheduling multi-relay amplify-and-forward cooperative communication network is presented. Precisely, we consider the CCI existence at both relays and destination node. Exact equivalent end-to-end signal-to-interference-plus-noise ratio (SINR) is derived. Then, closed-form expressions for both cumulative distribu-tion function (CDF) and probability density function (PDF) of the received SINR are obtained. The derived expressions are used to measure the asymptotic outage probability of the system. Numerical results and Matlab simulations are also provided to sustain the correctness of the analytical calculations.


TU-P1-10: A Unifying View on Energy-Efficiency Metrics in Cognitive Radio Channels

Raouia Masmoudi (ETIS / ENSEA - University of Cergy-Pontoise - CNRS, France); Elena Veronica Belmega (ETIS / ENSEA-UCP-CNRS, France); Inbar Fijalkow (ETIS, CNRS, ENSEA, UniversityCergy-Pontoise, France); Noura Sellami (Ecole Nationale d'Ingénieurs de Sfax, Tunisia)

Abstract: The objective of this paper is to provide a unifying framework of the most popular energy-efficiency metrics proposed in the wireless communications literature. The target application is a cognitive radio system composed of a secondary user whose goal is to transmit in an optimal energy-efficient manner over several available bands under the interference constraints imposed by the presence of the primary network. It turns out that, the optimal allocation policies maximizing these energy-efficiency metrics can be interpreted as Pareto-optimal points lying on the optimal tradeoff curve between the rate maximization and power minimization bi-criteria joint problem. Using this unifying framework, we provide several interesting parallels and differences between these metrics and the implications w.r.t. the optimal tradeoffs between achievable rates and consumed power.


TU-P1-11: Relevance of Dirichlet Process Mixtures for Modeling Interferences in Underlay Cognitive Radio

Vincent Pereira (Université Bordeaux 1, France); Guillaume Ferré (University of Bordeaux, France); Audrey Giremus (Université Bordeaux1, France); Eric J. Grivel (Université de Bordeaux, France)

Abstract: In the field of underlay cognitive radio communications, the signal transmitted by the secondary user is disturbed by incoming signals from primary users. Thus, it is necessary to compensate for this secondary-link degradation at the receiver level. In this paper we use Dirichlet process mixtures (DPM) to relax a priori assumptions on the characteristics of the primary user-induced interference. DPM allow us to model the probability density function of the interference. The latter is estimated jointly with the symbols and the channel of the secondary link by using marginalized particle filtering. Our approach makes it possible to improve the symbol error rate compared with an algorithm that simply models the interference as a Gaussian noise.


TU-P1-12: Channel Adaptive Pulse Shaping for OQAM-OFDM Systems

Martin Fuhrwerk (Leibniz Universität Hannover, Germany); Jürgen Peissig (Leibniz Universität Hannover, Germany); Malte Schellmann (Huawei Technologies Duesseldorf GmbH, Germany)

Abstract: Theory predicts a gain in transmission performance, when adapting pulse shapes of Offset Quadrature Amplitude Mod- ulation (OQAM) Orthogonal Frequency Division Multiplex- ing (OFDM) systems to delay and Doppler spread in doubly- dispersive channels. Here we investigate the quantitative gains in reconstruction quality and bit error rate (BER) with respect to subcarrier spacing and channel properties. It is shown that it is possible to reduce the uncoded BER by a factor of more than two and the coded BER by a factor of at least four, utilizing only two different pulse shapes. The simulation results show that channel adaptive pulse shaping for OQAM-OFDM systems is a promising concept for future mobile communication systems.


TU-P1-13: Augmentation and Integrity Monitoring Network and Egnos Performance Comparison for Train Positioning

Pietro Salvatori (University of Roma TRE, Italy); Alessandro Neri (University of ROMA TRE, Italy); Cosimo Stallo (University of Rome Tor Vergata, Italy); Veronica Palma (RadioLabs, Italy); Andrea Coluccia (University of Rome Tor Vergata, Italy); Francesco Rispoli (Anssaldo, Italy)

Abstract: The paper describes the performance comparison between EGNOS (European Geostationary Navigation Overlay Service) and an Augmentation & Integrity Monitoring Network (AIMN) Location Determination System (LDS) designed for train positioning in terms of position and velocity accuracy and integrity information. The proposed work is inserted in the scenario of introduction and application of space technologies based on the ERTMS (European Railways Train Management System) architecture. It foresees to include the EGNOS-Galileo infrastructures in the train control system, with the aim at improving performance, enhancing safety and reducing the investments on the railways circuitry and its maintenance. The performance results will be shown, based on a campaign test acquired on an ring-shaped highway (named GRA) around Rome (Italy) to simulate movement of a train on a generic track.



Session TU-P2: Image and Video Coding and Processing


TU-P2-1: Parallel Performance and Energy Efficiency of Modern Video Encoders on Multithreaded Architectures

Rafael Rodríguez-Sánchez (Universitat Jaume I, Spain); Francisco Igual (Universidad Complutense de Madrid, Spain); Jose Luis Martinez (University of Castilla-La Mancha, Spain); Rafael Mayo Gual (Universitat Jaume I de Castelló, Spain); Enrique S. Quintana-Ortí (Universidad de Jaume I, Spain)

Abstract: In this paper we evaluate four mainstream video encoders: H.264/MPEG-4 Advanced Video Coding, Google's VP8, High Efficiency Video Coding, and Google's VP9, studying conventional figures-of-merit such as performance in terms of encoded frames per second, and encoding efficiency in both PSNR and bit-rate of the encoded video sequences. Additionally, two platforms equipped with a large number of cores, representative of current multicore architectures for high-end servers, and equipped with a Wattmeter allow us to assess the quality of these video encoders in terms of parallel scalability and energy consumption, which is well-founded given the significant levels of thread concurrency and the impact of the power wall in todays' multicore processors.


TU-P2-2: Coefficient-Wise Intra Prediction for DCT-Based Image Coding

Ichiro Matsuda (Tokyo University of Science, Japan); Yusuke Kameda (Tokyo University of Science, Japan); Susumu Itoh (Tokyo University of Science, Japan)

Abstract: This paper proposes an adaptive intra prediction method for DCT-based image coding. In this method, predicted values in each block are generated in spatial domain like the conventional intra prediction methods. On the other hand, prediction residuals to be encoded are separately calculated in DCT domain, i.e. differences between the original and predicted values are calculated after performing DCT. Such a prediction framework allows us to change the coding process from block-wise order to coefficient-wise one. When the coefficient-wise order is adopted, a block to be predicted is almost always surrounded by partially reconstructed image signals, and therefore, efficient interpolative prediction can be performed. Simulation results indicate that the proposed method is beneficial for removing inter-block correlations of high-frequency components.


TU-P2-3: Fast Motion Estimation Discarding Low-Impact Fractional Blocks

Saverio G. Blasi (Queen Mary, University of London, United Kingdom); Ivan Zupancic (Queen Mary University of London, United Kingdom); Ebroul Izquierdo (Queen Mary, University of London, United Kingdom)

Abstract: Sub-pixel motion estimation is used in most modern video coding schemes to improve the outcomes of motion estimation. The reference frame is interpolated and motion vectors are refined with fractional components to reduce the prediction error. Due to the high complexity of these steps, sub-pixel motion estimation can be very demanding in terms of encoding time and resources. A method to reduce complexity of motion estimation schemes is proposed in this paper based on adaptive precision. A parameter is computed to geometrically characterise each block and select whether fractional refinements are likely to improve coding efficiency or not. The selection is based on an estimate of the actual impact of fractional refinements on the coding performance. The method was implemented within the H.264/AVC standard and is shown achieving considerable time savings with respect to conventional schemes, while ensuring that the performance losses are kept below acceptable limits.


TU-P2-4: Cost Function Optimization and Its Hardware Design for the Sample Adaptive Offset of HEVC Standard

Fabiane Rediess (Federal University of Pelotas, Brazil); Ruhan Conceição (Federal University of Pelotas, Brazil); Bruno Zatt (Federal University of Pelotas, Brazil); Marcelo Porto (Federal University of Pelotas, Brazil); Luciano Agostini (Federal University of Pelotas, Brazil)

Abstract: This work presents a cost function optimization for the internal decision of the HEVC Sample Adaptive Offset (SAO) filter. The optimization approach is focused on an efficient hardware design implementation, and explores two critical points. The first one focus in the use of fixed-point data instead of float-point data, and the second focus on reduce the number of full multipliers and divisors. The simulations results show that those proposals do not present significant impact on BD-rate measurements. Based on both these two hardware-friendly optimizations, we propose and hardware design for this cost function module. The FPGA synthesis results shows that the proposed architecture achieved 521 MHz, and are able process UHD 8K@120 fps operating at 47 MHz.


TU-P2-5: Clustering-based Methods for Fast Epitome Generation

Martin Alain (Technicolor/INRIA, France); Christine Guillemot (INRIA, France); Dominique Thoreau (Technicolor, France); Philippe Guillotel (Technicolor, France)

Abstract: This paper deals with epitome generation, mainly dedicated here to image coding applications. Existing approaches are known to be memory and time consuming due to exhaustive self-similarities search within the image for each non-overlapping block. We propose here a novel approach for epitome construction that first groups close patches together. In a second time the self-similarities search is performed for each group. By limiting the number of exhaustive searches we limit the memory occupation and the processing time. Results show that interesting complexity reduction can be achieved while keeping a good epitome quality (down to 18.08 % of the original memory occupation and 41.39 % of the original processing time).


TU-P2-6: Quality Assessment of Chromatic Variations: A Study of Full-Reference and No-Reference Metrics

Marco V. Bernardo (University of Beira Interior - Remote Sensing Unit and Instituto de Telecomunicações, Portugal); Antonio M. G. Pinheiro (University of Beira Interior, Portugal); Paulo Fiadeiro (Universidade da Beira Interior, Portugal); Manuela Pereira (University of Beira Interior, Portugal)

Abstract: This work describes a comparative study on the ability of Full-Reference vs No-Reference quality metrics to measure the Quality of Experience created by images that suffer chromatic errors. Considering this, some well known Full-Reference (PSNR, UQI, MSSIM) and No-Reference (GM, FTM, RTBM) will be compared with the MOS results. Although the quality metrics considered are usually applied to the luminance component, in this study they are applied to the chromatic components individually and to the average of the three components, because only the image chromatic components have been changed resulting in similar values of luminance. The correlation estimates show that the Full-Reference Metrics MSSIM and UQI provide a good representation of the subjective results. Moreover, the studied No-Reference metrics also provide an acceptable representation, although their reliability is less effective.


TU-P2-7: Chromatic Variations on 3D Video and QoE

Daniel Piedade (University of Beira Interior, Portugal); Marco V. Bernardo (University of Beira Interior - Remote Sensing Unit and Instituto de Telecomunicações, Portugal); Paulo Fiadeiro (Universidade da Beira Interior, Portugal); Antonio M. G. Pinheiro (University of Beira Interior, Portugal); Manuela Pereira (University of Beira Interior, Portugal)

Abstract: In this paper a study on the perceived quality of chromatic variations in 3D video is reported. Each of the testing videos colors, represented in the CIE 1976 (L*a*b*) color space, were initially divided into clusters based in their similarity. Predefined chromatic errors were applied to these color clusters. These videos were shown to individuals, that were asked to rank their quality based on the colors naturalness. The Mean Opinion Scores were computed and the sensibility to chromatic changes on 3D video was quantified. Moreover, attention maps were obtained and a short study on the changes on the visual saliency in the presence of these chromatic variations is also reported.


TU-P2-8: Color Information in a Model of Saliency

Shahrbanoo Hamel (Universite de Grenoble, France); Nathalie Guyader (GIPSA-lab, France); Denis Pellerin (GIPSA-lab, France); Dominique Houzet (University of Grenoble, France)

Abstract: Bottom-up saliency models have been developed to predict the location of gaze according to the low level features of visual scenes. So far, color besides to brightness, contrast and motion is considered as one of the primary features in computing bottom-up saliency. However, its contribution in guiding eye movements when viewing natural scenes has been debated. We investigated the contribution of color information in a bottom-up visual saliency model. The model efficiency was tested using the experimental data obtained on 45 observers who were eye tracked while freely exploring a large data set of color and grayscale videos. The two data sets of recorded eye positions, for grayscale and color videos, were compared with a luminance-based saliency model (Marat et al.). We incorporated chrominance information to the model. Results show that color information improves the performance of the saliency model in predicting eye positions.


TU-P2-9: Total Variation Reconstruction for Compressive Sensing Using Nonlocal Lagrangian Multiplier

Chien Van Trinh (Sungkyunkwan University, Korea); Khanh Q. Dinh (Sungkyunkwan University, Korea); Viet Anh Nguyen (Sungkyunkwan University, Korea); Byeungwoo Jeon (Sungkyunkwan University, Korea)

Abstract: Total variation has proved its effectiveness in solving inverse problems for compressive sensing. Besides, the nonlocal means filter, used as regularization, preserves texture better for recovered images, but it is quite complex to implement. In this paper, we propose a simpler method called nonlocal Lagrangian multiplier (NLLM) which uses the nonlocal means filter to update the Lagrangian multiplier. Although it is much simpler to implement than that using the nonlocal means filter as a regularization term, its experimental results show that the proposed NLLM is superior both in subjective and objective qualities of recovered image over other well-known recovery algorithms.


TU-P2-10: Noise Robust Local Phase Coherence Based Method for Image Sharpness Assesment

Damir Seršić (University of Zagreb Faculty of Electrical Engineering and Computing, Croatia); Ana Sović (University of Zagreb, Faculty of Electrical Engineering and Computing, Croatia)

Abstract: Image sharpness assessment is a very important issue in image acquisition and processing. Novel approaches in no-reference image sharpness assessment methods are based on local phase coherence (LPC), rather than edge or frequency content analysis. It has been shown that the LPC based methods are closer to human observer assessments. In this paper, we propose carefully time-domain de-signed complex wavelets that provide a good tool for the local phase estimation. Moreover, we take a special care of noise. We apply thresholding in the wavelet domain and merge several estimates to achieve statistical robustness in the presence of noise. A novel, more intuitive sharpness index is proposed, as well.


TU-P2-11: Entropy-constrained Dense Disparity Map Estimation Algorithm for Stereoscopic Images

Aysha Kadaikar (Université Paris13, Sorbonne Paris Cité, France); Anissa Mokraoui (Université Paris 13, Sorbone Paris Cité, France); Gabriel Dauphin (University of Paris, France)

Abstract: This paper deals with the stereo matching problem to estimate a dense disparity map. Traditionally a matching metric such as mean square error distortion is adopted to select the best matches associated with disparities. However several disparities related to a given pixel may satisfy the distortion criterion although quite often the choice that is made does not necessarily meet the coding objective. An entropy-constrained disparity optimization approach is developed where the traditional matching metric is replaced by a joint entropy-distortion metric so that the selected disparities reduce not only the disparity entropy but also the reconstructed image distortion. The algorithm sequentially builds a tree avoiding a full search and ensuring good rate-distortion performance. At each tree depth, the M best retained paths are extended to build new paths which are assigned entropy-distortion metrics. Simulations show that our algorithm provides better results than dynamic programming algorithm.


TU-P2-12: Motion Estimation for Super-resolution Based on Recognition of Error Artifacts

Ana Stojkovic (University Ss Cyryl and Methodius, Macedonia, the former Yugoslav Republic of); Zoran Ivanovski (Ss. Cyril and Methodius University, Macedonia, the former Yugoslav Republic of)

Abstract: The work presents an effective approach for subpixel motion estimation for Super-resolution (SR). The objective is to improve the quality of the estimated SR image by increasing the accuracy of the motion vectors used in the SR procedure. The correction of the motion vectors is achieved based on appearance of error artifacts in the SR image, introduced due to registration errors. First, SR is performed using full pixel accuracy motion vectors obtained using full search block matching algorithm (FS-BMA). Then, machine learning based method is applied on the resulting images in order to detect and classify artifacts introduced due to missing subpixel components of the motion vectors. The outcome of the classification is a subpixel component of the motion vector. In the final step, SR process is repeated using the corrected (subpixel accuracy) motion vectors.


TU-P2-13: Single Pass Dependent Bit Allocation for Spatial Scalability Coding of H.264/SVC

Randa Atta (Port Said University, Egypt); Rehab Abdel-Kader (Port Said University, Egypt); Amera Abd-AlRahem (Port Said University, Egypt)

Abstract: This paper investigates the problem of bit allocation for spatial scalability coding of H.264/SVC. Little prior work deal with the H.264/SVC bit allocation problem considering the correlation between the enhancement and base layers. Nevertheless, most of the bit allocation algorithms suffer from high computational complexity which grows signifi-cantly with the number of layers. In this paper, a single-pass spatial layer bit allocation algorithm, based on dependent Rate-Distortion modeling is proposed. In this algorithm, the R-D model parameters are adaptively updated during the coding process. Experimental results demonstrate that the proposed algorithm achieves a significant improvement in the coding gain as compared to the multi-pass model-based algorithm and the Joint Scalable Video Model reference software algorithm.



Session TU-P3: Signal Processing for Communications II


TU-P3-1: Normalized Recursive Least Moduli Algorithm with p-Modulus of Error and q-Norm of Filter Input

Shin'ichi Koike (Consultant, Japan)

Abstract: This paper proposes a new adaptation algorithm named Normalized Recursive Least Moduli (NRLM) algorithm which employs "p-modulus" of error and "q-norm" of filter input. p-modulus and q-norm are generalization of the modulus and norm used in complex-domain adaptive filters. The NRLM algorithm with p-modulus and q-norm makes adaptive filters fast convergent and robust against two types of impulse noise: one is found in observation noise and another at filter input. We develop theoretical analysis of the algorithm for calculating filter convergence. Through experiment with simulations and theoretical calculations, effectiveness of the proposed algorithm is demonstrated. We also find that the filter convergence does not critically depend on the value of p or q, allowing use of p = q = infinity that makes it easiest to calculate the p-modulus and q-norm. The theoretical convergence is in good agreement with simulation results which validates the analysis.


TU-P3-2: Dual-Layer Network Representation Exploiting Information Characterization

Virginia De Bernardinis (Universita degli Studi Roma TRE, Italy); Rui Fa (Brunel University, United Kingdom); Marco Carli (University of Roma TRE, Italy); Asoke Nandi (Brunel University, United Kingdom)

Abstract: In this paper, a logical dual-layer representation approach is proposed to facilitate the analysis of directed and weighted complex networks. Unlike the single logical layer structure, which was widely used for the directed and weighted flow graph, the proposed approach replaces the single layer with a dual-layer structure, which introduces a provider layer and a requester layer. The new structure provides the characterization of the nodes by the information, which they provide to and they request from the network. Its features are explained and its implementation and visualization are also detailed. We design two clustering methods with different strategies respectively, which provide the analysis from different points of view. The effectiveness of the proposed approach is demonstrated using a simplified example. By comparing the graph layout with the conventional directed graph, the new dual-layer representation reveals deeper insight into the complex networks and provides more opportunities for versatile clustering analysis.


TU-P3-3: Improving Scalar Quantization for Correlated Processes Using Adaptive Codebooks Only At the Receiver

Sai Han (Technische Universität Braunschweig, Germany); Tim Fingscheidt (Technische Universität Braunschweig, Germany)

Abstract: Lloyd-Max quantization (LMQ) is a widely used scalar non-uniform quantization approach targeting for the minimum mean squared error (MMSE). Once designed, the quantizer codebook is fixed over time and does not take advantage of possible correlations in the input signals. Exploiting correlation in scalar quantization could be achieved by predictive quantization, however, for the price of a higher bit error sensitivity. In order to improve the Lloyd-Max quantizer performance for correlated processes without encoder-sided prediction, a novel scalar decoding approach utilizing the correlation of input signals is proposed in this paper. Based on previously received samples, the current sample can be predicted a priori. Thereafter, a quantization codebook adapted over time will be generated according to the prediction error probability density function. Compared to the standard LMQ, distinct improvement is achieved with our receiver in error-free and error-prone transmission conditions, both with hard-decision and soft-decision decoding.


TU-P3-4: Filter Design with Hard Spectral Constraints

Johan Karlsson (KTH Royal Institute of Technology, Sweden); Jian Li (University of Florida, USA); Petre Stoica (Uppsala University, Sweden)

Abstract: Filter design is a fundamental problem in signal processing and important in many applications. In this paper we consider a communication application with spectral constraints, using filter designs that can be solved globally via convex optimization. In particular, this leads to FIR and IIR filters with maximal frequency response magnitude in the carrier frequencies subject to spectral constraints. Tradeoffs are discussed in order to determine which design is the most appropriate, and for these applications, finite impulse response filters appear to be more suitable than infinite impulse response filters since they allow for more flexible objective functions, shorter transients, and faster filter implementations.


TU-P3-5: Chromatic Dispersion Compensation Using Complex-Valued All-Pass Filter

Jawad Munir (Technische Universitaet Muenchen, Germany)

Abstract: We propose a new optimization framework to compensate chromatic dispersion by complex-valued infinite impulse response (IIR) all-pass filter. The design of the IIR all-pass filter is based on minimizing the mean square error (MSE) in group delay and phase cost functions. Necessary conditions are derived and incorporated in a multi-step optimization framework to ensure the stability of the resulting IIR filter. It is shown that IIR filter achieves similar or slightly better performance compared to its finite impulse response (FIR) counterpart. Moreover, IIR filtering requires significantly less number of taps to compensate the same CD channel compared to FIR filtering.


TU-P3-6: A Control Theoretic Approach to Solve a Constrained Uplink Power Dynamic Game

Santiago Zazo (Universidad Politecnica Madrid, Spain); Javier Zazo (Universidad Politécnica de Madrid, Spain); Matilde Sánchez-Fernández (Universidad Carlos III de Madrid, Spain)

Abstract: This paper addresses an uplink power control dynamic game where we assume that each user battery represents the system state that changes with time following a discrete-time version of a differential game. To overcome the complexity of the analysis of a dynamic game approach we focus on the concept of Dynamic Potential Games showing that the game can be solved as an equivalent Multivariate Optimum Control Problem. The solution of this problem is quite interesting because different users split the activity in time, avoiding higher interferences and providing a long term fairness.


TU-P3-7: Noise Filtering in Bandlimited Digital Chaos-based Communication Systems

Rodrigo T. Fontes (Polytechnic School of the University of São Paulo, Brazil); Marcio Eisencraft (Escola Politécnica da Universidade de São Paulo, Brazil)

Abstract: In recent years, many chaos-based communication schemes were proposed. However, their performance in non-ideal scenarios must be further investigated. In this work, the performance of a bandlimited binary communication system based on chaotic synchronization is evaluated considering a white Gaussian noise channel. As a way to improve the signal to noise ratio in the receiver, and thus the bit error rate, we propose to filter the out-of-band noise in the receiver. Numerical simulations show the advantages of using such a scheme.


TU-P3-8: A Simply-Differential Low-Complexity Primary Synchronization Scheme for 3GPP LTE Systems

Leila Nasraoui (Sup'com, Tunisia); Leila Najjar (Sup'Com, Tunisia); Mohamed Siala (Sup'Com, Tunisia)

Abstract: In this paper, downlink primary synchronization for LTE systems is investigated, including time synchronization and sector identification. The proposed scheme exploits the Primary Synchronization Signal which is generated from known Zadoff-Chu sequences. Unlike the conventional schemes, in which time synchronization is first processed then the demodulated OFDM symbols are cross-correlated with the known Zadoff-Chu sequences for sector identification, the proposed scheme simultaneously achieves both tasks. To this aim, the received signal is differentially auto-correlated and compensated with a frequency offset whose value depends on the used Zadoff-Chu sequence. The same metric allows detecting both the symbol timing and the sector identifier. Simulation results, carried in additive white Gaussian noise and Rayleigh multipath channels, show the efficiency and reliability of the proposed primary synchronization scheme. We also note that, compared to former methods, the proposed one not only leads to performance enhancement but also realizes a considerable complexity reduction.


TU-P3-9: Optimal Energy Allocation Scheme for Throughput Enhancement in Cooperative Cognitive Network

Imen Sahnoun (SupCom, Tunisia); Inès Kammoun (ENIS, Tunisia); Mohamed Siala (Sup'Com, Tunisia)

Abstract: In this paper, a cognitive radio scenario is proposed, where secondary users are allowed to communicate concurrently with primary users provided that they do not create harmful interference to the licensed users. Here, we aim to improve the throughput of unlicensed system. For this aim, we propose to use a cooperative relay to assist the secondary transmission. Moreover, an adaptive modulation is used in order to compensate the throughput loss due to the relaying. The main contribution of this work is to consider a new energy allocation scheme for source and relay nodes to maximize the achievable throughput under the system constraints. A variety of simulation results reveals that our proposed energy allocation method combined with adaptive modulation offers better performance compared with the classical cooperation scheme where energy resources are equally distributed over all nodes.


TU-P3-10: Multi-Radio Network Optimisation Using Bayesian Belief Propagation

Colin McGuire (University of Strathclyde, United Kingdom); Stephan Weiss (University of Strathclyde, United Kingdom)

Abstract: In this paper we show how 5~GHz and "TV White Space" wireless networks can be combined to provide fixed access for a rural community. Using multiple technologies allows advantages of each to be combined to overcome individual limitations when assigning stations between networks. Specifically, we want to maximise throughput under the constraint of satisfying both the desired individual station data rate and the transmit power within regulatory limits. For this optimisation, we employ Pearl's algorithm, a Bayesian belief propagation implementation, which is informed by statistics drawn from network trials on Isle of Tiree with 100 households. The method confirms results obtained with an earlier deterministic approach.


TU-P3-11: Characterizing Changes in the Noise Statistics of GNSS Space Clocks with the Dynamic Allan Variance

Lorenzo Galleani (Politecnico di Torino, Italy)

Abstract: The dynamic Allan variance (DAVAR) is a tool for the characterization of precise clocks. Monitoring anomalies of precise clocks is essential, especially when they are employed onboard the satellites of a global navigation satellite system (GNSS). When an anomaly occurs, the DAVAR changes with time, its shape depending on the type of anomaly occurred. We obtain the analytic DAVAR for a change of variance in the clock noise, an anomaly with critical effects on the clock performances. This result is helpful when the clock health is monitored by observing the DAVAR.



Session TU-P4: Signal Estimation and Detection I


TU-P4-1: Automatic Optimization of Adaptive Notch Filter's Frequency Tracking

Michał Meller (Gdansk University of Technology, Poland)

Abstract: Estimation of instantaneous frequency of narrowband complex sinusoids is often performed using lightweight algorithms called adaptive notch filters. However, to reach high performance, these algorithms require careful tuning. The paper proposes a novel self-tuning layer for a recently introduced adaptive notch filtering algorithm. Analysis shows that, under Gaussian random-walk type assumptions, the resulting solution converges in mean to the optimal frequency estimator. A simplified one degree of freedom version of the filter, recommended for practical applications, is also proposed. Finally, a comparison of performance with six other state of the art schemes is performed. It confirms the improved tracking accuracy of the proposed scheme.


TU-P4-2: An Approach to Nonlinear State Estimation Using Extended FIR Filtering

Shunyi Zhao (Jiangnan University, P.R. China); Juan Pomarico-Franquiz (University of Guanajuato, Mexico); Yuriy S. Shmaliy (Universidad de Guanajuato, Mexico)

Abstract: A new technique called extended finite impulse response (EFIR) filtering is developed to nonlinear state estimation in discrete time state space. The EFIR filter belongs to a family of unbiased FIR filters which completely ignore the noise statistics. An optimal averaging horizon of Nopt points required by the EFIR filter can be determined via measurements with much smaller efforts and cost than for the noise statistics. These properties of EFIR filtering are distinctive advantages against the extended Kalman filter (EKF). A payment for this is an Nopt −1 times longer operation which, however, can be reduced to that of the EKF by using parallel computing. Based on extensive simulations of diverse nonlinear models, we show that EFIR filtering is more successful in accuracy and more robust than EKF under the unknown noise statistics and model uncertainties.


TU-P4-3: Minimum Variance Unbiased FIR State Estimation of Discrete Time-Variant Models

Shunyi Zhao (Jiangnan University, P.R. China); Yuriy S. Shmaliy (Universidad de Guanajuato, Mexico); Fei Liu (Jiangnan University, P.R. China)

Abstract: State estimation and tracking often require optimal or unbiased estimators. In his paper, we propose a new minimum variance unbiased (MVU) finite impulse response (FIR) filter which minimizes the estimation error variance in the unbiased FIR (UFIR) filter. The relationship between the filter gains of the MVU FIR, UFIR and optimal FIR (OFIR) filters is found analytically. Simulations provided using a polynomial state-space model have shown that errors in the MVU FIR filter are intermediate between the UFIR and OFIR filters, and the MVU FIR filter exhibits better denoising effect than the UFIR estimates. It is also shown that the performance of MVU FIR filter strongly depends on the averaging interval of N points: by small N, the MVU FIR filter approaches UFIR filter and, if N is large, it becomes optimal.


TU-P4-4: Online Instantaneous Frequency Estimation Utilizing Empirical Mode Decomposition and Hermite Splines

Franz Holzinger (Virtual Vehicle Research Center, Austria); Martin Benedikt (Virtual Vehicle Research Center, Austria)

Abstract: Most of the available frequency estimation methods are restricted for the application to signals without a bias or a carrier signal. For offline analysis of a signal an existing carrier may be determined and eliminated. However, in the case of online computations the carrier is not accurately known a priori in general. An error in the carrier signal directly affects the accuracy of the subsequent instantaneous frequency estimation approach. This article focusses on the online instantaneous frequency estimation of non-stationary signals based on the empirical mode decomposition scheme. Especially hermite spline interpolation of samples for empirical mode decomposition is addressed. Hermite splines enables the definition of enhanced boundary conditions and leads to an effective online instantaneous frequency estimation approach. Throughout the article algorithmic details are examined by a theoretical example.


TU-P4-5: Recovery of Correlated Sparse Signals From Under-Sampled Measurements

Zhaofu Chen (Northwestern University, USA); Rafael Molina (Universidad de Granada, Spain); Aggelos K Katsaggelos (Northwestern University, USA)

Abstract: In this paper we consider the problem of recovering temporally smooth or correlated sparse signals from a set of under-sampled measurements. We propose two algorithmic solutions that exploit the signal temporal properties to improve the reconstruction accuracy. The effectiveness of the proposed algorithms is corroborated with experimental results.


TU-P4-6: Iterative Approach to Estimate the Parameters of a TVAR Process Corrupted by a MA Noise

Hiroshi Ijima (Wakayama University, Japan); Roberto Diversi (DEIS – University of Bologna, Italy); Eric J. Grivel (Université de Bordeaux, France)

Abstract: A great deal of interest has been paid to the time-varying autoregressive (TVAR) parameter tracking, but few papers deal with this issue when noisy observations are available. Recently, this problem was addressed for a TVAR process disturbed by an additive zero-mean white noise, by using deterministic regression methods. In this paper, we focus our attention on the case of an additive colored measurement noise modeled by a moving average process. More particularly, we propose to estimate the TVAR parameters by using a variant of the improved least-squares (ILS) methods, initially introduced by Zheng to estimate the AR parameters from a signal embedded in a white noise. Simulation studies illustrate the advantages and the limits of the approach.


TU-P4-7: Sparse Blind Deconvolution Based on Scale Invariant Smoothed l0-Norm

Kenji Nose Filho (University of Campinas, Brazil); Christian Jutten (GIPSA-Lab, France); João Romano (State University of Campinas, Brazil)

Abstract: In this work, we explore the problem of blind deconvolution in the context of sparse signals. We show that the l0-norm works as a contrast function, if the length of the impulse response of the system is smaller than the shortest distance between two spikes of the input signal. Demonstrating this sufficient condition is our basic theoretical result. However, one of the problems of dealing with the l0-norm in optimization problems is the requirement of exhaustive or combinatorial search methods, since it is a non continuous function. In order to propose an alternative for that, Mohimani et al. (2009) proposed a smoothed and continuous version of the l0-norm. Here, we propose a modification of this criterion in order to make it scale-invariant and, finally, we derive a gradient-based algorithm for the modified criterion. Results with synthetic data suggests that the imposed conditions are sufficient but not strictly necessary.


TU-P4-8: Performances Theoretical Model-Based Optimization for Incipient Fault Detection with KL Divergence

Abdulrahman Youssef (L2S (CNRS - Supelec - Université Pars Sud), France); Claude Delpha (Universite Paris Sud - L2S, France); Demba Diallo (LGEP (CNRS - Supelec - Université Paris Sud - UPMC), France)

Abstract: Sensible and reliable incipient fault detection methods are major concerns in industrial processes. The Kullback Leibler Divergence (KLD) has proven to be particularly efficient. However, the performance of the technique is highly dependent on the detection threshold and the Signal to Noise Ratio (SNR). In this paper, we develop an analytical model of the fault detection performances (False Alarm Probability and Miss Detection Probability) based on the KLD including the noisy environment characteristics. Thanks to this model, an optimization procedure is applied to set the optimal fault detection threshold depending on the SNR and the fault severity.


TU-P4-9: Instantaneous Frequency Estimation by Group Delay Attractors and Instantaneous Frequency Attractors

Soo-Chang Pei (National Taiwan University, Taiwan); Shih-Gu Huang (National Taiwan University, Taiwan)

Abstract: Instantaneous frequency attractors (IFAs), obtained from the phase of a time-frequency representation, have been introduced for instantaneous frequency (IF) estimation. In this paper, another kind of attractors called group delay attractors (GDAs) are proposed to improve the IFA-based method. The GDAs can reveal IFs which cannot be estimated from the IFAs. Simulation results show that the IF estimation method based on both the GDAs and IFAs outperforms the well-known estimation method, i.e. ridge detection. Also, it is shown that the proposed method creates much less spurious IFs than the IFA-based method in noisy environments.


TU-P4-10: A New Approach to Spectral Estimation From Irregular Sampling

David Bonacci (TeSA lab, France); Bernard Lacaze (TESA Lab, France)

Abstract: This article addresses the problem of signal reconstruction, spectral estimation and linear filtering directly from irregularly-spaced samples of a continuous signal (or autocorrelation function in the case of random signals) when signal spectrum is assumed to be bounded. The number 2L of samples is assumed to be large enough so that the variation of the spectrum on intervals of length proportional to 1/L is small. Reconstruction formulas are based on PNS (Periodic Nonuniform Sampling) schemes. They allow for reconstruction schemes not requiring regular resampling and suppress two stages in classical computations. The presented method can also be easily generalized to spectra in symmetric frequency bands (bandpass signals).


TU-P4-11: Uncovering Harmonic Content Via Skewness Maximization - A Fourier Analysis

Aziz Kubilay Ovacıklı (Rubico Vibration Analysis AB, Sweden); Patrik Pääjärvi (Rubico Vibration Analysis AB, Sweden); James LeBlanc (Luleå University of Technology, Sweden); Johan Carlson (Luleå University of Technology, Sweden)

Abstract: Blind adaptation with appropriate objective function results with enhancement of signal of interest. Skewness is chosen as a measure of impulsiveness for blind adaptation to enhance impacting sources arising from rolling bearings. Such impacting sources can be modelled with harmonically related sinusoids which leads to discovery of harmonic content with unknown fundamental frequency by skewness maximization. Interfering components that do not possess harmonic relation are simultaneously suppressed with proposed method. An experimental example on rolling bearing fault detection is given to illustrate the ability of skewness maximization in uncovering harmonic content.


TU-P4-12: Evaluation of Non-Linear Combinations of Rescaled Reassigned Spectrograms

Maria Sandsten (Lund University, Sweden)

Abstract: The reassignment technique is used to increase the resolution for signals that have closely located time-frequency components. For Gaussian components the reassignment based on an optimal (matched) window spectrogram will result in a single point where all mass is located. For non-optimal windows, the reassignment procedure can be optimally rescaled to fulfill the single point mass location. Non-linear combinations of spectrograms for different window lengths have previously been suggested, [1], and in this paper an evaluation is made of the performance for different non-linear combinations of optimally rescaled reassigned spectrograms.


TU-P4-13: High Resolution Sparse Estimation of Exponentially Decaying Two-Dimensional Signals

Stefan I Adalbjörnsson (Lund University, Sweden); Johan Svaerd (Lund University, Sweden); Andreas Jakobsson (Lund University, Sweden)

Abstract: In this work, we consider the problem of high-resolution estimation of the parameters detailing a two-dimensional (2-D) signal consisting of an unknown number of exponentially decaying sinusoidal components. Interpreting the estimation problem as a block (or group) sparse representation problem allows the decoupling of the 2-D data structure into a sum of outer-products of 1-D damped sinusoidal signals with unknown damping and frequency. The resulting non-zero blocks will represent each of the 1-D damped sinusoids, which may then be used as non-parametric estimates of the corresponding 1-D signals; this implies that the sought 2-D modes may be estimated using a sequence of 1-D optimization problems. The resulting sparse representation problem is solved using an iterative ADMM-based algorithm, after which the damping and frequency parameter can be estimated by a sequence of simple 1-D optimization problems.


TU-P4-14: Smooth 2-D Frequency Estimation Using Covariance Fitting

Johan Svaerd (Lund University, Sweden); Johan Brynolfsson (Lund University, Sweden); Andreas Jakobsson (Lund University, Sweden); Maria Sandsten (Lund University, Sweden)

Abstract: In this paper, we introduce a non-parametric 2-D spectral estimator for smooth spectra, allowing for irregularly sampled measurements. The estimate is formed by assuming that the spectrum is smooth and will vary slowly over the frequency grids, such that the spectral density inside any given rectangle in the spectral grid may be approximated well as a plane. Using this framework, the 2-D spectrum is estimated by finding the solution to a convex covariance fitting problem, solved using the generalized alternating direction method of multipliers. Numerical simulations indicate the achievable performance gain as compared to the Blackman-Tukey estimator.


TU-P4-15: Wiener Filtering in the Windowed DFT Domain

Siouar Bensaid (EURECOM, France); Dirk Slock (EURECOM, France)

Abstract: We focus on the use of windows in the frequency domain processing of data for the purpose of Wiener filtering. Classical frequency domain asymptotics replace linear convolution by circulant convolution, leading to approximation errors. The introduction of windows can lead to slightly more complex frequency domain techniques, replacing diagonal matrices by banded matrices, but with controlled approximation error. Other work observed this recently, proposing general banded matrices in the frequency domain for filtering. Here, we emphasize the design of a window to optimize the banded approximation, and more importantly, we show that the whole banded matrix is in fact still parametrized by a diagonal matrix, which facilitates estimation. We propose here both some non-parametric and parametric approaches for estimating the diagonal spectral parts and revisit in particular the effect of the window on frequency domain Recursive Least-Squares (RLS) adaptive filtering.



Session WE-L01: Machine Learning : Signal Processing Applications


WE-L01-1: Recognition of Acoustic Events Using Deep Neural Networks

Oguzhan Gencoglu (Tampere University of Technology, Finland); Tuomas Virtanen (Tampere University of Technology, Finland); Heikki Huttunen (Tampere University of Technology, Finland)

Abstract: This paper proposes the use of a deep neural network for the recognition of isolated acoustic events such as footsteps, baby crying, motorcycle, rain etc. For an acoustic event classification task containing 61 distinct classes, classification accuracy of the neural network classifier (60.3%) excels that of the conventional Gaussian mixture model based hidden Markov model classifier (54.8%). In addition, an unsupervised layer-wise pretraining followed by standard backpropagation training of a deep network (known as a deep belief network) results in further increase of 2-4% in classification accuracy. Effects of implementation parameters such as types of features and number of adjacent frames as additional features are found to be significant on classification accuracy.


WE-L01-2: ECG Analysis Using Consensus Clustering

Andre R. Lourenço (Instituto Superior de Engenharia de Lisboa, Portugal); Samuel Rota Bulò (Fondazione Bruno Kessler, Italy); Carlos Carreiras (Instituto de Telecomunicações, Portugal); Ana Fred (I.S.T. - Technical U. Lisbon / I.T. Lisbon, Portugal)

Abstract: Biosignals analysis has become widespread, upstaging their typical use in clinical settings. Electrocardiography (ECG) plays a central role in patient monitoring as a diagnosis tool in today's medicine and as an emerging biometric trait. In this paper we adopt a consensus clustering approach for the unsupervised analysis of an ECG-based biometric records. This type of analysis highlights natural groups within the population under investigation, which can be correlated with ground truth information in order to gain more insights about the data. Preliminary results are promising, for meaningful clusters are extracted from the population under analysis.


WE-L01-3: On the Information-theoretic Limits of Graphical Model Selection for Gaussian Time Series

Gabor Hannak (Vienna University of Technology, Austria); Alexander Jung (Vienna University of Technology, Austria); Norbert Goertz (Vienna University of Technology, Austria)

Abstract: We consider the problem of inferring the conditional independence graph (CIG) of a multivariate stationary Gaussian random process based on a finite length observation. Using information-theoretic methods, we derive a lower bound on the error probability of any learning scheme for the underlying process CIG of an i.i.d. process. This bound, in turn, yields a minimum required sample-size which is necessary for any algorithm regardless of its complexity to reliably select the true underlying graphical model. Furthermore, by analysis of a simple selection scheme, we show that the information-theoretic limits can be achieved for a subclass of processes. We do not assume a parametric model for the observed process, but expect it to have a sufficiently smooth spectral density matrix (SDM).


WE-L01-4: A Frequency Method for Blind Separation of an Anechoic Mixture

Wendyam S. B. Ouedraogo (GIPSA-lab, France); Barbara Nicolas (INPG, France); Benoit Oudompheng (MicrodB, France); Jerome I. Mars (Grenoble Institute of Technology, France); Christian Jutten (GIPSA-Lab, France)

Abstract: This paper presents a new frequency method for blind separation of mixtures of scaled and delayed versions of sources. This kind of problem can occur in air and underwater acoustics. By assuming the mutual independence of the sources, we make use of the power spectral densities and the cross power spectral densities of mixed data to estimate the sources, the mixing coefficients, and the relative delays between a reference sensor and the other sensors. Simulations on synthetic data of sound radiated by a ship show the effectiveness of the proposed method.


WE-L01-5: Sparse Representation and Least Squares-based Classification in Face Recognition

Michael Iliadis (Northwestern University, USA); Leonidas Spinoulas (Northwestern University, USA); Albert S. Berahas (Northwestern University, USA); Haohong Wang (TCL Research America, USA); Aggelos K Katsaggelos (Northwestern University, USA)

Abstract: In this paper we present a novel approach to face recognition. We propose an adaptation and extension to the state-of-the-art methods in face recognition, such as sparse representation-based classification and its extensions. Effectively, our method combines the sparsity-based approaches with additional least-squares steps that utilize more information, in order to achieve significant performance improvements with little additional cost. This approach also mitigates the need for a large number of training images since it proves robust to varying number of training samples.



Session WE-L02: Image and Video Coding II


WE-L02-1: Compression of Microarray Images Using a Binary Tree Decomposition

Luís M. O. Matos (University of Aveiro, Portugal); António J. R: Neves (University of Aveiro, Portugal); Armando J Pinho (University of Aveiro, Portugal)

Abstract: This paper proposes a lossless compression method for microarrays images, based on a hierarchical organization of the intensity levels followed by finite-context modeling. A similar approach was recently applied to medical images with success. The goal of this work was to further extend, adapt and evaluate this approach to the special case of microarray images. We performed simulations on seven different data sets (total of 254 images). On average, the proposed method attained ~9% better results when compared to the best compression standard (JPEG-LS).


WE-L02-2: Efficient Joint Multiscale Decomposition for Color Stereo Image Coding

Oussama Dhifallah (SUP'COM, Tunisia); Mounir Kaaniche (Université Paris 13, France); Amel Benazza (SUP'COM, Tunisia)

Abstract: With the recent advances in stereoscopic display technologies, the demand for developing efficient stereo image coding techniques has increased. While most of the existing approaches have been proposed and studied in the case of monochrome stereo images, we are interested in this paper in encoding color stereo data. More precisely, we design a multiscale decomposition, based on the concept of vector lifting scheme, that jointly exploits the inter-view and inter-color channels redundancies. Moreover, our decomposition is well adapted to the contents of these data. Experimental results performed on natural stereo images show the benefits which can be drawn from the proposed coding method.


WE-L02-3: Coding Mode Decision Algorithm for Binary Descriptor Coding

Pedro Monteiro (Instituto Superior de Engenharia de Lisboa, Portugal); João Ascenso (Instituto Superior de Engenharia de Lisboa, Portugal)

Abstract: In visual sensor networks, local feature descriptors can be computed at the sensing nodes, which work collaboratively on the data obtained to make an efficient visual analysis. In fact, with a minimal amount of computational effort, the detection and extraction of local features, such as binary descriptors, can provide a reliable and compact image representation. In this paper, it is proposed to extract and code binary descriptors to meet the energy and bandwidth constraints at each sensing node. The major contribution is a binary descriptor coding technique that exploits the correlation using two different coding modes: Intra, which exploits the correlation between the elements that compose a descriptor; and Inter, which exploits the correlation between descriptors of the same image. The experimental results show bitrate savings up to 35% without any impact in the performance efficiency of the image retrieval task.


WE-L02-4: Design of Optimized Contourlet Filters for Improved Coding Gain

Tobias Gehrke (Pforzheim University, Germany); Thomas Greiner (Pforzheim University, Germany)

Abstract: The separable wavelet transform has limited directional sensitivity and is suboptimal for compression of textured images. A finer directional resolution and better coding results can be achieved by contourlet transform. So far, directional filters based on design criteria that are unspecific to image compression were used for contourlet transform. We propose directional filters that are optimized specifically for image coding. Thereto, a filter design method that is based on maximization of coding gain was developed. Directional filters were designed for all images of two standard test image databases and compared experimentally to standard filters. In most cases the newly designed filters performed better than standard filters.


WE-L02-5: Image Quality Assessment Based on Detail Differences

Elio D. Di Claudio (University of Rome "La Sapienza", Italy); Giovanni Jacovitti (INFOCOM Dpt. University of Rome, Italy)

Abstract: This paper presents a novel Full Reference method for image quality assessment based on two indices measuring respectively detail loss and spurious detail addition. These indices define a two dimensional (2D) state in a Virtual Cognitive State (VCS) space. The quality estimation is obtained as a 2D function of the VCS, empirically determined via polynomial fitting of DMOS values of training images. The method provides at the same time highly accurate DMOS estimates, and a quantitative account of the causes of quality degradation.



Session WE-L03: Cooperation and Cognition in Wireless Networks


WE-L03-1: Joint SIC and Multi-relay Selection Algorithms for Cooperative DS-CDMA Systems

Jiaqi Gu (University of York, United Kingdom); Rodrigo C. de Lamare (University of York, United Kingdom)

Abstract: In this work, we propose a cross-layer design strategy based on a joint successive interference cancellation (SIC) detection technique and a multi-relay selection algorithm for the uplink of cooperative direct-sequence code-division multiple access (DS-CDMA) systems. We devise a low-cost greedy list-based SIC (GL-SIC) strategy with RAKE receivers as the front-end that can approach the maximum likelihood detector performance. We also present a low-complexity multi-relay selection algorithm based on greedy techniques that can approach the performance of an exhaustive search. Simulations show an excellent bit error rate performance of the proposed detection and relay selection algorithms as compared to existing techniques.


WE-L03-2: Enhancing the MIMO Channel Capacity in Manhattan-Like Scenarios

Ivo Sousa (Instituto de Telecomunicações/IST, University of Lisbon, Portugal); Maria Paula Queluz (Instituto Superior Técnico, Portugal); António J. Rodrigues (IT / Instituto Superior Técnico, Portugal)

Abstract: In this paper the channel capacity of Multiple Input Multiple Output (MIMO) wireless systems within a Manhattan like scenario is studied. Three Base Stations (BSs) placement models are proposed in this work, so as to enhance the channel capacity of the wireless system. The evaluation of the proposed BSs arrangements is performed using a simulator with realistic underlying models (test scenario, radio propagation and mobility models). Simulation results show that all the proposed placement models have a superior performance when compared with the traditional BSs placement model. In particular, one of the proposed BSs dispositions requires the use of less BSs, which means greener communications and less hardware costs.


WE-L03-3: Channel Simulation with Large-Scale Time Evolution in Irregular Cellular Networks

Levent Kayili (University of Toronto, Canada); Elvino Silveira Sousa (University of Toronto, Canada)

Abstract: We consider a cellular network where base stations with widely different power capabilities (power subclasses) are deployed in a highly inhomogeneous or irregular pattern-referred to in this work as an irregular cellular network. A simulation framework with slow scale time variation appropriate for irregular networks is proposed. A relevant resource allocation framework as well as shadowing and path loss models are discussed. Finally, the time evolution methodology is detailed. It is believed that the proposed simulation framework will be important in the evaluation of slowly adaptive algorithms such as those studied as part of 3GPP LTE Self Organizing Networks (SON).


WE-L03-4: Advanced Interference Reduction in NC-OFDM Based Cognitive Radio with Cancellation Carriers

Pawel Kryszkiewicz (Poznan University of Technology, Poland); Hanna Bogucka (Poznan University of Technology, Poland)

Abstract: Reduction of the out-of-band (OOB) emission is essential for Cognitive Radio (CR) systems to enable coexistence with licensed (primary) systems operating in the adjacent frequency bands. This paper proposes an algorithm for the Non Contiguous Orthogonal Frequency Division Multiplexing (NC-OFDM)-based CR, to reduce the interference caused by both OOB radiation and by non-ideal frequency selectivity of a primary user (PU) receiver. It is based on a concept to use a set of subcarriers called Cancellation Carriers (CCs). By being aware of the PU's carrier frequency, the observed interference power can by decreased by about 10 dB in comparison with the standard OOB-power minimizing algorithms.


WE-L03-5: A Multi-threshold Feedback Scheme for Cognitive Radio Networks Based on Opportunistic Beamforming

Ayman Massaoudi (MEDIATRON Lab., Sup'Com - University of Carthage, Tunisia); Noura Sellami (Ecole Nationale d'Ingénieurs de Sfax, Tunisia); Mohamed Siala (Sup'Com, Tunisia)

Abstract: Cognitive radio is a promising technique for efficient spectrum utilization in wireless systems. In multi-user Multiple-Input Multiple-Output (MIMO) system, a large amount of feedback information has to be used to achieve multi-user diversity. In this paper, in order to reduce the feedback amount and then the wasted energy, we propose a novel scheduling scheme of secondary users (SUs) for an underlay cognitive radio network. Our scheme is based on opportunistic beamforming and employs multiple feedback thresholds. The lowest threshold is chosen to insure a predefined allowed scheduling outage probability. The other thresholds are chosen in order to reduce the number of SUs feeding back their maximum signal to interference plus noise ratio (SINR) (and then the wasted energy) and the delay due to the number of attempts. We show via simulations that a significant gain in terms of energy is obtained at the price of a reasonable delay.



Session WE-L04: Recent Advances on CR/SDR Circuits, Systems and Signal-Processing Techniques (Special Session)


WE-L04-1: Low-power Active Interference Cancellation for OFDM Spectrum Sculpting

Jorge F. Schmidt (University of Vigo, Spain); Daniel Romero (University of Vigo, Spain); Roberto López-Valcarce (Universidad de Vigo, Spain)

Abstract: We present a low-power design for Active Interference Cancellation (AIC) sculpting of the OFDM spectrum, based on sparse design concepts. Optimal AIC designs compute cancellation weights based on contributions from all data subcarriers. Thus, as the number of subcarriers grows, power consumption becomes a concern, and suboptimal solutions that avoid involving all subcarriers are of interest. In this context, we present novel sparse AIC designs based on a zero-norm minimization of the matrix defining the cancellation weights. These designs drastically reduce the number of operations per symbol, and thus the power consumption, while allowing to tune the loss with respect the optimal design. They can be efficiently obtained and significantly outperform usual thresholding or sparsity-inducing l1-norm minimization approaches.


WE-L04-2: Characterization of SDR/CR Front-Ends for Improved Digital Signal Processing Algorithms

Diogo Ribeiro (Instituto de Telecomunicações - Universidade de Aveiro, Portugal); Pedro Cruz (Instituto de Telecomunicações - Universidade de Aveiro, Portugal); Nuno Borges Carvalho (University of Aveiro/IT Aveiro, Portugal)

Abstract: This paper will demonstrate the importance of performing joint analog and digital analysis and characterization procedures have on current radio-frequency components and devices. This is mostly due to the fact that today circuits and systems going in the way of integration into a single module, which make the separate analog and digital characterization unfeasible per si. Some details about mixed-signal instrumentation are introduced by showing representative laboratory measurement arrangements, which are necessary to obtain information on mixed-signal analog/digital portions commonly named transfer functions. This information will make possible to produce better designs of the entire radio front-end, as well as, the implementation of optimized digital signal processing algorithms to compensate the analog impairments.


WE-L04-3: DSP-Based Suppression of Spurious Emissions At RX Band in Carrier Aggregation FDD Transceivers

Adnan Kiayani (Tampere University of Technology, Finland); Mahmoud Abdelaziz (Tampere University of Technology, Finland); Lauri Anttila (Tampere University of Technology, Finland); Vesa K Lehtinen (Tampere University of Technology, Finland); Mikko Valkama (Tampere University of Technology, Finland)

Abstract: In frequency division duplex transceivers employing non-contiguous carrier aggregation (CA) transmission, achieving sufficient isolation between transmit and receive chains using radio frequency filtering alone is increasingly difficult. Particularly challenging problem in this context is spurious intermodulation (IM) components due to nonlinear power amplifier (PA), which may easily overlap the receiver band. With realistic duplex filters, the residual spurious IM at RX band can be several dBs stronger than the thermal noise floor, leading to own receiver desensitization. In this paper, we carry out detailed signal modeling of spurious emissions due to wideband PAs on the third-order IM band. Stemming from this modeling, and using the known transmit data, we present an efficient nonlinear digital identification and cancellation technique to suppress the unwanted IM components in RX band. The proposed technique is verified with computer simulations, showing excellent calibration properties, hence relaxing filtering and duplexing distance requirements in spectrally-agile CA transceivers.


WE-L04-4: Wideband Spectrum Sensing for Cognitive Radio

Jose Vieira (Universidade de Aveiro, Portugal); Ana Tomé (Universidade de Aveiro, Portugal); Daniel Malafaia (Universidade de Aveiro, Portugal)

Abstract: In this work we propose a wideband spectrum sensing system based on hybrid filter banks. The polyphasic implementation of the digital counterpart of the filter bank can be modified to include a parallelized version of discrete Fourier transform algorithm (FFT) avoiding this way any sampling rate expanders. In this work we show how to incorporate the FFT block in the structure in order to estimate the wideband frequency contents of the signal. The proposed structure is particularly suitable for FPGA based implementations.


WE-L04-5: Recent Advances in Software-Defined Radars: Chirped Impulses

José-María Muñoz-Ferreras (University of Alcalá, Spain); Israel Arnedo (Public University of Navarre, Spain); Aintzane Lujambio (Public University of Navarre, Spain); Magdalena Chudzik (Public University of Navarre, Spain); Miguel Laso (Public University of Navarre, Spain); Roberto Gómez-García (University of Alcalá, Spain); Arjuna Madanayake (University of Akron, USA)

Abstract: The software-defined radio (SDR) paradigm can be applied to radars. Novel radio-frequency (RF) chains and architectures can lead to enhanced radar schemes. After a brief review of SDR-based schemes, this work concentrates on the relevant topic of impulse-radio ultra-wideband (IR-UWB) radars. By emitting extremely-narrow impulses in time domain, these systems can achieve a great range resolution. However, one drawback is their difficulty to control their narrow waveform. On the other hand, because of its many advantages, the chirped waveform has been extensively used in radars and has become the standard employed signal. Here, for the first time, the chirped waveform is exploited in SDR-inspired IR-UWB radars, thus bringing together the benefits of both worlds. The key element in this radar architecture is a passive device shaped by smoothly-chirped coupled lines (SCCL) to produce the chirped signal. Through a developed circuit, very-narrow chirped pulses have been generated and measured.



Session WE-L05: Advances in Music and Audio Recognition/Analysis (Special Session)


WE-L05-1: Audio Concept Classification with Hierarchical Deep Neural Networks

Mirco Ravanelli (Fondazione Bruno Kessler (FBK), Italy); Benjamin Elizalde (ICSI Berkeley, USA); Karl Ni (Lawrence Livermore National Laboratory, USA); Gerald Friedland (International Computer Science Institute, USA)

Abstract: Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., audio concepts (such as music, laughter, or a revving engine). Conventional Gaussian-Mixture-Models have had some success in classifying a reduced set of audio concepts. However, multi-class classification can benefit from context window analysis and the discriminating power of deeper architectures. Although deep learning has shown promise in various applications such as speech and object recognition, it has not yet met the expectations for other fields such as audio concept classification. This paper explores, for the first time, the potential of deep learning in classifying audio concepts on User-Generated Content videos. The proposed system is comprised of two cascaded neural networks in a hierarchical configuration to analyze the short- and long-term context information. Our system outperforms a GMM approach by a relative 54%, a Neural Network by 33%, and a Deep Neural Network by 12% on the TRECVID-MED database.


WE-L05-2: Unsupervised Learning and Refinement of Rhythmic Patterns for Beat and Downbeat Tracking

Florian Krebs (Johannes Kepler University, Linz, Austria); Filip Korzeniowski (Johannes Kepler University Linz, Austria); Maarten Grachten (Austrian Research Institute for Artificial Intelligence, Austria); Gerhard Widmer (Johannes Kepler University Linz, Austria)

Abstract: In this paper, we propose a method of extracting rhythmic patterns from audio recordings to be used for training a probabilistic model for beat and downbeat extraction. The method comprises two stages: clustering and refinement. It is able to take advantage of any available annotations that are related to the metrical structure (e.g., beats, tempo, downbeats, dance style). Our evaluation on the Ballroom dataset showed that our unsupervised method achieves results comparable to those of a supervised model. On another dataset, the proposed method performs as well as one of two reference systems in the beat tracking task, and achieves better results in downbeat tracking.


WE-L05-3: Speech-Music Discrimination: a Deep Learning Perspective

Aggelos Pikrakis (University of Piraeus, Greece); Sergios Theodoridis (University of Athens, Greece)

Abstract: This paper is a study of the problem of speech-music discrimination from a deep learning perspective. We experiment with two feature extraction schemes and investigate how network depth and RBM size affect the classification performance on publicly available datasets and on large amounts of audio data from video-sharing sites. The main building block of our deep networks is the Restricted Boltzmann Machine (RBM). The stack of RBMs is pre-trained in a layer-wise mode and, subsequently, a fine-tuning stage trains the deep network as a whole with back-propagation. The proposed approach indicates that deep architectures can serve as strong classifiers for the binary problem of speech vs music, with satisfactory generalization performance.


WE-L05-4: Blind Audio Source Separation of Stereo Mixtures Using Bayesian Non-negative Matrix Factorization

Sayeh Mirzaei (KU Leuven, Belgium); Hugo Van hamme (KU Leuven, Belgium); Yaser Norouzi (Amirkabir University of Technology, Iran)

Abstract: In this paper, a novel approach is proposed for estimating the number of sources and for source separation in convolutive audio stereo mixtures. First, an angular spectrum-based method is applied to count and locate the sources. A non-linear GCC-PHAT metric is exploited for this purpose. The estimated channel coefficients are then utilized to obtain a primary estimate of the source spectrograms through binary masking. Afterwards, the individual spectrograms are decomposed using a Bayesian NMF approach. This way, the number of components required for modeling each source is inferred based on data. These factors are then utilized as initial values for the EM algorithm which maximizes the joint likelihood of the 2-channel data to extract the individual source signals. It is shown that this initialization scheme can greatly improve the performance of the source separation over random initialization. The experiments are performed on synthetic mixtures of speech and music signals.


WE-L05-5: Controlling the Convergence Rate to Help Parameter Estimation in a PLCA-based Model

Benoit Fuentes (Institut Mines Telecom, Telecom ParisTech, CNRS LTCI, France); Roland Badeau (Institut Mines Telecom, Telecom ParisTech, CNRS LTCI, France); Gaël Richard (Institut Mines-Télécom, Télécom ParisTech, CNRS-LTCI, France)

Abstract: The Probabilistic Latent Component Analysis (PLCA) is a tool used to model non-negative data such as non-negative time-frequency representations of audio. In this paper, we put forward a trick to help the corresponding parameter estimation algorithm to converge toward more meaningful solutions, based on the new concept of brakes. The idea is to control the convergence rate of the parameters of a PLCA-based model within the estimation algorithm: the parameters which are known to be properly initialized are braked in order to stay close to their initial values. This is an effective way to better account for a relevant initialization. In this paper, these brakes are implemented in the framework of PLCA, and they are tested in an application of multipitch estimation. Results show that the use of brakes can significantly influence the decomposition and thus the performance, making them a powerful tool to boost any kind of PLCA-based algorithm.



Session WE-L06: Machine Learning : Sparsity


WE-L06-1: Performance Limits of Dictionary Learning for Sparse Coding

Alexander Jung (Vienna University of Technology, Austria); Yonina C. Eldar (Technion-Israel Institute of Technology, Israel); Norbert Goertz (Vienna University of Technology, Austria)

Abstract: We consider the problem of dictionary learning under the assumption that the observed signals can be represented as sparse linear combinations of the columns of a single large dictionary matrix. In particular, we analyze the minimax risk of the dictionary learning problem which governs the mean squared error (MSE) performance of any learning scheme, regardless of its computational complexity. By following an established information-theoretic method based on Fano's inequality, we derive a lower bound on the minimax risk for a given dictionary learning problem. This lower bound yields a characterization of the sample-complexity, i.e., a lower bound on the required number of observations such that consistent dictionary learning schemes exist. Our bounds may be compared with the performance of a given learning scheme, allowing to characterize how far the method is from optimal performance.


WE-L06-2: Separable Cosparse Analysis Operator Learning

Matthias Seibert (Technische Universität München, Germany); Julian Wörmann (Technische Universität München, Germany); Rémi Gribonval (INRIA, France); Martin Kleinsteuber (Technische Universität München, Germany)

Abstract: The ability of having a sparse representation for a certain class of signals has many applications in data analysis, image processing, and other research fields. Among sparse representations, the cosparse analysis model has recently gained increasing interest. Many signals exhibit a multidimensional structure, e.g. images or three-dimensional MRI scans. Most data analysis and learning algorithms use vectorized signals and thereby do not account for this underlying structure. The drawback of not taking the inherent structure into account is a dramatic increase in computational cost. We propose an algorithm for learning a cosparse Analysis Operator that adheres to the preexisting structure of the data, and thus allows for a very efficient implementation. This is achieved by enforcing a separable structure on the learned operator. Our learning algorithm is able to deal with multidimensional data of arbitrary order. We evaluate our method on volumetric data at the example of three-dimensional MRI scans.


WE-L06-3: K-LDA: An Algorithm for Learning Jointly Overcomplete and Discriminative Dictionaries

Mohsen Joneidi (Sharif University of Technology, Iran); Jamal Golmohammady (Sharif University of Technology, Iran); Mostafa Sadeghi (Sharif University of Technology, Iran); Massoud Babaie-Zadeh (Sharif University of Technology, Iran); Christian Jutten (GIPSA-Lab, France)

Abstract: A new algorithm for learning jointly reconstructive and discriminative dictionaries for sparse representation (SR) is presented. While in a usual dictionary learning algorithm like K-SVD only the reconstructive aspect of the sparse representations is considered to learn a dictionary, in our proposed algorithm, which we call K-LDA, the discriminative aspect of the sparse representations is also addressed. In fact, K-LDA is an extension of K-SVD to the case that the class informations (the labels) of the training data are also known. K-LDA takes into account these information in order to make the sparse representations more discriminate. It makes a trade-off between the amount of reconstruction error, sparsity, and discrimination of sparse representations. Simulation results on synthetic and hand-written data demonstrate the promising performance of our proposed algorithm.


WE-L06-4: The Atomic Norm Formulation of OSCAR Regularization with Application to the Frank-Wolfe Algorithm

Xiangrong Zeng (Instituto de Telecomunicações, Instituto Superior Técnico, Portugal); Mario A. T. Figueiredo (Instituto Superior Técnico, Portugal)

Abstract: This paper proposes atomic norm formulation of octagonal shrinkage and clustering algorithm for regression (OSCAR) regularization. The OSCAR regularizer can be reformulated using a decreasing weighted sorted l1 (DWSL1) norm (which is shown to be convex). We also show how, by exploiting an atomic norm formulation, the Ivanov regularization scheme involving the OSCAR regularizer can be handled using the conditional gradient (also known as Frank-Wolfe) method.


WE-L06-5: Cardinal Sparse Partial Least Square Feature Selection and Its Application in Face Recognition

Honglei Zhang (Tampere University of Technology, Finland); Moncef Gabbouj (Tampere University of Technology, Finland); Serkan Kiranyaz (Tampere University of Technology, Finland)

Abstract: Many modern computer vision systems combine high dimensional features and linear classifiers to achieve better classification accuracy. However, the excessively long features are often highly redundant; thus dramatically increases the system storage and computational load. This paper presents a novel feature selection algorithm, namely cardinal sparse partial least square algorithm, to address this deficiency in an effective way. The proposed algorithm is based on the sparse solution of partial least square regression. It aims to select a sufficiently large number of features, which can achieve good accuracy when used with linear classifiers. We applied the algorithm to a face recognition system and achieved the state- of-the-art results with significantly shorter feature vectors.



Session WE-L07: Image and Video Security


WE-L07-1: Fighting Against Forged Documents by Using Textured Image

Iuliia Tkachenko (University Montpellier 2, France); William Puech (LIRMM, France); Olivier C. Strauss (Laboratory LIRMM, UMR CNRS 5506, University of Montpellier II, France); Jean-Marc Gaudin (Authentication Industries, France); Christophe Destruel (Authentication Industries, France); Christian Guichard (Authentication Industries, France)

Abstract: Verification of a document legitimacy is a current important problem. In this paper we propose to use a textured image containing a visual message, which can be used for identification of differences between printed legitimate document and printed fake document. The suggested textured image consists of specific patterns which should satisfy some conditions in order to give good recognition results after Print-and-Scan (P&S) process. The identification of a legitimate document is possible by making correlations between the patterns of the textured image with either the original patterns or representative P&S process patterns. Several experimental results validate the proposed verification method.


WE-L07-2: Color Laser Printer Identification Using Photographed Halftone Images

Doguk Kim (Korea Advanced Institute of Science and Technology, Korea); Heung Kyu Lee (Korea Advanced Institute of Science and Technology, Korea)

Abstract: Due to the spread of color laser printers to the general public, numerous forgeries are made by color laser printers. Printer identification is essential to preventing damage caused by color laser printed forgeries. This paper resents a new method to identify a color laser printer using photographed halftone images. First, we preprocess the photographed images to extract the halftone pattern regardless of the variation of the illumination conditions. Then, 15 halftone texture features are extracted from the preprocessed images. A support vector machine is used to be trained and classify the extracted features. Experiments are performed on seven color laser printers. The experimental results show that the proposed method is suitable for identifying the source color laser printer using photographed images.


WE-L07-3: Authentication Using Graphical Codes: Optimisation of the Print and Scan Channels

Anh Thu Phan Ho (Institut TELECOM, France); Bao An Mai Hoang (Institut TELECOM, TELECOM Lille, France); Wadih Sawaya (Institut TELECOM, TELECOM Lille, France); Patrick Bas (Centre National de la Recherche Scientifique - LAGIS Laboratory, France)

Abstract: This paper analyses the performances of an authentication system based on graphical printed codes. The authentication system relies on the non-invertible degradation due to the stochastic nature of the printing and scanning processes. Considering the print and scan channels as Lognormal or Generalized gaussian additive processes, we maximize the authentication performances for two different security scenarios. The first one considers the opponent as passive and assume that his print-and-scan channel is the same than the legitimate channel, while the second scenario devises a minimax game in order to take into account an active opponent trying to maximize his probability of non-detection by choosing the appropriate channel. Our first conclusions are the facts that (i) the authentication performance is more important for dense noises than for sparse noises for both scenarios, and (ii) for both families of distribution, the opponent optimal parameters are very close to the legitimate source parameters.


WE-L07-4: Optimized HOG for On-Road Video Based Vehicle Verification

Gonzalo Ballesteros (Universidad Autónoma de Madrid, Spain); Luis Salgado (Universidad Politécnica de Madrid, Spain)

Abstract: Vision-based object detection from a moving platform becomes particularly challenging in the field of advanced driver assistance systems (ADAS). In this context, on-board vision-based vehicle verification strategies become critical, facing challenges derived from the variability of vehicles appearance, illumination, and vehicle speed. In this paper, an optimized HOG configuration for on-board vehicle verification is proposed which not only considers its spatial and orientation resolution, but descriptor processing strategies and classification. An in-depth analysis of the optimal settings for HOG for on-board vehicle verification is presented, in the context of SVM classification with different kernels. In contrast to many existing approaches, the evaluation is realized in a public and heterogeneous database of vehicle and non-vehicle images in different areas of the road, rendering excellent verification rates that outperform other similar approaches in the literature.


WE-L07-5: Particle Swarm Optimization for Blurred Contour Retrieval

Julien Marot (Institut Fresnel, France); Salah Bourennane (Ecole Centrale Marseille, France)

Abstract: This paper concentrates on blurred contour estimation in an image. To solve this problem, we start from recently investigated signal models, derived through the association of an array of virtual sensors and the image. The array is linear when linear blurred contours are expected, and circular when circular blurred contours are expected. For the first time in this paper, we propose a common array processing model for both types of contours, which makes their retrieval closer to each other. We propose a common criterion to minimize for the estimation of the contour parameters, and justify the usage of particle swarm optimization for its minimization. An application to fire characterization exemplifies our method.



Session WE-L08: Signal Processing for Communications


WE-L08-1: Power Minimization in the Multiuser MIMO-OFDM Broadcast Channel with Imperfect CSI

José P González-Coma (University of A Coruña, Spain); Michael Joham (Technische Universität München, Germany); Paula M. Castro (University of A Coruña, Spain); Luis Castedo (University of A Coruña, Spain)

Abstract: This work addresses the design of linear precoders and receivers in multiuser Multiple-Input Multiple-Output (MIMO) downlink channels using Orthogonal Frequency Division Multiplexing (OFDM) modulation when only partial Channel State Information (CSI) is available at the transmitter. Our aim is to minimize the total transmit power subject to per-user Quality-of-Service (QoS) constraints expressed as per-user rates. We propose a gradient-projection algorithm to optimally distribute the per-user rates among the OFDM subcarriers. Then, another algorithm is used to obtain the per-subcarrier precoders and receivers that minimize the overall transmit power. Based on the Minimum Mean Square Error (MMSE) duality between the MIMO Broadcast Channel (BC) and the MIMO Multiple Access Channel (MAC), both algorithms perform an Alternating Optimization (AO).


WE-L08-2: Low Complexity Multiuser MIMO Scheduling for Weighted Sum Rate Maximization

Ganesh Venkatraman (University of Oulu, Finland); Antti Tölli (University of Oulu, Finland); Janne Janhunen (University of Oulu, Finland); Markku Juntti (University of Oulu, Finland)

Abstract: The paper addresses user scheduling schemes for the multi-user multiple-input multiple-output (MU-MIMO) transmission with the objective of sum rate maximization (SRM) and the weighted counterpart in a single cell scenario. We propose a low complex product of independent projection differences (PIPD) scheduling scheme, which performs the user selection for the MU-MIMO system with significantly lower complexity in comparison with the existing successive projections (SP) based designs. The PIPD scheme uses series of independent vector projections to evaluate the decision metrics. In addition, we also propose a heuristic algorithm of weighted scheduling, addressing the weighted sum rate maximization (WSRM) objective, which can be used with any scheduling algorithm. The performance of the weighted scheduling schemes are studied with the objective of minimizing the queues.


WE-L08-3: On the Use of Zero Padding with Discrete Cosine Transform Type-II in Multicarrier Communications

Elena Domínguez-Jiménez (Universidad Politecnica de Madrid, Spain); Gabriela Sansigre (Politechnical University of Madrid, Spain); Fernando Cruz-Roldán (Universidad Alcalá, Spain)

Abstract: In this work, the problem of applying Zero Padding (ZP) as redundancy in multicarrier communications is addressed. To this goal, a general matrix formulation to recover the transmitted symbol when ZP is used, is provided for any kind of discrete transform employed at both the transmitter and the receiver. The obtained result not only generalizes some previously reported techniques, such as discrete Fourier transform-based transceivers, but it also allows to extend it to other kind of transforms (e.g., discrete trigonometric transforms). As a particular case study, the use of discrete cosine transform Type-II even (DCT2e) is analyzed. In this case, a simple structure that recover the transmitted symbol at the receiver is also shown. Additionally, the expressions of the one-tap per subcarrier coefficients, also using the DCT2e, are derived.


WE-L08-4: Rate-adaptive Secure HARQ Protocol for Block-Fading Channels

Zeina Mheich (LSS Supélec, France); Mael Le Treust (ETIS, CNRS, ENSEA, UniversityCergy-Pontoise, France); Florence Alberge (University Paris-Sud, France); Pierre Duhamel (Lss Supelec, France); Leszek Szczecinski (INRS-EMT, Canada)

Abstract: This paper analyzes the achievable secrecy throughput in incremental redundancy secure HARQ protocols for communication over block-fading wiretap channels (WTC). The transmitter has no instantaneous channel state information (CSI) but can receive an outdated version of CSI from both legitimate receiver and eavesdropper through reliable multi-bit feedback channels. Using outdated CSI, the transmitter can adapt the coding rates. Since the transmitter cannot adapt the coding rates to the instantaneous channel conditions, we consider the outage performance of secure HARQ protocols. We show how to find the optimal rate-adaptation policies to maximize the secrecy throughput under constraints on outage probabilities. Numerical results for a Rayleigh-fading WTC show that the rate-adaptation using multilevel feedbacks provides important gains in secrecy throughput comparing to the non-adaptive model. The fact that the eavesdropper also feedbacks information may seem unrealistic, but obtained results can be understood as an upper limit of the possible secrecy throughput improvements.


WE-L08-5: Embedded Cross-Decoding Scheme for Multiple Description Based Distributed Source Coding

Beerend Ceulemans (Vrije Universiteit Brussel, Belgium); Shahid Satti (Vrije Universiteit Brussel - IBBT, Belgium); Nikos Deligiannis (University College London, United Kingdom); Frederik Verbist (Vrije Universiteit Brussel – IBBT, Belgium); Adrian Munteanu (Vrije Universiteit Brussel, Belgium)

Abstract: Using multiple description (MD) coding mechanisms, this paper proposes a novel coding framework for error-resilience in distributed source coding (DSC) in sensor networks. In particular, scalable source descriptions are first generated using a symmetric scalable MD scalar quantizer. These descriptions are then layered Wyner-Ziv (WZ) coded using low-density parity-check accumulate (LDPCA) -based syndrome binning. The decoder consists of two side decoders which attempt to iteratively decode their respective description at various LDPCA puncturing rates in the presence of a correlated side information. A central decoder exploits the inter-description correlation to further enhance the WZ rate-distortion performance when both descriptions are partially or fully received. In contrast to earlier work, our proposed decoding scheme also exploits the correlation that exists between bit-planes. Experimental simulations reveal that, for a Gaussian source, the proposed system yields a performance improvement of roughly 0.66 dB when compared to not exploiting inter-description correlations.



Session WE-L09: Signal Processing for Cognitive Radio Networks (Special Session)


WE-L09-1: Signal Processing Applications for Cognitive Networks: State of the Art

Fabricio Braga Soares de Carvalho (Federal University of Paraiba - UFPB, Brazil); Marcelo Portela Sousa (Federal Institute of Campina Grande (IFPB-CG), Brazil); José Valentim (CETEC/UFRB, Brazil); Jerônimo Silva Rocha (Federal Institute of Paraiba, Brazil); Waslon Terllizzie Araujo Lopes (UFCG - Federal University of Campina Grande, Brazil); Marcelo S. Alencar (Federal University of Campina Grande, Brazil)

Abstract: Cognitive radio is one of the most promising developments of wireless communications, due to its many applications. Cognitive networks have the capability to congregate different cognitive users via cooperative spectrum sensing. Examples of cognitive networks can be found in important and different applications, such as digital television and wireless sensor networks. The objective of this paper is to analyze how signal processing techniques are used to provide reliable performance in such networks. Several applications of signal processing in cognitive networks are presented and detailed.


WE-L09-2: Cognitive Radio System with a Two-User Non-Binary Network-Coded Cooperative Secondary Network

Samuel Mafra (UTFPR, Brazil); Ohara K. Rayel (Federal University of Technology - Parana, Brazil); João Luiz Rebelatto (Federal University of Technology - Parana, Brazil); Richard Demo Souza (Federal University of Technology - Paraná (UTFPR), Brazil)

Abstract: We investigate the performance of a network coding based secondary network in a cognitive radio system under spectrum sharing constraints. The secondary network is composed of two users that cooperate to transmit their information to a common secondary destination. The outage probability is analyzed under a given maximum interference constraint set by the primary network as well as to the maximum transmit power limit of the secondary users. Theoretical and numerical results show that the adequate use of network coding by the secondary network can provide significant gains in terms of outage probability and diversity order when compared to non cooperative or traditional cooperative techniques.


WE-L09-3: A Spectrum Sensing Algorithm Based on Statistic Tests for Cognitive Networks Subject to Fading

Fabricio Braga Soares de Carvalho (Federal University of Paraiba - UFPB, Brazil); Jerônimo Silva Rocha (Federal Institute of Paraiba, Brazil); Waslon Terllizzie Araujo Lopes (UFCG - Federal University of Campina Grande, Brazil); Marcelo S. Alencar (Federal University of Campina Grande, Brazil)

Abstract: Cognitive radio is a viable technology for the next generation of wireless communications. The ability to sense the electromagnetic spectrum and to enable vacant bands to other users has being investigated in the past years. One important issue is the use of an efficient spectrum sensing algorithm to monitor the frequency band occupancy. Usually, the effects of fading are overseen in the analysis of those algorithms. This paper aims to evaluate the performance of a spectrum sensing algorithm based on Jarque-Bera test. Rayleigh fading is considered in this paper. Preliminary simulation results are provided, to demonstrate the potential of the proposed strategy.


WE-L09-4: Distributed Cognitive Radio Systems with Temperature-Interference Constraints and Overlay Scheme

Javier Zazo (Universidad Politécnica de Madrid, Spain); Santiago Zazo (Universidad Politecnica Madrid, Spain); Sergio Valcarcel Macua (Universidad Politecnica de Madrid (UPM), Spain)

Abstract: Cognitive radio represents a promising paradigm to further increase transmission rates in wireless networks, as well as to facilitate the deployment of self-organized networks such as femtocells. Within this framework, secondary users (SU) may exploit the channel under the premise to maintain the quality of service (QoS) on primary users (PU) above a certain level. To achieve this goal, we present a noncooperative game where SU maximize their transmission rates, and may act as well as relays of the PU in order to hold their perceived QoS above the given threshold. In the paper, we analyze the properties of the game within the theory of variational inequalities, and provide an algorithm that converges to one Nash Equilibrium of the game. Finally, we present some simulations and compare the algorithm with another method that does not consider SU acting as relays.


WE-L09-5: Voice Segmentation System Based on Energy Estimation

Raissa Rocha (Universidade Federal de Campina Grande, Brazil); Marcelo S. Alencar (Federal University of Campina Grande, Brazil); Virginio Freire (Universidade Federal de Campina Grande, Brazil)

Abstract: Voice segmentation is used in speech recognition and system synthesis, as well as in phonetic voice encoders. This paper describes an implicit speech segmentation system, which aims to estimate the boundaries between phonemes in a locution. To find the segmentation marks, the proposed method initially locates reference borders between silent periods and phonemes, and vice versa measuring energy in short duration periods. The phonetic boundaries are found by means of energy encoding in the region delimited by the reference marks, which were initially detected. To evaluate the performance of the proposed system, an objective evaluation using 50 locutions was performed. The system detected 72.41% of the segmentation marks, in which, 77.6% were detected with an error less or equal to 10 ms and 22.4% of the boundaries were found with an error between 10 and 20 ms.



Session WE-L10: Inference and Estimation of Physical Fields Using Sensor Networks (Special Session)


WE-L10-1: Distributed Parameter Estimation with Exponential Family Statistics: Asymptotic Efficiency

Soummya Kar (Carnegie Mellon University, USA); Jose Moura (Carnegie Mellon University, USA)

Abstract: This paper studies the problem of distributed parameter estimation in multi-agent networks with exponential family observation statistics. Conforming to a given inter-agent communication topology, a distributed recursive estimator of the \emph{consensus-plus-innovations} type is presented in which at every observation sampling epoch the network agents exchange a single round of messages with their communication neighbors and recursively update their local parameter estimates by simultaneously processing the received neighborhood data and the new information (innovation) embedded in the observation sample. Under \emph{global observability} of the networked sensing model and mean connectivity of the inter-agent communication network, the proposed estimator is shown to yield consistent parameter estimates at each network agent. Furthermore, it is shown that the distributed estimator is asymptotically efficient, in that, the asymptotic covariances of the agent estimates coincide with that of the optimal centralized estimator, i.e., the inverse of the centralized Fisher information rate.


WE-L10-2: Nearest-Neighbor Estimation in Sensor Networks

Stefano Marano (University of Salerno, Italy); Vincenzo Matta (University of Salerno, Italy); Peter Willett (University of Connecticut, USA)

Abstract: This contribution reviews some recent advances in the field of distributed nonparametric regression in sensor networks, with focus on nearest-neighbor (NN) estimation. A network made of spatially distributed sensors and a common fusion center (FC) is deployed for inference purposes. As soon as at the fusion center a new observation variable is made available, it is delivered broadcast to all the sensors. These latter, relying upon locally available training samples and upon the received observation variable, each send message to the FC, from which is constructed the final estimate. The analysis is asymptotic in the limit of large network size and we show that, by means of a suitable ordered transmission policy, only a vanishing fraction of NN messages can be selected, without inter-sensor coordination, yet preserving the consistency of the estimation for both continuous and quantized messages, even in the presence of noisy channels.


WE-L10-3: On-line Detection and Estimation of Gaseous Point Sources Using Sensor Networks

Sérgio Agostinho (ISR - Instituto Superior Técnico, Portugal); Joao Gomes (ISR - Instituto Superior Tecnico, Portugal)

Abstract: The current work tackles the detection and localization of a diffusive point source, based on spatially distributed concentration measurements acquired through a sensor network. A model-based strategy is used, where the concentration field is modeled as a diffusive and advective-diffusive semi-infinite environment. We rely on hypothesis testing for source detection and maximum likelihood estimation for inference of the unknown parameters, providing Cramér-Rao Lower Bounds as benchmark. The (non-convex and multimodal) likelihood function is maximized through a Newton-Conjugate Gradient method, with an applied convex relaxation under steady-state assumptions to provide a suitable source position initialization. Detection is carried out resorting to a Generalized Likelihood Ratio Test. The framework's robustness is validated against a numerically simulated environment generated by the Toolbox of Level Set Methods, which provides data (loosely) consistent with the model.


WE-L10-4: Near-optimal Sensor Placement for Signals Lying in a Union of Subspaces

Dalia El Badawy (EPFL, Switzerland); Juri Ranieri (EPFL, Switzerland); Martin Vetterli (EPFL, Switzerland)

Abstract: Sensor networks are commonly deployed to measure data from the environment and accurately estimate certain parameters. However, the number of deployed sensors is often limited by several constraints, such as their cost. Therefore, their locations must be opportunely optimized to enhance the estimation of the parameters.

In a previous work, we considered a low-dimensional linear model for the measured data and proposed a near-optimal algorithm to optimize the sensor placement. In this paper, we propose to model the data as a union of subspaces to further reduce the amount of sensors without degrading the quality of the estimation. Moreover, we introduce a greedy algorithm for the sensor placement for such model and show the near-optimality of its solution. Finally, we verify with numerical experiments the advantage of the proposed model in reducing the number of sensors while maintaining intact the estimation performance.


WE-L10-5: Reconstructing Diffusion Fields Sampled with a Network of Arbitrarily Distributed Sensors

John Murray-Bruce (Imperial College London, United Kingdom); Pier Luigi Dragotti (Imperial College London, United Kingdom)

Abstract: Sensor networks are becoming increasingly prevalent for monitoring physical phenomena of interest. For such wireless sensor network applications, knowledge of node location is important. Although a uniform sensor distribution is common in the literature, it is normally difficult to achieve in reality. Thus we propose a robust algorithm for reconstructing two-dimensional diffusion fields, sampled with a network of arbitrarily placed sensors. The two-step method proposed here is based on source parameter estimation: in the first step, by properly combining the field sensed through well-chosen test functions, we show how Prony's method can reveal locations and intensities of the sources inducing the field. The second step then uses a modification of the Cauchy-Schwarz inequality to estimate the activation time in the single source field. We combine these steps to give a multi-source field estimation algorithm and carry out extensive numerical simulations to evaluate its performance.



Session WE-P1: Audio and Acoustic Signal Processing I


WE-P1-1: Exploring Superframe Co-occurrence for Acoustic Event Recognition

Huy Phan (University of Lübeck, Germany); Alfred Mertins (Institute for Signal and Image Processing, University of Luebeck, Germany)

Abstract: We introduce in this paper a novel concept of using acoustic superframes, a mid-level representation which can overcome the drawbacks of both global and simple frame-level representations for acoustic events. Through superframe-level recognition, we explore the phenomenon of superframe co-occurrence across different event categories and propose an efficient classification scheme that takes advantage of this feature sharing to improve the event-wise recognition power. We empirically show that our recognition system results in 2.4% classification error rate on the RBK-Irst database. This state-of-the-art performance demonstrates the efficiency of this proposed approach. Furthermore, we argue that this presentation can pretty much facilitate the event detection task compared to its counterparts, e.g. global and simple frame-level representations.


WE-P1-2: ILD Preservation in the Multichannel Wiener Filter for Binaural Hearing Aid Applications

Marcio H Costa (Federal University of Santa Catarina, Brazil); Patrick A Naylor (Imperial College London, United Kingdom)

Abstract: This work presents a new method for noise reduction in binaural hearing aid applications that preserves the interaural level difference. A bounded symmetrical approximation of the logarithm is employed to estimate the interaural level difference, resulting in identical values for symmetrical (left/right) frontal angles. It proposes a new cost function to be used in association with the multichannel Wiener filter technique to provide a trade-off between noise reduction and distortion of the localization cues. Simulations of a binaural setup and comparisons with a previously developed technique show that the new method gives a signal to noise ratio improvement of up to 9.6 dB better than the baseline technique, for the same maximum-tolerated binaural-cue distortion.


WE-P1-3: Merging Extremum Seeking and Self-Optimizing Narrowband Interference Canceller - Overdetermined Case

Michał Meller (Gdansk University of Technology, Poland)

Abstract: Active cancellation systems rely on destructive interference to achieve rejection of unwanted disturbances entering the system of interest. Typical practical applications of this method employ a simple single input, single output arrangement. However, when a spatial wavefield (e.g. acoustic noise or vibration) needs to be controlled, multichannel active cancellation systems arise naturally. Among these, the so-called overdetermined control configuration, which employs more measurement outputs than than control inputs, is often found to provide superior performance. The paper proposes an extension of the recently introduced control scheme, called self-optimizing narrowband interference canceller (SONIC), to the overdetermined case. The extension employs a novel variant of the extremum-seeking adaptation loop which uses random, rather than sinusoidal, probing signals. This modification simplifies design of the controller and improves its convergence. Simulations, performed using a realistic model of the plant, demonstrate improved properties of the new controller.


WE-P1-4: A Psychoacoustic Model with Partial Spectral Flatness Measure for Tonality Estimation

Armin Taghipour (International Audio Laboratories Erlangen, Germany); Maneesh Jaikumar (Fraunhofer Institute for Integrated Circuits IIS, Germany); Bernd Edler (International Audio Laboratories Erlangen, Germany)

Abstract: Psychoacoustic studies show that the strength of masking is, among others, dependent on the tonality of the masker: the effect of noise maskers is stronger than that of tone maskers. Recently, a Partial Spectral Flatness Measure (PSFM) was introduced for tonality estimation in a psychoacoustic model for perceptual audio coding. The model consists of an Infinite Impulse Response (IIR) filterbank which considers the spreading effect of individual local maskers in simultaneous masking. An optimized (with respect to audio quality and computational efficiency) PSFM is now compared to a similar psychoacoustic model with prediction based tonality estimation in medium (48 kbit/s) and low (32 kbit/s) bit rate conditions (mono) via subjective quality tests. 15 expert listeners participated in the subjective tests. The results are depicted and discussed. Additionally, we conducted the subjective tests with 15 non-expert consumers whose results are also shown and compared to those of the experts.


WE-P1-5: A Novel Decorrelation Approach for an Advanced Multichannel Acoustic Echo Cancellation System

Laura Romoli (Università Politecnica delle Marche, Italy); Stefania Cecchi (Università Politecnica delle Marche, Italy); Danilo Comminiello (Sapienza University of Rome, Italy); Francesco Piazza (Università Politecnica delle Marche, Italy); Aurelio Uncini (Univerity of Rome "La Sapienza", Italy)

Abstract: A multichannel sound reproduction system aims at offering an immersive experience exploiting multiple microphones and loudspeakers. In the case of multichannel acoustic echo cancellation, a suitable solutions for overcoming the wellknown non-uniqueness problem and an appropriate choice of the adaptive algorithm become essential to improve the audio reproduction quality. In this paper, an advanced system is proposed based on the introduction of a multichannel decorrelation solution exploiting the missing-fundamental phenomenon and a combined multiple-input multiple-output architecture updated by using the multichannel affine projection algorithm. Experimental results proved the effectiveness of the presented framework in terms of objective and subjective measures, providing a suitable solution for echo cancellation.


WE-P1-6: A Novel Method for Selecting the Number of Clusters in a Speaker Diarization System

Paula López Otero (AtlantTIC Research Centre, Multimedia Technogies Group, Universidade de Vigo, Spain); Laura Docio-Fernandez (University of Vigo, Spain); Carmen García Mateo (University of Vigo, Spain)

Abstract: This paper introduces the cluster score (C-score) as a measure for determining a suitable number of clusters when performing speaker clustering in a speaker diarization system. C-score finds a trade-off between intra-cluster and extra-cluster similarities, selecting a number of clusters with cluster elements that are similar between them but different to the elements in other clusters. Speech utterances are represented by Gaussian mixture model mean supervectors, and also the projection of the supervectors into a low-dimensional discriminative subspace by linear discriminant analysis is assessed. This technique shows robustness to segmentation errors and, compared with the widely used Bayesian information criterion (BIC)-based stopping criterion, results in a lower speaker clustering error and dramatically reduces computation time. Experiments were run using the broadcast news database used for the Albayzin 2010 Speaker Diarization Evaluation.


WE-P1-7: Monitoring Sleep with 40-Hz ASSR

Sahar Javaher Haghighi (University of Toronto, Canada); Dimitrios Hatzinakos (University of Toronto, Canada)

Abstract: The 40-Hz auditory steady state response (ASSR) signals recorded from human subjects during sleep and wakefulness are investigated in this study. The ASSR signals extracted from stimulated electro encephalogram (EEG), explored in search for differentiating and robust to noise features. Choosing appropriate features in time and frequency domain the performance of linear and quadratic discriminant analysis in classifying signals in different scenarios are studied. While the developed method itself is novel in sleep monitoring, due to similarities between N3 stage of sleep and anesthesia, the method will pave the way for later analysis on monitoring consciousness with 40-Hz ASSR. The 40-Hz ASSR extraction and noise cancellation methods presented in this paper can also be used for extracting 40-Hz ASSR from its background EEG signal in general.


WE-P1-8: A Broadband Beamformer Using Controllable Constraints and Minimum Variance

Sam Karimian-Azari (Aalborg University, Denmark); Jacob Benesty (INRS-EMT, University of Quebec, Canada); Jesper Rindom Jensen (Aalborg University, Denmark); Mads Græsbøll Christensen (Aalborg University, Denmark)

Abstract: The minimum variance distortionless response (MVDR) and the linearly constrained minimum variance (LCMV) beamformers are two optimal approaches in the sense of minimum output power. The LCMV beamformer can also null out interfering sources using linear constraints at the expense of reducing the degree of freedom in a limited number of microphones. At the same time, it may magnify the background noise and provide a lower output signal-to-noise ratio (SNR) than the MVDR beamformer. Contrarily, the MVDR beamformer has poor performance when interfering sources have spectral correlation with a signal of interest. In this paper, we propose a controllable LCMV (C-LCMV) beamformer to combine the principles of both optimal beamformers with a variable number of linear constraints to achieve a higher output SNR than the LCMV. Simulation results show that the C-LCMV beamformer outperforms the MVDR beamformer in interference rejection, and the LCMV beamformer in background noise reduction.


WE-P1-9: Adaptive Waveforms for Flow Velocity Estimation Using Acoustic Signals

Ion Candel (Grenoble INP, France); Angela Digulescu (Military Technical Academy, Romania); Cornel Ioana (Institute National Polytechnique de Grenoble, France); Gabriel Vasile (French National Council for Scientific Research (CNRS), France)

Abstract: In this paper, we introduce a general framework for waveform design and signal processing, dedicated to the study of turbulent flow phenomena. In a bi-static configuration, by transmitting a specific waveform with a predefined instantaneous frequency law (IFL), within the bounds of the Kolmogorov spectrum, the turbulent media will modify the IFL at the receiving side. We propose a new methodology to estimate this change and to exploit it for velocity estimation using acoustic signals. In this way, the amplitude based velocity estimation techniques can be substituted by non-stationary time - frequency signal processing. This technique proves to be more robust in terms of interferences and can provide a more detailed representation of any turbulent environment.


WE-P1-10: Algorithms and Evaluation on Blind Estimation of Reverberation Time

Jens-Alrik Adrian (Jade University of Applied Sciences Oldenburg, Germany); Joerg Bitzer (University of Applied Science Oldenburg, Germany)

Abstract: In this contribution, we propose an algorithm to analyze early and late reverberation in monaural recordings in an off-line processing framework with emphasis on live recordings. Furthermore, the algorithm is evaluated against known state-of-the-art solutions. Our baseline algorithm uses the cepstral mean along signal blocks, an estimation of the reverberation's impulse response and an analysis with respect to its decay characteristics. Further improvements are a cepstral lifter to increase the method's performance by removing nonrelevant cepstral coefficients and a polynomial of third order to map the results onto final estimates. Results indicate larger deviations in the estimated decay times of late reverberations, while estimates for the early decay times are within the JND and deviate only slightly from the true values. State-of-the-art algorithms show small correlation with the true reverberation times.


WE-P1-11: A Restricted Impact Noise Suppressor in Zero Phase Domain

Arata Kawamura (Osaka University, Japan)

Abstract: This paper proposes an impact noise suppression method in zero phase (ZP) domain. The signal in ZP domain (ZP signal) is obtained by taking IDFT of the p-th power of a spectral amplitude. We previously proposed an impact noise suppressor in ZP domain. This method performs noise reduction in all the segments. Since an impact noise exists only a short duration, we restrict the noise suppression procedure so that it cannot be applied to the non-impact noise segments. The restriction is achieved by using the ratio of the first to the second peak values of the ZP signal. In non-impact noise segments, this ratio becomes much larger than one. Thus, we can improve speech quality of the extracted signal when the restriction works well. Simulation results show that the proposed method improves about 15dB of SNR for a speech signal mixed with clap noise with SNR=0dB.


WE-P1-12: A Multi-Channel Postfilter Based on the Diffuse Noise Sound Field

Lukas Pfeifenberger (Telecom Industry, Austria); Franz Pernkopf (Technical University Graz, Austria)

Abstract: In this paper, we present the multi-channel Directional-to-Diffuse Postfilter (DD-PF), relying on the assumption of a near-field speech signal embedded in diffuse noise. Our postfilter uses the output of a superdirective beamformer like the Generalized Sidelobe Canceller (GSC), which is projected back to the microphone inputs to estimate the Power Spectral Density (PSD) ratio of the directional-to-diffuse portions of the sound field. This ratio is then used to calculate a Directional-to-Diffuse SNR (DD-SNR), which is used for a noise canceling Wiener filter. In our experiments, we outperform two recent postfilters based on the Transient Beam to Reference Ratio (TBRR) and the Multi-Channel Speech Presence Probability (MCSSP).



Session WE-P2: Design and Implementation of Signal Processing Systems


WE-P2-1: LMS Algorithmic Variants in Active Noise and Vibration Control

Markus Rupp (Vienna University of Technology, Austria); Fabian Hausberg (Audi AG, Germany)

Abstract: In this article we provide analyses of two low complex LMS algorithmic variants as they typically appear in the context of Filtered X-Least Mean Square (FXLMS) for active noise or vibration control in which the reference signal is not obtained by sensors but internally generated by the known engine speed. In particular we show that the algorithm with real valued error is robust and exhibits the same steady state quality as the original complex LMS algorithm but at the expense of only achieving half the learning speed while its counterpart with real-valued regression vector behaves only equivalent in the statistical sense.


WE-P2-2: ASR Systems in Noisy Environement: Auditory Features Based on Gammachirp Filter

Hajer Rahali (ENIT, Tunisia); Zied Hajaiej (ENIT, Tunisia); Noureddine Ellouze (ENIT, Tunisia)

Abstract: This paper deals with the analysis of Automatic Speech Recognition (ASR) suitable for usage within noisy environment in various conditions. Recent research has shown that auditory features based on gammachirp filterbank (GF) are promising to improve robustness of ASR systems against noise. The behavior of parameterization techniques was analyzed from the viewpoint of robustness against noise. It was done for Mel Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Gammachirp Filterbank Cepstral Coefficient (GFCC) and Gammachirp Filterbank Perceptual Linear Prediction (GF-PLP). GFCC features have shown best recognition efficiency for clean as well as for noisy database. GFCC and GF-PLP features are calculated using Matlab and saved in HTK format. Training and testing for speech recognition is done using HTK. The above-mentioned techniques were tested with impulsive signals within AURORA databases.


WE-P2-3: FIR Band-Pass Digital Differentiators with Flat Passband and Equiripple Stopband Characteristics

Takashi Yoshida (Tokyo University of Science, Japan); Yosuke Sugiura (Tokyo University of Science, Japan); Naoyuki Aikawa (Tokyo University of Science, Japan)

Abstract: Maximally flat digital differentiators are widely used as narrow-band digital differentiators because of their high accuracy around their center frequency of flat property. To obtain highly accurate differentiation over narrow-band, it is important to avoid the undesirable amplification of noise. In this paper, we introduce a design method of linear phase FIR band-pass differentiators with flat passband and equiripple stopband characteristics. The center frequency at the passband of the designed differentiators can be adjusted arbitrarily. Moreover, the proposed transfer function consists of two functions, i.e. the passband function and the stopband one. The weighting coefficients of the passband function are derived using a closed-form formula based on Jacobi Polynomial. The weighting coefficients of the stopband function are achieved using Remez alg


WE-P2-4: Conjugate Symmetric Sequency Ordered Walsh Fourier Transform

Soo-Chang Pei (National Taiwan University, Taiwan); Chia Chang Wen (National Taiwan University, Taiwan)

Abstract: A new family of transforms, which is called the conjugate symmetric sequency-ordered generalized Walsh-Fourier transform (CS-SGWFT), is proposed in this paper. The CS-SGWFT generalized the existing transforms including the conjugate symmetric sequency ordered complex Hadamard transform (CS-SCHT) and the discrete Fourier transform (DFT) as the special cases of the CS-SGWFT. Like the CS-SCHT and the DFT, the spectrums of the CS-SGWFT for real input signals are conjugate symmetric so that we need only half memory to store the transform results. The properties of the CS-SGWFT are similar to those of the CS-SCHT and DFT, including orthogonality, sequency ordering, and conjugate symmetric. Meanwhile, the proposed CS-SGWFT has radix-2 fast algorithm. Finally, applications of the CS-SGWFT for image noise removal are proposed.


WE-P2-5: Switching Extensible FIR Filter Bank for Adaptive Horizon Size in FIR Filtering

Jung Pak (Korea University, Korea); Choon Ahn (Korea University, Korea); Myo Lim (Korea University, Korea); Yuriy S. Shmaliy (Universidad de Guanajuato, Mexico)

Abstract: In this paper, we propose a novel approach to handle the horizon size of the FIR filter. The proposed approach is an adaptation technique using the FIR filter bank called the switching extensible FIR filter bank (SEFFB). In the SEFFB, the horizon size is constantly adjusted by means of the maximum likelihood strategy. We verify that the SEFFB achieves significant improvement in the performance compared with the single FIR filter which uses the best constant horizon size. Applications are given for the harmonic state space model with and without the uncertainties. A better performance of the proposed SEFFB is demonstrated in a comparison with the minimum variance FIR filter.


WE-P2-6: Compressed Sensing Under Strong Noise. Application to Imaging Through Multiply Scattering Media

Antoine Liutkus (INRIA, Nancy Grand-Est, France); David Martina (Institut Langevin, ESPCI ParisTech, France); Sylvain Gigan (ESPCI, France); Laurent Daudet (Université Paris Diderot, France)

Abstract: Compressive sensing exploits the structure of signals to acquire them with fewer measurements than required by the Nyquist-Shannon theory. However, the design of practical compressive sensing hardware raises several issues. First, one has to elicit a measurement mechanism that exhibits adequate incoherence properties. Second, the system should be robust to noise, whether it be measurement noise, or calibration noise, i.e. discrepancies between theoretical and actual measurement matrices. Third, to improve performance in the case of strong noise, it is not clear whether one should increase the number of sensors, or rather take several measurements, thus settling in the multiple measurement vector scenario (MMV). Here, we first show how measurement matrices may be estimated by calibration instead of being assumed perfectly known, and second that if the noise level reaches a few percents of the signal level, MMV is the only way to sample sparse signals at sub-Nyquist sampling rates.


WE-P2-7: Adaptive Randomized Coordinate Descent for Solving Sparse Systems

Alexandru Onose (Tampere University of Technology, Finland); Bogdan Dumitrescu (Tampere University of Technology, Finland)

Abstract: Randomized coordinate descent (RCD), attractive for its robustness and ability to cope with large scale problems, is here investigated for the first time in an adaptive context. We present an RCD adaptive algorithm for finding sparse least squares solutions to linear systems, in particular for FIR channel identification. The algorithm has low and tunable complexity and, as a special feature, adapts the probabilities with which the coordinates are chosen at each time moment. We show through simulation that the algorithm has tracking properties near those of the best current methods and investigate the trade-offs in the choices of the parameters.


WE-P2-8: RLS Sparse System Identification Using LAR-based Situational Awareness

Catia Valdman (Universidade Federal do Rio de Janeiro, Brazil); Marcello Campos (Federal University of Rio de Janeiro, Brazil); José Antonio Apolinário Jr. (Military Institute of Engineering (IME), Brazil)

Abstract: In this paper we propose the combination of the recursive least squares (RLS) and the least angle regression (LAR) algorithms for nonlinear system identification. In the application of interest, the model possesses a large number of coefficients, of which only few are different from zero. We use the LAR algorithm together with a geometrical stopping criterion to establish the number and position of the coefficients to be estimated by the RLS algorithm. The output error is used for indicating model inadequacy and therefore triggering the LAR algorithm. The proposed scheme is capable of modeling intrinsically sparse systems with better accuracy than the RLS algorithm alone, and also with better performance in terms of energy consumption.


WE-P2-9: Low-Power Simplex Ultrasound Communication for Indoor Localization

Alexander Ens (University of Freiburg, Germany); Thomas Janson (University of Freiburg, Germany); Christian Schindelhauer (University of Freiburg, Germany); Leonhard Reindl (IMTEK - Institute for Microsystem Technology, Germany)

Abstract: We propose an ultrasound communication system designed for time difference of arrival (TDOA) based indoor localization. The concept involves an infrastructure of stationary and independent senders tracking mobile receivers. The main goal is pure line-of-sight communication for correct localization. When ignoring the reception energy of multipaths the transmission range is reduced to 20 meter and we need more devices to cover the same area (0.03 devices/m^2). Thus, for cost-effectiveness and easy installation, we focus in the sender design on low power consumption for long battery or even energy independent operation. Moreover, we use the energy efficient pi/4-DQPSK modulation technique to send 8 data bits in 3.5ms. An identifier in each message along with the reception time can be used for TDOA localization. The frame synchronization error for a distance of 20m at 3 dB SNR is 11.2ns. Thus, for speed of sound the distance measurement error is 3.7 micrometer.


WE-P2-10: Sub-Nyquist 1 Bit Sampling System for Sparse Multiband Signals

Ning Fu (Harbin Institute of Technology, P.R. China); Liu Yang (Harbin Institute of Technology, P.R. China); Jingchao Zhang (Harbin Institute of Technology, P.R. China)

Abstract: Efficient sampling of wideband analog signals is a hard problem because their Nyquist rates may exceed the specifications of the analog-to-digital converters by magnitude. Modulated Wideband Converter (MWC) is a known method to sample sparse multiband signals below the Nyquist rate and the precision recovery relies on high precision quantization of samples which may take a great bit-budget. This paper proposes an alternative system that optimizes space utilization by applying comparator in sub-Nyquist sampling system. The system first multiplies the signal by a bank of periodic waveforms, and then it performs lowpass filtering, sampling and quantization through the comparator which just keeps the sign information. And we introduce a corresponding algorithm for perfect recovery. The primary design goals are efficient hardware implementation and low bit-budget. We compare our system with MWC to prove its advantages in condition of fixed bit-budget, particularly in low levels of input signal to noise ratio.


WE-P2-11: Fast Reconstruction of Nonuniformly Sampled Bandlimited Signal Using Slepian Functions

Dominik Rzepka (AGH University of Science and Technology, Poland); Marek Miskowicz (AGH University of Science and Technology, Poland)

Abstract: In this paper, an algorithm for fast reconstruction of bandlimited signal from nonuniform samples using shift-invariant space with Slepian function as a generator. The motivation to use Slepian functions is that they are bandlimited and most of their energy is concentrated in finite time interval [-τ,τ]. This allows their truncation in time with controllable error, and results in a reduction of computational complexity of reconstruction process to O(NL^2), where N is number of samples, and L≈τ. As decreasing τ increases the truncation error, the algorithm offers tradeoff between speed and accuracy. The simulation example of signal reconstruction is provided.


WE-P2-12: Compressive Blind Source Recovery with Random Demodulation

Ning Fu (Harbin Institute of Technology, P.R. China); Tingting Yao (Harbin Institute of Technology, P.R. China); Xu Hongwei (Harbin Institute of Technology, P.R. China)

Abstract: Distributed Compressive Sensing (DCS) theory effectively reduces the number of measurements of each signal, by exploiting both intra- and inter-signal correlation structures. In many fields, only the mixtures of source signals are available for compressive sampling, without prior information on both the source signals and mixing process. However, people are still interested in the source signal rather than the mixing signals. A novel method is proposed in this paper, which directly separates the mixing compressive measurements by estimating the mixing matrix first and then reconstruct the interesting source signals. At the same time, in most situations, the source signals are analog signals, in this paper, Random Demodulation (RD) system is introduced to compressively sample the analog signal. We also verify the independence and non-Gaussian property of the compressive measurement. The experimental results proves that the proposed method is feasible and compared to the underlying method, the estimation accuracy is improved.


WE-P2-13: On the Steady-state and Tracking Analysis of the Complex SRLMS Algorithm

Mohammed Mujahid Ulla Faiz (King Fahd University of Petroleum and Minerals (KFUPM), Saudi Arabia); Azzedine Zerguine (KFUPM, Saudi Arabia)

Abstract: In this paper, the steady-state and tracking behavior of the complex signed regressor least mean square (SRLMS) algorithm are analyzed in stationary and nonstationary environments, respectively. Here, the SRLMS algorithm is analyzed in the presence of complex-valued white and correlated Gaussian input data. Moreover, a comparison between the convergence performance of the complex SRLMS algorithm and the complex least mean square (LMS) algorithm is also presented. Finally, simulation results are presented to support our analytical findings.


WE-P2-14: OpenCL Parallelization of the HEVC De-Quantization and Inverse Transform for Heterogeneous Platforms

Diego F. de Souza (INESC-ID / IST, Technical University of Lisbon, Portugal); Nuno Roma (INESC-ID, IST, University of Lisbon, Portugal); Leonel A Sousa (INESC-ID / IST, Technical University of Lisbon, Portugal)

Abstract: To tackle the growing demand for high efficient implementations of video decoders in a vast set of heterogeneous platforms, a high performance implementation of the HEVC de-quantization and inverse Discrete Cosine Transform (IDCT) modules is proposed. To efficiently take advantage of the several different GPU architectures that are currently available on these platforms, the proposed modules consist on unified OpenCL implementations, allowing their migration and acceleration in any of the available devices of current heterogeneous platforms. To achieve such objective, the memory accesses were highly optimized and no synchronization points were required, in order to attain the maximum performance. The presented experimental results evaluated the proposed implementation in three different GPUs, achieving processing times as low as 6.39 ms and 6.51 ms for Ultra HD 4K I-type and B-type frames, respectively, corresponding to speedup factors as high as 18.9x and 16.5x over the HEVC Test Model (HM) version 11.0.


WE-P2-15: Analytical Design of Zero-Phase Circular 2D FIR Filters

Radu Matei (''Gh. Asachi" Technical University of Iasi, Romania)

Abstract: This paper approaches a simple and efficient analytical design method for 2D non-recursive filters with circularly-symmetric and zero-phase frequency response. The design is achieved in the frequency domain and is based on prototype filters of two types: maximally-flat and Gaussian-shaped. The 2D FIR filter transfer function results directly in factorized form. Two types of 2D circular filters will be approached, namely low-pass with a specified bandwidth, flat top and steep transition region, and also narrow band-pass filters with specified peak frequency. Simulation results for the filtering of a biomedical image are also provided, showing the usefulness of these filters in image processing.


WE-P2-16: Adaptive Identification of Sparse Systems Using the SLIM Approach

George-Othon Glentis (University of Peloponnese, Greece)

Abstract: In this paper, a novel time recursive implementation of the Sparse Learning via Iterative Minimization (SLIM) algorithm is proposed, in the context of adaptive system identification. The proposed scheme exhibits fast convergence and tracking ability at an affordable computational cost. Numerical simulations illustrate the achieved performance gain in comparison to other existing adaptive sparse system identification techniques.



Session WE-P3: Audio and Acoustic Signal Processing II


WE-P3-1: Dynamic Range Reduction of Audio Signals Using Multiple Allpass Filters on a GPU Accelerator

Jose A. Belloch (Universidad Politecnica de Valencia, Spain); Julian Parker (Aalto University, Finland); Lauri Savioja (Aalto University, Finland); Alberto Gonzalez (Universidad Politecnica de Valencia, Spain); Vesa Valimaki (Aalto University, Finland)

Abstract: Maximising loudness of audio signals by restricting their dynamic range has become an important issue in audio signal processing. Previous works indicate that an allpass filter chain can reduce the peak amplitude of an audio signal, without introducing the distortion associated with traditional non-linear techniques. Because of large search space and the consequential demand of the computational needs, the previous work selected randomly the delay-line lengths and fixed the filter coefficient values. In this work, we run on a GPU accelerator multiple allpass filter chains in parallel that cover all relevant delay-line lengths and perform a wide search on possible coefficient values in order to get closer to the optimal choice. Our most exhaustive method, which tests about 29 million parameter combinations, reduced the amplitude of test signals by 23% to 31%, whereas the previous work could only achieve a reduction of 23% at best.


WE-P3-2: Near-field Localization of Audio: A Maximum Likelihood Approach

Jesper Rindom Jensen (Aalborg University, Denmark); Mads Græsbøll Christensen (Aalborg University, Denmark)

Abstract: Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs) and gain ratios-of-arrival (GROAs) between microphones is estimated, and, second, this knowledge is used to localize the audio source. These methods often have a low computational complexity, but this comes at the cost of a limited estimation accuracy. Therefore, we propose a new localization approach, where the desired signal is modeled using TDOAs and GROAs, which are determined by the source location. This facilitates the derivation of one-stage, maximum likelihood methods under a white Gaussian noise assumption that is applicable in both near- and far-field scenarios. Simulations show that the proposed method is statistically efficient and outperforms state-of-the-art estimators in most scenarios, involving both synthetic and real data.


WE-P3-3: DOA and Pitch Estimation of Audio Sources Using IAA-based Filtering

Jesper Rindom Jensen (Aalborg University, Denmark); Mads Græsbøll Christensen (Aalborg University, Denmark)

Abstract: For decades, it has been investigated how to separately solve the problems of both direction-of-arrival (DOA) and pitch estimation. Recently, it was found that estimating these parameters jointly from multichannel recordings of audio can be extremely beneficial. Many joint estimators are based on knowledge of the inverse sample covariance matrix. Typically, this covariance is estimated using the sample covariance matrix, but for this estimate to be full rank, many temporal samples are needed. In cases with non-stationary signals, this is a serious limitation. We therefore investigate how a recent joint DOA and pitch filtering-based estimator can be combined with the iterative adaptive approach to circumvent this limitation in joint DOA and pitch estimation of audio sources. Simulations show a clear improvement compared to when using the sample covariance matrix and the considered approach also outperforms other state-of-the-art methods. Finally, the applicability of the considered approach is verified on real data.


WE-P3-4: Detecting Sound Objects in Audio Recordings

Anurag Kumar (Carnegie Mellon University, USA); Rita Singh (Carnegie Mellon University, USA); Bhiksha Raj (Carnegie Mellon University, USA)

Abstract: In this paper we explore the idea of defining sound objects and how they may be detected. We try to define sound objects and demonstrate by our experiments the existence of these objects. The major reason for proposing the idea of sound objects is to work with a generic sound concept instead of working with a small set of acoustic events for detection as is the norm. Our definition tries to conform to notions present in human auditory perception. Our experimental results are promising, and show that the idea of sound objects is worth pursuing and it that it could give a new direction to semi-supervised or unsupervised learning of acoustic event detection mechanisms


WE-P3-5: Towards Fully Uncalibrated Room Reconstruction with Sound

Marco Crocco (Istituto Italiano di Tecnologia (IIT), Italy); Andrea Trucco (University of Genoa, Italy); Vittorio Murino (University of Verona, Italy); Alessio Del Bue (Istituto Italiano di Tecnologia (IIT), Italy)

Abstract: This paper presents a novel approach for room reconstruction using unknown sound signals generated in different locations of the environment. The approach is very general, that is fully uncalibrated, i.e. the locations of microphones, sound events and room reflectors are not known a priori. We show that, even if this problem implies a highly non-linear cost function, it is still possible to provide a close solution to the global minimum. Synthetic experiments show the proposed optimization framework can achieve reasonable solutions even in the presence of signal noise.


WE-P3-6: Efficient Representation of Head-Related Transfer Functions in Subbands

Damian Marelli (Newcastle University, Australia); Robert Baumgartner (Austrian Academy of Sciences, Austria); Piotr Majdak (Austrian Academy of Sciences, Austria)

Abstract: Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology. We propose three algorithms for representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. These algorithms can be combined to achieve different design objectives. In the first algorithm, the choice of FBs is fixed, and a sparse approximation procedure minimizes the complexity of the transfer matrix associated to each HRTF. The other two algorithms jointly optimize the FBs and transfer matrices. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments show that the proposed methods offer significant computational savings when compared with other available approaches.


WE-P3-7: An Improved Patchwork-Based Digital Audio Watermarking in CQT Domain

Peng Hu (Hohai University, P.R. China); Qin Yan (Hohai University, P.R. China); Luan Dong (Hohai University, P.R. China); Meng Liu (Hohai University, P.R. China)

Abstract: In nowadays digital audio watermarking still remains one of the hot research topics in the view of multimedia copyright protection. In this paper an improved patchwork-based audio watermarking algorithm in Constant-Q Transform (CQT) domain has been proposed. The advantage of CQT in music analysis lies in its nonlinear frequency spacing. However the absence of exact invertible transform still prevents CQT from the wide application. In this paper it is overcome by frame pair selection, which is carefully performed according to frame pair energy ratios of the middle frequency range to avoid the disturbance of watermarking embedding and degradation in signal quality afterwards. Watermarks are then embedded by modifying the energy of selected frame pairs. The experimental results indicate that the proposed method outperforms the latest patchwork-based audio watermarking algorithm in Discrete Cosine Transform (DCT) domain with better signal quality of embedded signals and yet more robust to the conventional attacks.


WE-P3-8: A New Hybrid Infinite GMM-SVM System for Speaker Verification

Souad Friha (university of CONSTANTINE, Algeria); Noura Mansouri (University of Mentouri, Constantine, Algeria); Abdelmalik Taleb Ahmed (University of Valenciennes, Algeria)

Abstract: A new method for speaker text-independent verification that combines the Infinite Gaussian Mixture Models (IGMM) with Support Vector Machines (SVM) is described. Infinite GMM supervectors are constructed by stacking the means of the adapted mixture then they are trained via an SVM classifier. This allows to overcome the problem of fixing a priori the number of the involved Gaussians. Experiments showed a relative gain of about 12% in terms of the Equal Error Rate (EER) and about 59% in terms of the minimum detection cost Function (min DCF). Moreover, more improvement in terms of both EER and min DCF can be noticed when time increases with a lower number of components for comparable performance with GMMs.


WE-P3-9: An Analysis of the Effect of Larynx-Synchronous Averaging on Dereverberation of Voiced Speech

Alastair H Moore (Imperial College London, United Kingdom); Patrick A Naylor (Imperial College London, United Kingdom); Jan Skoglund (Google, Inc., USA)

Abstract: The SMERSH algorithm is a physiologically-motivated approach to low-complexity speech dereverberation. It employs multichannel linear prediction to obtain a reverberant residual signal and subsequent larynx-synchronous temporal averaging to attenuate the reverberation during voiced speech. Experimental results suggest the method is successful but, to date, no detailed analysis of the theoretical basis of the larynx-synchronous averaging has been undertaken. In this paper the SMERSH algorithm is reviewed before focussing on the theoretical basis of its approach. We show that the amount of dereverberation that can be achieved depends on the coherence of reverberation between frames. Simulations show that the extent of dereverberation increases with reverberation time and give an insight into the tradeoff between dereverberation and speech distortion.


WE-P3-10: Speech Enhancement with a Low-Complexity Online Source Number Estimator Using Distributed Arrays

Maja Taseska (International Audio Laboratories Erlangen, Germany); Affan Hasan Khan (International Audio Laboratories Erlangen, Germany); Emanuël Habets (International Audio Laboratories Erlangen, Germany)

Abstract: Enhancement of a desired speech signal in the presence of background noise and interferers is required in various modern communication systems.

Existing multichannel techniques often require that the number of sources and their locations are known in advance, which makes them inapplicable in many practical situations.

We propose a framework which uses the microphones of distributed arrays to enhance a desired speech signal by reducing background noise and an initially unknown number of interferers. The desired signal is extracted by a minimum variance distortionless response filter in dynamic scenarios where the number of active interferers is time-varying. An efficient, geometry-based approach that estimates the number of active interferers and their locations online is proposed. The overall performance is compared to the one of a geometry-based probabilistic framework for source extraction, recently proposed by the authors.


WE-P3-11: Spatio-Temporal Audio Enhancement Based on IAA Noise Covariance Matrix Estimates

Sidsel Marie Nørholm (Aalborg University, Denmark); Jesper Rindom Jensen (Aalborg University, Denmark); Mads Græsbøll Christensen (Aalborg University, Denmark)

Abstract: A method for estimating the noise covariance matrix in a multichannel setup is proposed. The method is based on the iterative adaptive approach (IAA), which only needs short segments of data to estimate the covariance matrix. Therefore, the method can be used for fast varying signals. The method is based on an assumption of the desired signal being harmonic, which is used for estimating the noise covariance matrix from the covariance matrix of the observed signal. The noise covariance estimate is used in the linearly constrained minimum variance (LCMV) filter and compared to an amplitude and phase estimation (APES) based filter. For a fixed number of samples, the performance in terms of signal-to-noise ratio can be increased by using the IAA method, whereas if the filter size is fixed and the number of samples in the APES based filter is increased, the APES based filter performs better.


WE-P3-12: A Montage Approach to Sound Texture Synthesis

Sean O'Leary (IRCAM, France); Axel Roebel (IRCAM, France)

Abstract: In this paper a novel algorithm for sound texture synthesis is presented. The goal of this algorithm is to produce new examples of a given sampled texture, the synthesized textures being of any desired duration. The algorithm is based on a montage approach to synthesis in that the synthesized texture is made up of pieces of the original sample concatenated together in a new sequence, preserving certain structures of the original texture.


WE-P3-13: Robust Feature Extractors for Continuous Speech Recognition

Md. Jahangir Alam (Computer Research Institute of Montreal (CRIM), Canada); Patrick Kenny (CRIM, Canada); Pierre Dumouchel (Ecole de technologie superieure, Canada); Douglas O'Shaughnessy (INRS-Énergie-Matériaux-Télécommunications, Canada)

Abstract: This paper presents robust feature extractors for a continuous speech recognition task in matched and mismatched environments. In the conventional Mel-frequency cepstral coefficient (MFCC) feature extraction framework, a subband spectrum enhancement technique is incorporated to improve its robustness. We denote this front-end as robust MFCCs (RMFCC). Based on the gammatone and compressive gammachirp filterbanks, robust gammatone filterbank cepstral coefficients (RGFCC) and robust compressive gammachirp filterbank cepstral coefficients (RCGCC) are also presented for comparison. We also employ low-variance spectrum estimators such as multitaper, regularized minimum variance distortionless response (RMVDR), instead of a discrete Fourier transform-based spectrum estimator for improving robustness against mismatched environments. Speech recognition performances of the robust feature extractors are evaluated in clean and multistyle training conditions of the AURORA-4 LVCSR task. Experimental results depict that the RMFCC and low-variance spectrum estimators-based robust feature extractors outperformed the MFCC, PNCC, and ETSI-AFE features both in clean and multistyle training conditions.


WE-P3-14: Topic Dependent Language Modelling for Spoken Term Detection

Shahram Kalantari (Queensland University of Technology, Australia); David B Dean (Queensland University of Technology, Australia); Sridha Sridharan (Queensland University of Technology, Australia); Roy Wallace (Queensland University of Technology, Australia)

Abstract: This paper investigates the effect of topic dependent language models (TDLM) on phonetic spoken term detection (STD) using dynamic match lattice spotting (DMLS). Phonetic STD consists of two steps: indexing and search. The accuracy of indexing audio segments into phone sequences using phone recognition methods directly affects the accuracy of the final STD system. If the topic of a document in known, recognizing the spoken words and indexing them to an intermediate representation is an easier task and consequently, detecting a search word in it will be more accurate and robust. In this paper, we propose the use of TDLMs in the indexing stage to improve the accuracy of STD in situations where the topic of the audio document is known in advance. It is shown that using TDLMs instead of the traditional general language model (GLM) improves STD performance according to figure of merit (FOM) criteria.


WE-P3-15: An Automatic System for Microphone Self-Localization Using Ambient Sound

Simayijiang Zhayida (Lund University, Sweden); Fredrik Andersson (Lund University, Sweden); Yubin Kuang (Lund University, Sweden); Kalle Åström (Lund University, Sweden)

Abstract: In this paper, we describe a system for microphone self-localization based on ambient sound. It is assumed that the microphones and the sound sources can be in unknown general 3D positions. We assume that the microphones are synchronized, although this assumption could be relaxed. There can be multiple possible moving sound sources. The system is based on first detecting and matching features. This produces TDOA data, possibly with missing data and with outliers. Then we use a robust and stratified approach for the parameter estimation. In the first step we use robust techniques to calculate initial estimates on the offsets parameters, followed by non-linear optimization based on a rank criterion. Then we use robust methods for calculating initial estimates of the sound source positions and microphone positions, followed by non-linear Maximum Likelihood estimation of all parameters. The methods are tested with both simulated and with real data with promising results.


WE-P3-16: Perceptual Coding-Based Informed Source Separation

Serap Kirbiz (GIPSA-Lab, Grenoble-INP, France); Alexey Ozerov (Technicolor Research & Innovation, France); Antoine Liutkus (INRIA, Nancy Grand-Est, France); Laurent Girin (GIPSA-Lab, Grenoble-INP, France)

Abstract: Informed Source Separation (ISS) techniques enable manipulation of the source signals that compose an audio mixture, based on a coder-decoder configuration. Provided the source signals are known at the encoder, a low-bitrate side- information is sent to the decoder and permits to achieve efficient source separation. Recent research has focused on a Coding-based ISS framework, which has an advantage to encode the desired audio objects, while exploiting their mixture in an information-theoretic framework. Here, we show how the perceptual quality of the separated sources can be significantly improved by inserting perceptual source coding techniques in this framework, achieving a continuum of optimal bitrate-perceptual distortion trade-offs.



Session WE-P4: Signal Estimation and Detection II


WE-P4-1: On Estimation Error Outage for Scalar Gauss-Markov Processes Sent Over Fading Channels

Reza Parseh (Norwegian University of Science and Technology, Norway); Kimmo Kansanen (Norwegian University of Science and Technology, Norway)

Abstract: Measurements of a complex scalar linear Gauss-Markov process are sent over a fading channel. The fading channel is modeled as independent and identically distributed complex normal random variables with known realization at the decoder. The optimal estimator at the decoder is the Kalman filter with random instantaneous gain and error variance. To evaluate the quality of estimation at the receiver, the probability distribution function of the instantaneous estimation error variance and its outage probability are of interest. For the special case of the Rayleigh fading channels, upper and lower bounds for the outage probability are derived which provide insight and simple means for design purposes.


WE-P4-2: Numerical Investigations on the Quasi-Stationary Response of Antennas to Wideband LFMCW Excitation

Markus Gardill (University of Erlangen-Nuremberg, Germany); Dennis Kay (Friedrich-Alexander University of Erlangen-Nuremberg, Germany); Robert Weigel (University of Erlangen-Nuremberg, Germany); Alexander Koelpin (University of Erlangen-Nuremberg, Germany)

Abstract: In this contribution, we numerically investigate if the quasi-stationary response is a valid approximation for the response of antennas excited by wideband linear-frequency modulated continuous waveforms. We give results for two idealized example systems, showing how the validity of the quasi-stationary response is dependent on the system's resonant behavior. It will be shown that the error between exact output and quasi-stationary response is approximately linearly dependent on the sweeprate of the linear-frequency modulated excitation. We then conduct our simulations for a realistic wideband radar system operating from 5 GHz to 8 GHz and using impulse responses extracted from electromagnetic simulations of a dipole and a biconical antenna.


WE-P4-3: Multidimensional Cramér-Rao Lower Bound for Non-uniformly Sampled NMR Signals

Anders Månsson (Lund University, Sweden); Andreas Jakobsson (Lund University, Sweden); Mikael Akke (Lund University, Sweden)

Abstract: In this work, we extend recent results on the Cramér-Rao lower bound for multidimensional non-uniformly sampled Nuclear Magnetic Resonance (NMR) signals. The used signal model is more general than earlier models, allowing for the typically present variance differences between the direct and the different indirect sampling dimensions. The presented bound is verified with earlier presented 1- and R-dimensional bounds as well as with the obtainable estimation accuracy using the statistically efficient non-linear least squares estimator. Finally, the usability of the presented bound is illustrated as a measure of the obtainable accuracy using three different sampling schemes for a real ^{15}N-HSQC NMR experiment.


WE-P4-4: A Low Complexity Coherent CPM Receiver with Modulation Index Estimation

Malek Messai (Télécom Bretagne, France); Frederic Guilloud (Institut Telecom - Telecom Bretagne, France); Karine Amis (Institut TELECOM ; TELECOM Bretagne, France)

Abstract: In this paper we address the problem of low-complexity coherent detection of constant phase modulation (CPM) signals. We exploit the per-survivor-process technique to build a reduced-state trellis and apply a Viterbi algorithm with modified metrics. In the case where the modulation index can vary, we propose a maximum-likelihood (ML) estimation of the modulation index and compare the performance of the resulting structure with a non-coherent receiver structure of the state of the art. Simulations on an additive white Gaussian noise (AWGN) channel both for binary and M-ary CPM show the efficiency of the proposed receiver.


WE-P4-5: Cramer-Rao Bound for Finite Streams of an Arbitrary Number of Pulses

Stephanie Bernhardt (Universite Paris Sud - L2S, France); Rémy Boyer (CNRS, Université Paris-Sud (UPS), Supelec, France); Sylvie Marcos (Laboratoire des Signaux et Systems, Supélec, CNRS UMR, France); Yonina C. Eldar (Technion-Israel Institute of Technology, Israel); Pascal Larzabal (ENS-Cachan, PARIS, France)

Abstract: Sampling a finite stream of filtered pulses violates the bandlimited assumption of the Nyquist-Shannon sampling theory. However, recent low rate sampling schemes have shown that these sparse signals can be sampled with perfect reconstruction at their rate of innovation which is smaller than the Nyquist's rate. To reach this goal in the presence of noise, an estimation procedure is needed to estimate the time-delay and the amplitudes of each pulse. To assess the quality of any estimator, it is standard to use the Cramer-Rao Bound (CRB) which provides a lower bound on the Mean Squared Error (MSE) of any estimator. In this work, analytic expressions of the Cramer-Rao Bound are proposed for an arbitrary number of filtered pulses. Using orthogonality properties on the filtering kernels, an approximate compact expression of the CRB is provided. The choice of kernel is discussed from the point of view of estimation accuracy.


WE-P4-6: Pilot Symbol Assisted TCM Coded System with Transmit Diversity

Emna Ben Slimane (National Engineering School of Tunis, Tunisia); Slaheddine Jarboui (National School of Engineers of Tunis, Tunisia); Ammar Bouallegue (National School of Engineers of Tunis, Tunisia)

Abstract: In this paper, novel non-coherent multiple-input multiple-output (MIMO) system based on concatenated inner space-time block code (STBC) and outer multidimensional trellis coded modulation (TCM) encoder is designed over slow time-varying Rayleigh fading channels. We develop here a novel MIMO channel estimation algorithm that adopts a pilot symbol assisted modulation (PSAM) which has been proven to be effective for fading channels. The proposed concatenated scheme takes advantage of both the high coding gain from its outer-coder and the ease of the channel estimation from the use of PSAM technique. Simulation results demonstrate the good performance of the proposed non-coherent scheme against the coherent counterpart for the same spatial diversity.


WE-P4-7: Instantaneous Parameters Estimation Algorithm for Noisy AM-FM Oscillatory Signals

Elias Azarov (Belarusian State University of Informatics and Radioelectronics, Belarus); Maxim Vashkevich (Belarusian State University of Informatics and Radioelectronics, Belarus); Alexander Petrovsky (Bialystok Technical University, Poland)

Abstract: The paper addresses the problem of estimation of amplitude envelope and instantaneous frequency of an amplitude and frequency modulated (AM-FM) signal in noisy conditions. The algorithm proposed in the paper utilizes derivatives of the signal and is analogous to well-known Energy Separation Algorithms (ESA) based on Teager-Kaiser energy operator (TEO). The formulation of the algorithm is based on Prony's method that provides estimates of phase and dumping factor as well. Compared to ESA the proposed algorithm has a very close performance for pure oscillatory signals and a better performance for signals with additive white noise.


WE-P4-8: Analysis of Coloured Noise in Received Signal Strength Using the Allan Variance

Chunbo Luo (University of the West of Scotland, United Kingdom); Pablo Casaseca-de-la-Higuera (University of the West of Scotland, United Kingdom); Sally I McClean (University of Ulster, Coleraine, United Kingdom); Gerard P. Parr (University of Ulster, United Kingdom); Christos Grecos (Independent Consultant, United Kingdom)

Abstract: The received signal strength (RSS) of wireless signals has been widely used in communications, localization and tracking. Theoretical modelling and practical applications often make a white noise assumption when dealing with RSS measurements. However, as we will show in this paper, the noise present in RSS measurements has time-dependency properties. In order to study these properties and provide noise characterisation, we propose to use the Allan Variance (AVAR) and show its better performance in comparison with direct analysis in the frequency domain using periodogram. We further study the contribution of each component by testing real RSS data. Our results confirm that the noise associated with RSS signals is actually coloured and demonstrate the appropriateness of AVAR for the identification and characterisation of these components.


WE-P4-9: Covariance Estimation From Compressive Measurements Using Alternating Minimization

José Bioucas Dias (Technical University Lisbon / Instituto de Telecomunicacoes Lisbon, Portugal); Deborah Cohen (Technion - Israel Institute of Technology, Israel); Yonina C. Eldar (Technion-Israel Institute of Technology, Israel)

Abstract: The estimation of covariance matrices from compressive measurements has recently attracted considerable research efforts in various fields of science and engineering. Owing to the small number of observations, the estimation of the covariance matrices is a severally ill-posed problem. This can be overcome by exploiting prior information about the structure of the covariance matrix. This paper presents a class of convex formulations and respective solutions to the high-dimensional covariance matrix estimation problem under compressive measurements, imposing either Toeplitz, sparseness, null-pattern, low rank, or low permuted rank structure on the solution, in addition to positive semi-definiteness. To solve the optimization problems, we introduce the CoVariance by Augmented Lagrangian Shrinkage Algorithm (CoVALSA), which is an instance of the Split Augmented Lagrangian Shrinkage Algorithm (SALSA). We illustrate the effectiveness of our approach in comparison with state-of-the-art algorithms.


WE-P4-10: Computational Cost of Chirp Z-transform and Generalized Goertzel Algorithm

Pavel Rajmic (Brno University of Technology, Czech Republic); Zdenek Prusa (Austrian Academy of Sciences, Austria); Christoph Wiesmeyr (NuHAG - University of Vienna, Austria)

Abstract: Two natural competitors in the area of narrow-band spectrum analysis, namely the Chirp Z-transform (CZT) and the Generalized Goertzel algorithm (GGA), are taken and compared, with the focus on the computational cost. We present results showing that for real-input data, the GGA is preferable over the CZT in a range of practical situations. This is shown both in theory and in practice.


WE-P4-11: High Resolution Stacking of Seismic Data

Marcos Covre (Universidade Estadual de Campinas, Brazil); Tiago Barros (University of Campinas, Brazil); Renato R Lopes (University of Campinas, Brazil)

Abstract: The stacking procedure is a key part of the seismic processing. Historically, this part of the processing is done using seismic acquisition data (traces) with common features such as the common midpoint between source and receiver. These traces are combined to construct an ideal trace where the source and receiver are virtually placed at the same place. The traditional stacking only performs a simple sum of the traces. This work proposes a different way to perform the stacking, which uses the singular value decomposition of a data matrix to create an eigenimage where the noise and interferences are attenuated. The proposed technique is called Eigenstacking. Results of the stacking and eigenstacking are compared using synthetic and real data.


WE-P4-12: Canceling Stationary Interference Signals Exploiting Secondary Data

Johan Svaerd (Lund University, Sweden); Andreas Jakobsson (Lund University, Sweden)

Abstract: In this paper, we propose a novel interference cancellation method that exploits secondary data to estimate stationary interference components present in both the primary and the secondary data sets, thereby allowing for the removal of such interference from the data sets, even when these components share frequencies with the signal of interest. The algorithm estimates the present interference components one frequency at a time, thus enabling for a computationally efficient algorithm, that require only a limited amount of secondary data. Numerical examples using both simulated and measured data show that the proposed methods offers a notable gain in performance as compared to other interference cancellation methods.


WE-P4-13: Multitaper Estimation of the Coherence Spectrum in Low SNR

Johan Brynolfsson (Lund University, Sweden); Maria Sandsten (Lund University, Sweden)

Abstract: A pseudo coherence estimate using multitapers is presented. The estimate has better localization for sinusoids and is shown to have lower variance for disturbances compared to the usual coherence estimator. This makes it superior in terms of finding coherent frequencies between two sinusoidal signals; even when observed in low SNR. Different sets of multitapers are investigated and the weights of the final coherence estimate are adjusted for a low-biased estimate of a single sinusoid. The proposed method is more computationally efficient than parametric methods, and does still give comparable results.


WE-P4-14: Novel Radar Signal Models Using Nonlinear Frequency Modulation

Sebastian Alphonse (Illinois Institute of Technology, Chicago, USA); Geoffrey A Williamson (Illinois Institute of Technology, USA)

Abstract: Two new radar signal models using nonlinear frequency modulation are proposed and investigated with respect to enhancing the target's range estimation and reducing the sidelobe level. The performance of the proposed signal models is compared to the currently popular linear frequency modulation signal model. The Cramer Rao Lower Bound along with main lobe width and the peak to sidelobe ratio are used for comparing the signal models to show that better range accuracy and smaller sidelobes can be achieved with the proposed signal models.


WE-P4-15: Enhanced Joint Data Detection and Turbo MAP Channel Estimation Using Randomly Rotated Constellations

Nejah Missaoui (ENIS, Tunisia); Inès Kammoun (ENIS, Tunisia); Mohamed Siala (Sup'Com, Tunisia)

Abstract: In this paper, we propose a joint data detection and a turbo maximum-a-posteriori (MAP) time-varying channel estimation in Slotted ALOHA MIMO systems using rotated constellations diversity. Our main idea is to use a rotated and unrotated constellation, together with coding and interleaving, for each user in order to increase the diversity order and to improve the collision resolution at the receiver side. The burst-by-burst turbo-MAP channel estimator proposed is based on Space Alternating Expectation Maximization (SAGE) algorithm. Our proposed approach allows an efficient separation of colliding packets even if they are received with equal powers. Simulation results are given to support our claims.



Session TH-L01: DOA Estimation


TH-L01-1: Direction-of-Arrival Estimation Using Multi-frequency Co-prime Arrays

Elie Bou Daher (Villanova University, USA); Yong Jia (University of Electronic Science and Technology of China, P.R. China); Fauzia Ahmad (Villanova University, USA); Moeness G. Amin (Villanova University, USA)

Abstract: In this paper, we present a new method for increasing the number of resolvable sources in direction-of-arrival estimation using co-prime arrays. This is achieved by utilizing multiple frequencies to fill in the missing elements in the difference coarray of the co-prime array corresponding to the reference frequency. For high signal-to-noise ratio, the multi-frequency approach effectively utilizes all of the degrees-of-freedom offered by the coarray, provided that the sources have proportional spectra. The performance of the proposed method is evaluated through numerical simulations.


TH-L01-2: DOA Estimation and Signal Separation Using Antenna with Time Varying Response

Gregory Dvorkind (Rafael, Israel); Eran Greenberg (RAFAEL, Israel)

Abstract: In this paper we suggest a new algorithm for DOA estimation and signal separation using a new antenna element, having a time variant radiation pattern. With the suggested approach, signals arriving from various spatial directions are acquired in this sensor with different time varying signatures, due to the continuously changing radiation pattern of the antenna. We show that if the radiation pattern is varied in a periodical manner and sufficiently fast compared to the bandwidth of the received signals, then multiple sources of radiation can be estimated and their direction of arrival detected. The suggested approach is a novel alternative to classical array based signal processing, which allows to perform spatial processing tasks without exploiting multiple elements.


TH-L01-3: Three CS-based Beamformers for Single Snapshot DOA Estimation

Stefano Fortunati (University of Pisa, Italy); Raffaele Grasso (CMRE, Italy); Fulvio Gini (University of Pisa, Italy); Maria S. Greco (University of Pisa, Italy); Kevin LePage (CMRE, Italy)

Abstract: In this work, the estimation of the Directions of Arrival (DOAs) of multiple source signals from a single observation vector is considered. In particular, the estimation, detection and super-resolution performance of three algorithms based on the theory of Compressed Sensing (the classical l1-minimization or LASSO, the fast smooth l0-minimization and the SPICE algorithm) are analyzed and compared with the classical Fourier beamformer. This comparison is carried out using both simulated data and real sonar data.


TH-L01-4: Robust High-Resolution DOA Estimation with Array Pre-Calibration

Christian Weiss (Darmstadt University of Technology, Germany); Abdelhak M Zoubir (Darmstadt University of Technology, Germany)

Abstract: A robust high-resolution technique for DOA estimation in the presence of array imperfections such as sensor position errors and non-uniform sensor gain is presented. When the basis matrix of a sparse DOA estimation framework is derived from an ideal model, array errors cannot be handled which causes performance deterioration. Array pre-calibration via robust steering vector estimation yields an improved overcomplete basis matrix. It alleviates the delicate problem of selecting the regularization parameter of the optimization problem and improves the performance significantly. Thus, closely spaced sources can be resolved in the presence of severe array imperfections, even at low SNRs.


TH-L01-5: Joint DOA and Multi-Pitch Estimation Via Block Sparse Dictionary Learning

Ted Kronvall (Lund University, Sweden); Stefan I Adalbjörnsson (Lund University, Sweden); Andreas Jakobsson (Lund University, Sweden)

Abstract: In this paper, we introduce a novel sparse method for joint estimation of the direction of arrivals (DOAs) and pitches of a set of multi-pitch signals impinging on a sensor array. Extending on earlier approaches, we formulate a novel dictionary learning framework from which an estimate is formed without making assumptions on the model orders. The proposed method alternatively uses a block sparse approach to estimate the pitches, using an alternating direction method of multipliers framework, and alternatively a nonlinear least squares approach to estimate the DOAs. The preferable performance of the proposed algorithm, as compared to earlier methods, is shown using numerical examples.



Session TH-L02: Image and Video Analysis I


TH-L02-1: Per-Pixel Mirror-Based Acquisition Method for Video Compressive Sensing

Jonathan Lima (University of Brasília, Brazil); Cristiano Miosso (University of Brasília, Brazil); Mylene Q Farias (University of Brasilia (UnB), Brazil)

Abstract: In this paper, we propose a new theoretical method for acquiring linear measures for compressive sensing reconstruction of high-speed videos from low-speed cameras: the Per-Pixel Mirror-Based (PPM) acquisition method. The proposed technique, unlike other techniques in literature, is light efficient and generates time-independent samples. In simulated tests, we compare the reconstruction results of PPM with other techniques available in the literature, using natural and synthetic videos. The PPM method shows better quantitative and qualitative results.


TH-L02-2: Accurate Image Registration Using Approximate Strang-Fix and an Application in Super-Resolution

Adam Scholefield (Imperial College London, United Kingdom); Pier Luigi Dragotti (Imperial College London, United Kingdom)

Abstract: Accurate registration is critical to most multi-channel signal processing setups, including image super-resolution. In this paper we use modern sampling theory to propose a new robust registration algorithm that works with arbitrary sampling kernels. The algorithm accurately approximates continuous-time Fourier coefficients from discrete-time samples. These Fourier coefficients can be used to construct an over-complete system, which can be solved to approximate translational motion at around 100-th of a pixel accuracy. The over-completeness of the system provides robustness to noise and other modelling errors. For example we show an image registration result for images that have slightly different backgrounds, due to a viewpoint translation. Our previous registration techniques, based on similar sampling theory, can provide a similar accuracy but not under these more general conditions. Simulation results demonstrate the accuracy and robustness of the approach and demonstrate the potential applications in image super-resolution.


TH-L02-3: Numerically Stable Estimation of Scene Flow Independent of Brightness and Regularizer Weights

Yusuke Kameda (Tokyo University of Science, Japan); Ichiro Matsuda (Tokyo University of Science, Japan); Susumu Itoh (Tokyo University of Science, Japan)

Abstract: In video images, apparent motions can be computed using optical flow estimation. However, estimation of the depth directional velocity is difficult using only a single viewpoint. Scene flows (SF) are three-dimensional (3D) vector fields with apparent motion and a depth directional velocity field, which are computed from stereo video. The 3D motion of objects and a camera can be estimated using SF, thus it is used for obstacle detection and self-localization. SF estimation methods require the numerical computation of nonlinear equations to prevent over-smoothing due to the regularization of SF. Since the numerical stability depends on the image and regularizer weights, it is impossible to determine appropriate values for the weights. Thus, we propose a method that is independent of the images and weights, which simplifies previous methods and derives the numerical stability conditions, thereby facilitating the estimation of suitable weights. We also evaluated the performance of the proposed method.


TH-L02-4: Online Learning Partial Least Squares Regression Model for Univariate Response Data

Lei Qin (Université de Technologie de Troyes, France); Hichem Snoussi (University of Technology of Troyes, France); Fahed Abdallah (Heudiasyc UMR CNRS 6599, University of Technology of Compiegne, France)

Abstract: The partial least squares (PLS) analysis has attracted increasing attention in image and video processing. However, most PLS methods currently used are of batch form, which requires maintaining previous training data and re-training the model when new samples are available. In this work, we propose a novel approach that is able to incrementally update the PLS model using new data. The incremental approach has the appealing property of constant computational and space complexities. Two extensions of the model updating method are proposed as well. First, we extend the method to be able to decrementally update the model when some training samples are removed. Second, we develop a weighted extension, where different weights can be assigned to the training data blocks when updating the model. Experiments on real image data confirmed the effectiveness and the advantages of the proposed methods.


TH-L02-5: Smoke Detection Using Spatio-Temporal Analysis, Motion Modeling and Dynamic Texture Recognition

Panagiotis Barmpoutis (Centre for Research and Technology Hellas, Greece); Kosmas Dimitropoulos (Centre for Research and Technology Hellas, Informatics and Telematics Institute, Greece); Nikos Grammalidis (Centre for Research and Technology Hellas, Greece)

Abstract: In this paper, we propose a novel method for video-based smoke detection, which aims to discriminate smoke from smoke-colored moving objects by applying spatio-temporal analysis, smoke motion modeling and dynamic texture recognition. Initially, candidate smoke regions in a frame are identified using background subtraction and color analysis based on the HSV model. Subsequently, spatio-temporal smoke modeling consisting of spatial energy analysis and spatio-temporal energy analysis is applied in the candidate regions. In addition, histograms of oriented gradients and optical flows (HOGHOFs) are computed to take into account both appearance and motion information, while dynamic texture recognition is applied in each candidate region using linear dynamical systems and a bag of systems approach. Dynamic score combination by mean value is finally used to determine whether there is smoke or not in each candidate image region. Experimental results presented in the paper show the great potential of the proposed approach.



Session TH-L03: Estimation and Detection


TH-L03-1: Robust Hypothesis Testing with Squared Hellinger Distance

Gökhan Gül (Darmstadt University of Technology, Germany); Abdelhak M Zoubir (Darmstadt University of Technology, Germany)

Abstract: We extend an earlier work of the same authors, which proposes a minimax robust hypothesis testing strategy between two composite hypotheses based on a squared Hellinger distance. We show that without any further restrictions the former four non-linear equations in four parameters that has to be solved to design the robust test can be reduced to two equations in two parameters. Additionally, we show that the same equations can be combined into a single equation if the nominal probability density functions satisfy the symmetry condition. The parameters controlling the degree of robustness are bounded from above depending on the nominal distributions and shown to be determined via solving a polynomial equation of degree two. Experiments justify the benefits of the proposed contributions.


TH-L03-2: Exploiting Time and Frequency Information for Delay/Doppler Altimetry

Abderrahim Halimi (University of Toulouse, France); Corinne Mailhes (University of Toulouse, France); Jean-Yves Tourneret (University of Toulouse, France); Thomas Moreau (Collecte Localisation Satellite (CLS), France); Francois Boy (Centre National d'Etudes Spatiales (CNES), France)

Abstract: Delay/Doppler radar altimetry is a new technology that has been receiving an increasing interest, especially since the launch of the first altimeter in 2010. It aims at reducing the measurement noise and increasing the along-track resolution in comparison with conventional pulse limited altimetry. A new semi-analytical model with five parameters has been recently introduced for this new technology. However, two of these parameters are highly correlated resulting in bad estimation performance when estimating all parameters. This paper proposes a new strategy improving estimation performance for delay/Doppler altimetry. The proposed strategy exploits all the information contained in the delay/Doppler domain. A comparison with other classical algorithms (using the temporal samples only) allows to appreciate the gain in estimation performance obtained when using both temporal and Doppler (frequency) data.


TH-L03-3: Identification of Power Line Outages

Shay Maymon (Technion, Israel); Yonina C. Eldar (Technion-Israel Institute of Technology, Israel)

Abstract: This paper considers the problem of identifying power line outages throughout an electric interconnection based on changes in phasor angles observed at a limited number of buses.

In existing approaches for solving the line outage identification problem the unobserved phasor angle data is ignored and identification is based on the observed phasor angles extracted from the data. We propose, instead, a least-squares approach for estimating the unobserved phasor angles, which is shown to yield a solution to the line outage identification problem that is equivalent to the solution obtained with existing approaches. This equivalence suggests an implementation of the solution to the line outage identification problem that is computationally more efficient than previous methods. A natural extension of the least-squares formulation leads to a generalization of the line outages identification problem in which the grid parameters are unknown.


TH-L03-4: Nonparametric Density Estimation with Region-Censored Data

Youssef Bennani (Laboratoire I3S, France); Luc Pronzato (Laboratoire I3S, France); Maria Joao Rendas (Centre Nacional Recherche Scientifique, France)

Abstract: The paper claims that nonparametric estimation from region censored observations, for which use of the Maximum Likelihood criterion may lead to counter-intuitive and non-unique solutions, is best solved in the framework of constrained Maximum Entropy estimation, in particular in the context of population studies, as it is the case of the problem addressed here. However, the link between constrained Maximum Entropy and Maximum Likelihood estimation for the exponential family, which holds under the assumptions considered by previous studies is necessarily lost for censored observations. We present estimators enabling the determination of density estimates that are at the same time physically plausible and have a good fit to the observed data, and illustrate their application to real data (hyperbaric diving).


TH-L03-5: Uniformly Most Powerful Detection for Integrate and Fire Time Encoding

Lionel Fillatre (Université de Nice Sophia Antipolis, France); Igor Nikiforov (Université de Technologie de Troyes, UTT/STMR/LM2S, France); Marc Antonini (Université de Nice Sophia Antipolis, France); Abdourrahmane M Atto (LISTIC, University of Savoie, Polytech Annecy-Chambéry, France)

Abstract: A time encoding of a random signal is a representation of this signal as a random sequence of strictly increasing times. The goal of this paper is the rule for testing the mean value of a Gaussian signal from asynchronous samples given by the Integrate and Fire (IF) time encoding. The optimal likelihood ratio test is calculated and its statistical performances are compared with a synchronous test which is based on regular samples of the Gaussian signal. Since the IF samples based detector takes a decision at a random time, the regular samples based test exploits a random number of samples. The time encoding significantly reduces the number of samples needed to satisfy a prescribed probability of detection.



Session TH-L04: Digital Audio Processing for Loudspeakers and Headphones 1 (Special Session)


TH-L04-1: A Geometrical Approach to Room Compensation for Sound Field Rendering Applications

Antonio Canclini (Politecnico di Milano, Italy); Dejan Marković (Politecnico di Milano, Italy); Lucio Bianchi (Politecnico di Milano, Italy); Fabio Antonacci (Politecnico di Milano, Italy); Augusto Sarti (Politecnico di Milano, Italy); Stefano Tubaro (Politecnico di Milano, Italy)

Abstract: In this paper we propose a method for reducing the impact of room reflections in sound field rendering applications. Our method is based on the modeling of the acoustic paths (direct and reflected) from each of the loudspeakers of the rendering system, and a set of control points in the listening area. From such models we derive a propagation matrix and compute its least-squares inversion. Due to its relevant impact on the spatial impression, we focus on the early reflections part of the Room Impulse Response, which is conveniently estimated using the fast beam tracing modeling engine. A least squares problem is formulated in order to derive the compensation filter. We also demonstrate the robustness of the proposed solution against errors in geometric measurement of the hosting environment.


TH-L04-2: Adaptive Stabilization of Electro-dynamical Transducers

Wolfgang Klippel (Klippel GmbH, Germany)

Abstract: A new control technique for electro-dynamical transducer is presented which stabilizes the voice coil position, compensates for nonlinear distortion and generates a desired transfer response by preprocessing the electrical input signal. The control law is derived from transducer modeling using lumped elements and identifies all free parameters of the model by monitoring the electrical signals at the transducer terminals. The control system stays operative for any stimulus including music and other audio signals. The active stabilization is important for small loudspeakers generating the acoustical output at maximum efficiency. The adaptive control algorithm presented in this paper has been illustrated on a moving coil transducer but the same approach can also be applied to other transduction principles such as the balance armature transducer used in hearing aids and in-ear phones.


TH-L04-3: Breaking Down the Cocktail Party: Capturing and Isolating Sources in a Soundscape

Anastasios Alexandridis (FORTH/University of Crete, Greece); Anthony Griffin (FORTH/University of Crete, Greece); Athanasios Mouchtaris (Foundation for Research and Technology-Hellas, Greece)

Abstract: Spatial scene capture and reproduction requires extracting directional information from captured signals. Our previous work focused on directional coding of a sound scene using a single microphone array. In this paper, we investigate the benefits of using multiple microphone arrays, and extend our previous method by allowing arrays to cooperate during spatial feature extraction. We can thus render the sound scene using both direction and distance information and selectively reproduce specific "spots" of the captured sound scene.


TH-L04-4: An Allpass Hear-Through Headset

Jussi Ramo (Aalto University, Finland); Vesa Valimaki (Aalto University, Finland)

Abstract: In augmented reality audio applications, a headset must allow the simultaneous hearing of ambient and reproduced sounds. In order to create a natural hear-through experience when wearing the headset, the acoustic attenuation of the headset itself must be cancelled. This is obtained by processing the ambient sound signals captured by external microphones. The sound perceived by the headset user will then be a mixture of the ambient sound that leaks through the headset and the processed ambient sound that is reproduced with the headset. We propose a new equalization method for designing such a hear-through system based on an allpass design principle. The proposed method takes the frequency-dependent isolation transfer function of the headset as the input and completes it with an engineered transfer function so that the outcome will be an allpass transfer function with a flat magnitude response.


TH-L04-5: An Advanced Spatial Sound Reproduction System with Listener Position Tracking

Stefania Cecchi (Università Politecnica delle Marche, Italy); Andrea Primavera (Università Politecnica delle Marche, Italy); Marco Virgulti (Università Politecnica delle Marche, Italy); Ferruccio Bettarelli (Leaff Engineering, Italy); Francesco Piazza (Università Politecnica delle Marche, Italy)

Abstract: The paper deals with the development of a real time system for the reproduction of an immersive audio field considering the listeners' position. The system is composed of two parts: a sound rendering system based on a crosstalk canceller that is required in order to have a spatialized reproduction and a listener position tracking system in order to model the crosstalk canceller parameters. Therefore, starting from the free-field model, a new model is considered introducing a directivity function for the loudspeakers and considering a three-dimensional environment. A real time application is proposed introducing a Kinect control, capable of accurately tracking the listener position and changing the crosstalk parameters. Several results are presented comparing the proposed approach with the state of the art in order to confirm its validity.



Session TH-L05: Physical Layer Network Coding (Special Session)


TH-L05-1: Physical Layer Network Coding: An Outage Analysis in Cellular Network

Hironori Fukui (OKI Electric Industry Co., Ltd., Japan); Petar Popovski (Aalborg University, Denmark); Hiroyuki Yomo (Kansai University, Japan)

Abstract: Physical layer network coding (PLNC) has been proposed to improve throughput of the two-way relay channel, where two nodes communicate with each other, being assisted by a relay node. Most of the works related to PLNC are focused on a simple three-node model where all the nodes are placed deterministically, and they do not take into account the impact of interference. Unlike these conventional studies, in this paper, we apply PLNC to a large-scale cellular network in the presence of intercell interference (ICI). In cellular networks, a terminal and a Base Station (BS) have different transmission power, which causes different impact of ICI on downlink (DL) and uplink (UL) phase. We theoretically derive outage probability with a tractable approach. With the obtained numerical results, we discuss how the interference and the difference of transmission power affect outage probability achieved by PLNC.


TH-L05-2: A Full Cooperative Diversity Beamforming Scheme in Two-way Amplify-and-Forward Relay Systems

Zhongyuan Zhao (Beijing University of Posts and Telecommunications, P.R. China); Zhiguo Ding (Newcastle University, United Kingdom); Mugen Peng (Beijing University of posts & Telecommunications, P.R. China); H. Vincent Poor (Princeton University, USA)

Abstract: This paper considers a simple two-way relaying channel in which two single-antenna sources exchange information via a multiple-antenna relay. For such a scenario, all the existing approaches that can achieve full cooperative diversity order are based on antenna/relay selection, and the difficulty in designing the relay beamformer lies in the fact that a single beamformer needs to serve two destinations simultaneously. In this paper, a new full-cooperative diversity beam-forming scheme that ensures that the relay signals are coherently combined at both destinations is proposed. Both analytical and numerical results are provided to demonstrate the performance gains achieved by the proposed scheme.


TH-L05-3: Adaptive Broadcast Transmission in Distributed Two-Way Relaying Networks

Dirk Wübben (University of Bremen, Germany); Meng Wu (University of Bremen, Germany); Armin Dekorsy (University of Bremen, Germany)

Abstract: In this paper we consider adaptive, distributed two-way relaying networks using physical-layer network coding (PLNC). In the multiple-access (MA) phase, two sources transmit simultaneously to multiple relays. Depending on the decoding success at the relays, adaptive transmission schemes are investigated to avoid error propagation in the broadcast (BC) phase where distributed orthogonal space-time block codes (D-OSTBCs) are employed. Recently, adaptive schemes have been proposed, where only relays with correct estimates of the network coded message participate in the BC transmission. In this work we extend the analysis by incorporating also the case, that some relays are able to detect only one source message and propose a corresponding modified adaptive transmission scheme. For performance evaluations we resort to a semi-analytical method in order to examine the outage behavior of the presented schemes. As demonstrated by link-level simulations, the proposed adaptive scheme outperforms the traditional scheme significantly, especially for asymmetric network topology.


TH-L05-4: Lattice Network Coding Over Euclidean Domains

Angeles Vazquez-Castro (Universidad Autónoma de Barcelona, Spain); Frederique Oggier (Nanyang Technological University, Singapore)

Abstract: We propose a novel approach to design and analyse lattice-based network coding. The underlying alphabets are carved from (quadratic imaginary) Euclidean domains with a known Euclidean division algorithm, due to their inherent algorithmic ability to capture analog network coding computations. These alphabets are used to embed linear p-ary codes of length n, p a prime, into n-dimensional Euclidean ambient spaces, via a variation of the so-called Construction A of lattices from linear codes. A study case over one such Euclidean domain is presented and the nominal coding gain of lattices obtained from p-ary Hamming codes is computed for any prime p such that p = 1 (mod 4).


TH-L05-5: Linear Physical Layer Network Coding for Multihop Wireless Networks

Alister G. Burr (University of York, United Kingdom); Dong Fang (University of York, United Kingdom)

Abstract: We consider linear network coding functions that can be employed at the relays in wireless physical layer network coding, applied to a general multi-hop network topology. We introduce a general model of such a network, and discuss the algebraic basis of linear functions, deriving conditions for unambiguous decodability of the source data at the destination. We consider the use of integer rings, integer fields, binary extension fields and the ring of binary matrices as potential algebraic constructs, and show that the ring constructs provide more flexibility. We use the two-way relay channel and a network containing two sources and two relays to illustrate the concept and to demonstrate the effect of fading of the wireless channels. We show the capacity benefits of the more flexible rings.



Session TH-L06: Sensor Array and Multichannel Signal Processing


TH-L06-1: Causality-Constrained Multiple Shift Sequential Matrix Diagonalisation for Parahermitian Matrices

Jamie Corr (University of Strathclyde, United Kingdom); Keith Thompson (University of Strathclyde, United Kingdom); Stephan Weiss (University of Strathclyde, United Kingdom); John G McWhirter (Cardiff University, United Kingdom); Ian Proudler (Loughborough University, United Kingdom)

Abstract: This paper introduces a causality constrained sequential matrix diagonalisation (SMD) algorithm, which generates a causal paraunitary transformation for any parahermitian matrix, and can be used to determine a polynomial eigenvalue decomposition. In addition to producing a causal paraunitary transformation, this algorithm builds on a multiple shift technique which speeds up diagonalisation by bringing additional energy onto the diagonal during each iteration. The results presented in this paper show the performance in comparison to existing algorithms, in particular the non-causal algorithm on which our new algorithm is based.


TH-L06-2: Optimum Discrete Phase-Only Transmit Beamforming with Antenna Selection

Özlem Tuğfe Demir (Middle East Technical University, Turkey); T. Engin Tuncer (Middle East Technical University, Turkey)

Abstract: Phase-only beamforming is used in radar and communication systems due to its certain advantages. Antenna selection becomes an important problem as the number of antennas becomes larger than the number of transmit-receive chains. In this paper, discrete single group multicast transmit phase-only beamformer design with antenna subset selection is considered. The problem is converted into linear form and solved efficiently by using mixed integer linear programming to find the optimum subset of antennas and beamformer coefficients. Several simulations are done and it is shown that the proposed approach is an effective and efficient method of subarray transmit beamformer design.


TH-L06-3: Adaptive Re-Weighting Homotopy for Sparse Beamforming

Fernando Almeida, Neto (University of São Paulo, Brazil); Vitor H Nascimento (USP, Brazil); Yuriy Zakharov (University of York, United Kingdom); Rodrigo C. de Lamare (University of York, United Kingdom)

Abstract: In this paper, a complex adaptive re-weighting algorithm based on the homotopy technique is developed and used for beamforming. A multi-candidate scheme is also proposed and incorporated into the adaptive re-weighting homotopy algorithm to choose the regularization factor and improve the signal-to-interference plus noise (SINR) performance. The proposed algorithm is used to minimize the degradation caused by sparsity in arrays with faulty sensors, or when the required degrees of freedom to suppress interference is significantly less than the number of sensors. Simulations illustrate the algorithm's performance.


TH-L06-4: Distributed GEVD-based Signal Subspace Estimation in a Fully-Connected Wireless Sensor Network

Amin Hassani (KU Leuven, Belgium); Alexander Bertrand (KU Leuven, Belgium); Marc Moonen (KU Leuven, Belgium)

Abstract: In this paper, we present a distributed algorithm for network-wide signal subspace estimation in a fully-connected wireless sensor network with multi-sensor nodes. We consider scenarios where the noise field is spatially correlated between the nodes. Therefore, rather than an eigenvalue decomposition (EVD-) based approach, we apply a generalized EVD (GEVD-) based approach which allows to directly incorporate the (estimated) noise covariance. Furthermore, the GEVD is also immune to unknown per-channel scalings. We first use a distributed algorithm to estimate the principal generalized eigenvectors (GEVCs) of a pair of network-wide sensor signal covariance matrices, without explicitly constructing these matrices, as this would inherently require data centralization. We then apply a transformation at each node to extract the actual signal subspace estimate from the principal GEVCs. The resulting distributed algorithm can reduce the per-node communication and computational cost. We demonstrate the effectiveness of the algorithm by means of numerical simulations.


TH-L06-5: Design of Piecewise Linear Polyphase Sequences with Good Correlation Properties

Mojtaba Soltanalian (Uppsala University, Sweden); Petre Stoica (Uppsala University, Sweden); Mohammad Mahdi Naghsh (Isfahan University of Technology, Iran); Antonio De Maio (University of Naples "Federico II", Italy)

Abstract: In this paper, we devise a computational approach for designing polyphase sequences with two key properties; (i) a phase argument which is piecewise linear, and (ii) an impulse-like autocorrelation. The proposed approach relies on fast Fourier transform (FFT) operations and thus can be used efficiently to design sequences with a large length or alphabet size. Moreover, using the suggested method, one can construct many new such polyphase sequences which were not known and/or could not be constructed by the previous formulations in the literature. Several numerical examples are provided to show the performance of the proposed design framework in different scenarios.



Session TH-L07: Image and Video Analysis II


TH-L07-1: A Probabilistic Interpretation of Geometric Active Contour Segmentation

Jonas De Vylder (Ghent University, Belgium); Dirk Van Haerenborgh (Ghent University, Belgium); Jan Aelterman (Ghent University, Belgium); Wilfried Philips (Ghent University, Belgium)

Abstract: Active contours or snakes are widely used for segmentation and tracking. These techniques require the minimization of an energy function, which is typically a linear combination of a data-fit term and regularization terms. This energy function can be tailored to the intrinsic object and image features. This can be done by either modifying the actual terms or by changing the weighting parameters of the terms. There is, however, no surefire way to set these terms and weighting parameters optimally for a given application. Although heuristic techniques exist for parameter estimation, often trial and error is used. In this paper, we propose a probabilistic interpretation to segmentation. This approach results in a generalization of state of the art active contour segmentation. In the proposed framework all parameters have a statistical interpretation, thus avoiding ad hoc parameter settings.


TH-L07-2: Retina Enhanced Bag of Words Descriptors for Video Classification

Sabin Tiberius Strat (University of Savoie, France); Alexandre Benoit (University of Savoie, France); Patrick Lambert (University of Savoie, France)

Abstract: This paper addresses the task of detecting diverse semantic concepts in videos. Within this context, the Bag Of Visual Words (BoW) model, inherited from static image analysis, is among the most popular methods. However, in the case of videos, this model faces new difficulties such as the added motion information, the extra computational cost and the increased variability of content and concepts to handle. Considering this spatio-temporal context, we propose to extend the BoW model by introducing video preprocessing strategies with the help of a retina model, before extracting BoW descriptors. This preprocessing increases the robustness of local features to disturbances such as noise and lighting variations. Additionally, the retina model is used to detect potentially salient areas and to construct spatio-temporal descriptors. We experiment with three state of the art local features, SIFT, SURF and FREAK, and we evaluate our results on the TRECVid 2012 Semantic Indexing (SIN) challenge.


TH-L07-3: Rotation-Invariant Object Detection Using Complex Matched Filters and Second Order Vector Fields

Mihails Pudzs (Institute of Electronics and Computer Science, Latvia); Modris Greitans (Institute of Electronics and Computer Science, Latvia)

Abstract: In this paper we introduce two concepts: second order vector fields that describe line-like objects in images and rotation-invariant Complex Matched Filter kernels that can be used to detect object with almost any complexity. We present the theoretical grounds for kernel derivation, object matching using sets of subresponses, object's rotation angle and active area determination. The work of the proposed algorithms is demonstrated on images of an occluded and rotated object.


TH-L07-4: Human Action Recognition in Stereoscopic Videos Based on Bag of Features and Disparity Pyramids

Alexandros Iosifidis (Aristotle University of Thessaloniki, Greece); Anastasios Tefas (Aristotle University of Thessaloniki, Greece); Nikos Nikolaidis (Aristotle University of Thessaloniki, Greece); Ioannis Pitas (Aristotle University of Thessaloniki, Greece)

Abstract: In this paper, we propose a method for human action recognition in unconstrained environments based on stereoscopic videos. We describe a video representation scheme that exploits the enriched visual and disparity information that is available for such data. Each stereoscopic video is represented by multiple vectors, evaluated on video locations corresponding to different disparity zones. By using these vectors, multiple action descriptions can be determined that either correspond to specific disparity zones, or combine information appearing in different disparity zones in the classification phase. Experimental results denote that the proposed approach enhances action classification performance, when compared to the standard approach, and achieves state-of-the-art performance on the Hollywood 3D database designed for the recognition of complex actions in unconstrained environments.


TH-L07-5: Object Tracking Extensions for Accurate Recovery of Rainfall Maps Using Microwave Sensor Network

Yoav Liberman (Tel Aviv University, Israel)

Abstract: Recently, diverse methods have been proposed for faithful reconstruction of instantaneous rainfall maps by using received signal level (RSL) measurements from commercial microwave network (CMN), especially in dense networks. The main lacking of these methods is that the temporal properties of the rain field had not been considered, hence their accuracy might be limited. This paper presents a novel method for accurate spatial-temporal reconstruction of rainfall maps, derived from CMN, by using an extension to object tracking algorithms. An efficient coherency algorithm is used, which relates between sequential instantaneous rainfall maps. Then by using Kalman filter, the observed rain maps are predicted and corrected. When comparing the estimates to actual rain measurements, the performance improvement of the rainfall mapping is manifested, even when dealing with a rather sparse network, and low temporal resolution of the measurements. The method proposed here is not restricted to the application of accurate rainfall mapping.



Session TH-L08: Bayesian Inference


TH-L08-1: A Reversible Jump MCMC Algorithm for Particle Size Inversion in Multiangle Dynamic Light Scattering

Abdelbassit Boualem (Université d'Orléans, France); Meryem Jabloun (Université d'Orléans, France); Philippe Ravier (Université d'Orléans, France); Marie Naiim (CILAS, France); Alain Jalocha (CILAS, France)

Abstract: The inverse problem of estimating the Particle Size Distribution (PSD) from Multiangle Dynamic Light Scattering measurements (MDLS) is considered using a Bayesian inference approach. We propose to model the multimodal PSD as a normal mixture with an unknown number of components (modes or peaks). In order to achieve the estimation of these variable dimension parameters, a Bayesian inference approach is used and solved by the Reversible Jump Markov Chain Monte Carlo sampler (RJMCMC). The efficiency and robustness of the method proposed are demonstrated using simulated and experimental data. Estimated PSDs are close to the original distributions for synthetic data. Moreover an improvement of the resolution is noticed compared to the Clementi method.


TH-L08-2: Majorize-Minimize Adapted Metropolis-Hastings Algorithm. Application to Multichannel Image Recovery

Yosra Marnissi (Université Paris-Est Marne-la-Vallée, France); Amel Benazza (SUP'COM, Tunisia); Emilie Chouzenoux (Université Paris-Est Marne-la-Vallée, France); Jean-Christophe Pesquet (Université Paris-Est, France)

Abstract: One challenging task in MCMC methods is the choice of the proposal density. It should ideally provide an accurate approximation of the target density with a low computational cost. In this paper, we are interested in Langevin diffusion where the proposal accounts for a directional component. We propose a novel method for tuning the related drift term. This term is preconditioned by an adaptive matrix based on a Majorize-Minimize strategy. This new procedure is shown to exhibit a good performance in a multispectral image restoration example.


TH-L08-3: Rank-based Multiple Change-Point Detection in Multivariate Time Series

Flore Harlé (CEA, LIST, France); Florent Chatelain (GipSA-Lab, France); Cédric Gouy-Pailler (CEA, LIST, France); Sophie Achard (GIPSA-lab, CNRS, France)

Abstract: In this paper, we propose a Bayesian approach for multivariate time series segmentation. A robust non-parametric test, based on rank statistics, is derived in a Bayesian framework in order to be robust to unknown distributions of piecewise constant multivariate time series for which mutual dependencies are unknown. By modelling rank-test p-values, a pseudo-likelihood is proposed to favour change-points detection for significant p-values. A vague prior is chosen for dependency structure between time series, and a MCMC method is applied to the resulting posterior distribution. The partially collapsed Gibbs sampling strategy makes the method computationally efficient. The algorithm is illustrated on simulated and real signals in two practical settings. It is demonstrated that change-points are robustly detected and localized, through implicit dependency structure learning or explicit structural prior introduction.


TH-L08-4: Group-sparse Adaptive Variational Bayes Estimation

Konstantinos E. Themelis (National Observatory of Athens, Greece); Athanasios A. Rontogiannis (National Observatory of Athens, Greece); Konstantinos Koutroumbas (National Observatory of Athens, Greece)

Abstract: This paper presents a new variational Bayes algorithm for the adaptive estimation of signals possessing group structured sparsity. The proposed algorithm can be considered as an extension of a recently proposed variational Bayes framework of adaptive algorithms that utilize heavy tailed priors (such as the Student-t distribution) to impose sparsity. Variational inference is efficiently implemented via appropriate time recursive equations for all model parameters. Experimental results are provided that demonstrate the improved estimation performance of the proposed adaptive group sparse variational Bayes method, when compared to state-of-the-art sparse adaptive algorithms.


TH-L08-5: Bayesian Optimal Compressed Sensing Without Priors: Parametric SURE Approximate Message Passing

Chunli Guo (University of Edinburgh, United Kingdom); Mike Davies (University of Edinburgh, United Kingdom)

Abstract: It has been shown that the Bayesian optimal approximate message passing (AMP) technique achieves the optimal compressed sensing (CS) recovery. However, the prerequisite of the signal prior makes it often impractical. To address this dilemma, we propose the parametric SURE-AMP algorithm. The key feature is it uses the Stein's unbiased risk estimate (SURE) based parametric family of MMSE estimator for the CS denoising. Given that the optimization of the estimator and the calculation of its mean squared error purely depend on the noisy data, there is no need of the signal prior. The weighted sum of piecewise kernel functions is used to form the parametric estimator. Numerical experiments on both Bernoulli-Gaussian and k-dense signal justify our proposal.



Session TH-L09: Digital Audio Processing for Loudspeakers and Headphones 2 (Special Session)


TH-L09-1: Room Reflections Assisted Spatial Soundfield Reproduction

Prasanga Samarasinghe (Australian National University, Australia); Thushara D. Abhayapala (Australian National University, Australia); Mark Poletti (Callaghan Innovation, New Zealand)

Abstract: With recent advances in surround sound technology, an increased interest is shown in the problem of virtual sound reproduction. However, the performance of existing surround sound systems are degraded by factors like room reverberation and listener movements. In this paper, we develop a novel approach to spatial sound reproduction in reverberant environments, where room reverberation is constructively incorporated with the direct source signals to recreate a virtual reality. We also show that the array of monopole loudspeakers required for reproduction can be clustered together in a small spatial region away from the listening area, which in turn enables the array's practical implementation via a single loudspeaker unit with multiple drivers.


TH-L09-2: Nonlinear Distortion Reduction for Electrodynamic Loudspeaker Using Nonlinear Filtering

Kenta Iwai (Kansai University, Japan); Yoshinobu Kajikawa (Kansai University, Japan)

Abstract: In this paper, we compare the efficiency of compensating nonlinear distortions in electrodynamic loudspeaker system using two different types of mirror filter, which called 2nd- and 3rd-order nonlinear IIR filter. These filters need nonlinear parameters of loudspeaker systems and we used estimated nonlinear parameters for evaluating the efficiency of compensating nonlinear distortions of these filters. Therefore, these evaluation results include the effect of the parameter estimation method. In this paper, we measure the nonlinear parameters using Klippel's measurement equipment and evaluate the compensation amount of both filters. Experimental results demonstrate that the 3rd-order nonlinear IIR filter can realize a reduction by 4dB more than the 2nd-order nonlinear IIR filter on nonlinear distortions at high frequencies.


TH-L09-3: Perceptually Optimized Room-in-room Sound Reproduction with Spatially Distributed Loudspeakers

Julian Grosse (Carl von Ossietzky University Oldenburg, Germany); Steven van de Par (University of Oldenburg, Germany)

Abstract: In sound reproduction it is desired to reproduce a recording of an instrument made in a specific room (e.g. a church or concert hall) in a playback room such that the listener has a plausible and authentic impression of the instrument including the room acoustical properties of the recording room. For this purpose a new method is presented that separately optimizes the direct sound field and recreates a reverberant sound field in the playback room that matches that of the recording room. This approach optimizes monaural cues related to coloration and the interaural cross correlation (IACC), responsible for listener envelopment, in both rooms based on an artificial head placed at the listener's positions. The cues are adjusted using an auditorily motivated gammatone analysis-synthesis filterbank. A MUSHRA listening test revealed that the proposed method is able to recreate the perceived room acoustics of the recording room in an accurate way.


TH-L09-4: Least-Mean-Square Weighted Parallel IIR Filters in Active-Noise-Control Headphones

Markus Guldenschuh (Institute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Austria)

Abstract: Adaptive filters in noise control applications have to approximate the plant and compensate for the secondary-path. This work shows that the plant and secondary-path variations of noise control headphones depend above all on the direction of incident noise and the tightness of the ear-cups. Both kind of variations are investigated by preliminary measurements, and it is further shown that the measured variations can be approximated with the linear combination of only a few prototype filters. Thus, a parallel adaptive linear combiner is suggested instead of the typical adaptive transversal-filter. Theoretical considerations and experimental results reveal that the parallel structure performs equally well, converges even faster, and requires fewer adaptation weights.


TH-L09-5: A Unified Approach to Numerical Auditory Scene Synthesis Using Loudspeaker Arrays

Joshua Atkins (Beats Electronics, USA); Ismael Nawfal (Beats Electronics LLC, USA); Daniele Giacobello (Beats Electronics, USA)

Abstract: In this work we address the problem of simulating the spatial and timbral cues of a given sound event, or auditory scene, using an array of loudspeakers. We define the problem with a general framework that encompasses many known techniques from physical acoustics, crosstalk cancellation, and acoustic control. In contrast to many previous approaches, the system described in this work is inherently broadband as it jointly designs a set of spatio-temporal filters while allowing for constraints in other domains. With this framework we show similarities and differences between known techniques and suggest some new, unexplored methods. These methods are then compared by implementing the systems on a linear array of loudspeakers and evaluating the timbral and spatial qualities of the system using objective metrics.



Session TH-L10: Biometric Technologies for Security and Forensics Applications (Special Session)


TH-L10-1: Anti-Forensic Resistant Likelihood Ratio Computation

Norman Poh (University of Surrey, United Kingdom); Nik Suki (University of Surrey, United Kingdom); Aamo Iorliam (University of Surrey, United Kingdom); Anthony T S Ho (University of Surrey, United Kingdom)

Abstract: One of the major utilities of biometrics in the context of crime scene investigation is to identify people. However, in the most sophisticated cases, criminals may introduce the biometric samples of innocent individuals in order to evade their own identities as well as to incriminate the innocent individuals. To date, even a minute suspect of an anti-forensic threat can potentially jeopardize any forensic investigation to the point a potentially vital piece of evidence suddenly becomes powerless in the court of law. In order to remedy this situation, we propose an anti-forensic resistant likelihood ratio computation. This approach renders the strength of evidence to a level that is proportional to the trustworthiness of the trace, such that a highly credible evidence will bear its full strength of evidence whilst a highly suspicious trace can have its strength of evidence reduced to naught.


TH-L10-2: Biometric Source Weighting in Multi-Biometric Fusion: Towards a Generalized and Robust Solution

Naser Damer (Fraunhofer Institute for Computer Graphics Research (IGD), Germany); Alexander Opel (Fraunhofer Institute for Computer Graphics Research (IGD), Germany); Alexander Nouak (Fraunhofer Institute for Computer Graphics Research IGD, Germany)

Abstract: This work presents a new weighting algorithm for biometric sources within a score-level multi-biometric system. Those weights are used in the effective and widely used weighted sum fusion rule to produce multi-biometric decisions. The presented solution is mainly based on the characteristic of the overlap region between the genuine and imposter scores distributions. It also integrates the performance of the biometric source represented by its equal error rate. This solution aims at avoiding the shortcomings of previously proposed solutions such as low generalization abilities and sensitiveness to outliers. The proposed solution is evaluated along with the state of the art and best practice techniques. The evaluation was performed on two databases, the Biometric Scores Set BSSR1 and the Extended Multi Modal Verification for Teleservices and Security applications database and a satisfying and stable performance was achieved.


TH-L10-3: Presentation Attack Detection Algorithm for Face and Iris Biometrics

R Raghavendra (Gjøvik University College, Norway); Christoph Busch (Gjøvik University College, Norway)

Abstract: Biometric systems are vulnerable to the diverse attacks that emerged as a challenge to assure the reliability in adopting these systems in real-life scenario. In this work, we propose a novel solution to detect the presentation attack based on exploring both statistical and Cepstral features. The proposed Presentation Attack Detection (PAD) algorithm will extract the statistical features that can capture the micro-texture variation using Binarized Statistical Image Features (BSIF) and Cepstral features that can reflect the micro changes in frequency using 2D Cepstrum analysis. We then fuse these features to form a single feature vector before making a decision on whether the presented biometric is real or an artefact using linear Support Vector Machine (SVM). Extensive experiments carried out on a publically available face and iris spoof database shows the efficacy of the proposed PAD algorithm with an HTER = 10:21% on face and HTER = 0% on the iris.


TH-L10-4: On Identification From Periocular Region Utilizing SIFT and SURF

Samil Karahan (Gebze Institute Of Technology, Turkey); Adil Karaoz (Gebze Institute of Technology, Turkey); Omer Ozdemir (Gebze Institute of Technology, Turkey); Ahmet Gul (Gebze Institute of Technology, Turkey); Umut Uludag (TUBITAK BILGEM, Turkey)

Abstract: We concentrate on utilization of facial periocular region for biometric identification. Although this region has superior discriminative characteristics, as compared to mouth and nose, it has not been frequently used as an independent modality for personal identification. We employ a feature-based representation, where the associated periocular image is divided into left and right sides, and descriptor vectors are extracted from these using popular feature extraction algorithms SIFT, SURF, BRISK, ORB, and LBP. We also concatenate descriptor vectors. Utilizing FLANN and Brute Force matchers, we report recognition rates and ROC. For the periocular region image data, obtained from widely used FERET database consisting of 865 subjects, we obtain Rank-1 recognition rate of 96.8% for full frontal and different facial expressions in same session cases. We include a summary of existing methods, and show that the proposed method produces lower/comparable error rates with respect to the current state of the art.


TH-L10-5: A Multivariate Singular Spectrum Analysis Approach to Clinically-Motivated Movement Biometrics

Tracey Lee (Singapore Polytechnic, Singapore); Sharon Gan (Singapore Polytechnic, Singapore); Joo Ghee Lim (Singapore Polytechnic, Singapore); Saeid Sanei (University of Surrey, United Kingdom)

Abstract: Biometrics are quantities obtained from analyses of biological measurements. For human based biometrics, the two main types are clinical and authentication. This paper presents a brief comparison between the two, showing that on many occasions clinical biometrics can motivate for its use in authentication applications. Since several clinical biometrics deal with temporal data and also involve several dimensions of movement, we also present a new application of Singular Spectrum Analysis , in particular its multivariate version, to obtain significant frequency information across these dimensions. We use the most significant frequency component as a biometric to distinguish between various types of human movements. The signals were collected from triaxial accelerometers mounted in an object that is handled by a user. Although this biometric was obtained in a clinical setting, it shows promise for authentication.



Session TH-L11: Location and Positioning


TH-L11-1: Interference Detection in GNSS Signals Using the Gaussianity Criterion

Fernando Nunes (Instituto Superior Tecnico, Portugal); Fernando Sousa (Instituto Superior de Engenharia de Lisboa, Portugal)

Abstract: This paper aims at analyzing the performance of several Gaussianity tests as a blind method to detect narrow (sinusoidal) and wideband (chirp) interference in GNSS signals. The tests under analysis can be classified into two classes: the ones that resort to the computation of moments or cumulants (Anscombe-Glynn and Giannakis-Tsatsanis tests), and those that rely on the divergence of the empirical distribution function relative to the theoretical Gaussian distribution (Lilliefors and Cramer-von Mises tests). Simulations have shown that the Giannakis-Tsatsanis test produces the best results, although at the cost of higher computational burden. In general, this test is more sensitive to narrowband interference, thus meaning that more processing effort is required to detect chirp interference. The test can be used as a benchmark for comparison with other interference detection techniques, proposed elsewhere.


TH-L11-2: Array-broadband Effects on Direct Geolocation Algorithm

Cyrile Delestre (ENS Cachan, France); Anne Ferréol (Thales Communications, France); Pascal Larzabal (ENS-Cachan, PARIS, France)

Abstract: Recent works have introduced powerful 1-step geolocation methods in comparison with traditional, and suboptimal, 2-steps methods. As these 1-step methods directly and simultaneously work on the observations of the whole array, there is now an important issue concerning the possible array-broadband effect. To counteract that effect, the recent methods introduce an imperfect narrowband decomposition, by the way of a filter bank or, equivalently, by a structured multidimensional modelization. The purpose of this work is to study the residual array-broadband effect on the 1-step algorithms performances. The study will compare two 1-step methods by the way of the bias and the ambiguity problem, giving some tools for operational design.


TH-L11-3: Iterative Grid Search for RSS-Based Emitter Localization

Suzan Üreten (University of Ottawa, Canada); Abbas Yongaçoğlu (University of Ottawa, Canada); Emil M. Petriu (University of Ottawa, Canada)

Abstract: This paper presents a reduced complexity iterative grid-search algorithm for RSS-based localization of non-cooperating primary emitters in cognitive radio networks. The proposed algorithm is initialized with a small number of candidate locations selected uniformly within the region of interest and then the search space is reduced at each iteration around the candidate that maximizes the likelihood function. We evaluate the performance of the proposed algorithm in independent shadowing scenarios and show that the performance closely approaches to that of the full search, particularly at small shadowing spread values with significantly reduced computational complexity.


TH-L11-4: Distance-based Tuning of the EKF for Indoor Positioning in WSNs

Alejandro Correa (Universitat Autònoma de Barcelona, Spain); Marc Barceló (Universitat Autònoma de Barcelona, Spain); Antoni Morell (Universitat Autonoma de Barcelona (UAB), Spain); José López Vicario (Universitat Autonoma de Barcelona, Spain)

Abstract: This work proposes a filtering method for indoor positioning and tracking applications which combines position, speed and heading measurements with the aim of achieving more accurate position estimates both in the short and the long term. We combine all this data using the well-known Extended Kalman Filter (EKF). The particularity in our proposal is that the EKF is configured using the designed statistical covariance matrix tuning method (SCMT), which is based on the the statistical characteristics of the position measurements. Thanks to the SCMT, the EKF is able to efficiently cope with measurements that have different degrees of uncertainty and, therefore, it achieves high accuracy also in the long-term. The system has been validated in a real environment and the results show a reduction in the positioning error of more than 48\% when compared to a regular EKF in the tested scenarios.


TH-L11-5: Improved Pseudolite Navigation Using C/N0 Measurements

Daniele Borio (EC Joint Research Centre, Italy); Ciro Gioia (Joint Research Centre of the European Commission, Italy)

Abstract: The problem of indoor navigation using pseudolites is investigated and two different approaches, employing synchronous and asynchronous technologies, are considered. It is shown that synchronous pseudolite systems, commonly considered more accurate, seem to be unsuitable for deep indoor operations: in complex propagation environments, the synchronization required for metre level navigation is difficult to achieve and a different solution should be adopted. The potential of asynchronous pseudolite systems is thus demonstrated and indoor navigation with metre level accuracy is obtained using C\N0 measurements. In particular, the spectral characteristics of C/N0 measurements are investigated and used to design a pre-filtering stage which, in turn, is employed to remove high-frequency noise. The filter designed significantly improves the navigation performance in harsh indoor environments.



Session TH-L12: Signal Processing Applications II


TH-L12-1: A Simplified QRS Decision Stage Based on the DFT Coefficients

Juan Manuel Górriz Sáez (University of Granada, Spain); Javier Ramirez (University of Granada, Spain); Puntonet Carlos (University of Granada, Spain); Pablo Padilla (University of Granada, Spain); Ignacio Illán (University of Granada, Spain); Diego Salas-González (University of Granada, Spain)

Abstract: This paper shows an adaptive statistical test for QRS detection of ECG signals. The method is based on a M-ary generalized likelihood ratio test (LRT) defined over a multiple observation window in the Fourier domain. The previous algorithms based on maximum a posteriori (MAP) estimation result in high signal model complexity that exhibits the most profound effect on performance, i.e. parameter selection; and are computational unfeasible or not intended for real time applications, i.e. intensive care monitoring. A simplified model is proposed based on the independent Gaussian properties of the DFT coefficients, which allows to define a simplified MAP probability function. The approach defines an adaptive MAP statistical test in which a global hypothesis is defined on particular hypotheses of the multiple observation window. Moreover, the observation interval is modeled as a discontinuous transmission discrete-time stochastic process avoiding the inclusion of parameters that constraints the morphology of the QRS complexes.


TH-L12-2: Heart Failure Discrimination Using Matching Pursuit Decomposition

Fausto Lucena (UFMA, Brazil); Yoshinori Takeuchi (Nagoya University, Japan); Allan Barros (UFMA, Brazil); Noboru Ohnishi (Nagoya University, Japan)

Abstract: Congestive heart failure (CHF) is a cardiac disease associated with the decreases in cardiac output. As a prophylactic measure to sudden death, we propose a framework for discriminating CHF subjects from normal sinus rhythm (NSR). This framework relies on matching pursuit decomposition to derive a set of features, which are tested in a hybrid genetic algorithm and k-nearest neighbor classifier to select the best feature subset. The performance of the proposed framework is analyzed using both Fantasia and CHF database from Physionet archives which are, respectively, composed of 40 NSR volunteers and 29 CHF subjects. The proposed methodology reaches an overall accuracy of 100% when the features are normalized and the feature subset selection strategy is applied. We believe that our method can be extremely useful to the clinician in primary health care as a support tool to discriminate healthy from heart disease subjects.


TH-L12-3: Online Seizure Detection in Adults with Temporal Lobe Epilepsy Using Single-Lead ECG

Thomas De Cooman (KU Leuven, Belgium); Evelien Carrette (University Hospital Ghent, Belgium); Paul Boon (University Hospital Ghent, Belgium); Alfred Meurs (University Hospital Ghent, Belgium); Sabine Van Huffel (Katholieke Universiteit Leuven, Belgium)

Abstract: In this paper, a patient-independent algorithm for online epileptic seizure detection using only single-lead ECG is proposed. It is tested on 300h of data from adults with temporal lobe epilepsy. The features are extracted from a period of linear increase of the heart rate, which typically occurs in these kind of patients. These features are classified by two different classifiers: linear support vector machine (LSVM) and linear discriminant analysis (LDA). The best performance is found for LDA with a sensitivity of 80.0%, a PPV of 40.5% and an average detection delay of 31.5s, which are satisfactory results for online usage in monitoring or warning systems.


TH-L12-4: Grouped Sparsity Algorithm for Multichannel Intracardiac ECG Synchronization

Thomas Trigano (Shamoon College of Engineering, Israel); Vladimir Kolesnikov (Shamoon College of Engineering, Spain); David Luengo (Universidad Politecnica de Madrid (UPM), Spain); Antonio Artés-Rodríguez (Universidad Carlos III de Madrid, Spain)

Abstract: In this paper, a new method is presented to ensure automatic synchronization of intracardiac ECG data, yielding a three-stage algorithm. We first compute a robust estimate of the derivative of the data to remove low-frequency perturbations. Then we provide a grouped-sparse representation of the data, by means of the Group LASSO, to ensure that all the electrical spikes are simultaneously detected. Finally, a post-processing step, based on a variance analysis, is performed to discard false alarms. Preliminary results on real data for sinus rhythm and atrial fibrillation show the potential of this approach.


TH-L12-5: Automatic Classification of Heartbeats

Tony Basil (PayPal India, India); Choudur Lakshminarayan (HP Labs, USA)

Abstract: We report improvement in the detection of a class of heart arrhythmias based on electrocardiogram signals (ECG). The detection is performed using a 4 dimensional feature vector obtained by applying an iterative feature selection method used in conjunction with artificial neural networks. The feature set includes the pre-RR interval, which is a primary measure that cardiologists use in a clinical setting. A transformation applied to the pre-RR interval reduced the false positive rate. Our solution as opposed to existing literature does not rely on high-dimensional features such as wavelets, signal amplitudes which do not have direct relationship to heart function and difficult to interpret. Also we avoid obtaining patient specific labeled recordings. Furthermore, we propose semi-parametric classifiers as opposed to restrictive parametric linear discriminant analysis and its variants, which are a mainstay in ECG classification. Extensive experiments from the MIT-BIH databases demonstrate superior performance by our methods.



Session TH-L13: Audio and Acoustic Signal Processing III


TH-L13-1: Comparison of Different Representations Based on Nonlinear Features for Music Genre Classification

Athanasia Zlatintsi (National Technical University of Athens, Greece); Petros Maragos (National Technical University of Athens, Greece)

Abstract: In this paper, we examine the descriptiveness and recognition properties of different feature representations for the analysis of musical signals, aiming in the exploration of their micro- and macro-structures, for the task of music genre classification. We explore nonlinear methods, such as the AM-FM model and ideas from fractal theory, so as to model the time-varying harmonic structure of musical signals and the geometrical complexity of the music waveform. The different feature representations' efficacy is compared regarding their recognition properties for the specific task. The proposed features are evaluated against and in combination with Mel frequency cepstral coefficients (MFCC), using both static and dynamic classifiers, accomplishing an error reduction of 28%, illustrating that they can capture important aspects of music.


TH-L13-2: Emotion Classification of Speech Using Modulation Features

Theodora Chaspari (University of Southern California, USA); Dimitrios B Dimitriadis (AT&T Labs - Research, USA); Petros Maragos (National Technical University of Athens, Greece)

Abstract: Automatic classification of a speaker's affective state is one of the major challenges in signal processing community, since it can improve Human-Computer interaction and give insights into the nature of emotions from psychology perspective. The amplitude and frequency control of sound production influences strongly the affective voice content. In this paper, we take advantage of the inherent speech modulations and propose the use of instant amplitude- and frequency-derived features for efficient emotion recognition. Our results indicate that these features can further increase the performance of the widely-used spectral-prosodic information, achieving improvements on two emotional databases, the Berlin Database of Emotional Speech and the recently collected Athens Emotional States Inventory.


TH-L13-3: Robust Pitch Estimation Using an Optimal Filter on Frequency Estimates

Sam Karimian-Azari (Aalborg University, Denmark); Jesper Rindom Jensen (Aalborg University, Denmark); Mads Græsbøll Christensen (Aalborg University, Denmark)

Abstract: In many scenarios, a periodic signal of interest is often contaminated by different types of noise, that may render many existing pitch estimation methods suboptimal, e.g., due to an incorrect white Gaussian noise assumption. In this paper, a method is established to estimate the pitch of such signals from unconstrained frequency estimates (UFEs). A minimum variance distortionless response (MVDR) method is proposed as an optimal solution to minimize the variance of UFEs considering the constraint of integer harmonics. The MVDR filter is designed based on noise statistics making it robust against different noise situations. The simulation results confirm that the proposed MVDR method outperforms the state-of-the-art weighted least squares (WLS) pitch estimator in colored noise and has robust pitch estimates against missing harmonics in some time-frames.


TH-L13-4: On the Modeling of Natural Vocal Emotion Expressions Through Binary Key

Jordi Luque (Telefonica Research, Spain); Xavier Anguera (Telefonica Research, Spain)

Abstract: This work presents a novel method to estimate natural expressed emotions in speech through binary acoustic modeling. Standard acoustic features are mapped to a binary value representation and support vector regression model is used to correlate them with the three-continuous emotional dimensions. Three different sets of speech features, two based on spectral parameters and one prosody-based are compared on the VAM corpus, a set of spontaneous dialogues from a German TV talk-show. The regression analysis, in terms of correlation coefficient and mean absolute error, show that the binary key modeling is able to capture speaker emotion characteristics. The proposed algorithm obtains comparable results to those reported on the literature while it relies on a much smaller set of acoustic descriptors. Furthermore, we also report on preliminary results based on the combination of the binary models, which brings further performance improvements.


TH-L13-5: Fast Music Information Retrieval with Indirect Matching

Takahiro Hayashi (Niigata University, Japan); Nobuaki Ishii (Niigata University, Japan); Masato Yamaguchi (Niigata University, Japan)

Abstract: This paper proposes a fast content-based music information retrieval method called indirect matching. As a preparation for retrieval, using a small number of pre-selected music clips called representative queries, the proposed method calculates the similarities of each music clip in the database to the representative queries in advance. In the online retrieval phase, the proposed method calculates the similarities between the user-inputted query and the representative queries at first. Evaluating the difference between the similarities of the query to the representative queries and the pre-calculated similarities of each music clip in the database to the representative queries with L1 norm, the proposed method indirectly estimates the actual similarity between the query and each music clip in the database. Experimental results have shown that the retrieval time can be greatly reduced by the indirect matching without much deterioration of retrieval accuracy.



Session TH-L14: Statistical Methods for Inverse Problems in Image Processing (Special Session)


TH-L14-1: Hyper-spectral Image Analysis with Partially-Latent Regression

Antoine Deleforge (University of Erlangen-Nuremberg, Germany); Florence Forbes (INRIA Rhône-Alpes, France); Radu P. Horaud (INRIA Grenoble Rhône-Alpes, France)

Abstract: The analysis of hyper-spectral images is often needed to recover physical properties of planets. To address this inverse problem, the use of learning methods have been considered with the advantage that, once a relationship between physical parameters and spectra has been established through training, the learnt relationship can be used to estimate parameters from new images underpinned by the same physical model. Within this framework, we propose a partially-latent regression method which maps high-dimensional inputs (spectral images) onto low-dimensional responses (physical parameters). We introduce a novel regression method that combines a Gaussian mixture of locally-linear mappings with a partially-latent variable model. While the former makes high-dimensional regression tractable, the latter enables to deal with physical parameters that cannot be observed or, more generally, with data contaminated by experimental artifacts that cannot be explained with noise models. The method is illustrated on images collected from the Mars planet.


TH-L14-2: Fusion of Multispectral and Hyperspectral Images Based on Sparse Representation

Qi Wei (University of Toulouse, France); José Bioucas-Dias (Instituto Superior Técnico, Portugal); Nicolas Dobigeon (University of Toulouse, France); Jean-Yves Tourneret (University of Toulouse, France)

Abstract: This paper studies a new dictionary learning based optimization algorithm for fusing hyperspectral and multispectral images. The dictionaries and the supports are learned from observed images using online dictionary learning method and orthogonal matching pursuit algorithm. Conditional on the dictionaries and supports, the optimization problem can be addressed by alternately optimizing with respect to the target image using alternative direction method of multipliers method and optimizing to the code with a least square method. Simulation results demonstrate the efficiency of the proposed fusion method when compared with several state-of-the-art fusion techniques.


TH-L14-3: Robust Minimum Volume Simplex Analysis for Hyperspectral Unmixing

Alexander Agathos (West University of Timisoara, Romania); Jun Li (University of Extremadura, Spain); José Bioucas Dias (Technical University Lisbon / Instituto de Telecomunicacoes Lisbon, Portugal); Antonio Plaza (University of Extremadura, Spain)

Abstract: Most blind hyperspectral unmixing methods exploit convex geometry properties of hyperspectral data. The minimum volume simplex analysis (MVSA) is one of such methods which, as many others, estimates the minimum volume (MV) simplex where the measured vectors live. MVSA was conceived to circumvent the matrix factorization step often implemented by MV based algorithms and also to cope with outliers, which compromise the results produced by MV algorithms. Inspired by the recently proposed robust minimum volume estimation (RMVES) algorithm, we herein introduce the robust MVSA (RMVSA), which is a version of MVSA robust to noise. As in RMVES, the robustness is achieved by employing chance constraints, which control the volume of the resulting simplex. RMVSA differs, however, substantially from RMVES in the way optimization is carried out. The effectiveness of RVMSA is illustrated by comparing its performance in simulated data with the state-of-the-art.


TH-L14-4: A Stochastic 3MG Algorithm with Application to 2D Filter Identification

Emilie Chouzenoux (Université Paris-Est Marne-la-Vallée, France); Anisia Florescu (Dunarea de Jos University, Galati, Romania); Jean-Christophe Pesquet (Université Paris-Est, France)

Abstract: Stochastic optimization plays an important role in solving many problems encountered in machine learning or adaptive processing. In this context, the second-order statistics of the data are often unknown a priori or their direct computation is too intensive, and they have to be estimated on-line from the related signals. In the context of batch optimization of an objective function being the sum of a data fidelity term and a penalization (e.g. a sparsity promoting function), Majorize-Minimize (MM) subspace methods have recently attracted much interest since they are fast, highly flexible and effective in ensuring convergence. The goal of this paper is to show how these methods can be successfully extended to the case when the cost function is replaced by a sequence of stochastic approximations of it. Simulation results illustrate the good practical performance of the proposed MM memory gradient algorithm when applied to 2D filter identification.


TH-L14-5: Total Variation Denoising Using Iterated Conditional Expectation

Cécile Louchet (Université d'Orléans, France); Lionel Moisan (Université Paris Descartes, France)

Abstract: We propose a new variant of the celebrated Total Variation image denoising model of Rudin, Osher and Fatemi, that provides results very similar to the Bayesian posterior mean variant (TV-LSE) while showing a much better computational efficiency. This variant is based on an iterative procedure which is proved to converge linearly to a fixed point satisfying a marginal conditional mean property. The implementation is simple, provided numerical precision issues are correctly handled. Experiments show that the proposed variant yields results that are very close to those obtained with TV-LSE and avoids as well the so-called staircasing artifact observed with classical Total Variation denoising.


TH-L14-6: Small-variance Asymptotics of Hidden Potts-MRFs: Application to Fast Bayesian Image Segmentation

Marcelo Pereyra (University of Bristol, United Kingdom); Steve McLaughlin (Heriot Watt University, United Kingdom)

Abstract: This paper presents a new approximate Bayesian estimator for hidden Potts-Markov random fields, with application to fast K-class image segmentation. The estimator is derived by conducting a small-variance-asymptotic analysis of an augmented Bayesian model in which the spatial regularisation and the integer-constrained terms of the Potts modelled are decoupled. This leads to a new image segmentation methodology that can be efficiently implemented in large 2D and 3D scenarios by using modern convex optimisation techniques. Experimental results on synthetic and real images as well as comparisons with state-of-the-art algorithms confirm that the proposed methodology converges extremely fast and produces accurate segmentation results in only few iterations.



Session TH-L15: High Dynamic Range imaging: Providing a step change in imaging technology (Special Session)


TH-L15-1: CG-TMO: A Local Tone Mapping for Computer Graphics Generated Content

Francesco Banterle (ISTI-CNR, Pisa, Italy); Alessandro Artusi (Universitat de Girona Spain, Spain); Paolo Francesco Banterle (Universita' di Siena, Italy); Roberto Scopigno (ISTI-CNR, Pisa, Italy)

Abstract: Physically based renderers produce high quality images with physical real-world luminance values. Due to this fact, these images need to be correctly displayed on a screen using tone mapping operators. A typical approach is to blindly apply tone mapping operators without exploiting extra information that comes for free from the modeling process when generating 3D scenes. In this paper, we propose a novel framework for tone mapping high dynamic range (HDR) images which are generated using physically based renderers. Our framework exploits 3D scenes' information from the renderer such as depth, normal, albedo, luminaries, etc. This allows to reduce some assumption typically made during the tone mapping process, and to improve its quality and to reduce its complexity.


TH-L15-2: Improved Tone Mapping Operator for HDR Coding Optimizing the Distortion/Spatial Complexity Trade-Off

Paul Lauga (Télécom ParisTech, France); Giuseppe Valenzise (Institut Mines-Télécom, Télécom ParisTech, CNRS LTCI, France); Giovanni Chierchia (Institut Mines-Télécom, Télécom ParisTech, CNRS LTCI, France); Frederic Dufaux (Télécom ParisTech, France)

Abstract: A common paradigm to code high dynamic range (HDR) image/video content is based on tone-mapping HDR pictures to low dynamic range (LDR), in order to obtain backward compatibility and use existing coding tools, and then use inverse tone mapping at the decoder to predict the original HDR signal. Clearly, the choice of a proper tone mapping is essential in order to achieve good coding performance. The state-of-the- art to design the optimal tone mapping operator (TMO) minimizes the mean-square-error distortion between the original and the predicted HDR image. In this paper, we argue that this is suboptimal in rate-distortion sense, and we propose a more effective TMO design strategy that takes into account also the spatial complexity (which is a proxy for the bitrate) of the coded LDR image. Our results show that the proposed optimization approach enables to obtain substantial coding gain with respect to the minimum-MSE TMO.


TH-L15-3: Rate Distortion Optimized Tone Curve for High Dynamic Range Compression

Mikaël Le Pendu (INRIA, Université de Rennes 1, France); Christine Guillemot (INRIA, France); Dominique Thoreau (Technicolor, France)

Abstract: In this paper, we define a reversible tone mapping-operator (TMO) for efficient compression of High Dynamic Range (HDR) images using a Low Dynamic Range (LDR) encoder. In our compression scheme, the HDR image is tone mapped and encoded. The inverse tone curve is also encoded, so that the decoder can reconstruct the HDR image from the LDR version. Based on a statistical model of the encoder error and assumptions on the rate of the encoded LDR image, we find a closed form solution for the optimal tone curve with respect to the rate and the mean square error (MSE) of the reconstructed HDR image. It is shown that the proposed method gives superior compression performance compared to existing tone mapping operators.


TH-L15-4: Evaluation of LDR, Tone Mapped and HDR Stereo Matching Using Cost-volume Filtering Approach

Tara Akhavan (Technology University of Vienna, Austria); Hyunjin Yoo (Technology University of Vienna, Austria); Margrit Gelautz (Vienna University of Technology, Austria)

Abstract: We present stereo matching solutions based on a fast cost volume filtering approach for High Dynamic Range (HDR)scenes. Multi-exposed stereo images are captured and used to generate HDR and Tone Mapped (TM) images of the left and right views. We perform stereo matching on conventional, Low Dynamic Range (LDR) images, original HDR, as well as TM images by customizing the matching algorithm for each of them. An evaluation on the disparity maps computed from the different approaches demonstrates that stereo matching on HDR images outperforms conventional LDR stereo matching and TM stereo matching, with the most discriminative disparity maps achieved by using HDR radiance information and log-luminance gradient values for matching cost calculation.


TH-L15-5: Analysis of the Consequences of Data Quality and Calibration on 3D HDR Image Generation

Jennifer Bonnard (University of Reims, France); Gilles Valette (University of Reims, France); Jean-Michel Nourrit (University of Reims, France); Celine Loscos (University of Reims, France)

Abstract: We propose to analyze consequences of input data quality on 3D HDR image generation. Input data are images from different viewpoints and different exposures. The ease and precision of 3D HDR images merging depends on how input data are created or acquired. We study the benefits and drawbacks of using an inbuilt multiview camera against a single camera with a simulation on computer generated images. This work builds on a previously published 3D HDR method based on disparity to guide HDR matching. In this paper, we outline the errors that occur when too little precaution is taken, coming on the one hand from poor pixel quality and on the other hand from poor geometrical setup.


TH-L15-6: Real-time Video Based Lighting Using GPU Raytracing

Joel Kronander (Linköping University, Sweden); Johan Dahlin (Linköping University, Sweden); Daniel Jönsson (Linköping University, Sweden); Manon Kok (Linköping University, Sweden); Thomas B. Schön (Uppsala University, Sweden); Jonas Unger (Linköping University, Sweden)

Abstract: The recent introduction of HDR video cameras has enabled the development of image based lighting techniques for rendering virtual objects illuminated with temporally varying real world illumination. A key challenge in this context is that rendering realistic objects illuminated with video environment maps is computationally demanding. In this work we present a GPU based rendering system based on the NVIDIA OptiX~\cite{Parker:2010eu} framework enabling real time raytracing of scenes illuminated with video environment maps. For this purpose we explore and compare several Monte Carlo sampling approaches, including bidirectional importance sampling, multiple importance sampling and sequential Monte Carlo samplers. While previous work have focused on synthetic data and overly simple environment maps sequences, we we have collected a set of real world dynamic environment map sequences using a state-of-art HDR video camera for evaluation and comparisons.



Session TH-P1: Machine Learning I


TH-P1-1: Comparing Initialisation Methods for the Heuristic Memetic Clustering Algorithm

Bart Craenen (Brunel University, United Kingdom); Tapani Ristaniemi (University of Jyväskylä, Finland); Asoke Nandi (Brunel University, United Kingdom)

Abstract: This study investigates the effect of applying different initialisation methods to the Heuristic Memetic Clustering Algorithm (HMCA). Five initialisation methods commonly used by other types of clustering algorithms are examined. The effect on performance is demonstrated in an extensive experimental comparison on three benchmark datasets between the HMCA and k-Medoids. Analysis of the resulting effectiveness and efficiency metrics shows the HMCA substantially outperforming k-Medoids, with the HMCA capable of finding better clusterings using substantially less computational effort. The Sample and Cluster initialisation methods were found to be the most suitable for the HMCA, with our results suggesting this to be the case for other algorithms as well.


TH-P1-2: Sparse Matrix Decompositions for Clustering

Thomas Blumensath (University of Southampton, United Kingdom)

Abstract: Clustering can be understood as a matrix decomposition problem, where a feature vector matrix is represented as a product of two matrices, a matrix of cluster centres and a matrix with sparse columns, where each column assigns individual features to one of the cluster centres. This matrix factorisation is the basis of classical clustering methods, such as those based on non-negative matrix factorisation but can also be derived for other methods, such as k-means clustering. In this paper we derive a new method that combines some aspects of both, non-negative matrix factorisation and k-means clustering. We demonstrate empirically that the new approach outperforms other methods on a host of examples.


TH-P1-3: Boosting the Weights of Positive Words in Image Retrieval

Emmanouil Giouvanakis (Aristotle University of Thessaloniki, Greece); Constantine Kotropoulos (Aristotle University of Thessaloniki, Greece)

Abstract: In this paper, an image retrieval system based on the bag-of-words model is developed, which contains a novel query expansion technique. SIFT image features are computed using the Hessian-Affine keypoint detector. All feature descriptors are taken into account for the bag-of-words representation by dividing the full set of descriptors into a number of subsets. For each subset, a partial vocabulary is created and the final vocabulary is obtained by the union of the partial vocabularies. In the query expansion technique proposed, an SVM classifier is trained in order to obtain a decision boundary between the top ranked and the bottom ranked images. Treating this boundary as a new query, words appearing exclusively in top-ranked images are further boosted by rewarding them with larger weights. The images are re-ranked with respect to the their distance from the new boosted query. It is proved that this strategy improves image retrieval performance.


TH-P1-4: Gait Feature Selection in Walker-Assisted Gait Using NSGA-II and SVM Hybrid Algorithm

Maria Martins (Minho University, Portugal); Cristina dos Santos (University of Minho, Guimarães, Portugal); Lino Costa (Minho University, Portugal); Anselmo Frizera (Federal University of Espirito Santo, Brazil)

Abstract: Nowadays, walkers are prescribed based on subjective standards that lead to incorrect indication of such devices to patients. This leads to the increase of dissatisfaction and occurrence of discomfort and fall events. Therefore, it is necessary to objectively evaluate the effects that walker can have on the gait patterns of its users, comparatively to non-assisted gait. A gait analysis, focusing on spatiotemporal and kinematics parameters, will be issued for this purpose. However, gait analysis yields redundant information and this study addresses this problem by selecting the most relevant gait features required to differentiate between assisted and non-assisted gait. In order to do this, it is proposed an approach that combines multi-objective genetic and support vector machine algorithms to discriminate differences. Results with healthy subjects have shown that the main differences are characterized by balance and joints excursion. Thus, one can conclude that this technique is an efficient feature selection approach.


TH-P1-5: Multiclass Ridge-adjusted Slack Variable Optimization Using Selected Basis for Fast Classification

Yinan Yu (Chalmers University of Technology, Sweden); Konstantinos Diamantaras (TEI of Thessaloniki, Greece); Tomas McKelvey (Chalmers University of Technology, Sweden); S. y. Kung (Princeton University, USA)

Abstract: Ridge-adjusted Slack Variable Optimization (RiSVO) is a recently proposed classification algorithm dealing with large scaled data. The advantages including fast computations and convergence have been presented. In this paper, the multiclass RiSVO is presented. The contributions of this multiclass technique are as follows. (1) Active training set selection for improving robustness and performance. (2) Using the inclusion property to reduce computations. The inclusion property means that once a pattern is excluded, it will no longer return to the active training set and therefore can be permanently removed from the training procedure. (3) A new algorithm using partial RKHS basis for representing solution vectors is presented to further reduce the complexity. Moreover, the label information encoding scheme allows the computational complexity to remain the same as its binary counterpart. The proposed techniques are evaluated on standard multiclass datasets MNIST, USPS, pendigits and letter which could be easily compared with existing results.


TH-P1-6: Bayesian Classification and Active Learning Using Lp-Priors. Application to Image Segmentation

Pablo Ruiz (Universidad de Granada, Spain); Nicolás Pérez de la Blanca (University of Granada, Spain); Rafael Molina (Universidad de Granada, Spain); Aggelos K. Katsaggelos (Northwestern University, USA)

Abstract: In this paper we utilize Bayesian modeling and inference to learn a softmax classification model which performs Supervised Classification and Active Learning. For p < 1, lp-priors are used to impose sparsity on the adaptive parameters. Using variational inference, all model parameters are estimated and the posterior probabilities of the classes given the samples are calculated. A relationship between the prior model used and the independent Gaussian prior model is provided. The posterior probabilities are used to classify new samples and to define two Active Learning methods to improve classifier performance: Minimum Probability and Maximum Entropy. In the experimental section the proposed Bayesian framework is applied to Image Segmentation problems on both synthetic and real datasets, showing higher accuracy than state-of-the-art approaches.


TH-P1-7: Piecewise Nonlinear Regression Via Decision Adaptive Trees

Nuri Denizcan Vanli (Bilkent University, Turkey); Muhammed O Sayin (Bilkent University, Turkey); Salih Ergut (AveaLabs, Turkey); Suleyman Serdar Kozat (Bilkent University, Turkey)

Abstract: We investigate the problem of adaptive nonlinear regression and introduce tree based piecewise linear regression algorithms that are highly efficient and provide significantly improved performance with guaranteed upper bounds in an individual sequence manner. We partition the regressor space using hyperplanes in a nested structure according to the notion of a tree. In this manner, we introduce an adaptive nonlinear regression algorithm that not only adapts the regressor of each partition but also learns the complete tree structure with a computational complexity only polynomial in the number of nodes of the tree. Our algorithm is constructed to directly minimize the final regression error without introducing any ad-hoc parameters. Moreover, our method can be readily incorporated with any tree construction method as demonstrated in the paper.


TH-P1-8: Comprehensive Lower Bounds on Sequential Prediction

Nuri Denizcan Vanli (Bilkent University, Turkey); Muhammed O Sayin (Bilkent University, Turkey); Salih Ergut (AveaLabs, Turkey); Suleyman Serdar Kozat (Bilkent University, Turkey)

Abstract: We study the problem of sequential prediction of real-valued sequences under the squared error loss function. While refraining from any statistical and structural assumptions on the underlying sequence, we introduce a competitive approach to this problem and compare the performance of a sequential algorithm with respect to the large and continuous class of parametric predictors. We define the performance difference between a sequential algorithm and the best parametric predictor as "regret", and introduce a guaranteed worst-case lower bounds to this relative performance measure. In particular, we prove that for any sequential algorithm, there always exists a sequence for which this regret is lower bounded by zero. We then extend this result by showing that the prediction problem can be transformed into a parameter estimation problem if the class of parametric predictors satisfy a certain property, and provide a comprehensive lower bound to this case.


TH-P1-9: On the Segmentation of Switching Autoregressive Processes by Nonparametric Bayesian Methods

Shishir Dash (Stony Brook University, USA); Petar M. Djurić (Stony Brook University, USA)

Abstract: We demonstrate the use of a variant of the nonparametric Bayesian (NPB) forward-backward (FB) method for sampling state sequences of hidden Markov models (HMMs), when the continuous-valued observations follow autoregressive (AR) processes. The goal is to get an accurate representation of the posterior probability of the state-sequence configuration. The advantage of using NPB samplers towards this end is well-known; one need not specify (or heuristically estimate) the number of states present in the model. Instead one uses hierarchical Dirichlet processes (HDPs) as priors for the state-transition probabilities to account for a potentially infinite number of states. The FB algorithm is known to increase the mixing rate of such samplers (compared to direct Gibbs), but can still yield significant spread in segmentation error. We show that by approximately integrating out some parameters of the model, one can alleviate this problem considerably.


TH-P1-10: Joint Low-Rank Representation and Matrix Completion Under a Singular Value Thresholding Framework

Christos Tzagkarakis (FORTH-ICS and University of Crete, Greece); Stephen Becker (IBM T. J. Watson Research Center, Yorktown Heights, New York, USA); Athanasios Mouchtaris (Foundation for Research and Technology-Hellas, Greece)

Abstract: Matrix completion is the process of estimating missing entries from a matrix using some prior knowledge. Typically, the prior knowledge is that the matrix is low-rank. In this paper, we present an extension of standard matrix completion that leverages prior knowledge that the matrix is low-rank and that the data samples can be efficiently represented by a fixed known dictionary. Specifically, we compute a low-rank representation of a data matrix with respect to a given dictionary using only a few observed entries. A novel modified version of the singular value thresholding (SVT) algorithm named joint low-rank representation and matrix completion SVT (J-SVT) is proposed. Experiments on simulated data show that the proposed J-SVT algorithm provides better reconstruction results compared to standard matrix completion.


TH-P1-11: Weight Moment Conditions for L^4 Convergence of Particle Filters for Unbounded Test Functions

Isambi Mbalawata (Lappeenranta University of Technology, Finland); Simo Särkkä (Aalto University, Finland)

Abstract: Particle filters are important approximation methods for solving probabilistic optimal filtering problems on nonlinear non-Gaussian dynamical systems. In this paper, we derive novel moment conditions for importance weights of sequential Monte Carlo based particle filters, which ensure the L^4 convergence of particle filter approximations of unbounded test functions. This paper extends the particle filter convergence results of Hu, Schön and Ljung (2008) and Mbalawata and Särkkä (2014) by allowing for a general class of potentially unbounded importance weights and hence more general importance distributions. The result shows that provided that the seventh order moment is finite, then a particle filter for unbounded test functions with unbounded importance weights are ensured to converge.


TH-P1-12: Detrended Fluctuation Analysis for Empirical Mode Decomposition Based Denoising

Ahmet Mert (Piri Reis University, Turkey); Aydin Akan (Istanbul University, Turkey)

Abstract: Empirical mode decomposition (EMD) is a recently proposed method to analyze non-linear and non-stationary time series by decomposing them into intrinsic mode functions (IMFs). One of the most popular application of such a method is noise elimination. EMD based denoising methods require a robust threshold to determine which IMFs are noise related components. Hence, detrended fluctuation analysis (DFA) is suggested to define such a threshold. The scaling exponential obtained by the root mean squared fluctuation is capable of distinguishing uncorrelated white Gaussian noise and anticorrelated signals. Therefore, in our method the slope of the scaling exponent is used as the threshold for EMD based denoising. IMFs with lower slope than the threshold are assumed to be noisy oscillations and excluded in the reconstruction phase. The proposed method is tested on various signal to noise ratios (SNR) to show its denoising performance and reliability compared with wavelet denoising.


TH-P1-13: Nonlinear System Identification Using Constellation Based Multiple Model Adaptive Estimators

João C Martins (Instituto Politécnico de Beja, Portugal); José Caeiro (Grupo de Sistemas de Processamento de Sinal – SIPS/INESC-ID, Portugal); Leonel A Sousa (INESC-ID / IST, Technical University of Lisbon, Portugal)

Abstract: This paper describes the application of the constellation based multiple model adaptive estimation (CBMMAE) algorithm to the identification and parameter estimation of nonlinear systems . This method was successfully applied to the identification of linear systems both stationary and nonstationary, being able to fine tune its parameters. The method starts by establishing a minimum set of models that are geometrically arranged in the space spanned by the unknown parameters, and adopts a strategy to adaptively update the constellation's models in the parameter space in order to find the model resembling the system under identification. By downscaling the models' parameters the constellation is shrunk, reducing te uncertainty of the parameters estimation. Simulations are presented to exhibit the application of the framework and the performance of the algorithm to the identification and parameters estimation of nonlinear systems.


TH-P1-14: Iterative Label Propagation on Facial Images

Olga Zoidi (Aristotle University of Thessaloniki, Greece); Anastasios Tefas (Aristotle University of Thessaloniki, Greece); Nikos Nikolaidis (Aristotle University of Thessaloniki, Greece); Ioannis Pitas (Aristotle University of Thessaloniki, Greece)

Abstract: In this paper a novel method is introduced for propagating person identity labels on facial images in an iterative manner. The proposed method takes into account information about the data structure, obtained through clustering. This information is exploited in two ways: to regulate the similarity strength between the data and to indicate which samples should be selected for label propagation initialization. The proposed method can also find application in label propagation on multiple graphs. The performance of the proposed Iterative Label Propagation (ILP) method was evaluated on facial images extracted from stereo movies. Experimental results showed that the proposed method outperforms state of the art methods either when only one or both video channels are used for label propagation.



Session TH-P2: Speech Processing I


TH-P2-1: On the Use of Artificial Neural Network to Predict Denoised Speech Quality

Anis Ben Aicha (SUPCOM, Tunisia)

Abstract: Existed objective criteria for denoised speech assessment have as output one score indicating the quality of processed speech. Even it is well useful when it is about comparing denoised techniques between each others, they failed to give with enough accuracy an idea about the real corresponding Mean Opinion Score rate (MOS). In this paper, we propose a new methodology to estimate MOS score of denoised speech. Firstly, a statistical study of existed criteria based on boxplot and Principal Component Analysis (PCA) analysis yields to select the most relevant criteria. Then, an Artificial Neural Network (ANN) trained in selected objective criteria applied on the denoised speech is used. Unlike traditional criteria, the proposed method can give a significant objective score directly interpreted as an estimation of real MOS score. Experimental results show that the proposed method leads to more accurate estimation of the MOS score of the denoised speech.


TH-P2-2: Automatic Recognition of Wideband Telephone Speech with Limited Amount of Matched Training Data

Patrick Bauer (Technische Universität Braunschweig, Germany); Johannes Abel (Technische Universität Braunschweig, Germany); Volker Fischer (European Media Laboratory GmbH, Germany); Tim Fingscheidt (Technische Universität Braunschweig, Germany)

Abstract: Automatic speech recognition (ASR) for wideband (WB) telephone speech services must cope with a lack of matching speech databases for acoustic model training. This paper investigates the impact of mixing insufficient WB and additional narrowband (NB) speech training data. It turns out that decimation and interpolation techniques, reducing the bandwidth mismatch between the NB speech material in training and the WB speech data to be recognized, do not succeed in outperforming the pure NB ASR baseline. However, true WB ASR training supported by artificial bandwidth extension (ABE) reveals a performance gain. A new ABE approach that makes use of robust dynamic features and a Viterbi path decoder exploiting phonetic a priori knowledge proves to be superior. It yields a reduction of 1.9% word error rate relative to the NB ASR baseline and 9.3% relative to a WB ASR experiment trained on only a limited amount of WB speech data.


TH-P2-3: Effect of MPEG Audio Compression on Vocoders Used in Statistical Parametric Speech Synthesis

Bajibabu Bollepalli (KTH Royal Institute of Technology, Sweden); Tuomo Raitio (Aalto University, Finland)

Abstract: This paper investigates the effect of MPEG audio compression on HMM-based speech synthesis using two state-of-the-art vocoders. Speech signals are first encoded with various compression rates and analyzed using the GlottHMM and STRAIGHT vocoders. Objective evaluation results show that the parameters of both vocoders gradually degrade with increasing compression rates, but with a clear increase in degradation with bit-rates of 32 kbit/s or less. Experiments with HMM-based synthesis with the two vocoders show that the degradation in quality is already perceptible with bit-rates of 32 kbit/s and both vocoders show similar trend in degradation with respect to compression ratio. The most perceptible artefacts induced by the compression are spectral distortion and reduced bandwidth, while prosody is better preserved.


TH-P2-4: Source-based Error Mitigation for Speech Transmissions Over Erasure Channels

Domingo López-Oller (University of Granada, Spain); Angel Manuel Gomez Garcia (University of Granada, Spain); Jose L. Perez-Cordoba (University of Granada, Spain)

Abstract: In this paper we present a new mitigation technique for lost speech frames transmitted over loss-prone packet networks. It is based on an MMSE estimation from the last received frame, which provides replacements not only for the LPC coefficients (envelope) but also for the residual signal (excitation). Although the method is codec-independent, it requires a VQ-quantization of the LPC coefficients and the residual. Thus, in this paper we also propose a novel VQ quantization scheme for the residual signal based on the minimization of the squared synthesis error. The performance of our proposal is evaluated over the iLBC codec in terms of speech quality using PESQ and MUSHRA tests. This new mitigation technique achieves a noticeable improvement over the legacy codec under adverse channel conditions with no increase of bitrate and without any delay in the decoding process.


TH-P2-5: Modified Sphere Decoding Algorithms and Their Applications to Some Sparse Approximation Problems

Przemyslaw Dymarski (Warsaw University of Technology, Poland); Rafał Romaniuk (Warsaw University of Technology, Poland)

Abstract: This work presents modified Sphere Decoding (MSD) algorithms for optimal solution of some sparse signal modeling problems. These problems include e.g. multi-pulse excitation signal calculation for MPE, ACELP and MP-MLQ speech coders, multistage Vector Quantization and MIMO communications in fading channels. Using the proposed MSD and sparse MSD (SMSD) algorithms, the optimal solution of these problems may be obtained at substantially lower computational cost, as compared with full search algorithm. The SMSD algorithms are compared with a series of suboptimal approaches (like zero-forcing, global replacement, M-best search and Optimized Orthogonal Matching Pursuit) in sparse approximation of correlated Gaussian signals and low delay speech coding tasks.


TH-P2-6: Speech Rate Determination by Vowel Detection on the Modulated Energy Envelope

Tomas Dekens (Vrije Universiteit Brussel, Belgium); Heidi Martens (Antwerp University Hospital, Belgium); Gwen Van Nuffelen (Antwerp University Hospital, Belgium); Marc De Bodt (Antwerp University Hospital, Belgium); Werner Verhelst (Vrije Universiteit Brussel, Belgium)

Abstract: In this paper we propose a new algorithm to detect vowels in a speech utterance and infer the rate at which speech was produced. To achieve this we determine a smooth trajectory that corresponds to a high frequency energy envelope, modulated by the low frequency energy content. Peak picking performed on this trajectory gives an estimate of the number of vowels in the utterance. To dispose of falsely detected vowels, a peak pruning post processing step is incorporated. Experimental results show that the proposed algorithm is more accurate than the two speech rate determination algorithms on which it was inspired.


TH-P2-7: Watermarking of Speech Signals Based on Formant Enhancement

Shengbei Wang (Japan Advanced Institute of Science and Technology, Japan); Unoki Masashi (Japan Advanced Institute of Science and Technology, Japan)

Abstract: This paper proposes a speech watermarking method based on formant enhancement. The line spectral frequencies (LSFs) which can stably represent the formants were first derived from the host speech signal by linear prediction (LP) analysis. A pair of LSFs were then symmetrically controlled to enhance formants for watermark embedding. Two kinds of objective experiments regarding inaudibility and robustness were carried out to evaluate the proposed method in comparison with three other typical methods. The results indicated that the proposed method could not only satisfy inaudibility but also provide good robustness against different speech codecs and general processing, while the other methods encountered problems.


TH-P2-8: A Low Distortion Noise Canceller with a Novel Stepsize Control and Conditional Cancellation

Akihiko K. Sugiyama (NEC Corporation, Japan); Ryoji Miyahara (NEC Engineering Ltd., Japan)

Abstract: This paper proposes a low-distortion noise canceller with a novel stepsize control and conditional cancellation. The coefficient adaptation stepsize is controlled by two factors; an estimated signal-to-noise ratio (SNR) at the primary input and a relative coefficient magnitude normalized by the reference power. The SNR is estimated based on the noise replica and the output, and converted to a stepsize by an exponential function. This stepsize provides robustness to interference by the desired speech to the error. Conditional cancellation guarantees that the noisy signal power is reduced by noise-replica subtraction. Comparison of the proposed noise canceller with five popular state-of-the-art commercial smartphones demonstrates good enhanced-signal quality with as much as 0.6 PESQ improvement.


TH-P2-9: LBP Based Recursive Averaging for Babble Noise Reduction Applied to Automatic Speech Recognition

Qiming Zhu (University of Strathclyde, Glasgow, United Kingdom); John J Soraghan (University of Strathclyde, United Kingdom)

Abstract: Improved automatic speech recognition (ASR) in babble noise conditions continues to pose major challenges. In this paper, we propose a new local binary pattern (LBP) based speech presence indicator (SPI) to distinguish speech and non-speech components. Babble noise is subsequently estimated using recursive averaging. In the speech enhancement system optimally-modified log-spectral amplitude (OMLSA) uses the estimated noise spectrum obtained from the LBP based recursive averaging (LRA). The performance of the LRA speech enhancement system is compared to the conventional improved minima controlled recursive averaging (IMCRA). Segmental SNR improvements and perceptual evaluations of speech quality (PESQ) scores show that LRA offers superior babble noise reduction compared to the IMCRA system. Hidden Markov model (HMM) based word recognition results show a corresponding improvement.


TH-P2-10: A Speaker Rediarization Scheme for Improving Diarization in Large Two-Speaker Telephone Datasets

Houman Ghaemmaghami (Queensland University of Technology, Australia); David Dean (Queensland University of Technology, Australia); Sridha Sridharan (Queensland University of Technology, Australia)

Abstract: In this paper we propose a novel scheme for carrying out speaker diarization in an iterative manner. We aim to show that the information obtained through the first pass of speaker diarization can be reused to refine and improve the original diarization results. We call this technique speaker rediarization and demonstrate the practical application of our rediarization algorithm using a large archive of two-speaker telephone conversation recordings. We use the NIST 2008 SRE summed telephone corpora for evaluating our speaker rediarization system. This corpus contains recurring speaker identities across independent recording sessions that need to be linked across the entire corpus. We show that our speaker rediarization scheme can take advantage of inter-session speaker information, linked in the initial diarization pass, to achieve a 30% relative improvement over the original diarization error rate (DER) after only two iterations of rediarization.



Session TH-P3: Machine Learning II


TH-P3-1: A State-Space Approach to Modeling Functional Time Series Application to Rail Supervision

Allou Samé (IFSTTAR, France); Hani El Assaad (IFSTTAR, France)

Abstract: This article introduces a state-space model for the dynamic modeling of curve sequences within the framework of railway switches online monitoring. In this context, each curve has the peculiarity of being subject to multiple changes in regime. The proposed model consists of a specific latent variable regression model whose coefficients are supposed to evolve dynamically in the course of time. The model parameters are identified online using a recursive variant of the Expectation-Maximization (EM) algorithm whose M-step involves Kalman filtering recursions. The experimental study conducted on two real power consumption curve sequences from the French high speed network has shown encouraging results.


TH-P3-2: Non-Redundant Gradient Semantic Local Binary Patterns for Pedestrian Detection

Jiu Xu (Waseda University, Japan); Ning Jiang (Waseda University, Japan); Satoshi Goto (Waseda University, Japan)

Abstract: In this paper, a feature named Non-Redundant Gradient Semantic Local Binary Patterns (NRGSLBP) is proposed for pedestrian detection as a modified version of conventional Semantic Local Binary Patterns (SLBP). Calculations of this feature are carried out for both intensity and gradient magnitude image so that texture and gradient information are combined. Moreover, non-redundant patterns are adopted on SLBP for the first time, allowing better discrimination. Compared with SLBP, no additional cost of the feature dimensions NRGSLBP is necessary and the calculation complexity is considerably smaller than that of other features. Experimental results on several datasets show that the detection rate of our proposed feature outperforms those of other features such as Histogram of Orientated Gradient (HOG), Histogram of Templates (HOT), Bidirectional Local Template Patterns (BLTP), Gradient Local Binary Patterns (GLBP), SLBP and Covariance matrix (COV).


TH-P3-3: Modelling Temporal Variations by Polynomial Regression for Classification of Radar Tracks

Lars Jochumsen (Aalborg University, Denmark); Jan Østergaard (Aalborg University, Denmark); Søren Holdt Jensen (Aalborg University, Denmark); Morten Pedersen (Terma A/S, Denmark)

Abstract: The sampling rate of a radar is often too low to reliably capture the acceleration of moving targets such as birds. Moreover, the sampling rate depends upon the target's speed and heading and will therefore generally be time varying. When classifying radar tracks using temporal features, too low or highly varying sampling rates therefore deteriorates the classifier's performance. In this work, we propose to model the temporal variations of the target's speed by low-order polynomial regression and use this to obtain the conditional statistics of the target's speed at some future time given its speed at the current time. When used in a classifier based on Gaussian mixture models and with real radar data, it is shown that the inclusions of conditional statistics describing the targets temporal variations, leads to a substantial improvement in the overall classification performance.


TH-P3-4: Joint Blind Source Separation of Multidimensional Components: Model and Algorithm

Dana Lahat (Gipsa-Lab, France); Christian Jutten (GIPSA-Lab, France)

Abstract: This paper deals with joint blind source separation (JBSS) of multidimensional components. JBSS extends classical BSS to simultaneously resolve several BSS problems by assuming statistical dependence between latent sources across mixtures. JBSS offers some significant advantages over BSS, such as identifying more than one Gaussian white stationary source within a mixture. Multidimensional BSS extends classical BSS to deal with a more general and more flexible model within each mixture: the sources can be partitioned into groups exhibiting dependence within a given group but independence between two different groups. Motivated by various applications, we present a model that is inspired by both extensions. We derive an algorithm that achieves asymptotically the minimal mean square error (MMSE) in the joint separation of Gaussian multidimensional data. We demonstrate the superior performance of this model over a two-step approach, in which JBSS, which ignores the multidimensional structure, is followed by a clustering step.


TH-P3-5: Balance Learning to Rank in Big Data

Guanqun Cao (Tampere University of Technology, Finland); Iftikhar Ahmad (Tampere University of Technology, Finland); Honglei Zhang (Tampere University of Technology, Finland); Weiyi Xie (Tampere University of Technology, Finland); Moncef Gabbouj (Tampere University of Technology, Finland)

Abstract: We propose a distributed learning to rank method, and demonstrate its effectiveness in web-scale image retrieval. With the increasing amount of data, it is not applicable to train a centralized ranking model for any large scale learning problems. In distributed learning, the discrepancy between the training subsets and the whole when building the models are non-trivial but overlooked in the previous work. In this paper, we firstly include a cost factor to boosting algorithms to balance the individual models toward the whole data. Then, we propose to decompose the original algorithm to multiple layers, and their aggregation forms a superior ranker which can be easily scaled up to billions of images. The extensive experiments show the proposed method outperforms the straightforward aggregation of boosting algorithms.


TH-P3-6: On the Need for Metrics in Dictionary Learning Assessment

Sylvain Chevallier (University of Versailles-Saint Quentin, France); Quentin Barthélemy (Mensia Technologies, France); Jamal Atif (Universite Paris Sud Orsay - LRI, TAO, INRIA, France)

Abstract: Dictionary-based approaches are the focus of a growing attention in the signal processing community, often achieving state of the art results in several application fields. Albeit their success, the criteria introduced so far for the assessment of their performances suffer from several shortcomings. The scope of this paper is to conduct a thorough analysis of these criteria and to highlight the need for principled criteria, enjoying the properties of metrics. Henceforth we introduce new criteria based on transportation like metrics and discuss their behaviors w.r.t the literature.


TH-P3-7: A Latent Variable-Based Bayesian Regression to Address Recording Replications in Parkinson's Disease

Carlos Pérez (University of Extremadura, Spain); Lizbeth Naranjo (University of Extremadura, Spain); Jacinto Martín (University of Extremadura, Spain); Yolanda Campos-Roca (University of Extremadura, Spain)

Abstract: Subject-based approaches are proposed to automatically discriminate healthy people from those with Parkinson's Disease (PD) by using speech recordings. These approaches have been applied to one of the most used PD datasets, which contains repeated measurements in an imbalanced design. Most of the published methodologies applied to perform classification from this dataset fail to account for the dependent nature of the data. This artificially increases the sample size and leads to a diffuse criterion to define which subject is suffering from PD. The first proposed approach is based on data aggregation. This reduces the sample size, but defines a clear criterion to discriminate subjects. The second one handles repeated measurements by introducing latent variables in a Bayesian logistic regression framework. The proposed approaches are conceptually simple, computationally low-cost and easy-to-implement.


TH-P3-8: A Family of Hierarchical Clustering Algorithms Based on High-Order Dissimilarities

Helena Aidos (Instituto de Telecomunicações, Instituto Superior Técnico, Portugal); Ana Fred (I.S.T. - Technical U. Lisbon / I.T. Lisbon, Portugal)

Abstract: Traditional hierarchical techniques are used in many areas of research. However, they require the user to set the number of clusters or use some external criterion to find them. Also, they are unable to identify varying internal structures in classes, i.e. classes can be represented as unions of clusters. To overcome these issues, we propose a family of agglomerative hierarchical methods, which integrates a high-order dissimilarity measure, called dissimilarity increments, in traditional linkage algorithms. Dissimilarity increments are a measure over triplets of nearest neighbors. This family of algorithms is able to automatically find the number of clusters using a minimum description length criterion based on the dissimilarity increments distribution. Moreover, each algorithm of the proposed family is able to find classes as unions of clusters, leading to the identification of internal structures of classes. Experimental results show that any algorithm from the proposed family outperforms the traditional ones.


TH-P3-9: Segmentation and Time Frequency Analysis of Pathological Heart Sound Signals Using the EMD Method

Daoud Boutana (University of Jijel, Algeria); Messaoud Benidir (University of Paris 11, France); Braham Barkat (The Petroleum Institute, UAE)

Abstract: The Phonocardiogram (PCG) is the graphical representation of acoustic energy due to the mechanical cardiac activity. Sometimes cardiac diseases provide pathological murmurs mixed with the main components of the Heart Sound Signal (HSs). The Empirical Mode Decomposition (EMD) allows decomposing a multicomponent signal into a set of monocomponent signals, called Intrinsic Mode Functions (IMFs). Each IMF represents an oscillatory mode with one instantaneous frequency. The goal of this paper is to segment some pathological HSs by selecting the most appropriate IMFs using the correlation coefficient. Then we extract some time-frequency characteristics considered as useful parameters to distinguish different cases of heart diseases. The experimental results conducted on some real-life pathological HSs such as: Mitral Regurgitation (MR), Aortic Regurgitation (AR) and the Opening Snap (OS) case; revealed the performance of the proposed method.


TH-P3-10: An Efficient, Approximate Path-Following Algorithm for Elastic Net-Based Nonlinear Spike Enhancement

Max Little (MIT, USA)

Abstract: Unwanted 'spike noise' in a digital signal is a common problem in digital filtering. However, sometimes the spikes are wanted and other, superimposed, signals are unwanted, and linear, time invariant (LTI) filtering is ineffective because the spikes are wideband - overlapping with independent noise in the frequency domain. So, no LTI filter can separate them, necessitating nonlinear filtering. However, there are applications in which the 'noise' includes drift or smooth signals for which LTI filters are ideal. We describe a nonlinear filter formulated as the solution to an elastic net regularization problem, which attenuates band-limited signals and independent noise, while enhancing superimposed spikes. Making use of known analytic solutions a novel, approximate path-following algorithm is given that provides a good, filtered output with reduced computational effort by comparison to standard convex optimization methods. Accurate performance is shown on real, noisy electrophysiological recordings of neural spikes.



Session TH-P4: Speech Processing II


TH-P4-1: An Automotive Wideband Stereo Acoustic Echo Canceler Using Frequency-Domain Adaptive Filtering

Marc-André Jung (Technische Universität Braunschweig, Germany); Samy Elshamy (Technische Universität Braunschweig, Germany); Tim Fingscheidt (Technische Universität Braunschweig, Germany)

Abstract: We present an improved state-space frequency-domain acoustic echo canceler (AEC), which makes use of Kalman filtering theory to achieve very good convergence performance, particularly in double talk. Our contribution can be considered threefold: The proposed approach is designed to suit an automotive wideband overlap-save (OLS) setup, to operate best in this distinctive use case. Second, we provide a temporal smoothing and overestimation approach for two particular noise covariance matrices to improve echo return loss enhancement (ERLE) performance. Furthermore, we integrate an adapted perceptually transparent decorrelation preprocessor, which makes use of human insensitivity against appropriately chosen frequency-selective phase modulation, to improve robustness against far-end impulse response changes.


TH-P4-2: A Strategy for LF-based Glottal Source & Vocal-Tract Estimation on Stationary Modal Singing

Fernando Villavicencio (Yamaha Corporation, Japan)

Abstract: This paper presents a methodology for extraction and modeling of the glottal source and vocal-tract information. The strategy, focused on stationary modal voice, allows joint source and filter estimation by selection of glottal pulse model candidates driven by a single shape parameter. The vocal-tract information is modelled in terms of the True-Envelope All-Pole model, allowing efficient extraction of the observed filter information on the spectrum after cancellation of the glottal source contribution. According to experimental studies on synthetic and real signals the complete methodology shows competitive performance for estimation of the source and filter elements, allowing adequate resynthesis quality with synthetic glottal excitation after simple optimization of the estimated parameters.


TH-P4-3: Zero Phase Speech Representation for Robust Formant Tracking

Dayana Ribas Gonzalez (Advanced Technologies Application Center, Cuba); Eduardo Lleida (University of Zaragoza, Spain); Jose Ramon Calvo de Lara (Advanced Technologies Application Center, Cuba)

Abstract: In this paper we present a speech representation based on the Linear Predictive Coding of the Zero Phase version of the signal (ZP-LPC) and its robustness in presence of additive noise for robust formant estimation. Two representations are proposed for using in the frequency candidate proposition stage of the formant tracking algorithm: 1) the roots of ZP-LPC and 2) the peaks of its group delay function (GDF). Both of them are studied and evaluated in noisy environments with a synthetic dataset to demonstrate their robustness. Proposed representations are then used in a formant tracking experiment with a speech database. A beam search algorithm is used for selecting the best candidates as formant. Results show that our method outperforms related techniques in noisy test configurations and is a good fit for use in applications that have to work in noisy environments.


TH-P4-4: Gaussian Power Flow Orientation Coefficients for Noise-Robust Speech Recognition

Branislav Gerazov (Faculty of Electrical Engineering and Information Technologies, Ss. Cyril and Methodius University, Macedonia, the former Yugoslav Republic of); Zoran Ivanovski (Ss. Cyril and Methodius University, Macedonia, the former Yugoslav Republic of)

Abstract: Spectro-temporal features have shown a great promise in respect to improving the noise-robustness of Automatic Speech Recognition (ASR) systems. The common approach uses a bank of 2D Gabor filters to process the speech signal spectrogram and generate the output feature vector. This approach suffers from generating a large number of coefficients, thus necessitating the use of feature dimensionality reduction. The proposed Gaussian Power flow Orientation Coefficients (GPOCs) use an alternative approach in which only the largest coefficients output from a bank of 2D Gaussian kernels are used to describe the spectro-temporal patterns of power flow in the auditory spectrogram. Whilst reducing the size of the feature vectors, the algorithm was shown to outperform traditional feature extraction methods, even a reference spectro-temporal approach, for low SNRs. Its performance for high SNRs is comparable but inferior to traditional ASR frontends, while falling behind state-of-the-art algorithms in all noise scenarios.


TH-P4-5: Wake-Up-Word Spotting for Mobile Systems

Andreas Zehetner (Graz University of Technology, Austria); Martin Hagmüller (Graz University of Technology, Austria); Franz Pernkopf (Technical University Graz, Austria)

Abstract: Wake-up-word (WUW) spotting for mobile devices has attracted much attention recently. The aim is to detect the occurrence of very few or only one personalized keyword in a continuous potentially noisy audio signal. The application in personal mobile devices is to activate the device or to trigger an alarm in hazardous situations by voice. In this paper, we present a low-resource approach and results for WUW spotting based on template matching using dynamic time warping and other measures. The recognition of the WUW is performed by a combination of distance measures based on a simple background noise level classification. For evaluation we recorded a WUW spotting database with three different background noise levels, four different speaker distances to the microphone, and ten different speakers. It consists of 480 keywords embedded in continuous audio data.


TH-P4-6: Efficient Rule Scoring for Improved Grapheme-Based Lexicons

William Hartmann, III (LIMSI-CNRS, France); Lori Lamel (CNRS Limsi, France); Jean-Luc Gauvain (LIMSI, France)

Abstract: For many languages, an expert-defined phonetic lexicon may not exist. One popular alternative is the use of a grapheme-based lexicon. However, there may be a significant difference between the orthography and the pronunciation of the language. In previous work, we proposed a statistical machine translation based approach to improving grapheme-based pronunciations. Without knowledge of true target pronunciations, a phrase table was created where each individual rule improved the likelihood of the training data when applied. The approach improved recognition accuracy, but required significant computational cost. In this work, we propose an improvement that increases the speed of the process by more than 80 times without decreasing recognition accuracy.


TH-P4-7: Missing Feature Reconstruction Methods for Robust Speaker Identification

XueLiang Zhang (Inner Mongolia University, P.R. China); Hui Zhang (Inner Mongolia University, P.R. China); Guanglai Gao (Inner Mongolia University, P.R. China)

Abstract: Speaker identification systems perform poorly under noisy conditions. Missing feature techniques have been recently applied to compensate for the influence of noise. These techniques classify the time-frequency points as unreliable or reliable and perform recognition by marginalizing or reconstructing unreliable components. In this study, we present reconstruction methods based on a hybrid generative model which is comprised of deep belief network (DBN) and restricted Boltzmann machine (RBM). Specifically, ideal binary mask (IBM) is firstly computed to indicate time-frequency points as unreliable or reliable. Then we reconstruct unreliable ones by the proposed model iteratively. Finally, reconstructed feature is utilized to speaker identification system. Experiments demonstrate that the proposed method achieves significant performance improvements over conventional missing feature method under a wide range of signal-to-noise ratios.


TH-P4-8: Combining Temporal and Spectral Information for Query-By-Example Spoken Term Detection

Ciro gracia Pons (University Pompeu Fabra, Spain); Xavier Anguera (Telefonica Research, Spain); Xavier Binefa (Universitat Pompeu Fabra, Spain)

Abstract: We present a system for Query-by-Example Spoken Term Detection on zero-resource languages. The system compares speech patterns by representing the signal using two different acoustic models,a Spectral Acoustic (SA) model covering the spectral characteristics of the signal, and a Temporal Acoustic (TA) model covering the temporal evolution of the speech signal. Given a query and a utterance to be compared, first we compute their posterior probabilities according to each of the two models, compute similarity matrices for each model and combine these into a single enhanced matrix. Subsequence-Dynamic Time Warping (S-DTW) algorithm is used to find optimal subsequence alignment paths on this final matrix. Our experiments on data from the 2013 Spoken Web Search (SWS) task at Mediaeval benchmark evaluation show that this approach provides state of the art results and significantly improves both the single model strategies and the standard metric baselines.


TH-P4-9: Analysis of Emotional Speech Using an Adaptive Sinusoidal Model

George P. Kafentzis (University of Crete, Greece); Theodora Yakoumaki (University of Crete, Greece); Athanasios Mouchtaris (Foundation for Research and Technology-Hellas, Greece); Yannis Stylianou (University of Crete, Greece)

Abstract: Processing of emotional (or expressive) speech has gained attention over recent years in the speech community due to its numerous applications. In this paper, an adaptive sinusoidal model (aSM), dubbed extended adaptive Quasi-Harmonic Model - eaQHM, is employed to analyze emotional speech in accurate, robust, continuous, time-varying parameters (amplitude, frequency, and phase). It is shown that these parameters can adequately and accurately represent emotional speech content. Using a well known database of narrow-band expressive speech (SUSAS) we show that very high Signal-to-Reconstruction-Error Ratio (SRER) values can be obtained, compared to the standard sinusoidal model (SM). Formal listening tests on a smaller wideband speech database show that the eaQHM outperforms SM from a perceptual resynthesis quality point of view. Finally, preliminary emotion classification tests show that the parameters obtained from the adaptive model lead to a higher classification score, compared to the standard SM parameters.



Session TH-P5: Image and Video Analysis I


TH-P5-1: Parameter Estimation in Bayesian Blind Deconvolution with Super Gaussian Image Priors

MIguel Vega (University of Granada, Spain); Rafael Molina (Universidad de Granada, Spain); Aggelos K. Katsaggelos (Northwestern University, USA)

Abstract: Super Gaussian (SG) distributions have proven to be very powerful prior models to induce sparsity in Bayesian Blind Deconvolution (BD) problems. Their conjugate based representations make them specially attractive when Variational Bayes (VB) inference is used since their variational parameters can be calculated in closed form with the sole knowledge of the energy function of the prior model. In this work we show how the introduction in the SG distribution of a global strength (not necessary scale) parameter can be used to improve the quality of the obtained restorations as well as to introduce additional information on the global weight of the prior. A model to estimate the new unknown parameter within the Bayesian framework is provided. Experimental results, on both synthetic and real images, demonstrate the effectiveness of the proposed approach.


TH-P5-2: Restoration of Images Corrupted by Mixed Gaussian-Impulse Noise by Iterative Soft-Hard Thresholding

Marko Filipović (Rudjer Boskovic Institute, Croatia); Ante Jukić (University of Oldenburg, Germany)

Abstract: We address the problem of restoration of images which have been affected by impulse or a combination of impulse and Gaussian noise. We propose a patch-based approach that exploits approximate sparse representation of image patches in learned dictionary. For every patch, sparse representation in learned dictionary is enforced by l1-norm penalty, and sparsity of the residual is enforced by l0-quasi-norm penalty. The obtained non-convex problem is solved iteratively by a combination of soft and hard thresholding, and a proof of convergence to a local minimum is given. Experimental evaluation suggests that the proposed approach can produce state-of-the-art results for some types of images, especially in terms of the structural similarity (SSIM) measure.


TH-P5-3: Sparse Reconstruction of Facial Expressions with Localized Gabor Moments

André Mourão (Universidade Nova Lisbon, Portugal); Pedro Borges (Universidade Nova de Lisboa, Portugal); Nuno Correia (Computer Science, Portugal); Joao Magalhaes (Universidade Nova Lisboa, Portugal)

Abstract: Facial expression recognition relies on the accurate detection of a few subtle face traces. Facial expressions are decomposed into a set of small Action Units (AU) corresponding to the movement and position of different face muscles [1]. We propose to (1) decompose facial expressions into regions grouping AUs by their proximity and contour direction; (2) recognize facial expressions with sparse reconstruction methods. We aim at finding the minimal set of facial AU regions that can represent a given expression. Regression with l1 regularization computes the deviation from the average face as an additive model of facial micro-expressions (the AUs). We compared the proposed approach to existing methods on the CK+ [2] and JAFFE datasets [3]. Results indicate that sparse reconstruction with l1 penalty outperforms SVM and k-NN baselines. On the CK+ dataset, the best accuracy (97%) was obtained using sparse reconstruction.


TH-P5-4: A Compressible Template Protection Scheme for Face Recognition Based on Sparse Representation

Yuichi Muraki (Tokyo Metropolitan University, Japan); Masakazu Furukawa (Tokyo Metropolitan University, Japan); Masaaki Fujiyoshi (Tokyo Metropolitan University, Japan); Yoshihide Tonomura (NTT, Japan); Hitoshi Kiya (Tokyo Metropolitan University, Japan)

Abstract: In applications using face recognition, facial images called templates should be securely managed for privacy protection and security. This paper studies a sparse representation-based face recognition system with a new template protection scheme. The proposed scheme uses two transformations for template protection; random pixel permutation and downsampling. Thanks to these transformations, protected templates can be efficiently compressed, whereas conventional schemes do not offer such functionality. Experimental results demonstrate that the system does not degrade face recognition performance even facial templates are protected. Thus, the proposed scheme can reduce the storage size required to keep templates in practical face recognition systems.


TH-P5-5: Near Light Source Location Estimation Using Illumination of a Diffused Planar Background

Nopporn Chotikakamthorn (King Mongkut`s Institute of Technology Ladkrabang, Bangkok, Thailand)

Abstract: The problem of light source location estimation is considered. It is shown that the location of a near light source can be estimated from an optical-depth image pair using information available from an illuminated Lambertian planar background. The method estimates the projected source location on the planar background from the surface illumination gradient. The distance of the light source from the planar background, which is equivalent to its elevation angle, is estimated by fitting the radiance of the background surface as observed by an optical image, with those synthesized at different light distances. The fitting equation is formulated such as the possible existence of ambient and environment lights can be taken into account. Experimental results with real images are provided.


TH-P5-6: A Study on Clustering-based Image Denoising: From Global Clustering to Local Grouping

Mohsen Joneidi (Sharif University of Technology, Iran); Mostafa Sadeghi (Sharif University of Technology, Iran); Mojtaba Sahraee-Ardakan (Sharif University of Technology, Iran); Massoud Babaie-Zadeh (Sharif University of Technology, Iran); Christian Jutten (GIPSA-Lab, France)

Abstract: This paper studies de-noising of images contaminated with additive white Gaussian noise (AWGN). In recent years, clustering-based methods have shown promising performances. In this paper we show that low-rank subspace clustering provides a suitable clustering problem that minimizes the lower bound on the MSE of the de-noising, which is optimum for Gaussian noise. Solving the corresponding clustering problem is not easy. We study some global and local sub-optimal solutions already presented in the literature and show that those that solve a better approximation of our problem result in better performances. A simple image de-noising method based on dictionary learning using the idea of gain-shaped K-means is also proposed as another global suboptimal solution for clustering.


TH-P5-7: Image Warmness - A New Perceptual Feature for Images and Videos

Michail Dimopoulos (Telemotive AG, Germany); Thomas Winkler (IAIS Fraunhofer, Germany)

Abstract: Many basic but very useful features for characterizing an image or calculating the similarity between two images are based on color information. Beyond the tone of color psychological studies show that different colors are also associated with different emotions. Thus, two colors that trigger the same impression are most likely considered to be more similar than two colors, which trigger the opposite impression. We introduce a new feature called image warmness, which is based on the cold or warm impression a single color triggers in the brain of the beholder. Image warmness provides a measure about how cold or warm an entire image is perceived by human beings based on the colors it contains. In a survey and evaluation with 90 images and 101 participants we show, that the values for image warmness calculated by the proposed formula are close to the average rating of the survey participants.


TH-P5-8: An Epipolar-Constrained Prior for Efficient Search in Multi-View Scenarios

Ignacio Bosch (Technicolor, Germany); Jordi Salvador (Technicolor, Germany); Eduardo Pérez-Pellitero (Technicolor, Germany); Javier Ruiz-Hidalgo (Universitat Politecnica de Catalunya, Spain)

Abstract: In this paper we propose a novel framework for fast exploitation of multi-view cues with applicability in different image processing problems. In order to bring our proposed framework into practice, an epipolar-constrained prior is presented, onto which a random search algorithm is proposed to find good matches among the different views of the same scene. This algorithm includes a generalization of the local coherency in 2D images for multi-view wide-baseline cases. Experimental results show that the geometrical constraint allows a faster initial convergence when finding good matches. We present some applications of the proposed framework on classical image processing problems.


TH-P5-9: Optimized Size-adaptive Feature Extraction Based on Content-matched Rational Wavelet Filters

Tan-Toan Le (Pforzheim University, Germany); Mathias Ziebarth (Karlsruhe Institute of Technology, Germany); Thomas Greiner (Pforzheim University, Germany); Michael Heizmann (Fraunhofer IOSB, Germany)

Abstract: One of the challenges of feature extraction in image processing is caused by the fact that objects originating from a feature class don't always appear in a unique size, and the feature sizes are diverse. Hence, a multiresolution analysis using wavelets should be suitable. Because of their integer scaling factors classical dyadic or M-channel wavelet filter banks often don't match very well the corresponding feature sizes occurring within the image. This paper presents a new method to optimally extract features in different sizes by designing a rational biorthogonal wavelet filter bank, which matches both the features' characteristics and the significant sizes of the most dominant features' sizes. This is achieved by matching the rational downsampling factor to the different feature sizes and matching the filter coefficients to the feature characteristics. The presented method is evaluated with the detection of defects on specular surfaces and of contaminations on manufactured metal surfaces.


TH-P5-10: Multiscale Keypoint Analysis with Triangular Biorthogonal Wavelets Via Redundant Lifting

Kensuke Fujinoki (Tokai University, Japan)

Abstract: This paper presents an efficient approach for multiscale keypoint detection based on triangular biorthogonal wavelets. The detection scheme is simple and thus fast as only three isotropic directional components of an image obtained by multiscale decomposition with the triangular biorthogonal wavelets are used for keypoint localization at each scale. Redundant lifting is also considered and can be applied directly to calculate cumulative local energy distribution that is derived from the correction of the three directional components at each scale. This gives the efficient and accurate localization of keypoints including scale information. An experimental result shows that our method is better in the sense of the uniform distribution of keypoints compared with the conventional wavelet-based approach.


TH-P5-11: Pornography Detection Using BossaNova Video Descriptor

Carlos Caetano (Federal University of Minas Gerais, Brazil); Sandra Avila (University of Campinas, Brazil); Silvio Guimarães (PUC Minas, Brazil); Arnaldo Araújo (Federal University of Minas Gerais, Brazil)

Abstract: In certain environments or for certain publics, pornographic content may be considered inappropriate, generating the need to be detected and filtered. Most works regarding pornography detection are based on the detection of human skin. However, a shortcoming of these kind of approaches is related to the high false positive rate in contexts like beach shots or sports. Considering the development of low-level local features and the emergence of mid-level representations, we introduce a new video descriptor, which employs local binary descriptors in conjunction with BossaNova, a recent mid-level representation. Our proposed method outperforms the state-of-the-art on the Pornography dataset.


TH-P5-12: A Texton for Fast and Flexible Gaussian Texture Synthesis

Bruno Galerne (Université Paris Descartes, France); Arthur Leclaire (Université Paris Descartes, France); Lionel Moisan (Université Paris Descartes, France)

Abstract: Gaussian textures can be easily simulated by convolving an initial image sample with a conveniently normalized white noise. However, this procedure is not very flexible (it does not allow for non-uniform grids in particular), and can become computationally heavy for large domains. We here propose an algorithm that summarizes a texture sample into a synthesis-oriented texton, that is, a small image for which the discrete spot noise simulation (summed and normalized randomly-shifted copies of the texton) is more efficient than the classical convolution algorithm. Using this synthesis-oriented texture summary, Gaussian textures can be generated in a faster, simpler, and more flexible way.


TH-P5-13: Determination of Retinal Network Skeleton Through Mathematical Morphology

Sandra Morales (Universitat Politècnica de València, Spain); Valery Naranjo (Universidad Politecnica de Valencia, Spain); Jesus Angulo (MINES ParisTech, France); Fernando López-Mir (Universidad Politécnica de Valencia, Spain); Mariano Alcañiz (Universidad Politécnica de Valencia, Spain)

Abstract: This paper describes a new approach to determine vascular skeleton in retinal images. This approach is based on mathematical morphology along with curvature evaluation. In particular, a variant of the watershed transformation, the stochastic watershed, is applied to extract the vessel centerline. Its goal is to obtain directly the skeleton of the retinal tree avoiding a previous stage of vessel segmentation in order to reduce the dependence between stages and the computational cost. Experimental results show qualitative improvements if the proposed method is compared with other state-of-the-art algorithms, above all on pathological images. Therefore, the result of this work is an efficient and effective vessel centerline extraction algorithm and can be useful for further applications and image-aided diagnosis systems.



Session TH-P6: Signal Estimation and Detection III


TH-P6-1: Moving Target Detection in Airborne MIMO Radar for Fluctuating Target RCS Model

Shabnam Ghotbi (K. N. Toosi University of Technology, Iran); Moein Ahmadi (K. N. Toosi University of Technology, Iran); Mohammad ali Sebt (K. N. Toosi University of Technology, Tehran, Iran)

Abstract: This paper considers the problem of target detection for multiple-input multiple- output with colocated antennas on a moving airborne platform. The target's radar cross section fluctuations degrade the detection performance of the radar. In this paper, first, we introduce a spatiotemporal signal model for airborne colocated MIMO radar which handles arbitrary transmit and receive antennas placement. Then, we employ the likelihood ratio test to derive the decision rules for fluctuating and nonfluctuating targets. In the case of full knowledge of target and interference statistic characteristics, we propose two detectors for fluctuating and non-fluctuating targets. The proposed detector can be used to evaluate adaptive detectors such as Kelly detector where the interference covariance matrix is estimated using training data. Simulation results have been provided to evaluate the detection performance of the proposed detectors.


TH-P6-2: Automatic WH-based Edge Detector in Weibull Clutter

Souad Chabbi (Université Constantine 1, Laboratoire Signaux et Systèmes de Communications, Algeria); Toufik Laroussi (Université Constantine 1, Laboratoire Signaux et Systèmes de Communications, Algeria); Amar Mezache (Département Electronique, Laboratoire Signaux et Systèmes de Communications, Algeria)

Abstract: Assuming a non-stationary Weibull background with no prior knowledge about the presence or not of a clutter edge, we propose and analyze the censoring and detection performances of the automatic censoring Weber-Haykin constant false censoring and alarm rates (ACWH-CFCAR) detector in homogeneous clutter and in the presence of a clutter edge within the reference window. The cfcarness property is assured by use of the Weber-Haykin (WH) adaptive thresholding which escapes the estimation of the distribution parameters. The censoring algorithm starts up by considering the two most left ranked cells and proceeds forward. The selected homogeneous set is used to estimate the unknown background level. Extensive Monte Carlo simulations show that the performances of the proposed detector are similar to those exhibited by the corresponding fixed-point censoring WH-CFAR detector.


TH-P6-3: Source Number Estimation in Non-Gaussian Noise

Gargeshwari V. Anand (PES University, India); P. Nagesh (PES University, India)

Abstract: In this paper a new method of source number estimation in non-Gaussian noise is presented. The proposed signal sub-space identification (SSI) method involves estimation of the array signal correlation matrix and determining the number of positive eigenvalues of the estimated correlation matrix. The SSI method is closely related to minimum residual mean square error estimation of the array signal vector. The method is applied to the problem of estimating the number of plane wave sinusoidal signals impinging on a uniform linear array. It is shown that the performance of the SSI method in non-Gaussian heavy-tailed noise is significantly better than that of the Akaike information criterion (AIC) and Gershgorin disk estimator (GDE) methods.


TH-P6-4: Hybrid Bayesian Variational Scheme to Handle Parameter Selection in Total Variation Signal Denoising

Jordan Frecon (ENS Lyon, France); Nelly Pustelnik (ENS Lyon, France); Nicolas Dobigeon (University of Toulouse, France); Herwig Wendt (IRIT - ENSEEIHT, CNRS, France); Patrice Abry (Ecole Normale Superieure, Lyon, France)

Abstract: Change-point detection problems can be solved either by variational approaches based on total variation or by Bayesian procedures. The former class leads to small computational time but requires the choice of a regularization parameter that significantly impacts the achieved solution and whose automated selection remains a challenging problem. Bayesian strategies avoid this regularization parameter selection, at the price of high computational costs. In this contribution, we propose a hybrid Bayesian variational procedure that relies on the use of a hierarchical Bayesian model while preserving the computational efficiency of total variation optimization procedures. Behavior and performance of the proposed method compare favorably against those of a fully Bayesian approach, both in terms of accuracy and of computational time. Additionally, estimation performance are compared to the Stein unbiased risk estimate, for which the knowledge of the noise variance is needed.


TH-P6-5: Compressed Spectrum Sensing in the Presence of Interference:Comparison of Sparse Recovery Strategies

Eva Lagunas (Universitat Politècnica de Catalunya, Spain); Montse Nájar (Universitat Politècnica de Catalunya, Spain)

Abstract: Existing approaches to Compressive Sensing (CS) of sparse spectrum has thus far assumed models contaminated with noise (either bounded noise or Gaussian with known power). In practical Cognitive Radio (CR) networks, primary users must be detected even in the presence of low-regulated transmissions from unlicensed systems, which cannot be taken into account in the CS model because of their non-regulated nature. In [1], the authors proposed an overcomplete dictionary that contains tuned spectral shapes of the primary user to sparsely represent the primary users' spectral support, thus allowing all frequency location hypothesis to be jointly evaluated in a global unified optimization framework. Extraction of the primary user frequency locations is then performed based on sparse signal recovery algorithms. Here, we compare different sparse reconstruction strategies and we show through simulation results the link between the interference rejection capabilities and the positive semidefinite character of the residual autocorrelation matrix.


TH-P6-6: Compressed Sensing K-Best Detection for Sparse Multi-User Communications

Benjamin Knoop (University of Bremen, Germany); Fabian Monsees (University of Bremen, Germany); Carsten Bockelmann (University of Bremen, Germany); Dagmar Peters-Drolshagen (University of Bremen, Germany); Steffen Paul (University Bremen, Germany); Armin Dekorsy (University of Bremen, Germany)

Abstract: Machine-type communications are quite often of very low data rate and of sporadic nature and therefore not well-suited for nowadays high data rate cellular communication systems. Since signaling overhead must be reasonable in relation to message size, research towards joint activity and data estimation was initiated. When the detection of sporadic multi-user signals is modeled as a sparse vector recovery problem, signaling concerning node activity can be avoided as it was demonstrated in previous works. In this paper we show how well-known K-Best detection can be modified to approximately solve this finite alphabet Compressed Sensing problem. We also demonstrate that this approach is robust against parameter variations and even works in cases where fewer measurements than unknown sources are available.


TH-P6-7: CFAR Detection of Spatially Distributed Targets in K-Distributed Clutter with Unknown Parameters

Nabila Nouar (Laboratoire SISCOM, Algeria); Atef Farrouki (Laboratoire SISCOM, Algeria)

Abstract: The paper deals with Constant False Alarm Rate (CFAR) detection of spatially distributed targets embedded in K-distributed clutter with correlated texture and unknown parameters. The proposed Cell Averaging-based detector automatically selects the suitable pre-computed threshold factor in order to maintain a prescribed Probability of False Alarm (Pfa). The threshold factors should be computed off-line through Monte Carlo simulations for different clutter parameters and correlation degrees. The online estimation procedure of clutter parameters has been implemented using Maximum Likelihood Moments approach. Performances analysis of the proposed detector assumes unknown shape and scale parameters and Multiple Dominant Scattering centers model (MDS) for spatially distributed targets.


TH-P6-8: Distribution Mixtures, A Reduced-Bias Estimation Algorithm

Nicolas Paul (EDF R&D, France); Alexandre Girard (Electricité de France, France); Michel Terré (CNAM, France)

Abstract: We focus on the definition of a new optimization criteria for mixtures of distributions estimation based on an evolution of the K-Product criterion [5]. For the case of monovariate observations we show that the new proposed criterion does not have any local non-global minimizer. This property is also observed for multivariate observations. The relevance of the new K-Product criterion is theoretically studied and analyzed through simulations (in some monovariate cases). We show that for a mixture of three separate uniform components, the distance between the criterion unique minimizer and the true component expectations is less than half the components standard deviation.


TH-P6-9: Support Agnostic Bayesian Recovery of Jointly Sparse Signals

Mudassir Masood (King Abdullah University of Science and Technology (KAUST), Saudi Arabia); Tareq Y. Al-Naffouri (King Abdullah University of Science and Technology, USA)

Abstract: A matching pursuit method using a Bayesian approach is introduced for recovering a set of sparse signals with common support from a set of their measurements. This method performs Bayesian estimates of joint-sparse signals even when the distribution of active elements is not known. It utilizes only the a priori statistics of noise and the sparsity rate of the signal, which are estimated without user intervention. The method utilizes a greedy approach to determine the approximate MMSE estimate of the joint-sparse signals. Simulation results demonstrate the superiority of the proposed estimator.


TH-P6-10: Efficient Spectral Analysis in the Missing Data Case Using Sparse ML Methods

George-Othon Glentis (University of Peloponnese, Greece); Johan Karlsson (KTH Royal Institute of Technology, Sweden); Andreas Jakobsson (Lund University, Sweden); Jian Li (University of Florida, USA)

Abstract: Given their wide applicability, several sparse high-resolution spectral estimation techniques and their implementation have been examined in the recent literature. In this work, we further the topic by examining a computationally efficient implementation of the recent SMLA algorithms in the missing data case. The work is an extension of our implementation for the uniformly sampled case, and offers a notable computational gain as compared to the alternative implementations in the missing data case.


TH-P6-11: Enhanced Radar Imaging Via Sparsity Regularized 2D Linear Prediction

Isin Erer (Istanbul Technical University, Turkey); Koray Sarıkaya (M. Sc., Turkey); Haldun Bozkurt (Istanbul Technical University, Turkey)

Abstract: ISAR imaging based on the 2D linear prediction uses the l2 norm minimization of the prediction error to obtain 2D autoregressive (AR) model coefficients. However, this approach causes many spirous peaks in the resulting image . In this study, a new ISAR imaging method based on the 2D sparse AR modeling of backscattered data is proposed. The 2D model coefficients are obtained by the l2 norm minization of the prediction error penalized by the l1 norm of the prediction coeeficient vector. The resulting 2D prediction coeffient vector is sparse, and its use yields radar images with reduced sidelobes compared to the classical l2 norm minization.


TH-P6-12: A Fast and Accurate Adaptive Notch Filter Using A Monotonically Increasing Gradient

Yosuke Sugiura (Tokyo University of Science, Japan)

Abstract: In order to fast and accurately remove a narrow-band noise embedded in a wide-band signal, we propose a new adaptive algorithm for an adaptive notch filter. The proposed algorithm achieves the fast and accurate estimation of the noise frequency by introducing an monotonically increasing gradient. The enhancement function which additionally introduced into the gradient enables the flexible adjustment of the convergence speed and the estimation accuracy. Several computational simulations show that the proposed algorithm can simultaneously provide fast convergence and high accurate estimation compared with the conventional NLMS algorithm.


TH-P6-13: An Empirical Eigenvalue-Threshold Test for Sparsity Level Estimation From Compressed Measurements

Anastasia Lavrenko (Ilmenau University of Technology, Germany); Florian Roemer (Ilmenau University of Technology, Germany); Giovanni Del Galdo (Fraunhofer Institute for Integrated Circuits IIS, Germany); Reiner S. Thomä (Ilmenau University of Technology, Germany); Orhan Arikan (Bilkent University, Turkey)

Abstract: Compressed sensing allows for a significant reduction of the number of measurements when the signal of interest is of a sparse nature. Most computationally efficient algorithms for signal recovery rely on some knowledge of the sparsity level, i.e., the number of non-zero elements. However, the sparsity level is often not known a priori and can even vary with time.

In this contribution we show that it is possible to estimate the sparsity level directly in the compressed domain, provided that multiple independent observations are available. In fact, one can use classical model order selection algorithms for this purpose. Nevertheless, due to the influence of the measurement process they may not perform satisfactorily in the compressed sensing setup. To overcome this drawback, we propose an approach which exploits the empirical distributions of the noise eigenvalues. We demonstrate its superior performance compared to state-of-the-art model order estimation algorithms numerically.


TH-P6-14: Compressive Sensing with an Overcomplete Dictionary for High-Resolution DFT Analysis

Guglielmo Frigo (University of Padova, Italy); Claudio Narduzzi (Universita' di Padova, Italy)

Abstract: The problem of resolving frequency components close to the Rayleigh threshold, while using time-domain sample sequences of length not greater than N, is relevant to several waveform monitoring applications where acquisition time is upper-bounded. The paper presents a compressive sensing (CS) algorithm that enhances frequency resolution by introducing a dictionary that explicitly accounts for spectral leakage on a fine frequency grid. The proposed algorithm achieves good estimation accuracy without significantly extending total measurement time.


TH-P6-15: Reconstruction of Locally Frequency Sparse Nonstationary Signals From Random Samples

Moeness G. Amin (Villanova University, USA); Branka Jokanovic (Villanova University, USA); Traian Dogaru (US Army Research Lab, USA)

Abstract: The local sparsity property of frequency modulated (FM) signals stems from their instantaneous narrowband characteristics. This enables their reconstruction from few random signal observations over a short time window. It is shown that for linear FM signals, the sparsity of the local frequencies is equal to the window length, thus adding another specification to the window selection requirements, beside the conventional temporal and spectral resolutions. Stable signal reconstruction within a sliding window depends on the underlying probability distribution of the random sampling intervals. Both simulations and computational EM modeling data are used to demonstrate the effectiveness of local reconstructions. We consider both mono-component FM signals and multi-component signals, corresponding to maneuvering targets and human gait Doppler signatures, respectively.


TH-P6-16: Robust SPArsity and Clustering Regularization for Regression

Xiangrong Zeng (Instituto de Telecomunicações, Instituto Superior Técnico, Portugal); Mario A. T. Figueiredo (Instituto Superior Técnico, Portugal)

Abstract: Based on our previously proposed {\it SPARsity and Clustering} (SPARC) regularization, we propose a robust variant of SPARC (RSPARC), which is able to detect observations corrupted by sparse outliers. The proposed RSPARC inherits the ability of SPARC to promote group-sparsity, and combines that ability with robustness to outliers. We propose algorithms of the alternating direction method of multipliers (ADMM) family to solve several regularization formula



Session FR-L01: Speech Processing I


FR-L01-1: Speech Recognition of Multiple Accented English Data Using Acoustic Model Interpolation

Thiago Fraga da Silva (Université Paris Sud, France); Jean-Luc Gauvain (LIMSI, France); Lori Lamel (CNRS Limsi, France)

Abstract: In a previous work [1], we have shown that model interpolation can be applied for acoustic model adaptation for a specific show. Compared to other approaches, this method has the advantage to be highly flexible, allowing rapid adaptation by simply reassigning the interpolation coefficients. In this work this approach is used for a multi-accented English broadcast news data recognition, which can be considered an arduous task due to the impact of accent variability on the recognition performance. The work described in [1] is extended in two ways. First, in order to reduce the parameters of the interpolated model, a theoretically motivated EM-like mixture reduction algorithm is proposed. Second, beyond supervised adaptation, model interpolation is used as an unsupervised adaptation framework, where the interpolation coefficients are estimated on-the-fly for each test segment.


FR-L01-2: Acoustic Model Selection Using Limited Data for Accent Robust Speech Recognition

Maryam Najafian (University of Birmingham, United Kingdom)

Abstract: This paper investigates techniques to compensate for the effects of regional accents of British English on automatic speech recognition (ASR) performance. Given a small amount of speech from a new speaker, is it better to apply speaker adaptation, or to use accent identification (AID) to identify the speaker's accent followed by accent-dependent ASR? Three approaches to accent-dependent modelling are investigated: using the 'correct' accent model, choosing a model using supervised (ACCDIST-based) accent identification (AID), and building a model using data from neighbouring speakers in 'AID space'. All of the methods outperform the accentindependent model, with relative reductions in ASR error rate of up to 44%. Using on average 43s of speech to identify an appropriate accent-dependent model outperforms using it for supervised speaker-adaptation, by 7%.


FR-L01-3: Robust Speech Recognition Using Warped DFT-Based Cepstral Features in Clean and Multistyle Training

Md. Jahangir Alam (Computer Research Institute of Montreal (CRIM), Canada); Patrick Kenny (CRIM, Canada); Pierre Dumouchel (Ecole de technologie superieure, Canada); Douglas O'Shaughnessy (INRS-Énergie-Matériaux-Télécommunications, Canada)

Abstract: This paper investigates the robustness of the warped discrete Fourier transform-based cepstral features for continuous speech recognition. In the MFCC and PLP front-ends, to approximate the nonlinear characteristics of the human auditory system in frequency, the speech spectrum is warped using the Mel-scale filterbank. It is well known that such nonlinear frequency transformation based features provide better speech recognition accuracy than linear frequency scale features. It has been found that warping the DFT spectrum directly provides a more precise approximation to the perceptual scales. WDFT provides non-uniform resolution filter-banks whereas DFT provides uniform resolution filter-banks. Here, we provide a performance evaluation of the following variants of the warped cepstral features: WDFT, and WDFT-linear prediction (WDFT-LP)-based MFCC. Experiments are carried out on the AURORA-4 task and demonstrate that the WDFT-based cepstral features outperform the MFCC and PLP both in clean as well as multistyle training conditions in recognition error rates.


FR-L01-4: Class-Dependent Two-Dimensional Linear Discriminant Analysis Using Two-Pass Recognition Strategy

Peter Viszlay (Technical University of Košice, Slovakia); Martin Lojka (Technical University of Kosice, Slovakia); Jozef Juhár (Technical University of Kosice, Slovakia)

Abstract: In this paper, we introduce a novel class-dependent extension of two-dimensional linear discriminant analysis (2DLDA) named CD-2DLDA, applied in automatic speech recognition using two-pass recognition strategy. In the first pass, the class labels of test sample are obtained using baseline recognition. The labels are then used in CD transformation of test features. In the second pass, recognition of previously transformed test samples is performed using CD-2DLDA acoustic model. The novelty of the paper lies in improvement of the present 2DLDA algorithm by its modification to more precise, class-dependent estimations repeated separately for each class. The proposed approach is evaluated in several scenarios using the TIMIT corpus in phoneme-based continuous speech recognition task. CD-2DLDA features are compared to state-of-the-art MFCCs, conventional LDA and 2DLDA features. The experimental results show that our method performs better than MFCCs and LDA. Furthermore, the results confirm that CD-2DLDA markedly outperforms the 2DLDA method.


FR-L01-5: Evaluation of Speech Enhancement Based on Pre-Image Iterations Using Automatic Speech Recognition

Christina Leitner (JOANNEUM RESEARCH Forschungsgesellschaft mbH, Austria); Juan A. Morales-Cordovilla (Graz University of Technology, Austria); Franz Pernkopf (Technical University Graz, Austria)

Abstract: Recently, we developed pre-image iteration methods for single-channel speech enhancement. We used objective quality measures for evaluation. In this paper, we evaluate the de-noising capabilities of pre-image iterations using an automatic speech recognizer trained on clean speech data. In particular, we provide the word recognition accuracy of the de-noised utterances using white and car noise at 0, 5, 10, and 15 dB signal-to-noise ratio (SNR). Empirical results show that the utterances processed by pre-image iterations achieve a consistently better word recognition accuracy for both noise types and all SNR levels compared to the noisy data and the utterances processed by the generalized subspace speech enhancement method.



Session FR-L02: Image and Video Applications


FR-L02-1: Semi-local Total Variation for Regularization of Inverse Problems

Laurent Condat (University of Grenoble-Alpes, France)

Abstract: We propose the discrete semi-local total variation (SLTV) as a new regularization functional for inverse problems in imaging. The SLTV favors piecewise linear images; so the main drawback of the total variation (TV), its clustering effect, is avoided. Recently proposed primal-dual methods allow to solve the corresponding optimization problems as easily and efficiently as with the classical TV. [yes, this abstract is less than 100 words, but that is the way it is, well-thought ideas can be explained concisely. yes, this abstract is less than 100 words, but that is the way it is, well-thought ideas can be explained concisely]


FR-L02-2: A Hybrid Alternating Proximal Method for Blind Video Restoration

Feriel Abboud (Université Paris-Est Marne-la-Vallée, France); Emilie Chouzenoux (Université Paris-Est Marne-la-Vallée, France); Jean-Christophe Pesquet (Université Paris-Est, France); Jean-Hugues Chenot (INA, Institut National de l'Audiovisuel, France); Louis Laborelli (INA, France)

Abstract: Old analog video sequences suffer from a number of degradations. Some of them can be modeled through convolution with a kernel and an additive noise term.

In this work, we propose a new blind deconvolution algorithm for the restoration of such sequences based on a variational formulation of the problem. Our method accounts for motion between frames, while enforcing some level of temporal continuity through the use of a novel penalty function involving optical flow operators, in addition to an edge-preserving regularization. The optimization process is performed by a proximal alternating minimization scheme benefiting from theoretical convergence guarantees. Simulation results on synthetic and real video sequences confirm the effectiveness of our method.


FR-L02-3: Voting Based Automatic Exudate Detection in Color Fundus Photographs

Pavle Prentašić (University of Zagreb, Faculty of Electrical Engineering and Computing, Croatia); Sven Lončarić (University of Zagreb, Croatia)

Abstract: Diabetic retinopathy is one of the leading causes of preventable blindness. Screening programs using color fundus photographs enable early diagnosis of diabetic retinopathy, which enables timely treatment of the disease. Exudate detection algorithms are important for development of automatic screening systems and in this paper we present a method for detection of exudate regions in color fundus photographs. The method combines different preprocessing and candidate extraction algorithms to increase the exudate detection accuracy. First, we form an ensemble of different candidate extraction algorithms, which are used to increase the accuracy. After extracting the potential exudate regions we apply machine learning based classification for detection of exudate regions. For experimental validation we use the DRiDB color fundus image set where the presented method achieves higher accuracy in comparison to other state-of-the art methods.


FR-L02-4: Retinal Blood Vessel Extraction Using Curvelet Transform and Conditional Fuzzy Entropy

Sudeshna Sil Kar (Indian Institute of Engineering Science and Technology, Shibpur, India); Santi Prasad Maity (Bengal Engineering & Science University, Shibpur, India); Claude Delpha (Universite Paris Sud - L2S, France)

Abstract: This work employs multiple thresholds on matched filter response for automatic extraction of blood vessels, specially from a low contrast and non-uniformly illuminated background of retina. Curvelet transform is used first to enhance the finest details along the vessels followed by matched filtering to intensify the blood vessels' response. Maximization of the conditional fuzzy entropy is considered to find the optimal thresholds to extract the different types of vessel silhouettes from the background. Differential Evolution algorithm is used to specify the optimal combination of the fuzzy parameters. Next the enhanced image is classified as the thin, the medium and the thick vessels which are then logically OR-ed to obtain the entire vascular tree. Performance is evaluated on publicly available DRIVE database and is compared with the existing blood vessel extraction methods. Simulation results demonstrate that the proposed method outperforms the existing methods in detecting the various types of vessels.


FR-L02-5: Bone Microstructure Reconstructions From Few Projections with Stochastic Nonlinear Diffusion

Lin Wang (INSA de Lyon, France); Bruno Sixou (INSA Lyon, France); Francoise Peyrin (Universite de Lyon INSA Lyon, France)

Abstract: In this work, we use a stochastic diffusion equation for the reconstruction of binary tomography cross-sections obtained from a small number of projections. The aim of this new method is to escape from local minima by changing the shape

of the boundaries of the image. First, an initial binary image is reconstructed with a deterministic Total Variation regularization method, and then this binary reconstructed image is refined by a stochastic partial differential equation with singular diffusivity and a gradient dependent noise. This method is tested on a 256 256 experimental micro-CT trabecular bone image with different additive Gaussian noises. The reconstruction images are clearly improved.



Session FR-L03: Compressive Sensing and Sparsity


FR-L03-1: Gridless Compressive-sensing Methods for Frequency Estimation

Petre Stoica (Uppsala University, Sweden); Gongguo Tang (Colorado School of Mines, USA); Zai Yang (Nanyang Technological University, Singapore); Dave Zachariah (Uppsala University, Sweden)

Abstract: The gridless compressive-sensing methods form the most recent class of approaches that have been proposed for estimating the frequencies of sinusoidal signals from noisy measurements. In this paper we review these methods with the main goal of providing new insights into the relationships between them and their links to the basic approach of nonlinear least squares (NLS). We show that a convex relaxation of penalized NLS leads to the atomic-norm minimization method. This method in turn can be approximated by a gridless version of the SPICE method, for which the dual problem is shown to be equivalent to the global matched filter method.


FR-L03-2: Piecewise Toeplitz Matrices-based Sensing for Rank Minimization

Kezhi Li (University of Cambridge, United Kingdom); Cristian Rojas (Royal Institute of Technology (KTH), Sweden); Saikat Chatterjee (KTH - Royal Institute of Technology, Sweden); Håkan Hjalmarsson (KTH-Royal Institute of TEchnology, Sweden)

Abstract: This paper proposes a set of piecewise Toeplitz matrices as the linear mapping/sensing operator \mathcal{A}: \mathbf{R}^{n_1 \times n_2} \rightarrow \mathbf{R}^M for recovering low rank matrices from few measurements. We prove that such operators efficiently encode the information so there exists a unique reconstruction matrix under mild assumptions. This work provides a significant extension of the compressed sensing and rank minimization theory, and it achieves a tradeoff between reducing the memory required for storing the sampling operator from \mathcal{O}(n_1n_2M) to \mathcal{O}(\max(n_1,n_2)M) but at the expense of increasing the number of measurements by r. Simulation results show that the proposed operator can recover low rank matrices efficiently with a reconstruction performance close to the cases of using random unstructured operators.


FR-L03-3: Combined Modeling of Sparse and Dense Noise Improves Bayesian RVM

Martin Sundin (Royal Institute of Technology - KTH, Sweden); Saikat Chatterjee (KTH - Royal Institute of Technology, Sweden); Magnus Jansson (KTH Royal Institute of Technology, Sweden)

Abstract: Using a Bayesian approach, we consider the problem of recovering sparse signals under additive sparse and dense noise. Typically, sparse noise models outliers, impulse bursts or data loss. To handle sparse noise, existing methods simultaneously estimate sparse noise and sparse signal of interest. For estimating the sparse signal, without estimating the sparse noise, we construct a Relevance Vector Machine (RVM). In the RVM, sparse noise and ever present dense noise are treated through a combined noise model. Through simulations, we show the efficiency of new RVM for three applications: kernel regression, housing price prediction and compressed sensing.


FR-L03-4: Sparsity-Aware Learning in the Context of Echo Cancelation: A Set Theoretic Estimation Approach

Yannis Kopsinis (University of Athens, Greece); Symeon Chouvardas (University of Athens, Greece); Sergios Theodoridis (University of Athens, Greece)

Abstract: In this paper, the set-theoretic based adaptive filtering task is studied for the case where the input signal is nonstationary and may assume relatively small values. Such a scenario is often faced in practice, with a notable application that of echo cancellation. It turns out that very small input values can trigger undesirable behaviour of the algorithm leading to severe performance fluctuations. The source of this malfunction is geometrically investigated and a solution complying with the set-theoretic philosophy is proposed. The new algorithm is evaluated in realistic echo-cancellation scenarios and compared with state--of--the--art methods for echo cancellation such as the IPNLMS and IPAPA algorithms.


FR-L03-5: Greedy Methods for Simultaneous Sparse Approximation

Leila Belmerhnia (CRAN, Université de Lorraine, CNRS, France); El-Hadi Djermoune (CRAN, Nancy-Universite, CNRS, France); David Brie (CRAN, Nancy Université, CNRS, France)

Abstract: This paper extends greedy methods to simultaneous sparse approximation. This problem consists in finding good estimation of several input signals at once, using different linear combinations of a few elementary signals, drawn from a fixed collection. The sparse algorithms for which simultaneous versions are proposed are namely CoSaMP, OLS and SBR. These approaches are compared to Tropp's S-OMP algorithm using simulation signals. We show that in the case of signals exhibiting correlated components, the simultaneous versions of SBR and CoSaMP perform better than S-OMP and S-OLS.



Session FR-L04: Capacity Enhancing Techniques (Special Session)


FR-L04-1: Faster-than-Nyquist Signaling for Next Generation Communication Architectures

Andrea Modenini (University of Parma, Italy); Fredrik Rusek (Lund University, Sweden); Giulio Colavolpe (University of Parma, Italy)

Abstract: We discuss a few promising applications of the faster-than-Nyquist (FTN) signaling technique. Although proposed in the mid 70s, thanks to recent extensions this technique is taking on a new lease of life. In particular, we will discuss its applications to satellite systems for broadcasting transmissions, optical long-haul transmissions, and next-generation cellular systems, possibly equipped with a large scale antenna system (LSAS, also known as massive MIMO) at the base stations (BSs).


FR-L04-2: Fractionally Spaced Non-linear Equalization of Faster Than Nyquist Signals

Stefano Tomasin (University of Padova, Italy); Nevio Benvenuto (University of Padova, Italy)

Abstract: Faster than Nyquist transmissions provides the opportunity of increasing the data rate at the expenses of additional self-interference, or inter-symbol interference. The optimum receiver requires a maximum likelihood sequence estimator, whose complexity grows exponentially with the number of filter taps and with the number of bit per symbol used at the transmitter. In this paper we consider a suboptimal approach based on the non-linear equalization of the received signal. In order to further reduce the receiver complexity we consider an implementation of equalization filters in the frequency domain. The contributions of the paper are a) an efficient design of the equalization filters in the frequency domain, b) the receiver architecture and corresponding design for a fractionally spaced non-linear equalization, and c) the derivation of optimal (in the mean square error sense) filters that overcome existing suboptimal approaches existing in the literature.


FR-L04-3: IB-DFE SIC Based Receiver Structure for IA-Precoded MC-CDMA Systems

Adão Silva (Instituto de Telecomunicações (IT)/University of Aveiro, Portugal); Rui Dinis (Faculdade de Ciências e Tecnologia, University Nova de Lisboa, Portugal); José Assunção (Instituto de Telecomunicações, University of Aveiro, Portugal); Atílio Gameiro (Instituto de Telecomunicações / Universidade de Aveiro, Portugal); Alicia João (University of Aveiro/Instituto de Telecomunicações, Portugal)

Abstract: Interference alignment (IA) is a promising technique that allows high capacity gains in interfering channels. On the other hand, iterative frequency-domain detection receivers based on the IB-DFE concept (Iterative Block Decision Feedback Equalization) can efficiently exploit the inherent space-frequency diversity of the MIMO MC-CDMA systems. In this paper we design a joint iterative IA precoding at the transmitter with IB-DFE successive interference cancellation (SIC) based receiver structure for MC-CDMA systems. The receiver is designed in two steps: first a linear filter is used to mitigate the inter-user aligned interference, and then an iterative frequency-domain receiver is designed to efficiently separate the spatial streams in the presence of residual inter-user aligned interference at the output of the filter. Our scheme achieves the maximum degrees of freedom provided by the IA precoding, while allowing an almost optimum space-diversity gain, with performance close to the matched filter bound (MFB).


FR-L04-4: Improved Interference Aware Precoding for Cellular Network-MIMO Systems

Juan José García Fernández (Universidad Carlos III de Madrid, Spain); Maximo Morales (Universidad Carlos III de Madrid, Spain); Matilde Sánchez-Fernández (Universidad Carlos III de Madrid, Spain); Ana Garcia Armada (Universidad Carlos III de Madrid, Spain)

Abstract: An interference aware precoding scheme based on limited channel state information at the transmitter (CSIT) is considered for its use in the downlink of a cellular system. The transmitter precoder used is based on an MMSE-ZF criterion in order to maximize the user rate while the interference to other users is reduced. The proposed scheme also exploits the network topology, so that each BS can categorize the users into two groups, according to the level of interference that the BS is introducing in those users. On the receiver end, each user makes use of the whole channel state information at the receiver (CSIR) by employing an MMSE filter. This approach enables a reduction in the complexity of the system, while improving the performance of the whole network.


FR-L04-5: Enhancing Spectral Efficiency in Advanced Multicarrier Techniques: A Challenge

Leonardo Gomes Baltar (Technische Universität München, Germany); Tobias Laas (Technische Universität München, Germany); Michael Newinger (Technische Universität München, Germany); Amine Mezghani (TU Munich, Germany); Josef A. Nossek (TU Munich, Germany)

Abstract: Advanced multicarrier systems, like the OQAM filter bank based (OQAM-FBMC), are gaining importance as a candidate for the physical layer of the 5th generation of wireless communications. One of the main advantages of FBMC when compared to traditional cyclic prefix based OFDM is the higher spectral efficiency. However, this gain can be lost again if the problem of training based channel estimation is not tackled correctly. This is due to the memory inserted by the longer pulse shaping and the loss of orthogonality of overlapping subcarriers. In this paper we approach the problem of training based channel estimation for FBMC systems. We propose an iterative algorithm based on the expectation maximization (EM) maximum likelihood (ML) that reduces the overhead and consequently improves the spectral efficiency.



Session FR-L05: Advanced Signal Processing for Optical Communications (Special Session)


FR-L05-1: Digital Signal Processing Techniques for Multi-core Fiber Transmission Using Self-homodyne Detection

Jose Manuel Delgado Mendinueta (National Institute of Information and Communications Technology, Japan); Ruben S Luís (National Institute of Information and Communications Technology, Japan); Benjamin Puttnam (National Institute of Information and Communications Technology, Japan); Jun Sakaguchi (National Institute of Information and Communications Technology, Japan); Werner Klaus (National Institute of Information and Communications Technology, Japan); Yoshinari Awaji (National Institute of Information and Communications Technology (NICT), Japan); Naoya Wada (NICT, Japan); Atsushi Kanno (National Institute of Information and Communications Technology, Japan); Tetsuya Kawanishi (National Institute of Information and Communications Technology, Japan)

Abstract: We discuss digital signal processing (DSP) techniques for self-homodyne (SH), multi-core fiber (MCF) transmission links. We focus on exploiting the reduced phase noise of SH-MCF systems to enable DSP resource savings and describe a digital receiver architecture that mixes signal and local oscillator in the digital domain.


FR-L05-2: Realtime Digital Signal Processing in Coherent Optical PDM-QPSK and PDM-16-QAM Transmission

Reinhold Noé (University of Paderborn, Germany); Muhammad Fawad Panhwar (University of Paderborn, Germany); Christian Wördehoff (University Paderborn, Germany); David Sandel (University Paderborn, Germany)

Abstract: Coherent fiberoptic transmission with synchronous detection of 4, 8 or more bit/symbol enhances spectral efficiency. It relies on polarization-division multiplexing (PDM) and quadrature phase shift keying (QPSK) or higher-order quadrature amplitude modulation (QAM). A coherent polarization diversity, in-phase and quadrature receiver detects the optical field information. Its most important tasks are to recover the carrier in a laser phase noise tolerant manner and to control optical polarization electronically. We present suitable digital signal processing designs and their usage in realtime coherent transmission. We likewise discuss chromatic dispersion (CD) and polarization mode dispersion (PMD) equalization, needed for longer fibers.


FR-L05-3: Applications of Expectation Maximization Algorithm for Coherent Optical Communication

Darko Zibar (DTU Fotonik, depertment of Photonic Engineering, Technical University of Denmark, Denmark); Ole Winther (Technical University of Denmark, Denmark); Robert H Borkowski (DTU Fotonik, Denmark); Idelfonso Tafur Monroy (Technical University of Denmark, Denmark); Luis de Carvalho (CPqD, Brazil); Julio Cesar Oliveira (CPqD, Brazil)

Abstract: In this invited paper, we present powerful statistical signal processing methods, used by machine learning community, and link them to current problems in optical communication. In particular, we will look into iterative maximum likelihood parameter estimation based on expectation maximization algorithm and its application in coherent optical communication systems for linear and nonlinear impairment mitigation. Furthermore, the estimated parameters are used to build the probabilistic model of the system for the synthetic impairment generation.In this invited paper, we present powerful statistical signal processing methods, used by machine learning community, and link them to current problems in optical communication. In particular, we will look into iterative maximum likelihood parameter estimation based on expectation maximization algorithm and its application in coherent optical communication systems for linear and nonlinear impairment mitigation. Furthermore, the estimated parameters are used to build the probabilistic model of the system for the synthetic impairment generation.


FR-L05-4: Audio and Video Service Provision in Deep-Access Integrated Optical-Wireless Networks

Roberto Llorente (Universidad Politecnica de Valencia, Spain); Maria Morant (Universidad Politecnica de Valencia, Spain)

Abstract: Audio and video streaming can be provided in deep-access optical-wireless networks in a cost-effective way integrating the optical access, the optical in-building network and the wireless link at customer premises. Orthogonal frequency division multiplexing (OFDM) modulation is an interesting candidate for the integrated optical-wireless provision of this service and has been selected by most of the wireless communication standards due to the high spectral efficiency and bit rate capabilities combined with its robustness to transmission channel impairments and inter-symbol interference (ISI). In this paper, the successful transmission of commercially available OFDM-based signals following different wireless standards is demonstrated and the performance of the different digital signal processing algorithms implemented in their communication stacks is analyzed. Different optical transmission media and different OFDM transmission frequency bands are evaluated experimentally including the 60 GHz band. The wireless range coverage after the integrated optical-wireless transmission is also reported from the experimental work.


FR-L05-5: OSSB-OFDM Transmission Performance Using a Dual Electroabsorption Modulated Laser in NG-PON Context

Christelle Aupetit-Berthelemot (University of Limoges, France); Thomas Anfray (XLIM, France); Mohamed Chaibi (TELECOM ParisTech, France); Didier Erasme (TELECOM ParisTech, France); Guy Aubin (LPN-CNRS, France); Christophe Kazmierski (Alcatel-Thales III-V Lab, France); Philippe Chanclou (Orange Labs, France)

Abstract: We report system simulation and experimental results on enhanced transmission distance over standard single mode fiber thanks to a novel dual modulation technique that generates a wideband optical single side band orthogonal frequency division multiplexing (OSSB-OFDM) signal using a low-cost, integrated, dual RF access electro-absorption modulated laser. We obtained in experimentation and by simulation a bit error rate (BER) lower than 10^-3 for 11 Gb/s up to 200 km in an amplified point-to-point configuration for an optical single side band discrete multi-tone (OSSB-DMT) signal. We also experiment in simulation conventional OFDM at 25 Gb/s in point-to-multipoint architecture and we show that the transmission reach can be extended to 55 km for a BER at 10^−3 thanks to the new technique we have developed and implemented.



Session FR-L06: Speech Processing II


FR-L06-1: A Computationally Efficient Single-Channel Speech Enhancement Algorithm for Monaural Hearing Aids

David Ayllón (University of Alcalá, Spain); Roberto Gil-Pita (University of Alcalá, Spain); Manuel Utrilla (University of Alcalá, Spain); Manuel Rosa (University of Alcalá, Spain)

Abstract: A computationally-efficient single-channel speech enhancement algorithm to improve intelligibility in monaural hearing aids is presented in this paper. The algorithm combines a novel set of features with a simple supervised machine learning technique to estimate a time-frequency soft mask for noise reduction, using extremely low computational resources. Results show a noticeable intelligibility improvement in terms of PESQ score and SNR_ESI, even for low input SNR, using only a 7% of the computational resources available in a state-of-the-art commercial hearing aid. The performance of the algorithm is comparable to the performance of current algorithms that use more computationally complex features and learning schemas.


FR-L06-2: Binaural Localization of Speech Sources in the Median Plane Using Cepstral HRTF Extraction

Dumidu S. Talagala (University of Surrey, United Kingdom); Xiang Wu (The Australian National University, Australia); Wen Zhang (Australian National University, Australia); Thushara D. Abhayapala (Australian National University, Australia)

Abstract: In binaural systems, source localization in the median plane is challenging due to the difficulty of exploring the spectral cues of the head-related transfer function (HRTF) independently of the source spectra. This paper presents a method of extracting the HRTF spectral cues using cepstral analysis for speech source localization in the median plane. Binaural signals are preprocessed in the cepstral domain so that the fine spectral structure of speech and the HRTF spectral envelope can be easily separated. We introduce (i) a truncated cepstral transformation to extract the relevant localization cues, and (ii) a mechanism to normalize the effects of the time varying speech spectra. The proposed method is evaluated and compared with a convolution based localization method using a speech corpus of multiple speakers. The results suggest that the proposed method fully exploits the available spectral cues for robust speaker independent binaural source localization in the median plane.


FR-L06-3: Time-frequency Reassigned Cepstral Coefficients for Phone-Level Speech Segmentation

Georgina Tryfou (Fondazione Bruno Kessler - irst, Italy); Marco Pellin (Fondazione Bruno Kessler - irst, Italy); Maurizio Omologo (Fondazione Bruno Kessler - irst, Italy)

Abstract: This paper studies feature extraction within the context of automatic speech segmentation at phonetic level. Current state-of-the-art solutions widely use cepstral features as a front-end for HMM based frameworks. Although the automatic segmentation results have reached the inter-annotator agreement, within a tolerance equal or higher than 20ms, the same is not true when a lower tolerance is considered. We propose a new set of cepstral features that derive from the time-frequency reassigned spectrogram and offer a sharper representation of the speech signal in the cepstral domain. The features are evaluated through a series of forced alignment experiments which demonstrate a better performance, compared to the traditional MFCC features, in aligning phone boundaries within a small distance from their true position.


FR-L06-4: Cluster-Based Adaptation Using Density Forest for HMM Phone Recognition

Mohamed Abou-Zleikha (Aalborg University, Denmark); Zheng-Hua Tan (Aalborg University, Denmark); Mads Græsbøll Christensen (Aalborg University, Denmark); Søren Holdt Jensen (Aalborg University, Denmark)

Abstract: The dissimilarity between the training and test data in speech recognition systems is known to have a considerable effect on the recognition accuracy. To solve this problem, we use density forest to cluster the data and use maximum a posteriori (MAP) method to build a cluster-based adapted Gaussian mixture models (GMMs) in HMM speech recognition. Specifically, a set of bagged versions of the training data for each state in the HMM is generated, and each of these versions is used to generate one GMM and one tree in the density forest. Thereafter, an acoustic model forest is built by replacing the data of each leaf (cluster) in each tree with the corresponding GMM adapted by the leaf data using the MAP method. The results show that the proposed approach achieves 3.8% (absolute) lower phone error rate compared with the standard HMM/GMM and 0.8% (absolute) lower PER compared with bagged HMM/GMM.


FR-L06-5: Privacy-Preserving Speaker Verification Using Garbled GMMs

José Portêlo (INESC-ID Lisboa, Portugal); Bhiksha Raj (Carnegie Mellon University, USA); Alberto Abad (INESC-ID/IST, Portugal); Isabel Trancoso (I.S.T. - Technical U. Lisbon / I.N.E.S.C. - I.D., Portugal)

Abstract: In this paper we present a privacy-preserving speaker verification system using a UBM-GMM technique. Remote speaker verification services rely on the system having access to the user's recordings, or features derived from them, and a model representing the user's voice. Preserving privacy in our context means that neither the system observes voice samples or speech models from the user nor the user observes the universal model owned by the system. Our approach uses Garbled Circuits for obtaining an implementation that simultaneously is secure, has high accuracy and is efficient. To the best of our knowledge this is the first privacy-preserving speaker verification system that accomplishes all these three goals.



Session FR-L07: Signal Processing Applications III


FR-L07-1: fMRI Unmixing Via Properly Adjusted Dictionary Learning

Yannis Kopsinis (University of Athens, Greece); Harris V Georgiou (University of Athens, Greece); Sergios Theodoridis (University of Athens, Greece)

Abstract: The mapping of the functional networks within the brain is a major step towards a deeper understanding of the the brain function. It involves the blind source separation of obtained fMRI data, usually performed via independent component analysis (ICA). Recently, there is an increased interest for alternatives to ICA for data-driven fMRI unmixing and notably good results have been attained with Dictionary Learning (DL) - based analysis. In this paper, the K-SVD DL method is appropriately adjusted in order to cope with the special properties characterizing the fMRI data.


FR-L07-2: Exploiting Correlation in Neural Signals for Data Compression

Sebastian Schmale (University of Bremen, Germany); Janpeter Hoeffmann (University of Bremen, Germany); Benjamin Knoop (University of Bremen, Germany); Gernot Kreiselmeyer (University Hospital of Erlangen, Germany); Hajo Hamer (University Hospital of Erlangen, Germany); Dagmar Peters-Drolshagen (University of Bremen, Germany); Steffen Paul (University Bremen, Germany)

Abstract: Progress in invasive brain research relies on signal acquisition at high temporal- and spatial resolutions, resulting in a data deluge at the (wireless) interface to the external world. Hence, data compression at the implant site is necessary in order to comply with the neurophysiological restrictions, especially when it comes to recording and transmission of neural raw data. This work investigates spatial correlations of neural signals, leading to a significant increase in data compression with a suitable sparse signal representation before the wireless data transmission at the implant site. Subsequently, we used the correlation-aware two-dimensional DCT used in image processing, to exploit spatial correlation of the data set. In order to guarantee a certain sparsity in the signal representation, two paradigms of zero forcing are evaluated and applied: Significant coefficients- and block sparsity-zero forcing.


FR-L07-3: Cooperative Use of Parallel Processing with Time or Frequency-domain Filtering for Shape Recognition

Carlos Graca (Instituto de Telecomunicações, University of Coimbra, Portugal); Gabriel Falcao (Instituto de Telecomunicações, University of Coimbra, Portugal); Sunil Kumar (CMUC, Portugal); Isabel N Figueiredo (University of Coimbra, Portugal)

Abstract: For many computer vision applications, detection of blobs and/or tubular structures in images are of great importance. In this paper, we have developed a parallel signal processing framework for speeding up the detection of blob and tubular objects in images. We identified filtering procedure as being responsible for up to 98% of the global processing time, in the used blob or tubular detector functions. We show that after a certain dimension of the filter it is beneficial to combine frequency-domain techniques with parallel processing to develop faster signal processing algorithms. The proposed framework is applied to medical wireless capsule endoscopy (WCE) images, where blob and/or tubular detectors are useful in distinguishing between abnormal and normal images.


FR-L07-4: Automated Detection of Sleep Apnea and Hypopnea Events Based on Robust Airflow Envelope Tracking

Marcin Ciołek (Gdansk University of Technology, Poland); Maciej Niedźwiecki (Gdansk University of Technology, Poland); Stefan Sieklicki (Gdansk University of Technology, Poland); Jacek Drozdowski (Medical University of Gdańsk, Poland); Janusz Siebert (Medical University of Gdansk, Poland)

Abstract: The paper presents a new approach to detection of apnea and hypopnea events, in the presence of artifacts and breathing irregularities, from a single channel airflow record. The proposed algorithm identifies segments of signal affected by a high amplitude modulation corresponding to apnea\hypopnea events. It is shown that a robust airflow envelope - free of breathing artifacts - improves effectiveness of the diagnostic process and allows one to localize the beginning and the end of each episode more accurately. The performance of the proposed approach, evaluated on 15 overnight polysomnographic recordings, was assessed using diagnostic measures such as accuracy, sensitivity and specificity; achieving 95%, 91% and 96%, respectively.


FR-L07-5: Bayesian Spatiotemporal Segmentation of Combined PET-CT Data Using a Bivariate Poisson Mixture Model

Zacharie Irace (University of Toulouse - IRIT/ENSEEIHT, France); Hadj Batatia (University of Toulouse, France)

Abstract: This paper presents an unsupervised algorithm for the joint segmentation of 4-D PET-CT images. The proposed method is based on a bivariate-Poisson mixture model to represent the bimodal data. A Bayesian framework is developed to label the voxels as well as jointly estimate the parameters of the mixture model. A generalized four-dimensional Potts-Markov Random Field (MRF) has been incorporated into the method to represent the spatio-temporal coherence of the mixture components. The method is successfully applied to 4-D registered PET-CT data of a patient with lung cancer. Results show that the proposed model fits accurately the data and allows the segmentation of different tissues and the identification of tumors in temporal series.



Session FR-L08: Signal Modelling and Estimation


FR-L08-1: Time-Frequency Kernel Design for Sparse Joint-Variable Signal Representations

Branka Jokanovic (Villanova University, USA); Moeness G. Amin (Villanova University, USA); Yimin D. Zhang (Villanova University, USA); Fauzia Ahmad (Villanova University, USA)

Abstract: Highly localized quadratic time-frequency distributions cast nonstationary signals as sparse in the joint-variable representations. The linear model relating the ambiguity domain and time-frequency domain permits the application of sparse signal reconstruction techniques to yield high-resolution time-frequency representations. In this paper, we design signal-dependent kernels that enable the resulting time-frequency distribution to meet the two objectives of reduced cross-term interference and increased sparsity. It is shown that, for random undersampling schemes, the new adaptive kernel is superior to traditional signal-independent reduced interference distribution kernels and outperforms those which only consider each objective separately.


FR-L08-2: Autoregressive Models with Epsilon-Skew-Normal Innovations

Pascal Bondon (LSS CNRS, France)

Abstract: We consider the problem of modelling asymmetric near-Gaussian correlated signals by autoregressive models with epsilon-skew normal innovations. Moments and maximum likelihood estimators of the parameters are proposed and their limit distributions are derived. Monte Carlo simulation results are analyzed and the model is fitted to a real time series.


FR-L08-3: Parametric Estimation of Multi-Line Parameters Based on SLIDE Algorithm

Slobodan Djukanović (University of Montenegro, Montenegro); Marko Simeunović (University of Montenegro, Montenegro); Igor Djurović (University of Montenegro, Montenegro)

Abstract: The subspace-based line detection (SLIDE) algorithm enables the estimation of parameters of multiple lines within digital image by mapping these lines to frequency modulated (FM) signals. In this paper, we consider the estimation of such obtained FM signals by using estimators developed for polynomial-phase signals (PPSs). For this purpose, the recently proposed PCPF-HAF has been used. Simulations show that the PCPF-HAF-based estimator is more accurate than estimators based on time-frequency representations.


FR-L08-4: Multi-Lag Phase Space Representations for Transient Signals Characterization

Cindy Bernard (GIPSA-lab, France); Teodor Petrut (Gipsa-lab, France); Gabriel Vasile (French National Council for Scientific Research (CNRS), France); Cornel Ioana (Institute National Polytechnique de Grenoble, France)

Abstract: Transient signals are very difficult to characterize due to their short duration and their wide frequency content. Various methods such as spectrogram and wavelet decomposition have already been extensively used in the literature to detect them, but show limits when it comes to near similar transients discrimination. In this paper, we propose the multi-lag phase space analysis as a way to characterize them. This data-driven method enables the comparison between features extracted from two different signals. In an example, we compare the multi-lag phase space representations of three similar transients and show that common features can be found to discriminate them. Finally the results are compared with a wavelet decomposition.


FR-L08-5: Estimation of Large Toeplitz Covariance Matrices and Application to Source Detection

Julia Vinogradova (Telecom Paristech, France); Romain Couillet (Supélec, France); Walid Hachem (Telecom-paristech, France)

Abstract: In this paper, performance results of two types of Toeplitz covariance matrix estimators are provided. Concentration inequalities for the spectral norm for both estimators have been derived showing exponential convergence of the error to zero. It is shown that the same rates of convergence are obtained in the case where the aggregated matrix of time samples is corrupted by a rank one matrix. As an application based on this model, source detection by a large dimensional sensor array with temporally correlated noise is studied.



Session FR-L09: The Many Faces of Signal Processing in Multimedia QoE Assessment (Special Session


FR-L09-1: A No-Reference Audio-Visual Video Quality Metric

Helard Martinez (University of Brasilia, Brazil); Mylene Q Farias (University of Brasilia (UnB), Brazil)

Abstract: Three psychophysical experiments were carried out to understand how both audio and video components interact and affect the overall audio-visual quality. In the experiments, subjects independently evaluated the perceived quality of (1) video (without audio), (2) audio (without video ), and (3) video with audio. Results show that content has an important influence on the perceived quality, for video and audio. Also, video compressing has a higher impact on the perceived quality than audio compressing. With the help of the perceptual models obtained using subjective data, we propose 3 no-reference audio-visual quality metrics composed of combination functions of a video and an audio quality metrics. The no-reference video quality metric consists of a blockiness and a blurriness metrics, while the NR audio metric is modification of the SESQA metric. The combination functions used for the perceptual models and the no-reference audio-visual metric were: linear, Minkowski, and power functions.


FR-L09-2: Robustness and Prediction Accuracy of Machine Learning for Objective Visual Quality Assessment

Andrew Hines (Trinity College Dublin, Ireland); Paul Kendrick (University of Salford, United Kingdom); Adriaan Barri (Vrije Universiteit Brussel, Belgium); Manish Narwaria (Universite ́ de Nantes, France); Judith Redi (Delft University of Technology, The Netherlands)

Abstract: Machine Learning (ML) is a powerful tool to support the development of objective visual quality assessment metrics, serving as a substitute model for the perceptual mechanisms acting in visual quality appreciation. Nevertheless, the reliability of ML-based techniques within objective quality assessment metrics is often questioned. In this study, the robustness of ML in supporting objective quality assessment is investigated, specifically when the feature set adopted for prediction is suboptimal. A Principal Component Regression based algorithm and a Feed Forward Neural Network are compared when pooling the Structural Similarity Index (SSIM) features perturbed with noise. The neural network adapts better with noise and intrinsically favours features according to their salient content.


FR-L09-3: EEG Correlates During Video Quality Perception

Eleni Kroupi (EPFL, Switzerland); Philippe Hanhart (Ecole Polytechnique Fédérale de Lausanne, Switzerland); Jong-Seok Lee (Yonsei University, Korea); Martin Rerabek (EPFL, Switzerland); Touradj Ebrahimi (Ecole Polytechnique Fédérale de Lausanne, Switzerland)

Abstract: Understanding Quality of Experience (QoE) in various multimedia contents is still challenging. In this paper, we investigate the way QoE affects brain oscillations captured by electroencephalography (EEG). In particular, sixteen subjects watched 2D and 3D videos of various quality levels while their EEG signals were recorded, and were asked to provide their self-assessed perceived quality ratings for each video. EEG signals were decomposed into six frequency bands, namely theta, alpha, beta low, beta middle, beta high and gamma bands. The results revealed frontal asymmetry patterns in the alpha band, which correspond to right frontal activation when perceived quality is low.


FR-L09-4: Single Exposure Vs Tone Mapped High Dynamic Range Images: A Study Based on Quality of Experience

Manish Narwaria (Universite ́ de Nantes, France); Matthieu Perreira Da Silva (University of Nantes, France); Patrick Le Callet (Université de Nantes, France); Romuald Pépion (IRCCyN, Université de Nantes, France)

Abstract: Tone mapping operators (TMOs), employed to fit the dynamic range of High Dynamic Range (HDR) visual signals to that of the display, are generally non-transparent and modify the visual appearance of the scene. Despite this, tone mapped content generally tends to have more visual details as compared to a single exposure scene. It is however not clear if the extra details in tone mapped HDR affect user preferences over a single exposure content in terms of scene appearance and to what extent. This paper aims to shed light on this issue via a comprehensive subjective study. Our results reveal that there is no statistical evidence to establish if the users preferred tone mapped content over the single exposure version as closer representation of the corresponding HDR scene. We present those results as well as outline the possible factors contributing to this somewhat unexpected finding.


FR-L09-5: Subjective Evaluation of 3D Video Enhancement Algorithm

Federica Battisti (University of Roma TRE, Italy); Marco Carli (University of Roma TRE, Italy); Alessandro Neri (University of ROMA TRE, Italy)

Abstract: In this contribution the subjective evaluation of a 3D enhancement algorithm is presented. In the proposed scheme, perceptually significant features are enhanced or attenuated according to their saliency and to the masking effects induced by textured background. In particular, for each frame we consider the high frequency components, i.e, the edges, as relevant features in the edge complex wavelet domain computed by the first order dyadic Gauss-Laguerre Circular Harmonic Wavelet decomposition. The saliency is assessed by evaluating both disparity map and motion vectors extracted from the 3D videos. The effectiveness of the proposed approach has been verified by means of subjective tests.



Session FR-L10: Random Matrix Theory Methods for Multi-Antenna Signal Processing (Special Session)


FR-L10-1: Low-Complexity Linear Precoding for Multi-Cell Massive MIMO Systems

Abla Kammoun (Supelec, France); Axel Müller (Supélec, France); Emil Björnson (Linköping University, Sweden); Mérouane Debbah (Supelec, France)

Abstract: Massive MIMO has been recognized as an efficient solution to improve the spectral efficiency of future communication systems. However, increasing the number of antennas goes with increasing computational complexity. In particular, the precoding design becomes involved since near-optimal precoding, such as regularized-zero-forcing(RZF), requires the inversion of a large matrix. In our previous work we proposed to solve this issue in the single-cell case by approximating the matrix-inverse by a truncated polynomial expansion (TPE), where the polynomial coefficients are selected for optimal system performance. In this paper, we generalize this technique to multicell scenarios. While the optimization of the RZF precoding has, thus far, not been feasible in multicell systems, we show that the proposed TPE precoding can be optimized to maximize the weighted max-min fairness. Using simulations, we compare the proposed TPE precoding with RZF and show that our scheme achieve higher throughput using a TPE order of only 3.


FR-L10-2: Robust G-MUSIC

Romain Couillet (Supélec, France); Abla Kammoun (Supelec, France)

Abstract: An improved MUSIC algorithm for direction of arrival estimation is introduced that accounts both for large array sizes N comparatively with the number of independent observations n and for the impulsiveness of the background environment (e.g., in the presence of outliers in the observations). This method derives from the spiked G MUSIC algorithm proposed in (Vallet, Hachem, Loubaton, Mestre, and Najim, 2011) and from the recent works by one of the authors on the random matrix analysis of robust scatter matrix estimators (Couillet, Pascal, and Silverstein, 2013). The method is shown to be asymptotically consistent where classical approaches are not. This superiority is corroborated by simulations.


FR-L10-3: Asymptotic Analysis of a GLRT for Detection with Large Sensor Arrays

Sonja Hiltunen (Thales Communications & Security, France); Philippe Loubaton (Université de Marne La Vallée, France); Pascal Chevalier (Thales Communication, France)

Abstract: This paper addresses the performance analysis of two GLRT receivers in the case where the number of sensors M is of the same order of magnitude as the sample size N. In the asymptotic regime where M and N converge towards infinity at the same rate, the corresponding asymptotic means and variances are characterized using large random matrix theory, and compared to the standard situation where N goes to infinity and M is fixed. This asymptotic analysis allows to understand the behavior of the considered receivers, even for relatively small values of N and M.


FR-L10-4: Correlation Test for High Dimensional Data with Application to Signal Detection in Sensor Networks

Xavier Mestre (CTTC, Spain); Pascal Vallet (Institut Polytechnique de Bordeaux, France); Walid Hachem (Telecom-paristech, France)

Abstract: The problem of correlation detection of multivariate Gaussian observations is considered. The problem is formulated as a binary hypothesis test, where the null hypothesis corresponds to a diagonal correlation matrix with possibly different diagonal entries, whereas the alternative would be associated to any other form of positive covariance. Using tools from random matrix theory, we study the asymptotic behavior of the Generalized Likelihood Ratio Test (GLRT) under both hypothesis, assuming that both the sample size and the observation dimension tend to infinity at the same rate. It is shown that the GLRT statistic always converges to a Gaussian distribution, although the asymptotic mean and variance will strongly depend the actual hypothesis. Numerical simulations demonstrate the superiority of the proposed asymptotic description in situations where the sample size is not much larger than the observation dimension.


FR-L10-5: Fluctuations for Linear Spectral Statistics of Large Random Covariance Matrices

Jamal Najim (CNRS & Université de Paris Est - Marne La Vallée, France); Jianfeng Yao (Télécom Paristech, France)

Abstract: The theory of large random matrices has proved to be an efficient tool to address many problems in wireless communication and statistical signal processing these last two decades.

We provide hereafter a central limit theorem (CLT) for linear spectral statistics of large random covariance matrices, improving Bai and Silverstein's celebrated 2004 result. This fluctuation result should be of interest to study the fluctuations of important estimators in statistical signal processing.



Session FR-L11: Speech Processing III


FR-L11-1: Voice Source Modelling Using Deep Neural Networks for Statistical Parametric Speech Synthesis

Tuomo Raitio (Aalto University, Finland); Heng Lu (University of Edinburgh, United Kingdom); John Kane (Trinity College Dublin, Ireland); Antti Suni (University of Helsinki, Finland); Martti Vainio (University of Helsinki, Finland); Simon King (University of Edinburgh, United Kingdom); Paavo Alku (Aalto University, Finland)

Abstract: This paper presents a voice source modelling method employing a deep neural network (DNN) to map from acoustic features to the time-domain glottal flow waveform. First, acoustic features and the glottal flow signal are estimated from each frame of the speech database. Pitch-synchronous glottal flow time-domain waveforms are extracted, interpolated to a constant duration, and stored in a codebook. Then, a DNN is trained to map from acoustic features to these duration-normalised glottal waveforms. At synthesis time, acoustic features are generated from a statistical parametric model, and from these, the trained DNN predicts the glottal flow waveform. Illustrations are provided to demonstrate that the proposed method successfully synthesizes the glottal flow waveform and enables easy modification of the waveform by adjusting the input values to the DNN. In a subjective listening test, the proposed method was rated as equal to a method employing a stored glottal flow waveform.


FR-L11-2: An Improved Chirp Group Delay Based Algorithm for Estimating the Vocal Tract Response

Jayesh MK (IIT Madras, India); C S Ramalingam (IITM, India)

Abstract: We propose a method for vocal tract estimation that is better than Bozkurt's chirp group delay method and its zero-phase variant. The chirp group delay method works only for voiced speech, is critically dependent on finding the glottal closure instants (GCI), deteriorates in performance when more than two pitch cycles are included for analysis, and does not work for unvoiced speech. The zero-phase variant eliminates these drawbacks but works poorly for nasal sounds. In our proposed method all outside-unit-circle zeros are reflected inside before computing the chirp group delay. The advantages are: (a) GCI knowledge not required, (b) the vocal tract estimate is far less sensitive to the location and duration of the analysis window, (c) works for unvoiced sounds, and (d) captures the spectral valleys well for nasals, which in turn leads to better recognition accuracy.


FR-L11-3: Audiovisual to Area and Length Functions Inversion of Human Vocal Tract

Benjamin Elie (Loria, France); Yves Laprie (Loria, France)

Abstract: This paper proposes a multimodal approach to estimate the area function and the length of the vocal tract of oral vowels. The method is based on an iterative technique consisting in deforming an initial area function so that the output acoustic vector matches a specified target. The chosen acoustic vector is the formant frequency pattern. In order to regularize the ill-problem, several constraints are added to the algorithm. First, the lip termination area is estimated via a facial capture software. Then, the area function is constrained in such a way that it does not get too far from a neutral position, and it does not change too quickly from a temporal frame to the next, when dealing with dynamic inversion. The method proves to be efficient to approximate the area function and the length of the vocal tract for oral french vowels, both in static and dynamic configurations.


FR-L11-4: A Speech Presence Probability Estimator Based on Fixed Priors and a Heavy-Tailed Speech Model

Balazs Fodor (Technische Universität Braunschweig, Germany); Timo Gerkmann (University of Oldenburg, Germany)

Abstract: Speech enhancement approaches are often enhanced by speech presence probability (SPP) estimation. However, SPP estimators suffer from random fluctuations of the a posteriori signal-to-noise ratio (SNR). While there exist proposals that overcome the random fluctuations by basing the SPP framework on smoothed observations, these approaches do not take into account the super-Gaussian nature of speech signals. Thus, in this paper we define a framework that allows for modeling the likelihoods of speech presence for smoothed observations, while at the same time assuming super-Gaussian speech coefficients. The proposed approach is shown to outperform the reference approaches in terms of the amount of noise leakage and the amount of musical noise.


FR-L11-5: Novel Topic N-gram Count LM Incorporating Document-based Topic Distributions and N-gram Counts

Md. Akmal Haidar (INRS-EMT, Canada); Douglas O'Shaughnessy (INRS-Énergie-Matériaux-Télécommunications, Canada)

Abstract: In this paper, we introduce a novel topic n-gram count language model (NTNCLM) using topic probabilities of training documents and document-based n-gram counts. The topic probabilities for the documents are computed by averaging the topic probabilities of words seen in the documents. The topic probabilities of documents are multiplied by the document-based n-gram counts. The products are then summed-up for all the training documents. The results are used as the counts of the respective topics to create the NTNCLMs. The NTNCLMs are adapted by using the topic probabilities of a development test set that are computed as above. We compare our approach with a recently proposed TNCLM [1], where the long-range information outside of the n-gram events is not encountered. Our approach yields significant perplexity and word error rate (WER) reductions over the other approach using the Wall Street Journal (WSJ) corpus.



Session FR-L12: Signal Processing Applications IV


FR-L12-1: Human Motion Detection in Daily Activity Tasks Using Wearable Sensors

Olga Politi (University of Patras, Greece); Iosif Mporas (University of Patras, Greece); Vassilis Megalooikonomou (, Greece)

Abstract: In this article we present a human motion detection framework, based on data derived from a single tri-axial accelerometer. The framework uses a set of different pre-processing methods that produce data representations which are respectively parameterized by statistical and physical features. These features are then concatenated and classified using well-known classification algorithms for the problem of motion recognition. Experimental evaluation was carried out according to a subject-dependent scenario, meaning that the classification is performed for each subject separately using their own data and the average accuracy for all individuals is computed. The best achieved detection performance for 14 everyday human motion activities, using the USC-HAD database, was approximately 95%. The results compare favorably are competitive to the best reported performance of 93.1% for the same database.


FR-L12-2: A New Approach to Wavelet Entropy: Application to Postural Signals

Céline Franco (University of Grenoble, France); Pierre-Yves Guméry (PRETA team, TIMC-IMAG, Joseph Fourier University, La Tronche, France); Anthony Fleury (Ecole des Mines de Douai, France); Nicolas Vuillerme (University of Grenoble, France)

Abstract: This study proposes a new approach for quantifying complexity of physiological signals characterized by a spectral distribution in modes. Our approach is inspired by wavelet entropy but based on a modal representation: Synchrosqueezing transform. It is calculated for each time sample within the cone of influence of the decomposition. This index is first validated

and discussed on simulated multicomponent signals. Finally, it is applied to assess postural control and ability at using all the sensory resources available. Results show significant differences in our index following an induced change in sensory conditions whereas a conventional approach fails. This index may constitute a promising tool for detection of postural troubles.


FR-L12-3: Improved Modeling and Bounds for NQR Spectroscopy Signals

Georgia Kyriakidou (King's College London, United Kingdom); Andreas Jakobsson (Lund University, Sweden); Erik Gudmundson (Lund University, Sweden); Alan Gregorovič (Jozef Stefan Institute, Slovenia); Jamie Barras (King's College London, United Kingdom); Kaspar Althoefer (King's College London, Spain)

Abstract: Nuclear Quadrupole Resonance (NQR) is a method of detection and unique characterization of compounds containing quadrupolar nuclei, commonly found in many forms of explosives, narcotics, and medicines. Typically, multi-pulse sequences are used to acquire the NQR signal, allowing the resulting signal to be well modeled as a sum of exponentially damped sinusoidal echoes. In this paper, we improve upon the earlier used NQR signal model, introducing an observed amplitude modulation of the spectral lines as a function of the sample temperature. This dependency noticeably affects the achievable identification performance in the typical case when the substance temperature is not perfectly known. We further extend the recently presented Cramér-Rao lower bound to the more detailed model, allowing one to determine suitable experimental conditions to optimize the detection and identifiability of the resulting signal. The theoretical results are carefully motivated using extensive NQR measurements.


FR-L12-4: Recursive Total Least-Squares Estimation of Frequency in Three-Phase Power Systems

Reza Arablouei (University of South Australia, Australia); Kutluyıl Doğançay (University of South Australia, Australia); Stefan Werner (Aalto University, Finland)

Abstract: We propose an adaptive algorithm for estimating the frequency of a three-phase power system from its noisy voltage readings. We consider a second-order autoregressive linear predictive model for the noiseless complex-valued αβ signal of the system to relate the system frequency to the phase voltages. We use this model and the noisy voltage data to calculate a total least-square (TLS) estimate of the system frequency by employing the inverse power method in a recursive manner. Simulation results show that the proposed algorithm, called recursive TLS (RTLS), outperforms the recursive least-squares (RLS) and the bias-compensated RLS (BCRLS) algorithms in estimating the frequency of both balanced and unbalanced three-phase power systems. Unlike BCRLS, RTLS does not require the prior knowledge of the noise variance.


FR-L12-5: Advances in Bacteria Motility Modelling Via Diffusion Adaptation

Sadaf Monajemi (National University of Singapore, Singapore); Saeid Sanei (University of Surrey, United Kingdom); Sim Heng Ong (National University of Singapore, Singapore)

Abstract: In this paper we model the biological networks of bacteria and antibacterial agents and investigate the effects of cooperation in the corresponding self-organized networks. The cooperative foraging of the bacteria has been used to solve non-gradient optimization problems. In order to obtain a more realistic model of the process, we extend the previously introduced strategies for bacteria motility to incorporate the effects of antibacterial agents and bacterial replication as two key aspects of this network. The proposed model provides a more accurate understanding of bacterial networks. Moreover, it has applications for various regenerative networks where the agents cooperate to solve an optimization problem. The model is examined and the effects of bacterial growth, diffusion of information and interaction of antibacterial agents with bacteria are evaluated.


FR-L12-6: An Alive Electroencephalogram Analysis System to Assist the Diagnosis of Epilepsy

Anas Ahmad Malik (Lahore University of Management Sciences, Pakistan); Waqas Majeed (Lahore University of Management Sciences, Pakistan); Nadeem Khan (Lahore University of Management Sciences, Pakistan)

Abstract: Computer assisted electroencephalograph analysis tools are trained to classify the data based upon the "ground truth" provided by the clinicians. After development and delivery of these systems there is no simple mechanism for these clinicians to improve the system's classification while encountering any false classification by the system. So the improvement process of the system's classification after initial training (during development) can be termed as 'dead'. We consider neurologist as the best available benchmark for system's learning. In this article, we propose an 'alive' system, capable of improving its performance by taking clinician's feedback into consideration. The system is based on taking DWT transform which has been shown to be very effective for EEG signal analysis. PCA is applied on the statistical features which are extracted from DWT coefficients before classification by an SVM classifier. After corrective marking of few epochs the initial average accuracy of 94.8% raised to 95.12.



Session FR-L13: Graphs, Networks and Distributed Estimation


FR-L13-1: Network Observability and Localization of Sources of Diffusion in Tree Networks with Missing Edges

Sabina Zejnilovic (Carnegie Mellon University, USA); Joao Gomes (ISR - Instituto Superior Tecnico, Portugal); Bruno Sinopoli (Carnegie Mellon University, USA)

Abstract: In order to quickly curb infections or prevent spreading of rumors, first the source of diffusion needs to be localized. We analyze the problem of source localization, based on infection times of a subset of nodes in incompletely observed tree networks, under a simple propagation model. Our scenario reflects the assumption that having access to all the nodes and full network topology is often not feasible. We evaluate the number of possible topologies that are consistent with the observed incomplete tree. We provide a sufficient condition for the selection of observed nodes, such that correct localization is possible, i.e. the network is observable. Finally, we formulate the source localization problem under these assumptions as a binary linear integer program. We then provide a small simulation example to illustrate the effect of the number of observed nodes on the problem complexity and on the number of possible solutions for the source.


FR-L13-2: Graph Empirical Mode Decomposition

Nicolas Tremblay (ENS Lyon, France); Pierre Borgnat (ENS Lyon, CNRS, France); Patrick Flandrin (CNRS-ENS de Lyon, France)

Abstract: An extension of Empirical Mode Decomposition (EMD) is defined for graph signals. EMD is an algorithm that decomposes a signal in an addition of modes, in a local and data-driven manner. The proposed Graph EMD (GEMD) for graph signals is based on careful considerations on key points of EMD: defining the extrema, choosing the interpolation procedure, and proposing a stopping criterion for the sifting process. Examples of GEMD are shown on the 2D grid and on two examples of sensor networks. Finally, the effect of the graph's connectivity and of the precise definition of extrema on the algorithm's performance is discussed.


FR-L13-3: Consensus for Continuous Belief Functions

Zhiyuan Weng (Stony Brook University, USA); Petar M. Djurić (Stony Brook University, USA)

Abstract: We study the belief consensus problem in networks of agents. Unlike previous work in the literature, where agents try to reach consensus on a scalar or vector, here we investigate how agents can reach a consensus on a probability distribution. In our setting, the agents fuse functions instead of point estimates. The objective is that every agent ends up with the belief being the global Bayesian posterior. We show that to achieve the objective, the number of total agents in the network is required. But in some circumstances the number of agents is not available and therefore the global Bayesian posterior is not achievable. In such cases, we have to resort to approximation methods. We confine ourselves to Gaussian cases and formulate the problem and propose two methods for the selection of weighting coefficients in the fusion process. Simulation has been performed to demonstrate the performance of the methods.


FR-L13-4: Distributed Reduced-Rank Estimation Based on Joint Iterative Optimization in Sensor Networks

Songcen Xu (University of York, United Kingdom); Rodrigo C. de Lamare (University of York, United Kingdom); H. Vincent Poor (Princeton University, USA)

Abstract: This paper proposes a novel distributed reduced-rank scheme and a novel adaptive algorithm for distributed estimation in wireless sensor networks. The proposed distributed scheme is based on a transformation that performs dimensionality reduction at each agent of the network followed by a reduced-dimension parameter vector. A distributed reduced-rank joint iterative estimation algorithm which requires low complexity is developed, which has the ability to achieve significantly reduced communication overhead and improved performance when compared with existing techniques. Simulation results illustrate the advantages of the proposed strategy in terms of convergence rate and mean square error performance when compared to existing algorithms.


FR-L13-5: On Sequential Estimation of Linear Models From Data with Correlated Noise

Yunlong Wang (Stony Brook University, USA); Petar M. Djurić (Stony Brook University, USA)

Abstract: In this paper, we consider the problem of Bayesian sequential estimation on a set of time invariant parameters. At every time instant, a new observation through a linear model is obtained where the observation is distorted by correlated noise with unknown covariance. We derive the joint posterior of the parameters of interest and the covariance, and we propose several approximations to make the Bayesian estimation tractable. Then we propose a method for forming a pseudo posterior, which is suitable for settings where consensus-based distributed estimation is applied. By computer simulations, we show that the Kullback--Leibler divergence between the pseudo posterior and a posterior obtained from a known covariance is decreasing. We also provide computer simulations that compare the proposed method with the least squares method.



Session FR-L14: Acoustic Scene Analysis in Domestic Environments (Special Session)


FR-L14-1: Model-Based Processing for Acoustic Scene Analysis

Climent Nadeu (UPC, Spain); Rupayan Chakraborty (Universitat Politecnica de Catalunya, Spain); Martin Wolf (Universitat Politècnica de Catalunya, Spain)

Abstract: The analysis of acoustic scenes requires several functionalities, being perhaps recognition (speech, speaker, other acoustic events) and spatial localization the two most relevant ones. For a reduced invasiveness, the microphones are far away from the sound sources, and possibly grouped in arrays, which may be distributed, not arranged, in the room. Aiming at an increased performance, the usual model-based approach employed for sound recognition or detection can be extended to other co-occurrent tasks like source localization, so both tasks can be carried out jointly, using the same formulation and processing. In this paper, we intend to illustrate that point by presenting together a few new model-based techniques that deal with the problems of overlapped-sounds recognition, multi-source localization, and channel selection. They are briefly described, and tested in a smart-room environment with a multiple microphone-array setup.


FR-L14-2: Multi-Microphone Fusion for Detection of Speech and Acoustic Events in Smart Spaces

Panagiotis Giannoulis (National Technical University of Athens, Greece); Gerasimos Potamianos (University of Thessaly, Greece); Athanasios Katsamanis (National Technical University of Athens, Greece); Petros Maragos (National Technical University of Athens, Greece)

Abstract: In this paper, we examine the challenging problem of detecting acoustic events and voice activity in smart indoors environments, equipped with multiple microphones. In particular, we focus on channel combination strategies, aiming to take advantage of the multiple microphones installed in the smart space, capturing the potentially noisy acoustic scene from the far-field. We propose various such approaches that can be formulated as fusion at the signal, feature, or at the decision level, as well as combinations of the above, also including multi-channel training. We apply our methods on two multi-microphone databases: (a) one recorded inside a small meeting room, containing twelve classes of isolated acoustic events; and (b) a speech corpus containing interfering noise sources, simulated inside a smart home with multiple rooms. Our multi-channel approaches demonstrate significant improvements, reaching relative error reductions over a single-channel baseline of 9.3% and 44.8% in the two datasets, respectively.


FR-L14-3: Distant Speech Recognition in Reverberant Noisy Conditions Employing a Microphone Array

Juan A. Morales-Cordovilla (Graz University of Technology, Austria); Martin Hagmüller (Graz University of Technology, Austria); Hannes Pessentheiner (Graz University of Technology, Austria); Gernot Kubin (Graz University of Technology, Austria)

Abstract: This paper addresses the problem of distant speech recognition in reverberant noisy conditions employing a microphone array. We present a prototype system that can segment the utterances in realtime and generate robust ASR results off-line. The segmentation is carried out by a voice activity detector based on deep belief networks, the speaker localization by a position-pitch plane, and the enhancement by a novel combination of convex optimized beamforming and vector Taylor series compensation. All of the components are compared with other similar ones and justified in terms of word accuracy on a proposed database which simulates distant speech recognition in a home environment.


FR-L14-4: Exploiting Inter-Microphone Agreement for Hypothesis Combination in Distant Speech Recognition

Cristina M. Guerrero (Fondazione Bruno Kessler, Italy); Maurizio Omologo (Fondazione Bruno Kessler - irst, Italy)

Abstract: A multi-microphone hypothesis combination approach, suitable for the distant-talking scenario, is presented in this paper. The method is based on the inter-microphone agreement of information, extracted at speech recognition level. Particularly, temporal information is exploited to organize the clusters that shape the resulting confusion network, and to reduce the global hypothesis search space. As a result, a single combined confusion network is generated from multiple lattices. The approach offers a novel perspective to solutions based on confusion network combination. The method was evaluated in a simulated domestic environment equipped with largely spaced microphones. The experimental evidence suggests that results, comparable or, in some cases, better than the state of the art, can be achieved under optimal configurations with the proposed method.


FR-L14-5: Experiments in Acoustic Source Localization Using Sparse Arrays in Adverse Indoors Environments

Antigoni Tsiami (National Technical University of Athens + ATHENA-RC, Greece); Athanasios Katsamanis (National Technical University of Athens, Greece); Petros Maragos (National Technical University of Athens, Greece); Gerasimos Potamianos (University of Thessaly, Greece)

Abstract: In this paper we experiment with 2-D source localization in smart homes under adverse conditions using sparse distributed microphone arrays. We propose some improvements to deal with problems due to high reverberation and use of a limited number of microphones. These consist of a pre-filtering stage for dereverberation and an iterative procedure that aims to increase accuracy. Experiments carried out in relatively large databases with both simulated and real recordings of sources in various positions indicate that the proposed method exhibits a better performance compared to others under challenging conditions while also being computationally efficient. It is demonstrated that although reverberation degrades localization performance, this degradation can be compensated by identifying the reliable microphone pairs and disposing of the outliers.



Session FR-L15: Non-Linear Signal Processing


FR-L15-1: Exploring Deep Markov Models in Genomic Data Compression Using Sequence Pre-analysis

Diogo Pratas (University of Aveiro, Portugal); Armando J Pinho (University of Aveiro, Portugal)

Abstract: The pressure to find efficient genomic compression algorithms is being felt worldwide, as proved by several prizes and competitions. In this paper, we propose a compression algorithm that relies on a pre-analysis of the data before compression, with the aim of identifying regions of low complexity. This strategy enables us to use deeper context models, supported by hash-tables, without requiring huge amounts of memory. As an example, context depths as large as 32 are attainable for alphabets of four symbols, as is the case of genomic sequences. These deeper context models show very high compression capabilities in very repetitive genomic sequences, yielding improvements over previous algorithms. Furthermore, this method is universal, in the sense that it can be used in any type of textual data (such as quality-scores).


FR-L15-2: Perfect Periodic Sequences for Legendre Nonlinear Filters

Alberto Carini (University of Urbino, Italy); Stefania Cecchi (Università Politecnica delle Marche, Italy); Laura Romoli (Università Politecnica delle Marche, Italy); Giovanni L. Sicuranza (University of Trieste, Italy)

Abstract: The paper shows that perfect periodic sequences can be developed and used for the identification of Legendre nonlinear filters, a sub-class of linear in the parameters nonlinear filters recently introduced in the literature. A periodic sequence is perfect for the identification of a nonlinear filter if all cross-correlations between two different basis functions, estimated over a period, are zero. Using perfect periodic sequences as input signals, the unknown nonlinear system and its most relevant basis functions can be identified with the cross-correlation method. The effectiveness and efficiency of this approach is illustrated with experimental results involving a real nonlinear system.


FR-L15-3: A Multidimensional Approach to Wave Digital Filters with Multiple Nonlinearities

Tim Schwerdtfeger (Bergische Universität Wuppertal, Germany); Anton Kummert (University of Wuppertal, Germany)

Abstract: The implementation of nonlinear elements in Wave Digital Filters (WDFs) is usually restricted to just one nonlinear one-port per structure. Existing approaches that aim to circumvent this restriction have in common that they neglect the notion of modularity and thus the reusability of the original Wave Digital concept. In this paper, a new modular approach to implement an arbitrary number of nonlinearities based on Multidimensional Wave Digital Filters (MDWDFs) is presented. For this, the contractivity property of WDFs is shown. On that basis, the new approach is studied with respect to possible side-effects and an appropriate modification is proposed that counteracts these effects and significantly improves the convergence behaviour.


FR-L15-4: Fast Filter in Nonlinear Systems with Application to Stochastic Volatility Model

Stephane Derrode (LIRIS (CNRS UMR 5205), Ecole Centrale de Lyon, France); Wojciech Pieczynski (Télécom SudParis, France)

Abstract: We consider the problem of optimal statistical filtering in nonlinear and non-Gaussian systems. The novelty consists of approximating the nonlinear system by a recent switching system, in which exact fast optimal filtering is workable. The new method is applied to filter stochastic volatility model and some experiments show its efficiency.


FR-L15-5: Generalization of Campbell's Theorem to Nonstationary Noise

Leon Cohen (Hunter College of the City University of NY, USA)

Abstract: Campbell's theorem is a fundamental result of noise theory and has found application in many fields of science and engineering. It gives a simple but very powerful result for the mean and standard deviation of a stationary random pulse train. We generalize Campbell's theorem to the non stationary case where the random process is space and time dependent. We also generalize it to a pulse train of waves, acoustic and electromagnetic, where the intensity is defined as the absolute square of the pulse train.



Session FR-P1: Sensor Array and Multichannel Signal Processing I


FR-P1-1: 2-D Angle of Arrival Estimation Using a One-Dimensional Antenna Array

Saleh Al-Jazzar (Al-Zaytoonah University of Jordan, Jordan)

Abstract: In this paper, a two-dimensional (2-D) angle of arrival (AOA) estimator is presented for vertically polarised waves in which a one-dimensional (1-D) antenna array is used. Many 2-D AOA estimators were previously developed to estimate elevation and azimuth angles. These estimators require a 2- D antenna array setup such as the L-shaped or parallel antenna 1-D arrays. In this paper a 2-D AOA estimator is presented which requires only a 1-D antenna array. This presented method is named Estimation of 2-D Angle of arrival using Reduced antenna array dimension (EAR). The EAR estimator utilises the antenna radiation pattern factor to reduce the required antenna array dimensionality. Thus, 2-D AOA estimation is possible using antenna arrays of reduced size and with a minimum of two elements only, which is very beneficial in applications with size and space limitations. Simulation results are presented to show the performance of the presented method.


FR-P1-2: Adaptive Waveform Selection and Target Tracking by Wideband Multistatic Radar/Sonar Systems

Ngoc Hung Nguyen (University of South Australia, Australia); Kutluyıl Doğançay (University of South Australia, Australia); Linda M. Davis (University of South Australia, Australia)

Abstract: An adaptive waveform selection algorithm for target tracking by multistatic radar/sonar systems in wideband environments is presented to minimize the tracking mean squared error. The proposed selection algorithm is developed based on the minimization of the trace of error covariance matrix for the target state estimates (i.e. the target position and target velocity). This covariance matrix can be computed using the Cramer-Rao lower bounds of the wideband radar/sonar measurements. The performance advantage of the proposed adaptive waveform selection algorithm over the conventional fixed waveforms with minimum and maximum time-bandwidth products is demonstrated by simulation examples using various FM waveform classes.


FR-P1-3: Alternating Maximization Algorithm for the Broadcast Beamforming

Özlem Tuğfe Demir (Middle East Technical University, Turkey); T. Engin Tuncer (Middle East Technical University, Turkey)

Abstract: Semidefinite relaxation (SDR) is a powerful approach to solve nonconvex optimization problems involving rank condition. However its performance becomes unacceptable for certain cases. In this paper, a nonconvex equivalent formulation without the rank condition is presented for the broadcast beamforming problem. This new formulation is exploited to obtain an alternating optimization method which is shown to converge to the local optimum rank one solution. Proposed method opens up new possibilities in different applications. Simulations show that the new method is very effective and can attain global optimum especially when the number of users is low.


FR-P1-4: Closed-form Approximations of the PAPR Distribution for Multi-Carrier Modulation Systems

Marwa Chafii (Supélec, France); Jacques Palicot (IETR/Supélec, France); Rémi Gribonval (INRIA, France)

Abstract: The theoretical analysis of the Peak-to-Average Power Ratio (PAPR) distribution for an Orthogonal Frequency Division Multiplexing (OFDM) system, depends on the particular waveform considered in the modulation system. In this paper, we generalize this analysis by considering the Generalized Waveforms for Multi-Carrier (GWMC) modulation system based on any family of modulation functions, and we derive a general approximate expression for the Cumulative Distribution Function (CDF) of its continuous and discrete time PAPR. These equations allow us to directly find the expressions of the PAPR distribution for any particular family of modulation functions, and they can be applied to control the PAPR performance by choosing the appropriate functions.


FR-P1-5: Path Uncertainty Robust Beamforming

Richard Stanton (Imperial College London, United Kingdom); Mike Brookes (Imperial College London, United Kingdom)

Abstract: Conventional beamformer design assumes that the phase differences between the received sensor signals are a deterministic function of the array and source geometry. In fact however, these phase differences are subject to random variations arising both from source and sensor position uncertainties and from fluctuations in sound velocity. We present a framework for modelling these uncertainties and show that improved beamformers are obtained when they are taken into account.


FR-P1-6: Robust DOA Estimation of Harmonic Signals Using Constrained Filters on Phase Estimates

Sam Karimian-Azari (Aalborg University, Denmark); Jesper Rindom Jensen (Aalborg University, Denmark); Mads Græsbøll Christensen (Aalborg University, Denmark)

Abstract: In array signal processing, defined distances between receivers, e.g., microphones, ideally causes multi-channel time delays depending on the frequency and direction of arrival (DOA) of a signal source. We can estimate the DOA from the time-difference of arrival (TDOA) estimates. However, the conventional DOA estimators based on the TDOA estimates are limited in colored noise. In this paper, we estimate the DOA of a harmonic signal source from multi-channel phase estimates, which relate to narrowband TDOA estimates. Herein, we design filters to apply on phase estimates to obtain the DOA with a minimum variance. Using the linear array and the harmonic constraints, which can also be generalized to non-linear arrays, we design optimal filters based on noise statistics. Therefore, the proposed method is robust in different Gaussian noise, and, in colored noise, the simulation results outperform an optimal state-of-the-art weighted least-squares (WLS) DOA estimator.


FR-P1-7: A Unifying Approach to Minimal Problems in Collinear and Planar TDOA Sensor Network Self-Calibration

Erik Ask (Lund University, Sweden); Yubin Kuang (Lund University, Sweden); Kalle Åström (Lund University, Sweden)

Abstract: This work presents a study of sensor network calibration from time-difference-of-arrival (TDOA) measurements for cases when the dimensions spanned by the receivers and the transmitters differ. This could for example be if receivers are restricted to a line or plane or if the transmitting objects are moving linearly in space. Such calibration arises in several applications such as calibration of (acoustic or ultrasound) microphone arrays, and radio antenna networks. We propose a non-iterative algorithm based on recent stratified approaches: (i) rank constraints on modified measurement matrix, (ii) factorization techniques that determine transmitters and receivers up to unknown affine transformation and (iii) determining the affine stratification using remaining nonlinear constraints. This results in a unified approach to solve almost all minimal problems. Such algorithms are important components for systems for self-localization. Experiments are shown both for simulated and real data with promising results.


FR-P1-8: Space-Time Signal Subspace Estimation for Wide-Band Acoustic Arrays

Elio D. Di Claudio (University of Rome "La Sapienza", Italy); Giovanni Jacovitti (INFOCOM Dpt. University of Rome, Italy)

Abstract: Acoustic array applications are generally characterized by very large signal bandwidth. Most existing wide-band direction of arrival (DOA) estimators are based on binning in the frequency domain, so that within each bin the signal model is considered approximately narrow-band. In this work the basic inconsistency of the commonly used binning is first shown. It is shown that the recent Space Time MUSIC (ST-MUSIC) method, which estimates a set of narrow-band signal subspaces directly from the space-time array covariance and combines them within a Weighted Subspace Fitting paradigm, can restore wide-band DOA estimation consistency in most scenarios, obtaining a large variance improvement at high signal to noise ratio (SNR). In addition, a refined ST-MUSIC subspace weighting is proposed to improve accuracy, especially at low SNR.


FR-P1-9: Informed Separation of Dependent Sources Using Joint Matrix Decomposition

Boudjellal Abdelouahab (Polytech'Orléans, France); Karim Abed-Meraim (Polytech'Orléans, France); Adel Belouchrani (Ecole Nationale Polythechnique, Algiers, Algeria); Philippe Ravier (Université d'Orléans, France)

Abstract: This paper deals with the separation problem of dependent sources. The separation is made possible thanks to side information on the dependence nature of the considered sources. In this work, we first show how this side information can be used to achieve desired source separation using joint matrix decomposition techniques. Indeed, in the case of of statistically independent sources, many BSS methods are based on joint matrix diagonalization. In our case, we replace the target diagonal structure by appropriate non diagonal matrices set witch reflect the dependence nature of the sources. This new concept is illustrated with two simple 2 x 2 source separation examples where second-order-statistics and high-order-statistics are used respectively.


FR-P1-10: Range-Doppler Radar Target Detection Using Denoising Within the Compressive Sensing Framework

Rasim Sevimli (Bilkent University, Turkey); Mohammad Tofighi (Bilkent University, Turkey); Enis Çetin (Bilkent University, Turkey)

Abstract: Compressive sensing (CS) idea enables the reconstruction of a sparse signal from small number of measurements. CS approach has many applications in many areas. One of the areas is radar systems. In this article, the radar ambiguity function is denoised within the CS framework. A new CS reconstruction algorithm based on the projection onto the epigraph set of the convex function is developed for this purpose. This approach is compared to the other CS reconstruction algorithms. Experimental results are presented.


FR-P1-11: 3-D Array Configuration Using Multiple Regular Tetrahedra for High-Resolution 2-D DOA Estimation

Yuki Doi (Yokohama National University, Japan); Koichi Ichige (Yokohama National University, Japan); Hiroyuki Arai (Yokohama National University, Japan); Hiromi Matsuno (KDDI R&D Laboratories Inc., Japan); Masayuki Nakano (KDDI R&D Labs, Japan)

Abstract: This paper presents a novel 3-D array configuration using multiple regular tetrahedra which enables high resolution 2-D DOA estimation. The proposed array configuration has better DOA estimation performance as that of the conventional 3-D array configuration for uncorrelated waves, and can be rearranged into the cuboid array configuration which can estimate DOAs of correlated waves. Performance of the proposed 3-D array configuration is evaluated through computer simulation.


FR-P1-12: Optimal Adaptive Transmit Beamforming for Cognitive MIMO Sonar in a Shallow Water Waveguide

Nathan Shraga (Tel-Aviv University, Israel); Joseph Tabrikian (Ben-Gurion University of the Negev, Israel)

Abstract: This paper addresses the problem of adaptive beamforming for target localization by active cognitive multiple-input multiple-output (MIMO) sonar in a shallow water waveguide. Recently, a sequential waveform design approach for estimation of parameters of a linear system was proposed. In this approach, at each step, the transmit beampattern is determined based on previous observations. The criterion used for waveform design is the Bayesian Cramér-Rao bound (BCRB) for estimation of the unknown system parameters. In this paper, this method is used for target localization in a shallow water waveguide, and it is extended to account for environmental uncertainties which are typical to underwater acoustic environments. The simulations show the sensitivity of the localization performance of the method at different environmental prior uncertainties.


FR-P1-13: A Lower Bound for Passive Sensor Network Auto-Localization

Rémy Vincent (CEA Leti, France); Mikael Carmona (CEA-Léti, France); Olivier Michel (INPG, France); Jean-Louis Lacoume (INPG, France)

Abstract: Passive travel-time estimation is here defined as the retrieval of an inter-sensor propagation delay, using uncontrolled ambient sources in an homogeneous non dispersive linear and time invariant wave propagation medium. The temporal and spatial spectral properties of the sources are uncontrolled and an example of such sources would be e.g. cracks. Our approach relates to passive linear systems identification through the use of ambient noise correlations and Ward identities to form estimators. The goal of this article is to assess the performances of such passive auto-location estimation by discussing the effect of non white sources in both time and space and by deriving a lower bound for the variance of the proposed estimator.


FR-P1-14: Interval-based Localization Using Sensors Mobility and Fingerprints in Decentralized Sensor Networks

Xiaowei Lv (Université de Technologie de Troyes, France); Farah Mourad-Chehade (Université de Technologie de Troyes, France); Hichem Snoussi (University of Technology of Troyes, France)

Abstract: This paper is focused on the decentralized localization problem of mobile sensors in wireless sensor networks. Based on a combined localization technique, it uses accelerometer, gyroscope and fingerprinting information to solve the positioning issue. Using the sensors mobility, the proposed method computes first estimates of sensors positions. It then proceeds to a decentralized localization scheme, where the network is divided to different zones. RSSIs fingerprints are jointly used with mobility information in order to compute position estimates. Final position estimates are obtained by means of interval analysis where all uncertainties are considered throughout the estimation process.



Session FR-P2: Signal Processing Applications I


FR-P2-1: Reconstruction Technique of Fluorescent X-Ray Computed Tomography Using Sheet Beam

Tetsuya Yuasa (Yamagata University, Japan)

Abstract: We clarify the measurement process of fluorescent x-ray computed tomography (FXCT) using sheet-beam as incident beam, and show that the process leads to the attenuated Radon transform. In order to improve quantitativeness, we apply Natterer's scheme to the FXCT reconstruction. We show its efficacy by computer simulation.


FR-P2-2: ElectroMyoGram Signal Enhancement in fMRI Noise Using Spectral Subtraction

Sofia BenJebara (Ecole Superieure des Communications de Tunis, Tunisia)

Abstract: This paper deals with noise removal in ElectroMyoGram (EMG) signals acquired in the hostile noisy environment of functional Magnetic Resonance Imaging (fMRI).

The noise due to magnetic fields and radio frequencies corrupts significantly the EMG signal which render its extraction very difficult. The proposed approach operes in the frequency domain to estimate the noise spectrum to subtract it from noisy observation spectrum. The noise estimation is based on spectral minima tracking in each frequency bin without any distinction between muscle activity and muscle rest. But it looks for connected time-frequency regions of muscle activity presence to estimate a bias compensation factor. The method is tested with a simulated noisy observation in order to evaluate its performance using objective criteria. It is also validated for real noisy observations where no clean is available.


FR-P2-3: Gunshot Signal Enhancement for DOA Estimation and Weapon Recognition

Ângelo Borzino (Military Institute of Engineering (IME), Brazil); José Antonio Apolinário Jr. (Military Institute of Engineering (IME), Brazil); Marcello Campos (Federal University of Rio de Janeiro, Brazil); Carla Pagliari (Instituto Militar de Engenharia, Brazil)

Abstract: This paper proposes a deconvolution technique for gunshot signals aiming at improving direction of arrival estimation and weapon recognition. When dealing with field recorded signals, reflections degrade the performance of these tasks and a signal enhancement technique is required. Our scheme improves a gunshot signal by dealing and summing its reflections. Conventional blind deconvolution schemes are not reliable when applied to impulsive signals. While other techniques impose restrictions on the signal in order to ensure stability, the one proposed herein can be used without such limitations. The results of the proposed deconvolution were tested with real gunshot signals and both applications performed well.


FR-P2-4: Automatic Generation of Personalised Alert Thresholds for Patients with COPD

Carmelo Velardo (University of Oxford, United Kingdom); Syed Ahmar Shah (University of Oxford, United Kingdom); Oliver Gibson (University of Oxford, United Kingdom); Heather Rutter (University of Oxford, United Kingdom); Andrew Farmer (University of Oxford, United Kingdom); Lionel Tarassenko (University of Oxford, United Kingdom)

Abstract: Chronic Obstructive Pulmonary Disease (COPD) is a chronic disease predicted to become the third leading cause of death by 2030. Patients with COPD are at risk of exacerbations in their symptoms, which have an adverse effect on their quality of life and may require emergency hospital admission. Using the results of a pilot study of an m-Health system for COPD self-management and tele-monitoring, we demonstrate a data-driven approach for computing personalised alert thresholds to prioritise patients for clinical review. Univariate and multivariate methodologies are used to analyse and fuse daily symptom scores, heart rate, and oxygen saturation measurements. We discuss the benefits of a multivariate kernel density estimator which improves on univariate approaches.


FR-P2-5: Cochlear Implant Artifact Rejection in Electrically Evoked Auditory Steady State Responses

Hanne Deprez (KU Leuven, Belgium); Michael Hofmann (ExpORL, Dept. Of Neurosciences, KU Leuven, Belgium); Astrid van Wieringen (KU Leuven, Belgium); Jan Wouters (Katholieke Universiteit Leuven, Belgium); Marc Moonen (KU Leuven, Belgium)

Abstract: Electrically evoked auditory steady state responses (EASSRs) are EEG signals measured in response to periodic or modulated pulse trains presented through a cochlear implant (CI). EASSRs are studied for the objective fitting of CIs in infants, as electrophysiological thresholds determined with EASSRs correlate well with behavioural thresholds. Currently available techniques to remove CI artifacts from such measurements are only able to deal with artifacts for low-rate pulse trains or modulated pulse trains presented in bipolar mode, which are not used in main clinical practice. In this paper, a fully automatic EASSR CI artifact rejection technique based on independent component analysis (ICA) is presented that is suitable for clinical parameters. Artifactual independent components are automatically selected based on the spectral amplitude of the pulse rate. Electrophysiological thresholds determined based on ICA compensated signals are equal to those detected using blanked signals, but measurements at only one modulation frequency are required.


FR-P2-6: Pedaling Parameters Behavior on Healthy Subjects: Towards a Rehabilitation Indication

David Barbosa (University of Minho, Portugal); Maria Martins (Minho University, Portugal); Cristina dos Santos (University of Minho, Guimarães, Portugal); Lino Costa (Minho University, Portugal); António Pereira (University of Minho, Portugal); Eurico Seabra (University of Minho, Portugal)

Abstract: It is of outmost importance to identify the quantitative indicators that characterize the rehabilitation degree of the lower limbs of stroke patients and qualitative indicators of the "quality" of the movement. As a first step in this direction, a cycling ergometer, used in hospitals and rehabilitation clinics, was modified to provide informations about the force applied in the pedals and the pedal angles. One group of non-pathological subjects performed a set of trials at different workloads and cadence values, to analyze the effect of these variables on force output. An increased workload resulted in the raise of the work performed by each leg, whereas the cadence results were inconclusive. Results suggest that the variation of the workload may be a suitable method to characterize motor impairments.


FR-P2-7: Metric Learning for Event-Related Potential Component Classification in EEG Signals

Qi Liu (Institute of Automation, Chinese Academy of Sciences, P.R. China); Xiao-Guang Zhao (Institute of Automation, Chinese Academy of Sciences, P.R. China); Zeng-Guang Hou (Institute of Automation, Chinese Academy of Sciences, P.R. China)

Abstract: We introduce in this paper a metric learning approach for the classification of P300 cognitive component based on EEG signals. It is shown that the SVM classification accuracy is significantly improved by learning a similarity metric from training data instead of using the default Euclidean metric. The effectiveness of the algorithm has been validated through data analysis on BCI Competition dataset (P300 speller BCI data).


FR-P2-8: A Discriminative Approach to Automatic Seizure Detection in Multichannel EEG Signals

David James (Swansea University, United Kingdom); Xianghua Xie (Swansea University, United Kingdom); Parisa Eslambolchilar (FIT Lab, United Kingdom)

Abstract: The aim of this paper is to introduce the application of Random Forests to the automated analysis of epileptic EEG data. Feature extraction is performed using a discrete wavelet transform to give time-frequency representations, from which statistical features based on the wavelet decompositions are formed and used for training and classification. We show that Random Forests can be used for the classification of ictal, inter-ictal and healthy EEG with a high level of accuracy, with 99% sensitivity and 93.5% specificity for classifying ictal and inter-ictal EEG, 90.6% sensitivity and 95.7% specificity for the windowed data and 93.9% sensitivity for seizure onset classification.


FR-P2-9: Stockwell Transform Optimization. Applied on the Detection of Split in Heart Sounds

Ali Moukadem (University of Haute Alsace, France); Zied Bouguila (University of Haute Alsace, France); Djaffar Ould Abdeslam (University of Haute Alsace, France); Alain Dieterlen (MIPS Laboratory, University of Haute Alsace, France)

Abstract: The aim of this paper is to improve the energy concentration of the Stockwell transform (S-transform) in the time-frequency domain. Several methods proposed in the literature tried to introduce novel parameters to control the width of the Gaussian window in the S-transform. In this study, a modified S-transform is proposed with four parameters to control the Gaussian window width. A genetic algorithm is applied to select the optimal parameters which maximize the energy concentration measure. An application presented in this paper consists to detect split in heart sounds and calculate its duration which is valuable medical information. Comparison with other famous time-frequency transforms such as Short-time Fourier transforms (STFT) and smoothed-pseudo Wigner-Ville distribution (SPWVD) is performed and discussed.


FR-P2-10: Vessel Centerline Detection in Retinal Images Based on a Corner Detector and Dynamic Thresholding

Ivo Soares (University of Beira Interior, Portugal); Miguel Castelo Branco (University of Beira Interior, Portugal); Antonio M. G. Pinheiro (University of Beira Interior, Portugal)

Abstract: This paper describes a new method for the calculation of the retinal vessel centerlines using a scale-space approach for an increased reliability and effectiveness. The algorithm begins with a new vessel detector description method based in a modified corner detector. Then the vessel detector is filtered with a set of binary rotating filters, resulting in enhanced vessels structures. The main vessels can be selected with a dynamic thresholding approach. In order to deal with vessels bifurcations and vessels crossovers that might not be detected, the initial retinal image is processed with a set of four directional differential operators. The resulting directional images are then combined with the detected vessels, creating the final vessels centerlines image. The performance of the algorithm is evaluated using two different methods.


FR-P2-11: VOG-Enhanced ICA for SSVEP Response Detection From Consumer-Grade EEG

Mohammad Reza Haji Samadi (The University of Birmingham, United Kingdom); Neil Cooke (University of Birmingham, United Kingdom)

Abstract: The steady-state visual evoked potential (SSVEP) brain-computer interface (BCI) paradigm detects when users look at flashing static and dynamic visual stimuli. Electroculogram(EOG) artefacts in the electroencephalography (EEG) signal limit the application for dynamic stimuli because they elicit smooth pursuit eye movement. We propose 'VOG-ICA' - an EOG artefact rejection technique based on Independent Component Analysis (ICA) that uses video-oculography (VOG) information from an eye tracker. It demonstrates good performance compared to Plöchl when evaluated on matched and EEG data collected with consumer grade eye tracking and wireless cap EEG apparatus. SSVEP response detection from frequential features extracted from ICA components demonstrates higher SSVEP response detection accuracy and lower between-person variation compared with extracted features from raw and post-ICA reconstructed 'clean' EEG. The work highlights the requirement for robust EEG artefact and SSVEP response detection techniques for consumer-grade multimodal apparatus.


FR-P2-12: EEG Signal Processing for Eye Tracking

Mohammad Reza Haji Samadi (The University of Birmingham, United Kingdom); Neil Cooke (University of Birmingham, United Kingdom)

Abstract: Head-mounted Video-Oculography (VOG) eye tracking is visually intrusive due to a camera in the peripheral view. Electrooculography (EOG) eye tracking is socially intrusive because of face-mounted electrodes. In this work we explore Electroencephalography (EEG) eye tracking from less intrusive wireless cap scalp based electrodes. Classification algorithms to detect eye movement and the focus of foveal attention are proposed and evaluated on data from a matched dataset of VOG and 16-channel EEG. The algorithms utilise EOG artefacts and the brain's steady state visually evoked potential (SSVEP) response while viewing flickering stimulus. We demonstrate improved performance by extracting features from source signals estimated by Independent Component Analysis (ICA) rather than the traditional band-pass preprocessed EEG channels. The work envisages eye tracking technologies that utilise non-facially intrusive EEG brain sensing via wireless dry contact scalp based electrodes.


FR-P2-13: Processing of Laser Speckle Contrast Images to Analyze the Impact of Aging on Moving Blood Cells

Adil Khalil (University of Angers, France); Anne Humeau-Heurtier (LISA, France); Pierre Abraham (Centre Hospitalier Universitaire d'Angers, France); Guillaume Mahé (Centre Hospitalier Universitaire d'Angers, France)

Abstract: It has long been recognized that age alters microcirculation. The follow-up of such alterations can be performed by monitoring microvascular blood flow. Laser speckle contrast imaging (LSCI) has recently been commercialized to monitor microvascular blood flow. From laser speckle contrast images, velocity of microvascular moving scatterers (mainly red blood cells) can be computed when a profile for velocity distribution is assumed. Our goal herein is to analyze if alterations of microcirculation with age can be determined by processing experimental LSCI data. In our work a Lorentzian velocity profile is assumed and the presence of static scatterers, like skin, is taken into account. Our results show that moving scatterers velocities computed from LSCI data vary with age: blood cells velocities increase with age. Moreover, the more the static scatterers, the higher the moving scatterers velocity values. Our findings are a first step in the analysis of the impact of aging from the processing of laser speckle contrast images.


FR-P2-14: Low-complexity, Multi-channel, Lossless and Near-lossless EEG Compression

Ignacio Capurro (Universidad de la República, Uruguay); Federico Lecumberry (Universidad de la República, Uruguay); Álvaro Martín (Universidad de la República, Uruguay); Ignacio Ramirez (Universidad de la República, Uruguay); Eugenio Rovira (Universidad de la República, Uruguay); Gadiel Seroussi (Universidad de la República, Uruguay)

Abstract: Current EEG applications imply the need for low-latency, low-power, high-fidelity data transmission and storage algorithms. This work proposes a compression algorithm meeting these requirements through the use of modern information theory and signal processing tools (such as universal coding, universal prediction, and fast online implementations of multivariate recursive least squares), combined with simple methods to exploit spatial as well as temporal redundancies typically present in EEG signals. The resulting compression algorithm requires O(1) operations per scalar sample and surpasses the current state of the art in near-lossless and lossless EEG compression ratios.


FR-P2-15: RFID-Based Butterfly Location Sensing System

Simo Särkkä (Aalto University, Finland); Ville Viikari (Aalto University School of Electrical Engineering, Finland); Kaarle Jaakkola (VTT, Finland)

Abstract: In this paper, we describe the implementation of a passive radio frequency identification (RFID) based location sensing system, which is intended for monitoring the movement and activity of butterflies for biological research purposes. The system uses the phase of the received signal to derive a time-of-flight measurement which is used for location and speed estimation. We present the design and characteristics of the developed small RFID tags, the antenna and system configuration, as well as the non-linear Kalman filtering and smoothing based tracking algorithm. We also present experimental results obtained in an anechoic chamber as well as in field test environment.



Session FR-P3: Image and Video Analysis II


FR-P3-1: Segmentation of 3D Dynamic Meshes Based on Reeb Graph Approach

Meha Hachani (Syscom Laboratory, Tunisia); Azza Ouled Zaid (University of Tunis El Manar, Tunisia); William Puech (LIRMM, France)

Abstract: This paper presents a new segmentation approach, for 3D dynamic meshes, based upon ideas from Morse theory and Reeb graphs. The segmentation process is performed using topological analysis of smooth functions defined on 3D mesh surface. The main idea is to detect critical nodes located on the mobile and immobile parts. Particularly, we define a new continuous scalar function, used for Reeb graph construction. This function is based on the heat diffusion properties. Clusters are obtained according to the values of scalar function while adding a refinement step. The latter is based on curvature information in order to adjust segmentation boundaries. Experimental results performed on 3D dynamic articulated meshes demonstrate the high accuracy and stability under topology changes and various perturbations through time.


FR-P3-2: Human Detection and Tracking Through Temporal Feature Recognition

Fraser K Coutts (University of Strathclyde, United Kingdom); Stephen Marshall (University of Strathclyde, United Kingdom); Paul Murray (University of Strathclyde, United Kingdom)

Abstract: The ability to accurately track objects of interest - partic-ularly humans - is of great importance in the fields of security and surveillance. In such scenarios, the application of accurate, automated human tracking offers benefits over manual supervision. In this paper, recent efforts made to investigate the improvement of automated human detection and tracking techniques through the recognition of person-specific time-varying signatures in thermal video are detailed. A robust human detection algorithm is developed to aid the initialisation stage of a state-of-the-art existing tracking algorithm. In addition, coupled with the spatial tracking methods present in this algorithm, the inclusion of temporal signature recognition in the tracking process is shown to improve human tracking results.


FR-P3-3: Estimation of the Weight Parameter with SAEM for Marked Point Processes Applied to Object Detection

Aurélie Boisbunon (INRIA Sophia-Antipolis Méditerranée – AYIN Team, France); Josiane Zerubia (INRIA, Sophia Antipolis, France)

Abstract: We consider the problem of estimating one of the parameters of a marked point process, namely the tradeoff parameter between the data and prior energy terms defining the probability of the process. In previous work, the Stochastic Expectation-Maximization (SEM) algorithm was used. However, SEM is well known for having bad convergence properties, which might also slow down the estimation time. Therefore, in this work, we consider an alternative to SEM: the Stochastic Approximation EM algorithm, which makes an efficient use of all the data simulated. We compare both approaches on high resolution satellite images where the objective is to detect boats in a harbor.


FR-P3-4: Joint Road Network Extraction From a Set of High Resolution Satellite Images

Olfa Besbes (SUPCOM, Tunisia); Amel Benazza (SUP'COM, Tunisia)

Abstract: In this paper, we develop a novel Conditional Random Field (CRF) formulation to jointly extract road networks from a set of high resolution satellite images. Our fully unsupervised method relies on a pairwise CRF model defined over a set of test images, which encodes prior assumptions about the roads such as thinness, elongation. Four competitive energy terms related to color, shape, symmetry and contrast-sensitive potentials are suitably defined to tackle with the challenging problem of road network extraction. The resulting objective energy is minimized by resorting to graph-cuts tools. Promising results are obtained for developed suburban scenes of remotely sensed images. The proposed model improve significantly the segmentation quality, compared against the independent CRF and two state-of-the-art methods.


FR-P3-5: Automatic Design of Aperture Filters Using Neural Networks Applied to Ocular Image Segmentation

Marco Benalcázar (Universidad Nacional de Mar del Plata, Argentina); Marcel Brun (Universidad Nacional de Mar del Plata, Argentina); Virginia Ballarin (Universidad Nacional de Mar del Plata, Argentina)

Abstract: Aperture filters are image operators which combine mathematical morphology and pattern recognition theory to design windowed classifiers. Previous works propose designing and representing such operators using large decision tables and classic linear pattern classifiers. These approaches demand an enormous computational cost in order to solve real image problems. The current work presents a new method to automatically design Aperture filters for color and grayscale image processing. This approach consists of designing a family of Aperture filters using artificial feed-forward neural networks. The resulting Aperture filters are combined into a single one using an ensemble method. The performance of the proposed approach was evaluated by segmenting blood vessels in ocular images of the DRIVE database. The results show the suitability of this approach: It outperforms window operators designed using neural networks and logistic regression as well as Aperture filters designed using logistic regression and support vector machines.


FR-P3-6: Accelerated A-contrario Detection of Smooth Trajectories

Rémy Abergel (Université Paris Descartes, France); Lionel Moisan (Université Paris Descartes, France)

Abstract: The detection of smooth trajectories in a (noisy) point set sequence can be realized optimally with the ASTRE algorithm, but the quadratic time and memory complexity of this algorithm with respect to the number of frames is prohibitive for many practical applications. We here propose a variant that cuts the input sequence into overlapping temporal chunks that are processed in a sequential (but non-independent) way, which results in a linear complexity with respect to the number of frames. Surprisingly, the performances are in general not affected by this acceleration strategy, and are sometimes even slightly above those of the original ASTRE algorithm.


FR-P3-7: Human Action Recognition in 3D Motion Sequences

Anastasios Tefas (Aristotle University of Thessaloniki, Greece); Konstantinos Kelgeorgiadis (Aristotle University of Thessaloniki, Greece); Nikos Nikolaidis (Aristotle University of Thessaloniki, Greece)

Abstract: In this paper we propose a method for learning and recognizing human actions on dynamic binary volumetric (voxel-based) or 3D mesh movement data. The orientation of the human body in each 3D posture is estimated by detecting its feet and this information is used to orient all postures in a consistent manner. K-means is applied on the 3D postures space of the training data to discover characteristic movement patterns namely 3D dynemes. Subsequently, fuzzy vector quantization (FVQ) is utilized to represent each 3D posture in the 3D dynemes space and then information from all time instances is combined to represent the entire action sequence. Linear discriminant analysis (LDA) is then applied. The actual classification step utilizes support vector machines (SVM). Results on a 3D action database verified that the method can achieve good performance.


FR-P3-8: Shot-based Object Retrieval From Video with Compressed Fisher Vectors

Luca Bertinetto (Politecnico di Torino, Italy); Attilio Fiandrotti (Politecnico di Torino, Italy); Enrico Magli (Politecnico di Torino, Italy)

Abstract: This paper addresses the problem of retrieving those shots from a database of video sequences that match a query image. Existing architectures match the images using a high-level representation of local features extracted from the video database, and are mainly based on Bag of Words model. Such architectures lack however the capability to scale up to very large databases. Recently, Fisher Vectors showed promising results in large scale image retrieval problems, but it is still not clear how they can be best exploited in video-related applications. In our work, we use compressed Fisher Vectors to represent the video shots and we show that inherent correlation between video frames can be effectively exploited. Experiments show that our proposed system achieves better performance while having lower computational requirements than similar architectures.


FR-P3-9: Nonlinear Band-Pass Filtering Using the TV Transform

Guy Gilboa (Technion, Israel)

Abstract: A distinct family of nonlinear filters is presented. It is based on a new formalism, defining a nonlinear transform based on the TV-functional. Scales in this sense are related to the size of the object and its contrast. Edges are very well preserved and selected scales of the object can be either selected, removed or enhanced. We compare the behavior of the filter to other filters based on Fourier and wavelets transforms and present its unique qualities.


FR-P3-10: A Homography-Based CDVS Pipeline for Image Matching with Improved Resilience to Viewpoint Changes

Biao Zhao (Politecnico di Torino, Italy); Enrico Magli (Politecnico di Torino, Italy)

Abstract: Compact Descriptors for Visual Search (CDVS) is MPEG proposed standard that will enable efficient and interoperable design of visual search applications using SIFT descriptors. Such local descriptors are invariant to rotation and scaling, but are not very robust towards viewpoint changes. In this paper, we address this problem and propose a modified version of the CDVS pipeline that employs image back-projection to compensate for perspective distortion. It is based on the homography derived from the correspondence extracted from pairs of matching keypoints. This method is simple and easy to be implemented. Extensive results show that it improves the CDVS matching accuracy under viewpoint changes.


FR-P3-11: Total Variation Super-resolution for 3D Trabecular Bone Micro-Structure Segmentation

Alina Toma (University of Lyon, CREATIS, France); Loic Denis (Laboratoire Hubert Curien, France); Bruno Sixou (INSA Lyon, France); Jean-Baptiste Pialat (Inserm U1033, Universite de Lyon, Hospices Civils de Lyon, France); Francoise Peyrin (Universite de Lyon INSA Lyon, France)

Abstract: The analysis of the trabecular bone micro-structure plays an important role in studying bone fragility diseases such as osteoporosis. In this context, X-ray CT techniques are increasingly used to image bone micro-architecture. The aim of this paper is to improve the segmentation of the bone micro-structure for further bone quantification. We propose a joint super-resolution/segmentation method based on total variation with a convex constraint. The minimization is performed with the ADMM framework. The new method is compared with the bicubic interpolation method and the classical total variation regularization. All methods were tested on blurred, noisy and down-sampled 3D synchrotron micro-CT bone volumes.

Improved segmentations are obtained with the proposed joint super-resolution/segmentation method.



Session FR-P4: Sensor Array and Multichannel Signal Processing II


FR-P4-1: Oversampled Graph Laplacian Matrix for Graph Signals

Akie Sakiyama (Tokyo University of Agriculture and Technology, Japan); Yuichi Tanaka (Tokyo University of Agriculture and Technology, Japan)

Abstract: In this paper, we propose oversampling of graph signals by using oversampled graph Laplacian matrix. The conventional critically sampled graph filter banks have to decompose an original graph into bipartite subgraphs, and the transform has to be performed on each subgraph due to the spectral folding phenomenon caused by downsampling of graph signals. Therefore, they cannot always utilize all edges of the original graph for the one-stage transformation. Our proposed method is based on oversampling of the underlying graph itself, and it can append nodes and edges to the graph somewhat arbitrarily. We use this approach to make one oversampled bipartite graph that includes all edges of the original non-bipartite graph. We apply the oversampled graph with the critically sampled filter bank for decomposing graph signals, and show the performance of graph signal denoising.


FR-P4-2: Semi-deterministic Ternary Matrix for Compressed Sensing

Weizhi Lu (INSA RENNES, France); Kidiyo Kpalma (INSA de Rennes, France); Joseph Ronsin (UEB UMR 6164 IETR-INSA  Rennes, France)

Abstract: The ensemble of random matrices with elements drawn from some symmetric distributions, such as Gaussian distribution and Bernoulli distribution, has been playing a central role in the construction of compressed sensing matrices. As a sparse version of this type of matrices, the random ternary matrix, i.e. {0,±1}, is more attractive for its low complexity and competitive performance. From the viewpoint of application, it is of practical interest if one could provide a suitable sparsity for the ternary matrix. Based on the study of RIP, this paper proposes a semi-deterministic ternary matrix, which holds deterministic nonzero positions while random signs. It achieves better performance than the typical random matrices under the popular decoding algorithms.


FR-P4-3: Search for Costas Arrays Via Sparse Representation

Mojtaba Soltanalian (Uppsala University, Sweden); Petre Stoica (Uppsala University, Sweden); Jian Li (University of Florida, USA)

Abstract: Costas arrays are mainly known as a certain type of optimized time-frequency coding pattern for sonar and radar. In order to fulfill the need for effective computational approaches to find Costas arrays, in this paper, we propose a sparse formulation of the Costas array search problem. The new sparse representation can pave the way for using an extensive number of methods offered by the sparse signal recovery literature. It is further shown that Costas arrays can be obtained using an equivalent quadratic program with linear constraints. A numerical approach is devised and used to illustrate the performance of the proposed formulations.


FR-P4-4: Recursive Blind Equalization with an Optimal Bounding Ellipsoid Algorithm

Mathieu Pouliquen (University of Caen, France); Miloud Frikel (ENSICAEN, France); Matthieu Denoual (ENSICAEN, France)

Abstract: In this paper, we present an algorithm for blind equalization i.e. equalization without training sequence. The proposed algorithm is based on the reformulation of the equalization problem in a set membership identification problem. Among the Set Membership Identification methods, the chosen algorithm is an optimal bounding ellipsoid type algorithm. This algorithm has a low computational burden which allows to use it easily in real time. Note that in this paper the equalizer is a finite impulse response filter. An analysis of the algorithm is provided. In order to show the good performance of the proposed approach some simulations are performed.


FR-P4-5: On Almost Sure Identifiability of Non Multilinear Tensor Decomposition

Jeremy Cohen (CNRS Gipsa-lab, France); Pierre Comon (CNRS UMR5216, France)

Abstract: Uniqueness of tensor decompositions is of crucial importance in numerous engineering applications. Extensive work in algebraic geometry has given various bounds involving tensor rank and dimensions to ensure generic identifiability. However, most of this work is hardly accessible to non-specialists, and does not apply to non-multilinear models. In this paper, we present another approach, using the Jacobian of the model. The latter sheds a new light on bounds and exceptions previously obtained. Finally, the method proposed is applied to a non-multilinear decomposition used in fluorescence spectrometry.


FR-P4-6: Greedy Orthogonal Matching Pursuit for Sparse Target Detection and Counting in WSN

Zakia Jellali (Higher School of Communications of Tunis, Tunisia); Leila Najjar (Sup'Com, Tunisia); Sofiane Cherif (Sup'Com, Tunisia)

Abstract: The recently emerged Compressed Sensing (CS) theory has widely addressed the problem of sparse targets detection in Wireless Sensor Networks (WSN) in the aim of reducing the deployment cost and energy consumption. In this paper, we apply CS approach for both sparse events recovery and counting. We first propose a novel Greedy version of the Orthogonal Matching Pursuit (GOMP) algorithm allowing to account for the decomposition matrix non orthogonality. Then, in order to reduce the GOMP computational load, we propose a two-stages version of GOMP, the 2S-GOMP, which separates the events detection and counting steps. Simulation results show that the proposed algorithms achieve a better tradeoff between performance and computational load when compared to the recently proposed GMP algorithm and its two stages version denoted 2S-GMP.


FR-P4-7: Sparse Vector Sensor Array Design Based on Quaternionic Formulations

Matthew B Hawes (University of Sheffield, United Kingdom); Wei Liu (University of Sheffield, United Kingdom)

Abstract: In sparse arrays, the randomness of sensor locations avoids the introduction of grating lobes, while allowing adjacent sensor spacings to be greater than half a wavelength, leading to a larger array size with a relatively small number of sensors. In this paper, for the first time, the design of sparse vector sensor arrays is studied based on a quaternionic formulation. It is a further extension of the recently proposed compressive sensing (CS) based design for traditional sparse arrays and the vector sensors being considered are crossed-dipoles. Design examples are presented to validate the effectiveness of the method.


FR-P4-8: Minimal Solutions for Dual Microphone Rig Self-calibration

Simayijiang Zhayida (Lund University, Sweden); Simon Burgess (Lund University, Sweden); Yubin Kuang (Lund University, Sweden); Kalle Åström (Lund University, Sweden)

Abstract: In this paper, we study minimal problems related to dual microphone rig self-calibration with TOA measurements. We consider the problems with varying setups as (i) if the internal distances between the microphone nodes are known a priori or not. (ii) if the microphone rigs lies in an affine space with different dimension than the sound sources. Solving these minimal problems are essential to robust estimation of microphone and sound source locations. We identify for each of these minimal problems the number of solutions in general and develop non-iterative solvers. We show the proposed solvers are numerically stable in synthetic experiments. We also apply our method on real indoor experiment and obtain accurate reconstruction using TOA measurements.


FR-P4-9: Generalized MNS Method for Parallel Minor and Principal Subspace Analysis

Viet-Dung Nguyen (Polytech'Orléans, France); Karim Abed-Meraim (Polytech'Orléans, France); Nguyen Linh-Trung (Vietnam National University, Hanoi, Vietnam); Rodolphe Weber (University of Orleans, France)

Abstract: This paper introduces a generalized minimum noise subspace method for the fast estimation of the minor or principal subspaces for large dimensional multi-sensor systems. In particular, the proposed method allows parallel computation of the desired subspace when K > 1 computational units (DSPs) are available in a parallel architecture. The overall numerical cost is approximately reduced by a factor of K 2 while preserving the estimation accuracy close to optimality. Different algorithm implementations are considered and their performance is assessed through numerical simulation.


FR-P4-10: Sparsity-Aided Radar Waveform Synthesis

Heng Hu (Nanjing University of Science and Technology, P.R. China); Mojtaba Soltanalian (Uppsala University, Sweden); Petre Stoica (Uppsala University, Sweden); Xiaohua Zhu (Nanjing University of Science and Technology, P.R. China)

Abstract: Owing to the inherent sparsity of the target scene, compressed sensing (CS) has been successfully employed in radar applications. It is known that the performance of target scene recovery in CS scenarios depends highly on the coherence of the sensing matrix (CSM), which is determined by the radar transmit waveform. In this paper, we present a cyclic optimization algorithm to effectively reduce the CSM via a judicious design of the radar waveform. The proposed method provides a reduction in the size of the Gram matrix associated with the sensing matrix, and moreover, relies on the fast Fourier transform (FFT) operations to improve the computation speed. The effectiveness of the proposed algorithm is illustrated through numerical examples.


FR-P4-11: How to Localize Ten Microphones in One Finger Snap

Ivan Dokmanić (Ecole Polytechnique Fédérale de Lausanne, Switzerland); Laurent Daudet (Université Paris Diderot, France); Martin Vetterli (EPFL, Switzerland)

Abstract: A compelling method to calibrate the positions of microphones in an array is with sources at unknown locations. Remarkably, it is possible to reconstruct the locations of both the sources and the receivers, if their number is larger than some prescribed minimum [1, 2]. Existing methods, based on times of arrival or time differences of arrival, only exploit the direct paths between the sources and the receivers. In this proof-of-concept paper, we observe that by placing the whole setup inside a room, we can reduce the number of sources required for calibration. Moreover, our technique allows us to compute the absolute position of the microphone array in the room, as opposed to knowing it up to a rigid transformation or reflection. The key observation is that echoes correspond to virtual sources that we get "for free". This enables endeavors such as calibrating the array using only a single source.


FR-P4-12: Information-based Pool Size Control of Boolean Compressive Sensing for Adaptive Group Testing

Yohei Kawaguchi (Central Research Laboratory, Hitachi, Ltd., Japan); Tatsuhiko Osa (Tokyo University of Agriculture and Technology, Japan); Shubhranshu Barnwal (Central Research Laboratory, Hitachi, Ltd., India); Hisashi Nagano (Central Research Laboratory, Hitachi, Ltd., Japan); Masahito Togami (Central Research Laboratory, Hitachi, Ltd., Japan)

Abstract: A new method for solving the adaptive-group-testing problem is proposed. To solve the problem that the conventional method for non-adaptive group testing by Boolean compressive sensing needs a larger number of tests when the pool size is not optimized, the proposed method controls the pool size for each test. The control criterion is the expected information gain that can be calculated from the L0 norm of the estimated solution. Experimental simulation indicates that the proposed method outperforms the conventional method even when the number of defective items is varied and the number of defective items is unknown.


FR-P4-13: Leak Detection and Localization in Water Distribution System Using Time Frequency Analysis

Thaw Tar Thein Zan (Nanyang Technological University, Singapore); Kai-Juan Wong (Singapore Institute of Technology, Singapore); Hock Beng Lim (Nanyang Technological University, Singapore); Andrew Whittle (MIT, USA); Bu Sung Lee (Nanyang Technological University, Singapore)

Abstract: Water loss through burst events or leaks is a significant problem affecting water utilities worldwide and is exacerbated by deterioration of the underground infrastructure. This paper will report on our method to localize the source of a pipe burst by estimating the arrival time of the pressure transients at sensor nodes. Our proposed method uses Short Time Fourier Transform which has shown to overcome the limitation of Fourier transform temporal deficiency. The paper will in addition report on the results obtained from a real leakage data obtained on the WaterWiSe@SG test-bed, which shows the superiority of our method compared to multi-level wavelet transform.



Session FR-P5: Image and Video Applications


FR-P5-1: Multi Temporal Distance Images for Shot Detection in Soccer Games

Martin Hoernig (TU München, Germany); Michael Herrmann (TU München, Germany); Bernd M. Radig (Technische Universitaet Muenchen, Germany)

Abstract: We present a new approach for video shot detection and introduce multi temporal distance images (MTDIs), formed by chi-square based similarity measures that are calculated pairwise within a floating window of video frames. By using MTDI-based boundary detectors, various cuts and transitions in various shapes (dissolves, overlayed effects, fades, and others) can be determined. The algorithm has been developed within the special context of soccer game TV broadcasts, where a particular interest in long view shots is intrinsic. With a correct shot detection rate in camera 1 shots of 98.2% within our representative test data set, our system outperforms competing state-of-the-art systems.


FR-P5-2: Steganalysis with Cover-Source Mismatch and a Small Learning Database

Jérôme Pasquet (LIRMM, France); Sandra Bringay (LIRMM, France); Marc Chaumont (LIRMM, France)

Abstract: Many different hypotheses may be chosen for modeling a steganography/steganalysis problem. In this paper, we look closer into the case in which Eve, the steganalyst, has partial or erroneous knowledge of the cover distribution. More precisely we suppose that Eve knows the algorithms and the payload size that has been used by Alice, the steganographer, but she ignores the images distribution. In this source-cover mismatch scenario, we demonstrate that an Ensemble Classifier with Features Selection (EC-FS) allows the steganalyst to obtain the best state-of-the-art performances, while requiring 100 times smaller training database compared to the previous state-of-the art approach. Moreover, we propose the islet approach in order to increase the classification performances.


FR-P5-3: Universal Image Steganalysis Based on GARCH Model

Saeed Akhavan (School of Electrical & Computer Eng., College of Eng., University of Tehran, Iran); Mohammad Ali Akhaee (School of Electrical & Computer Eng., College of Eng., University of Tehran, Iran); Saeed Sarreshtedari (Sharif University of Technology, Iran)

Abstract: This paper introduces a new universal steganalysis frame- work. The required image features are extracted based on the generalized autoregressive conditional heteroskedasticity (GARCH) model and higher-order statistics of the images. The GARCH features are extracted from non-approximate wavelet coefficients. Besides, the second and third order statistics are exploited to develop features very sensitive to minor changes in natural images. The experimental results demonstrate that the proposed feature-based steganalysis framework outperforms state of the art methods while run- ning on the same order of features.


FR-P5-4: A Frontal View Gait Recognition Based on 3D Imaging Using a Time of Flight Camera

Tengku Afendi (Queen's University Belfast, United Kingdom); Fatih Kurugollu (Queen's University Belfast, United Kingdom); Danny Crookes (Queen's University of Belfast, United Kingdom); Ahmed Bouridane (Northumbria UNiversity at Newcastle, United Kingdom)

Abstract: Studies have been carried out to recognize individuals from frontal view using their gait patterns. In previous work, gait sequences were captured using either single or stereo RGB camera systems or the Kinect camera system. In this re-search, we used a new frontal view gait recognition method using a laser based Time of Flight (ToF) camera. In addition to the new gait data set, other contributions include enhancement of the silhouette segmentation, gait cycle estimation and gait image representations. We propose four new gait image representations namely Gait Depth Energy Image (GDE), Partial GDE (PGDE), Discrete Cosine Transform GDE (DGDE) and Partial DGDE (PDGDE). The experimental results show that all the proposed gait image representations produced better accuracies than the previous methods. In addition, we also developed Fusion GDEs (FGDEs) which achieved better overall accuracy and outperformed the previous methods.


FR-P5-5: Video Steganalysis of Multiplicative Spread Spectrum Steganography

Nematollah Zarmehi (University of Tehran, Iran); Mohammad Ali Akhaee (School of Electrical & Computer Eng., College of Eng., University of Tehran, Iran)

Abstract: In this paper we propose a video steganalysis method toward multiplicative spread spectrum embedding. We use the redundancies of the video frames to estimate the cover frame and after extracting some features from the video frames and the estimated ones, the received video is classified as suspicious or not suspicious. In the case that the video declared suspicious, we estimate the hidden message and the gain factor used in the embedder. We also propose a new method for estimating the gain factor in multiplicative spread spectrum embedding. Using the estimated hidden message and gain factor, we are able to reconstruct the original video. Simulation results verify the success of our steganalysis method.


FR-P5-6: On the Application of AAM-Based Systems in Face Recognition

Muhammad Aurangzeb Khan (Lancaster University, UK, Pakistan); Costas Xydeas (Lancaster University, United Kingdom); Hassan Ahmed (Lancaster University, U.K, United Kingdom)

Abstract: The presence of significant levels of signal variability in face-portrait type of images, due to differences in illumination, pose and expression, is generally been accepted as having an adverse effect on the overall performance of i) face modeling and synthesis (FM/S) and also on ii) face recognition (FR) systems. Furthermore, the dependency on such input data variability and thus the sensitivity, with respect to face synthesis performance, of Active Appearance Modeling (AAM), is also well understood. As a result, the Multi-Model Active Appearance Model (MM-AAM) technique [1] has been developed and shown to possess a superior face synthesis performance than AAM. This paper considers the applicability in FR applications of both AAM and MM-AAM face modeling and synthesis approaches. Thus, a MM-AAM methodology has been devised that is tailored to operate successfully within the context of face recognition. Experimental results show FR-MM-AAM to be significantly superior to conventional FR-AAM.


FR-P5-7: Textures and Reversible Watermarking

Catalin Dragoi (Valahia University of Targoviste, Romania); Dinu Coltuc (Valahia University of Targoviste, Romania)

Abstract: This paper investigates the effectiveness of prediction-error expansion reversible watermarking on textured images. Five well performing reversible watermarking schemes are considered, namely the schemes based on the rhombus average, the adaptive rhombus predictor, the full context predictor as a weighted average between the rhombus and the four diagonal neighbors, the global least-squares predictor and its recently proposed local counterpart. The textured images are analyzed and the optimal prediction scheme for each texture type is determined. The local least-squares prediction based scheme provides the best overall results. On certain textures the global predictor can offer similar performances.


FR-P5-8: Three Stages Prediction-Error Expansion Reversible Watermarking

Tudor Nedelcu (Politehnica University of Bucharest, Romania); Iordache Razvan (Politehnica University of Bucharest, Romania); Dinu Coltuc (Valahia University of Targoviste, Romania)

Abstract: This paper proposes a three stages difference expansion reversible watermarking scheme. In the first stage, a quarter of the pixels are estimated by using the median of the eight original neighbors of the 3x3 window. In the second stage, a quarter of the pixels are estimated as the average on the rhombus of the four horizontal and vertical original pixels. Finally, the remaining pixels are estimated on the rhombus context, but by using the pixels modified in the two previous stages. The experimental results show that the proposed scheme can provide slightly improved results than the classical two-stages reversible watermarking based on the rhombus context.


FR-P5-9: Modified Fuzzy C-Means Clustering for Automatic Tongue Base Tumour Extraction From MRI Data

Trushali Doshi (University of Strathclyde, United Kingdom); John J Soraghan (University of Strathclyde, United Kingdom); Derek Grose (Beatson Oncology Unit, NHS Greater Glasgow and Clyde, United Kingdom); Kenneth MacKenzie (Royal Infirmary, NHS Greater Glasgow and Clyde, United Kingdom); Lykourgos Petropoulakis (University of Strathclyde, United Kingdom)

Abstract: Magnetic resonance imaging (MRI) is a widely used imaging modality to extract tumour regions to assist in radiotherapy and surgery planning. Extraction of a tongue base tumour from MRI is challenging due to variability in its shape, size, intensities and fuzzy boundaries. This paper presents a new automatic algorithm that is shown to be able to extract tongue base tumour from gadolinium-enhanced T1-weighted (T1+Gd) MRI slices. In this algorithm, knowledge of tumour location is added to the objective function of standard fuzzy c-means (FCM) to extract the tumour region. Experimental results on 9 real MRI slices demonstrate that there is good agreement between manual and automatic extraction results with dice similarity coefficient (DSC) of 0.77±0.08.


FR-P5-10: Accelerated Unsupervised Filtering for the Smoothing of Road Pavement Surface Imagery

Henrique Oliveira (Instituto de Telecomunicações, Portugal); Paulo Lobato Correia (Instituto Superior Tecnico - Universidade Tecnica Lisboa, Portugal); José Caeiro (Grupo de Sistemas de Processamento de Sinal – SIPS/INESC-ID, Portugal)

Abstract: An accelerated formulation of the Unsupervised Infor-mation-theoretic Adaptive Image Filtering (UINTA) method is presented. It is based on a parallel implementation of the algorithm, using the Open Computing Language (OpenCL), while maintaining the precision and efficiency of the original method, which are briefly discussed focusing on the respective computational complexities. The experimental computational efficiency is compared with the one obtained using the standard implementation, highlighting the significant improvement of computational times achieved with the proposed one. This new implementation is tested for the smoothing of road pavement surface images, for which the original method had been previously applied, showing the clear advantage of its use.


FR-P5-11: Mapping Sounds Onto Images Using Binaural Spectrograms

Antoine Deleforge (University of Erlangen-Nuremberg, Germany); Vincent Drouard (INRIA Rhône-Alpes, France); Laurent Girin (GIPSA-Lab, Grenoble-INP, France); Radu P. Horaud (INRIA Grenoble Rhône-Alpes, France)

Abstract: We propose a novel method for mapping sound spectrograms onto images and thus enabling alignment between auditory and visual features for subsequent multimodal processing. We suggest a supervised learning approach to this audio-visual fusion problem, on the following grounds. Firstly, we use a Gaussian mixture of linear regressors to learn a probabilistic mapping from image locations to binaural spectrograms. Secondly, we derive a closed-form expression for the conditional posterior probability of an image location, given both an observed spectrogram, emitted from an unknown source direction, and the mapping parameters that were previously learnt. Prominently, the proposed method is able to deal with completely different spectrograms for training and for alignment. While fixed-length wide-spectrum sounds are used for learning, thus fully and robustly estimating the regression, variable-length sparse-spectrum sounds, e.g., speech, are used for alignment. The proposed method successfully extracts the image location of speech utterances in realistic reverberant-room scenarios.



Session FR-P6: Signal Processing Applications II


FR-P6-1: Bio-mechanical Characterization of Voice for Smoking Detection

Sofia BenJebara (Ecole Superieure des Communications de Tunis, Tunisia)

Abstract: The purpose of this work is to discriminate between smoker and non-smoker speakers by analyzing their voice. In fact, the vocal folds, the main organ responsible of producing voice, is damaged by smoke so that its structure and its vibration are altered. Some bio-mechanical features, describing vocals folds behavior and status are used. They are based on the two-mass model which characterizes vocal folds by the mass, the stiffness and the losses of their cover and body parts. Bio-mechanical features of smokers and non-smokers are analyzed and compared to select relevant features permitting to discriminate between the two categories of speakers. The Quadratic Discriminant Analysis is used as a tool of classification and shows a relatively good rate of detection of smokers.


FR-P6-2: Detection of Faulty Glucose Measurements Using Texture Analysis

Nevine Demitri (Darmstadt University of Technology, Germany); Abdelhak M Zoubir (Darmstadt University of Technology, Germany)

Abstract: Faults occurring in hand-held blood glucose measurements can be critical to patient self-monitoring, as they can lead to unnecessary changes of treatment. We propose a method to detect faulty glucose measurement frames in devices that use a camera to estimate the glucose concentration. We assert that texture, as opposed to intensity, is able to differentiate between correct and false glucose measurements, regardless of the given blood sample. The co-occurrence based textural features energy, maximum probability and correlation prove to be suitable for our detection application. We calculate kinetic feature curves and use a hypothesis testing approach to detect faulty measurements. Our method is able to detect a faulty measurement after less than one third of the time, which would usually be needed. The validation of our method is done using a real data set of blood glucose measurements obtained using different glucose concentrations and containing both correct and faulty measurements.


FR-P6-3: Bi-CoPaM Ensemble Clustering Application to Five Escherichia Coli Bacterial Datasets

Basel Abu-Jamous (Brunel University, United Kingdom); Rui Fa (Brunel University, United Kingdom); David Roberts (The University of Oxford, United Kingdom); Asoke Nandi (Brunel University, United Kingdom)

Abstract: Bi-CoPaM ensemble clustering has the ability to mine a set of microarray datasets collectively to identify the subsets of genes consistently co-expressed in all of them. It also has the capability of considering the entire gene set without pre-filtering as it implicitly filters out less interesting genes. While it showed success in revealing new insights into the biology of yeast, it has never been applied to bacteria. In this study, we apply Bi-CoPaM to five bacterial datasets, identifying two clusters of genes as the most consistently co-expressed. Strikingly, their average profiles are consistently negatively correlated in most of the datasets. Thus, we hypothesise that they are regulated by a common biological machinery, and that their genes with unknown biological processes may be participating in the same processes in which most of their genes known to participate. Additionally, our results demonstrate the applicability of Bi-CoPaM to a wide range of species.


FR-P6-4: Generation of Stimulus Features for Analysis of fMRI During Natural Auditory Experiences

Valeri Tsatsishvili (University of Jyväskylä, Finland); Fengyu Cong (Dalian University of Technology, P.R. China); Tapani Ristaniemi (University of Jyväskylä, Finland); Petri Toiviainen (University of Jyväskylä, Finland); Vinoo Alluri (University of Jyvaskyla, Finland); Elvira Brattico (University of Helsinki, Finland); Asoke Nandi (Brunel University, United Kingdom)

Abstract: In contrast to block and event-related designs for fMRI experiments, it becomes much more difficult to extract events of interest in the complex continuous stimulus for finding corresponding blood-oxygen-level dependent (BOLD) responses. Recently, in a free music listening fMRI experiment, acoustic features of the naturalistic music stimulus were first extracted, and then principal component analysis (PCA) was applied to select the features of interest acting as the stimulus sequences. For feature generation, kernel PCA has shown superiority over PCA since it can implicitly exploit nonlinear relationship among features and such relationship seems to exist generally. Here, we applied kernel PCA to select the musical features and obtained an interesting new musical feature in contrast to PCA features. With the new feature, we found similar fMRI results compared with those by PCA features, indicating that kernel PCA assists to capture more properties of the naturalistic music stimulus.


FR-P6-5: DSP in Heterogeneous Multicore Embedded Systems - A Laboratory Experiment

Pavel Lifshits (Technion - Israel Institute of Technology, Israel); Alon Eilam (Technion - Israel Institute of Technology, Israel); Yair Moshe (Technion - Israel Institute of Technology, Israel); Nimrod Peleg (Technion, IIT, Israel)

Abstract: Undergraduate engineering students who are learning Digital Signal Processing (DSP) are expected to have the ability to implement their theoretical knowledge in various applications soon after graduation. In this paper, we present a laboratory experiment developed for undergraduate students that addresses the challenge of getting them familiar with implementing DSP algorithms in heterogeneous multicore systems. In a top-down approach, the students first gain control of the development environment, and then implement DSP algorithms on a general purpose and on a digital signal processor core. Through the experiment, they get to appreciate the advantages of DSP core architecture in performing signal processing algorithms, and learn methods for timing and data transfer between cores while meeting real time constraints. In a limited time frame, this hands-on laboratory experiment exposes the students to state-of-the-art multicore development practices and increases their knowledge and interest in DSP and in embedded programming.


FR-P6-6: T Wave Alternans Detection in ECG Using Extended Kalman Filter and Dualrate EKF

Mahsa Akhbari (Sharif University of Technology, Iran); Mohammad Bagher Shamsollahi (Sharif University of Technology, Iran); Christian Jutten (GIPSA-Lab, France)

Abstract: T Wave Alternans (TWA) is considered as an indicator of Sudden Cardiac Death (SCD). In this paper, a method for TWA detection is presented. It is based on a nonlinear dynamic model and for estimating the model parameters, we use an Extendend Kalman Filter (EKF). We propose EKF6 and dualrate EKF6 approaches. Dualrate EKF is suitable for modeling the states which are not updated in all time instances. Quantitative and qualitative evaluations of the proposed method have been done on TWA challenge database. We compare our method with that proposed by Sieed et al. in TWA challenge 2008. We also compare our method with our previous proposed approach (EKF25-4obs). Results show that the proposed methods can detect peak position and amplitude of T waves in ECG precisely. Mean and standard deviation of estimation error of our methods for finding position of T waves do not exceed four samples (8 msec).


FR-P6-7: A New Spontaneous Expression Database & A Study of Classification-Based Expression Analysis Methods

Segun Aina (Loughborough University, United Kingdom); Mingxi Zhou (Loughborough University, United Kingdom); Jonathon A Chambers (Loughborough University, United Kingdom); Raphael Phan (Faculty of Engineering, Multimedia University, Malaysia)

Abstract: In this paper we introduce a new spontaneous expression database, which is under development as a new open resource for researchers working in expression analysis. It is particularly targeted at providing a wider number of expression classes contained within the small number of natural expression databases currently available so that it can be used as a benchmark for comparative studies. We also present the first comparison between kernel-based Principal Component Analysis (PCA) and Fisher Linear Discriminant Analysis (FLDA), in combination with a Sparse Representation Classifier (SRC), based classifier for expression analysis. We highlight the trade-off between performance and computation time; which are critical parameters in emerging systems which must capture the expression of a human, such as a consumer responding to some promotional material.


FR-P6-8: Performance Improvement of Spread Spectrum Additive Data Hiding Over Codec-Distorted Voice Channels

Mehdi Boloursaz (Author, Iran); Reza Kazemi (Sharif University of Technology, Iran); Ferydon Behnia (Sharif University of Technology, Iran); Mohammad Ali Akhaee (School of Electrical & Computer Eng., College of Eng., University of Tehran, Iran)

Abstract: This paper considers the problem of covert communication through dedicated voice channels by embedding secure data in the cover speech signal utilizing spread spectrum additive data hiding. The cover speech signal is modeled by a Generalized Gaussian (GGD) random variable and the Maximum A Posteriori (MAP) detector for extraction of the covert message is designed and its reliable performance is verified both analytically and by simulations. The idea of adaptive estimation of detector parameters is proposed to improve detector performance and overcome voice non-stationarity. The detector's bit error rate (BER) is investigated for both blind and semi-blind cases in which the GGD shape parameter needed for optimum detection is either estimated from the stego or cover signal respectively. The simulation results also show that the proposed method achieves acceptable robustness against the lossy compression attack by different compression rates of Adaptive Multi Rate (AMR) voice codec.


FR-P6-9: A Bayesian Method to Quantifying Chemical Composition Using NMR: Application to Porous Media Systems

Yuting Wu (University of Cambridge, United Kingdom); Daniel Holland (University of Cambridge, United Kingdom); Mick Mantle (University of Cambridge, United Kingdom); Andrew Wilson (University of Cambridge, United Kingdom); Sebastian Nowozin (Microsoft Research, United Kingdom); Andrew Blake (Microsoft Research, United Kingdom); Lynn Gladden (University of Cambridge, United Kingdom)

Abstract: This paper describes a Bayesian approach for inferring the chemical composition of liquids in porous media obtained using nuclear magnetic resonance (NMR). The model analyzes NMR data automatically in the time domain, eliminating the operator dependence of a conventional spectroscopy approach. The technique is demonstrated and validated experimentally on both pure liquids and liquids imbibed in porous media systems, which are of significant interest in heterogeneous catalysis research. We discuss the challenges and practical solutions of parameter estimation in both systems. The proposed Bayesian NMR approach is shown to be more accurate and robust than a conventional spectroscopy approach, particularly for signals with a low signal-to-noise ratio (SNR) and a short life time.


FR-P6-10: Stochastic Modeling of EEG Rhythms with Fractional Gaussian Noise

Mandar Karlekar (BTS Pilani- Goa Campus, India); Anubha Gupta (Indraprastha Institute of Information Technology Delhi, India)

Abstract: This paper presents a novel approach to signal modeling for EEG signal rhythms. A new method of 3-stage DCT based multirate filterbank is proposed for the decomposition of EEG signals into brain rhythms: delta, theta, alpha, beta, and gamma rhythms. It is shown that theta, alpha, and gamma rhythms can be modeled as 1st order fractional Gaussian Noise (fGn), while the beta rhythms can be modeled as 2nd order fGn processes. These fGn processes are stationary random processes. Further, it is shown that the delta subband imbibes all the nonstationarity of EEG signals and can be modeled as a 1st order fractional Brownian motion (fBm) process. The modeling of subbands is characterized by Hurst exponent, estimated using maximum likelihood (ML) estimation method. The modeling approach has been tested on two public databases.


FR-P6-11: Feasibility of Single-Arm Single-Lead ECG Biometrics

Peter Sam Raj (University of Toronto, Canada); Dimitrios Hatzinakos (University of Toronto, Canada)

Abstract: This work analyses the feasibility of electrocardiogram (ECG) biometrics using signals from a novel single arm single-lead acquisition methodology. These new signals are used and analysed in a biometric recognition system in verification mode for validation of a person's identity enrolled in a system database. The algorithm used for recognition in the proposed system is the Autocorrelation/Linear Discriminant Analysis (AC/LDA), which is combined with preprocessing stages tuned to the characteristics for ECG from the single arm. The signal is collected from 23 subjects in three scenarios and performance of the proposed scheme is evaluated. Considerably low Equal Error Rate of 4.34% is obtained using the described method, establishing the utility of these signals as viable candidates for ECG Biometrics.


FR-P6-12: Personalizing a Smartwatch-based Gesture Interface with Transfer Learning

Gabriele Costante (University of Perugia, Italy); Lorenzo Porzi (University of Perugia, Italy); Oswald Lanz (Fondazione Bruno Kessler, Italy); Paolo Valigi (University of Perugia, Italy); Elisa Ricci (University of Perugia, Italy)

Abstract: The widespread adoption of mobile devices has lead to an increased interest toward smartphone-based solutions for supporting visually impaired users. Unfortunately the touch-based interaction paradigm commonly adopted on most devices is not convenient for these users, motivating the study of different interaction technologies. In this paper, following up on our previous work, we consider a system where a smartwatch is exploited to provide hands-free interaction through arm gestures with an assistive application running on a smartphone. In particular we focus on the task of effortlessly customizing the gesture recognition system with new gestures specified by the user. To address this problem we propose an approach based on a novel transfer metric learning algorithm, which exploits prior knowledge about a predefined set of gestures to improve the recognition of user-defined ones, while requiring only few novel training samples. The effectiveness of the proposed method is demonstrated through an extensive experimental evaluation.