EFFICIENT IMPLEMENTATION ON MULTIPROCESSORS : THE PROBLEM OF SIGNAL PROCESSING APPLICATIONS MODELLING Laurent Kwiatkowski, Fernand Bori, Jean-Paul Stromboni Laboratoire d'Informatique Signaux Systmes, UNSA - URA 1376 CNRS 41, Boulevard NAPOLEON III - F06041 NICE cedex - FRANCE email : kwiatkow@alto.unice.fr In signal processing area, applications involve a large amount of computation, suggesting the use of multiprocessors to speed up processing. However, obtaining good performance is not easy because the machine should take advantage of the potential parallelisms of the studied application. That is why several parallel implementation methods using mapping and scheduling algorithms has been developped. One of development shells aims is application partitioning so that every part will be processed by a different processor, like SynDEx [1] or Ptolemy[2]. These shells use some graph models to exhibit both potential parallelisms of the application and the available multiprocessor parallelisms [3], but the task granularity problem is not considered when the application is modelized. The purpose of this paper is to emphasize the problem of the task granularity when the application is modelized by means of a graph and to study the impact on speedup. As a solution for this problem, this paper presents an original implementation method based on the variation of the granularity and regular application size.
PROCESSOR ARCHITECTURE FOR EXTENDED LAPPED TRANSFORM David Akopian and Jaakko Astola Signal Processing Laboratory, Tampere University of Technology, P.O.Box 553, FIN-33101, Tampere, Finland, e-mails: prog@cs.tut.fi, jta@cs.tut.fi ABSTRACT This paper is devoted to implementation of Extended Lapped Transform (ELT), which is among the most efficient factorization methods for paraunitary filterbanks. First we utilize thespecial form of the matrices in the part of factorization of ELT to process data at input data rate in a pipelined structure with minimal number of processor elements without inserting additional delays. Next we suggest an algorithm for DCT-IV transform, the other part of ELT factorization, with a constant geometry structure suitable for the use of perfect-shuffle network.
Real-Time Obstacle Detection using Stereo Vision Massimo Bertozzi, Alberto Broggi, Alessandra Fascioli Dipartimento di Ingegneria dell'Informazione Universita' di Parma, I-43100 Parma, Italy Tel. +39-521-905707 Fax. +39-521-905723 e-mail: {bertozzi,broggi,fascal}@CE.UniPR.IT This work presents a low-cost stereo vision system aimed to the real-time detection of generic obstacles (without constraints on symmetry or shape) on the path of a mobile road vehicle. Thanks to a geometrical transform the perspective effect is removed from both left and right stereo images. The difference between the results is used for the detection of free-space in front of the vehicle. The output of the processing is displayed on both an on-board monitor and a control-panel to give a visual feedback to the driver. The system was tested on MOB-LAB experimental land vehicle, which was driven for more than 3000 km along extra-urban roads and freeways at speeds up to 80 km/h, and demonstrated its robustness with respect to shadows and changing illumination conditions, different road textures, and vehicle movement.
IMPLEMENTATION OF A FAST MPEG-2 COMPLIANT HUFFMAN DECODER Mikael Karlsson Rudberg (mikaelr@isy.liu.se) and Lars Wanhammar (larsw@isy.liu.se) Department of Electrical Engineering, Linkping University, S-581 83 Linkping, Sweden Tel: +46 13 284059; fax: +46 13 139282 ABSTRACT In this paper a 100 Mbit/s Huffman decoder implementation is presented. A novel approach where a parallel decoding of data mixed with a serial input has been used. The critical path has been reduced and a significant increase in throughput is achieved. The decoder is aimed at the MPEG-2 Video decoding standard and has therefore been designed to meet the required performance.
IMPLEMENTATION OF KOGBETLIANTZ'S SVD ALGORITHM USING ORTHONORMAL MICRO--ROTATIONS Jurgen Gotze (+), Peter Rieder (++) and Josef A. Nossek (++) (+) ECE ,Rice University, Houston, TX 77251--1892, U.S.A. jugo@ece.rice.edu (++) TU Munich, Arcisstr. 21, 80290 Munich, Germany peri@nws.e-technik.tu-muenchen.de In this paper the implementation of Kogbetliantz's SVD algorithm using orthonormal micro--rotations is presented. An orthonormal micro--rotation is a rotation by an angle of a given set of micro--rotation angles which are choosen such that the rotation can be implemented by a small amount of shift--add operations. All computations (evaluation and application of the rotations) can entirely be referred to orthonormal micro--rotations. Simulations show the reduced computational complexity of Kogbetliantz's SVD algorithm based on orthonormal micro--rotations comparded to the standard Kogbetliantz SVD algorithm.
A VLSI ARCHITECTURE FOR REAL TIME OBJECT DETECTION ON HIGH RESOLUTION IMAGES M. Cavadini M. Wosnitza M. Thaler G. Troester Electronic Laboratory Swiss Institute of Technology Zuerich (ETHZ) Gloriastrasse 35 CH-8092 Zuerich cavadini@ife.ee.ethz.ch ABSTRACT This paper describes a VLSI-based SIMD multiprocessor system for the implementation of a set of basic object detection algorithms. The system architecture takes advantage of modern fast EDRAM-technology to support the communication requirements of 800 Mbytes/s between main memory and processors imposed by high resolution images. A specialized processing element (PE) architecture for implementation in VLSI which efficiently implements the basic set of algorithms is presented. The performance of a single PE is discussed with respect to the different algorithms. A system consisting of 4 processing elements realized in 0.6mu CMOS-technology is able to localize a 128x128 pixel template in a 1024x1024 pixel image at a rate of 10 frames/second (sustained performance 2.1 MOps/s).
Title: A SINGLE CHIP MOTION ESTIMATOR DEDICATED TO MPEG2 MP@HL Authors: Takao ONOYE, Gen FUJITA, Masamichi TAKATSU, Isao SHIRAKAWA, and Kenji MATSUMURA* Affiliations: Dept. Inf. Sys. Eng., Osaka University Yamada-Oka, Suita, Osaka, 565 Japan {onoe, fujita, taka2, sirakawa}@ise.eng.osaka-u.ac.jp *K.C.S. Co., Ltd. Naka-Kosaka, Higashi-Osaka, Osaka, 577 Japan matsu@k-c-s.k-c-s.co.jp Abstract: A single chip motion estimator dedicated to MPEG2 MP@HL is developed. Adopting a two-level hierarchical searching algorithm in detecting motion vectors, the computational labor can be reduced by 1/70. A novel mechanism is introduced into the full-search procedure, which attempts the maximum possible reuse of reference pixels in order to reduce the bandwidth of the frame memory interface. The proposed motion estimator is integrated in a 0.6um triple-metal CMOS chip with the input clock rate up to 133MHz, which enables the real time motion estimation.
VLSI DESIGN OF A PARALLEL ARCHITECTURE 2-D RANK ORDER FILTER R. Roncella, R. Saletti, G. Savoia Dipartimento di Ingegneria dell'Informazione: Elettronica, Informatica, Telecomunicazioni, Universit di Pisa, Via Diotisalvi 2, 56126 Pisa (Italy) tel: +39-50-568511; fax: +39-50-568522 E-mail: roncella@iet.unipi.it A VLSI parallel architecture implementing a new algorithm for 2-D rank order filtering, based on repeated maximum finding operations, is presented in this paper, and the design of a programmable demonstrator chip realised in standard-cell 1 um CMOS technology is described. The chip has programmable window size and selectable rank, it can work with unitary throughput at 25 MHz, in the worst case, and its area is 7 x 5.5 sq.mm.
A NOVEL VLSI ARCHITECTURE FOR BLOCK MATCHING ALGORITHMS* Chen-Yi Lee Dept. of Electronics Engineering, National Chiao Tung University 1001, University Road, Hsinchu 300, Taiwan, ROC Tel: 886-35-731849; Email: cylee@cc.llctll.edu.tw This paper presents a new VLSI architecture for full search block matching motion estimation (ME) algorithm. The proposed VLSI architecture has three specific features: (1) it has a processor element (PE) array which provides sufficient computational power and achieves 100% hardware efficiency, where PE's work in a systolic style, (2) it contains stream memory banks which provide scheduled data flow needed in PE for computing mean absolute error (MAE); and (3) it has minimal memory access bandwidth to save I/O pin-count. As a result, the proposed architecture allows to reach cost-effective ME hardware solution.
SYNTHESIS OF MEMORY-BASED VLSI ARCHITECTURES FOR DISCRETE WAVELET TRANSFORMS Seonil Choi, Jongwoo Bae and Viktor K. Prasanna Integrated Media Systems Center Department of Electrical Engineering-Systems University of Southern California Los Angeles, CA 90089-2562 WWW:http://www.usc.edu/dept/ceng/prasanna/home.html {seonil, jongwoo, prasanna}@halcyon.usc.edu ABSTRACT We propose novel VLSI architectures for computing the Discrete Wavelet Transforms. The proposed architectures employ a memory-based approach. ROM look-up tables are used for the implementation of complex computational modules. Compared with known architectures that employ traditional hardware computational modules, the proposed architectures are faster and are area-efficient. The memory-based architecture is used to implement the block-based DWT with parallel I/O. The resulting architectures are area-efficient and have high throughput and low latency. These architectures are suitable for low-power single-chip implementations which are useful for DWT-based mobile/visual communication systems.
RESIDUAL SIGNAL IN SUB-BAND ACOUSTIC ECHO CANCELLERS O. Tanrikulu, B. Baykal, A. G. Constantinides, J. A. Chambers Sig. Proc. Sec., Dept. of EE. Eng., Imperial College of Sci., Tech. and Med., London SW7 2BT, UK Email: o.tanrikulu@ic.ac.uk All-pass based Power Symmetric QMF-IIR (PS-QMF-IIR) and Aliasing Cancellation QMF-FIR (AC-QMF-FIR) sub-band decomposition approaches are studied in the context of Acoustic Echo Cancellation. The properties of the residual echo signal are obtained. For both filter types, if the filters have very sharp transition-bands, the residual echo signal contains tonal components. It is shown that these can be efficiently removed by using notch filters. Experimental results indicate that PS-QMF-IIR filters are better suited for this application than FIR filter based sub-band approaches, when combined with the notch filters presented.
AN IMPROVED ECHO SHAPING ALGORITHM FOR ACOUSTIC ECHO CONTROL Rainer Martin and Stefan Gustafsson IND, Aachen University of Technology 52056 Aachen, Germany Tel: +49 241 806984; fax: +49 241 8888186 e-mail: martin@ind.rwth-aachen.de This paper describes and analyses an improved algorithm for hands-free telephony which uses an acoustic echo canceller combined with an additional FIR-filter (called "echo shaping filter") in the sending path of the hands-free telephone. The algorithm controlling the filter is motivated by an approximation of an optimal least squares filter. Simulation results show that the algorithm allows to reduce the order of the echo canceller significantly, still providing high echo attenuation and low distortion of the near end speech signal during double talk. The modulation of the background noise caused by the echo shaping filter can be reduced by adding artificially generated noise to the output signal ("comfort noise"). 
ACOUSTIC ECHO CANCELLATION AND NOISE REDUCTION IN THE FREQUENCY-DOMAIN: A GLOBAL OPTIMISATION F.Capman, J.Boudy, P.Lockwood MATRA COMMUNICATION, Speech Processing Department rue J.P.Timbaud, 78392 Bois d'Arcy Cedex, BP 26, FRANCE phone: (+33-1) 34-60-76-84 fax: (+33-1) 34-60-88-32 e-mail: fcapman@matra-com.fr Abstract: The design of an efficient and robust hands-free system is now required by the growth of mobile radio and teleconference communications. The use of Frequency-Domain Adaptive Filters in the context of acoustic echo cancellation has been extensively studied in the literature. These algorithms are well-suited for long impulse response modeling and for correlated input signals like speech. A global optimisation of a frequency- domain acoustic echo cancellation algorithm with noise reduction is presented in this paper. This optimisation leads to both reduced complexity and improved performances when compared to classical cascaded structures.
REALIZATION OF AN ACOUSTIC ECHO CANCELLER ON A SINGLE DSP Gerard Egelmeers, Piet Sommen and Jacob de Boer Eindhoven University of Technology (TUE) P.O.Box 513, 5600 MB Eindhoven, The Netherlands Tel: +31 40 2473634; fax: +31 40 2455674 e-mail: p.c.w.sommen@ele.tue.nl An Acoustic Echo Canceller (AEC) based on the Decoupled Partitioned Block Frequency Domain Adaptive Filter (DPBFDAF) [3,4] is implemented on a single Digital Signal Processor (DSP), the TMS320C30. This flexible setup makes it possible to choose the sample frequency (fs), the number of coefficients (N) of the adaptive filter and the processing delay independent of one another (only limited by the total complexity). Two implementation examples are given: one with N=2016 and fs=7 kHz with a processing delay of 1.6 msec., the other one with N=2560 and fs=13kHz with a processing delay of 6.5 msec. It is shown that the setup works both for a white noise input signal and a real speech signal.
SUBBAND ACOUSTIC ECHO CONTROL USING NON-CRITICAL FREQUENCY SAMPLING P. A. Naylor and J. E. Hart Dept. Electrical and Electronic Engineering, Imperial College, London, UK. email: p.naylor@ic.ac.uk Aliasing is often generated in critically decimated subband schemes which can reduce the performance of subband adaptive algorithms. This paper investigates non-critical decimation schemes in which the generation of aliasing in the subbands is avoided by down-sampling the subband signals by a smaller factor than would normally be expected, thereby allowing for analysis filters with finite transition bands. The implementations of two such non-critical schemes are presented, one using FIR and one using IIR filter banks. Simulation results for acoustic echo control using both USASI noise and male speech signals show the non-critical schemes performance in comparison to critically decimated filter bank approaches.
SIMULTANEOUS SCHUR DECOMPOSITION OF SEVERAL MATRICES TO ACHIEVE AUTOMATIC PAIRING IN MULTIDIMENSIONAL HARMONIC RETRIEVAL PROBLEMS Martin Haardt (1), Knut Hueper (1), John B. Moore (2), and Josef A. Nossek (1) (1) Institute of Network Theory and Circuit Design, Technical University of Munich, D-80290 Munich, Germany Phone: +49 (89) 289-28511 Fax: +49 (89) 289-68504 E-Mail: maha@nws.e-technik.tu-muenchen.de (2) Department of Systems Engineering, Australian National University, Canberra ACT 0200, Australia This paper presents a new Jacobi-type method to calculate a simultaneous Schur decomposition (SSD) of several real-valued, non-symmetric matrices by minimizing an appropriate cost function. Thereby, the SSD reveals the average eigenstructure'' of these non-symmetric matrices. This enables an R-dimensional extension of Unitary ESPRIT to estimate several undamped R-dimensional modes or frequencies along with their correct pairing in multidimensional harmonic retrieval problems. Unitary ESPRIT is an ESPRIT-type high-resolution frequency estimation technique that is formulated in terms of real-valued computations throughout. For each of the R dimensions, the corresponding frequency estimates are obtained from the real eigenvalues of a real-valued matrix. The SSD jointly estimates the eigenvalues of all R matrices and, thereby, achieves automatic pairing of the estimated R-dimensional modes via a closed-form procedure, that neither requires any search nor any other heuristic pairing strategy. Finally, we show how R-dimensional harmonic retrieval problems (with R > 2) occur in array signal processing and model-based object recognition applications.
A UNIFIED APPROACH TO ROBUST ADAPTIVE BEAMFORMING IN MOVING JAMMER ENVIRONMENT Alex B. Gershman, Ulrich Nickel, Johann F. Bohme Electrical Engineering Dept., Ruhr University, Bochum, Germany Electronics Dept., FGAN-FFM, Wachtberg, Germany e-mail: gsh@sth.ruhr-uni-bochum.de The performance of adaptive beamforming algorithms is known to degrade in rapidly moving jammer environments. This degradation occurs due to the jammer motion that may bring the jammers out of the sharp nulls of the adapted directional pattern. Below, we develop a unified approach allowing to make a wide class of adaptive array algorithms robust against possible jammer motion. This is achieved by means of artificial broadening of the null width in all jammer directions. Data-dependent sidelobe derivative constraints are used which do not require any a priori information about the jammers. The robust modifications of several well known adaptive array algorithms are formulated.
AN ADAPTIVE ESPRIT ALGORITHM BASED ON PERTURBATION OF UNSYMMETRICAL MATRICES Qing-Guang Liu and Benoit Champagne INRS-Telecommunications 16 Place du Commerce Verdun, Quebec, Canada H3E 1H6 qingliu@inrs-telecom.uquebec.ca ABSTRACT Many subspace updating algorithms based on the eigenvalue decomposition (EVD) of array covariance matrices have been proposed and used in high-resolution array processing algorithms in recent years. In some applications (i.e. ESPRIT algorithms), however, the EVD of an unsymmetrical matrix is also needed. In this paper, an EVD updating approach for an unsymmetrical matrix is presented based on its first-order perturbation analysis. By jointly using this approach and a subspace updating method in an ESPRIT algorithm, a completely adaptive ESPRIT algorithm is obtained. The evaluation of the complexity and the performance of this algorithm is given in the paper.
Title: AN ALGORITHM FOR MULTI-SOURCE BEAMFORMING AND MULTI-TARGET TRACKING: FURTHER RESULTS Authors: Sofiene AFFES (1),(3), Saeed GAZOR (2) and Yves GRENIER (3) Affiliations: (1) INRS-Telecommunications, 16, Place du Commerce, Ile des Soeurs, Verdun, H3E 1H6, Canada e-mail: affes@inrs-telecom.uquebec.ca (2) Isfahan University of Technology, Electrical Engineering Dept, Isfahan, Iran (3) ENST, Dept Signal, 46 rue Barrault, 75634 Paris, Cedex 13, France Abstract: We herein propose an optimal beamformer for the extraction and the tracking of partially- or fully-coherent sources in colored noise. We adaptively implement it in a simple structure and combine it with a source-subspace'' tracking procedure. We finally show its effectiveness and its fast tracking capacity by simulations.
ARRAY SELF CALIBRATION: IDENTIFIABILITY ISSUES Pierre Comon (*) and Laurent Deruaz Thomson-Sintra ASM, BP157, F-06903 Sophia-Antipolis Cedex comon@asm.thomson.fr (*) also I3S-CNRS, 250 av Einstein, Sophia-Antipolis, F-06560 Valbonne http://wwwi3s.unice.fr comon@alto.unice.fr Array self calibration consists of identifying array shape distortions and deviations to gain and phase sensor responses, in an unknown source field. Conditions of local identifiability of these parameters are established (small perturbations), and turn out to depend on the type of array (ie linear, surface, volume) and the type of field (ie near or far). The minimal number of sources and sensors is calculated in each case, and the nature of the remaining degrees of freedom is interpreted (eg translation, rotation). With an additional knowledge, that can be provided by a manoeuvre or by a perfect sensor, it is shown that the latter parameters can be in turn identified.
MULTIPLE SIGNAL DETECTION AND PARAMETER ESTIMATION USING SENSOR ARRAYS WITH PHASE UNCERTAINTIES D. Maiwald and U. Nickel FGAN-FFM, Neuenahrer Str. 20, D--53343 Wachtberg email: maiwald@elserv.ffm.fgan.de In this paper a procedure is outlined for performing both sensor array calibration and signal detection/direction of arrival estimation simultaneously. The source directions are unknown. Sensor array calibration is done using a least squares technique. Signal detection and direction of arrival estimation is performed by a multiple test procedure based on $F$-tests. The algorithm is studied by simulations and by numerical experiments with data measured by an experimental radar array with $8$ elements.
A GENERALIZED CORRELATION FUNCTION FOR MAGNIFIED/REDUCED SIGNALS Axel Busboom, Hans Dieter Schotten, and Harald Elders-Boll Institut fuer Elektrische Nachrichtentechnik RWTH Aachen, D-52056 Aachen, Germany Tel: +49 241 807678; fax: +49 241 8888196 e-mail: busboom@ient.rwth-aachen.de A generalization of the correlation function is explored which, besides a relative time shift between the signals to be correlated, also takes into account different scalings on the time axis (i.e., magnification/reduction). It is shown how the generalized correlation function for continous signals can be sampled and computed without loss of information and thus can be described by discrete-time signals. Envisaged applications comprise coded aperture imaging, measurement, radar, and digital communications. Special attention is paid to tomographic imaging using coded apertures. It is demonstrated how individual slices of an object can be reconstructed by correlating the recorded image with suitably designed decoding filters using the generalized correlation function.
OPTIMAL TIME INVARIANT AND WIDELY LINEAR SPATIAL FILTERING FOR RADIOCOMMUNICATIONS Pascal Chevalier Thomson-CSF-Communications, 66 rue du Foss Blanc, 92231 Gennevilliers, France Tel: 33 1 46 13 26 98 ; Fax: 33 1 46 13 25 55 The classical optimal array filtering problem assumes stationary signals and consists to implement a complex linear and Time Invariant (TI) filter, optimizing a second order criterion at the output under some possible constraints. Optimal for stationary signals this approach is sub-optimal for non stationary signals for which the optimal complex filters are Time Variant (TV) and, under some conditions of non circularity, Widely Linear (WL). The purpose of this paper is to present the interest of WL structures of spatial filtering with respect to linear ones in non stationary radiocommunications environments.
A NEW ROBUST ADAPTIVE STEP SIZE LMS ALGORITHM Dimitrios I. Pazaitis and Anthony G. Constantinides Department of Electrical and Electronic Engineering, Signal Processing Section, Imperial College, Exhibition Road, London SW7 2BT e-mail : {d.pazaitis, a.constantinides}@ic.ac.uk In this contribution a new robust technique for adjusting the step size of the Least Mean Squares (LMS) adaptive algorithm is introduced. The proposed method exhibits faster convergence, enhanced tracking ability and lower steady state excess error compared to the fixed step size LMS and other previously developed variable step size algorithms, while retaining much of the LMS computational simplicity. A theoretical behaviour analysis is conducted and equations regarding the evolution of the weight error vector correlation matrix together with convergence bounds are established. Extensive simulation results support the theoretical analysis and confirm the desirable characteristics of the proposed algorithm.
A NON STATIONARY LMS ALGORITHM FOR ADAPTIVE TRACKING OF A MARKOV TIME-VARYING SYSTEM M. TURKI, M. JAIDANE-SAIDANE L.S.Telecoms, ENIT, Campus Universitaire, Le Belvedere, Tunis, TUNISIA Telephone: (216)1514700; E-Mail: Jaidane@enit.rnrt.tn Abstract We propose in this paper a new adaptive algorithm which is designed to track system represented by a filter which has a P order markovian time evolution. The Non Stationary LMS (NSLMS) algorithm is able to identify the unknown order and parameters of the markov model. An analysis of the performances of the adaptive filter when the input is i.i.d. shows that the NSLMS presents better performances than the classical LMS. In particular, this superiority occurs when the system time evolution is so fast that the tracking with LMS is harmful.
ANALYSIS OF AN LMS ADAPTIVE FEEDFORWARD CONTROLLER FOR PERIODIC DISTURBANCE REJECTION: NON-WIENER SOLUTIONS FOR THE LMS ALGORITHM WITH A NOISY REFERENCE-REVISITED Neil J. Bershad (1) and Jose Carlos M. Bermudez (2) (1) Department of Electrical and Computer Engineering, University of California, Irvine, CA, 92717, U.S.A., bershad@ece.uci.edu (2) Laboratorio de Intrumentacao Eletronica (LINSE), Departamento de Engenharia Eletrica, Universidade Federal de Santa Catarina, C.P. 476, 88.040-900, Florianopolis, SC, Brazil, bermudez@linse.ufsc.br LMS adaptive cancellation has been found to be effective in various applications of active noise control of periodic disturbances. A deterministic periodic waveform can be used for the reference when the period of the disturbance is known a priori. However, the algorithm behavior is determined by so-called Non-Wiener solutions. This paper presents a new vector subspace model for simplifying the analysis of the Non-Wiener behavior. The LMS weights are modelled as a deterministic time-varying mean plus a zero-mean fluctuating part. Each weight component is analyzed separately with the subspace model.
A DESIGN METHOD FOR OVERSAMPLED PARAUNITARY DFT FILTER BANKS USING HOUSEHOLDER FACTORIZATION K.Kajita, H.Kobayashi, S.Muramatsu, A.Yamada and H.Kiya Dept. of Elec. & Info. Eng., Tokyo Metropolitan University(e-mail:kajita@isys.eei.metro.ac.jp) In this work, we propose a design method for oversampled FIR DFT filter banks which have the paraunitary property, where the number of channel M is the multiple of decimation ratio D and the filter length is the multiple of M. Our proposed method is based on Householder factorization, which can keep the perfect reconstruction condition and the paraunitary property of filter banks in optimization process. In addition, we examine the linear phase property for oversampled DFT filter banks, and the design method of oversampled linear phase DFT filter banks. In order to show the effectiveness of our method, we give some design examples.
ADAPTIVE+DARWINIAN APPROACH FOR THE ESTIMATION AND TRACKING OF TIME DELAYS Armando Malanda Trigueros Anibal R. Figueiras-Vidal Gerald Cain Universidad Publica de Navarra (Spain). malanda@upna.es Universidad Politecnica de Madrid (Spain). anibal@gtts.ssr.upm.es University of Westminster (U.K.). gerry@cmsa.westminster.ac.uk Abstract The problem of time delay estimation is tackled with three different algorithms: a gradient-like scheme, a Darwinian Algorithm (a global optimisation procedure inspired on Nature's evolution mechanisms) and a third approach, mixture of the previous two. While the gradient scheme easily finds an accurate estimate when easily initialised, it misleads the track when badly initialised or when jumps occur in the delay. The Darwinian algorithm appears more robust to delay changes but too slow and less accurate. Our combined solution outperforms the other two in conver- gence capabilities, without notably degrading accuracy nor speed.
AN ADAPTIVE FILTER COEFFICIENTS ADJUSTMENT ALGORITHM STABLE AGAINST REFERENCE SIGNAL POWER FLUCTUATION AVAILABLE FOR ACOUSTIC ECHO CANCELLER SYSTEMS Kensaku FUJII and Juro OHGA Multimedia Systems Laboratories (L40), Fujitsu Laboratories Ltd. 4-1-1 Kamikodanaka, Nakahara-ku,--Kawasaki, 211-88, Japan Tel: +88-44-777-1111, Fax: +88-44-754-2741, fujiken@flab.fujitsu.co.jp The ERLE (echo return loss enhancement) iterates greatly up and down, if the adaptive filter coefficients are continuously adjusted in disregard of the reference signal power fluctuation. This paper presents a method of always maintaining the specified ERLE, even when the adjustment is continued in voiceless noise terms. The method is based on the 'summational' NLMS (normalised least mean square) algorithm in which the coefficients are updated after the reference signal norm, and the product of the residual echo and the reference signal have been summed up for continues iterations (a block). The SNLMS algorithm can keep the ERLE at the specified level, if the coefficients are updated after the summed norm has been amounted to a value which was evaluated from a given surrounding noise power.
A Cost Function for Constant Amplitude Signals based on Statistical Referencett Josep Sala-Alvarez Department of Signal Theory and Communications (GPS) Universitat Politecnica de Catalunya c/ Gran Capit s/n, Modul D5 08034 Barcelona, Spain Tel: +34-3-401 64 40; Fax: +34-3-401 64 47 E-mail l: alvarez@gps.tsc.upc.es ABSTRACT The equalisation of constant amplitude signals is considered in the scope of this paper. A criterion based on the probability density function (pdf) of the signal of interest is proposed. The objective is to derive a suitable soft-decision scheme, more robust than the classical CMA algorithm that ensures recoverability of the signal.
ON THE PROBLEM OF BLIND EQUALIZATION CONSIDERING ABRUPT CHANGES IN THE CHANNEL CHARACTERISTICS Catharina Carlemalm, Bo Wahlberg S3-Automatic Control Royal Institute of Technology (KTH) S-100 44 Stockholm SWEDEN cath@s3.kth.se, bo@s3.kth.se The problem of blind equalization in a digital communication system is considered. Unfortunately, the circuit might suffer from abrupt changes. Thus, it is criticalnot to ignore this phenomenon when the problem of blind equalization is analyzed. The proposed method, which is based on an Ito stochastic differential calculus approach, describes the dynamics of the output signal with an infinite impulse response (IIR) model where the involved taps are modeled as time-varying cadlag (continu a droite limites a gauche) processes. Therefore, nonlinear and time-variant changes in the channel characteristics are included.
SOURCE INDEPENDENT BLIND EQUALIZATION WITH FRACTIONALLY-SPACED SAMPLING Joao Gomes, Victor Barroso Instituto Superior Tecnico - Instituto de Sistemas e Robotica Av. Rovisco Pais, Torre Norte 7 1096 Lisboa Codex, Portugal Tel: +351-1-8418296 Fax: +351-1-8418291 jpg@isr.ist.utl.pt, vab@isr.ist.utl.pt A generalization of the super-exponential blind equalization algorithm for fractionally-spaced sampling is presented. Taking advantage of the increased degrees of freedom in selecting higher order statistics of cyclostationary signals, two different cost functions are proposed for blind equalization. One of them allows the inverse of a bandlimited continuous channel to be identified without aliasing, and the other leads to a blind counterpart of a decision-directed fractionally-spaced equalizer (FSE). Simulation results document the performance of these algorithms.
SOFT DECISION SOLUTION TO ILL CONVERGENCE OF BLIND DECISION FEEDBACK EQUALIZERS Sofiane Cherif(l)(2), A/Meriem Jaidane(l), Sylvie Marcos(3) (1) Laboratoire des Systemes de Telecommunications, ENIT, BP 37, Le Belvedere-Tunis, TUNISIA Tel : + 216 (1) 514700; fax : + 216 (1) 510729; e-mail : jaidane@enit.rnrt.tn (2) Ecole Superieure des Postes et des Telecommunications de Tunis, 9083 Cite El Ghazala, TUNISIA Tel : + 216 (1) 762000; fax : + 216 (1) 762819; e-mail : cherif@espttn.esptt.tn (3) Laboratoire des Signaux et Systemes, CNRS-ESE, 91192 Gif/Yvette ceded FRANCE Tel : + 33 (1) 69851729; fax : + 33 (1) 69413060; e-mail : marcos@lss.supelec.fr ABSTRACT Decision Feedback Equalisers ( DFE) for blind equalization are subject to ill-convergence. In this paper we prove that the algorithms may be blind to the global minimum due to the error surface structure. The use of a soft decision in the decision device during a pseudo-training phase solve partially the problem of ill-convergence of DFE.
WIDEBAND BLIND IDENTIFICATION AND SEPARATION OF INDEPENDENT SOURCES Wang Jun DSP Division, Department of Radio Engineering Southeast University Nanjing 210096, P.R.China e-mail: cwwu@seu.edu.cn Abstract: Two higher-order spectra methods, one bispectra and one trispectra, for solving the wideband blind identification and signal separation problem are presented. The methods are universal in the sense that they does not impose any restrictions on the probability ditribution of the input signals provided that they are asymmetrically distributed for the bispectra method and non-Gaussian for the trispectra one. Two criteria, which state sufficient conditions for identification and sepapration, have been proved. Algorithms are developed based on the criteria, whose efficiency is verified by the simulations.
SUBSPACE METHOD FOR BLIND SEPARATION OF SOURCES IN CONVOLUTIVE MIXTURE. Ali MANSOUR (1,3), Christian JUTTEN (1,3, 4) and Philippe LOUBATON (2,3) 1 INPG-TIRF, 46 avenue F\'{e}lix Viallet, 38031 Grenoble Cedex (France) 2 Univ. de Marne la Vall\'{e}e, 2 rue de la Butte Verte, 93166 Noisy-Le-Grand Cedex (France) 3 GdR Traitement du Signal et des Images, CNRS 4 Professor in Institut des Sciences et Techniques de Grenoble (ISTG) of Universit\'e Joseph Fourier. mansour@tirf.inpg.fr chris@tirf.inpg.fr loubaton@pekin.univ-mlv.fr For the convolutive mixture, a subspace method to separate the sources is proposed. It is showed that after using only the second order statistic but more sensors than sources, the convolutive mixture can be itentified up to instantaneou mixture. Furthermore, the sources can be separated by any algorithm for instantaneous mixture (based in generally on the fourth order statistics).
BLIND SEPARATION OF WIDE-BAND SOURCES : APPLICATION TO ROTATING MACHINE SIGNALS V. Capdevielle, Ch. Serviere, J-L. Lacoume CEPHAG serviere@cephag.observ-gr.fr We propose an extension of the narrow band source separation algorithms to the case of wide band sources, which is developed in frequency domain. We mainly focus on the separation of convolutive mixtures of rotating machine noises and develop two specific points. In the first point, we study the feasibility of the separation of periodic signals, with regard to the hypothesis of random and non gaussian sources. The second point consists in the reconstruction of the spectra of the estimated sources from the signals identified at each frequency bin. Indeed, the source associated to the ith identified signal is not necessarily the same from one frequency bin to another. In this paper, we theoretically prove the feasibility of the separation of rotating machine noises and propose a solution in order to reconstruct the source spectra. The algorithm is then illustrated with experimental results, including the procedures of separation and reconstruction.
BLIND SOURCE SEPARATION BY SIMULTANEOUS THIRD-ORDER TENSOR DIAGONALIZATION Lieven De Lathauwer, Bart De Moor, Joos Vandewalle K.U.Leuven - E.E. Dept.- ESAT - SISTA Kard. Mercierlaan 94, B-3001 Leuven (Heverlee), Belgium tel: 32/16/321805 fax: 32/16/321986 e-mail: Lieven.DeLathauwer@esat.kuleuven.ac.be We develop a technique for Blind Source Separation based on simultaneous diagonalization of (linear combinations of) third-order tensor slices'' of the fourth-order cumulant. It will be shown that, in a Jacobi-type iteration scheme, the computation of an elementary rotation can be reformulated in terms of a simultaneous matrix diagonalization.
SECOND ORDER BLIND IDENTIFICATION OF CONVOLUTIVE MIXTURES WITH TEMPORALLY CORRELATED SOURCES: A SUBSPACE BASED APPROACH A. Gorokhov and P. Loubaton Telecom Paris, Dept. Signal 46 rue Barrault 75634 Paris Cedex 13 FRANCE UF SPI (EEA) Universite de Marne la Vallee 2 rue de la Butte Verte 93166 Noisy-le-Grand Cedex FRANCE This contribution addresses the blind identification of Multiple Input Multiple Output (MIMO) linear FIR systems having a number of inputs less than the number of outputs. Recent publications have proposed an efficient second order identification method in the Single Input Multiple Output (SIMO) case. Based on a subspace analysis, it allows a perfect recovery of the system parameters and excitation in a noise free environment. In this paper we indicate how to extend the original subspace based approach to the general MIMO case.
DIRECTION FINDING AFTER BLIND IDENTIFICATION OF SOURCES STEERING VECTORS: THE BLIND-MAXCOR AND BLIND-MUSIC METHODS P. Chevalier, G. Benoit and A. Ferrol Thomson-CSF-Communications, 66 rue du Foss Blanc, 92231 Gennevilliers, France Tel: 33 1 46 13 26 98 ; Fax: 33 1 46 13 25 55 To find the direction of arrival (DOA) of P sources impinging on an array of N sensors, actual second and fourth order direction finding (DF) methods try to solve a P-dimensional problem from the statistics of the data. The purpose of this paper is to present a new approach of DF, based on a first step of blind identification of sources steering vectors, aiming, for some of these methods, at reducing the problem dimension before DF. Two new methods, the Blind-MAXCOR and the Blind-MUSIC methods, are proposed and their performance are compared to that of MUSIC method.
BLIND BEAMFORMING IN A CYCLOSTATIONARY CONTEXT USING AN OPTIMALLY WEIGHTED QUADRATIC COST FUNCTION C. VIGNAT AND P. LOUBATON Universit de Marne la Valle Unit de Formation S.P.I. 2 rue de la Butte Verte 93166 NOISY LE GRAND CEDEX e-mail: vignat@univ-mlv.fr This paper addresses the problem of blind beamforming in a cyclostationary context. We show the equivalence between the SCORE algorithm derived by Gardner et al., and the minimization of an optimally weighted quadratic cost function. This approach allows us to justify, from a statistical point of view, the relevance of the SCORE algorithm.
VOICE CONTROLLED MOBILE PHONE FOR CAR ENVIRONMENT Ivan Bourmeyster(1), Jamil Chaoui(2), Silvio Cucchi(3), Nicola Griggio(3), Alessandro Guido(3), Giuliano Moroni(3), Anlonello Riccio(3), Marco Stanzani(3), Fabio Valente(3) (1) Alcatel Mobile Phones,(2) formerly at Alcatel Mobile Phones - 32, avenue Kleber,92707 Colombes, France Tel:+33 146521706;fax:+33 146528025 (3) Alcatel Corporate Research Centre - Via Trento, 30, 20059 Vimercate (Milano), Italy Tel: +39 39 686 4077; fax: +39 39 686 3587 ABSTRACT The development of an application of speech processing in a car environment is addressed. The main objective is to provide the user of a vehicular phone with a powerful and friendly bidirectional vocal interface. In particular, the paper focusses on the speech recogniser component of the interface as it was specifically designed and tuned to operate in the very hostile acoustic environment of a moving car. The recogniser operates in a fully speaker dependent mode so enabling the user to store his/her personal agenda of frequent called parties. For the training, three repetitions of each vocabulary word are recommended, although the performances remain still satisfactory with only two repetitions. Reliable performance assessment was conducted with particular attention to the aspect of robustness of the recogniser against spurious noises. Standard procedures (SAM oriented) were used to guarantee the repeatability of any test. An outlook on future improvements is also given.
A NEW ERROR CONCEALMENT TECHNIQUE FOR AUDIO TRANSMISSION WITH PACKET LOSS Alexander Stenger, Khaled Ben Younes, Richard Reng, Bernd Girod Telecommunications Institute, University of Erlangen-Nuremberg Cauerstrasse 7, 91058 Erlangen, Germany stenger@nt.e-technik.uni-erlangen.de younes@nt.e-technik.uni-erlangen.de reng@vs-ulm.dasa.de girod@nt.e-technik.uni-erlangen.de We present a new error concealment technique for audio transmission over packet networks with high packet loss rate. Unlike other techniques it modifies the time-scale of correctly received packets instead of repeating them. This is done by a time-domain algorithm, WSOLA, whose parameters are redefined so that short audio segments like lost packets can be extended. Particular attention is paid to the additional delay introduced by the new technique. For subjective hearing tests, single and double packet loss is simulated at high packet loss rates, and the new technique is compared to previous proposals by category judgment and component judgment of sound quality. Mean Opinion Score (MOS) curves show that sound distortions due to packet repetition can be reduced.
TRANSMISSION OF VARIABLE-RATE ENCODED SPEECH SAMPLES ON PACKET RADIO NETWORKS Fulvio Babich, Sergio Carrato and Francesca Vatta D.E.E.I., University of Trieste via A. Valerio, 10, 34127 Trieste, Italy Tel. +39 40 6763458 - 6767147; Fax: +39 40 6763460 e-mail: babich, vatta@univ.trieste.it e-mail: carrato@imagets.univ.trieste.it ABSTRACT This paper presents the performance evaluation of different speech coding techniques in wireless packet switching networks: the goal of our study is to increase network capacity while maintaining a smooth degradation of quality at high loads and heavy interference, in order to make it possible for different kinds of information to coexist in a single network infrastructure. In the paper we propose a variable-rate multimode and embedded encoding technique as effective for handling network congestion and channel impairments that both cause discarding or erasure of frames of information. Therefore this approach is important not only in TDMA packet switched communications with statistical multiplexing (leading to greater efficiency and flexibility than basic TDMA, that assigns a fixed portion of channel resources to each user), but also in a CDMA-based mobile system that is strictly limited by interference.
CHANNEL EQUALIZATION USING PARTIAL LIKELIHOOD ESTIMATION AND RECURRENT CANONICAL PIECEWISE LINEAR NETWORK Xiao Liu and Tulay Adali Information Technology Laboratory Department of Computer Science and Electrical Engineering University of Maryland Baltimore County Baltimore, MD 21228-5938, USA Tel: (410) 455-3521; fax: (410) 455-3969 e-mail: xliu@engr.umbc.edu adali@engr.umbc.edu A recurrent canonical piecewise linear (RCPL) network is proposed based on the canonical piecewise linear (CPL) structure and is applied to channel equalization. RCPL network provides savings in computation and implementation and has a distinct dynamic behavior completely different than that of finite duration feedforward structure. The simulations of multilevel signal equalization demonstrate the superior performance of RCPL equalizer when compared to the multilayer perceptron equalizer. For the RCPL network, it is easy to incorporate the a-priori information into the network structure. A novel blind algorithm is presented by combining partial likelihood estimation and RCPL structure for the binary communications channel. The simulation results show that RCPL blind equalizer outperforms the CMA equalizer by orders of magnitude for blind equalization of nonlinear communication channels.
CHANNEL ESTIMATION FOR TRANSFORM MODULATIONS IN MOBILE COMMUNICATIONS Meritxell Lamarca, Gregori Vazquez Department of Signal Theory and Communications Polytechnic University of Catalonia (UPC) Barcelona (SPAIN) e-mail: xell@gps.tsc.upc.es This paper deals with data-aided channel estimation in systems using OFDM modulation. We formulate a pilot symbol-based channel estimator and compare it with the pilot tone one proposed in [1]. Although this paper focuses in flat fading mobile channels, the results could be easily applied to OFDM systems operating in frequency selective channels.
DETECTION AND COMPENSATION FOR DISRUPTIVE NON-LINEAR TRAFFIC-FLOW DYNAMICS IN COMMUNICATION NETWORKS D.P.A.Greenwood and R.A.Carrasco School Of Engineering Staffordshire University Stafford ST18 0AD United Kingdom d.greenw@bss10a.staffs.ac.uk r.carras@bss10a.staffs.ac.uk Abstract: A method has been developed for the monitoring of traffic flow behavioural dynamics in distributed communication networks and the provision of results from this process to a distributed neural control mechanism which facilitates localised adaptive traffic routing in order to maintain or regain flow stability. It has been shown by simulation how the novel method improves network performance and efficiency beyond that of conventional techniques.
SIMULATION OF LAND MOBILE SATCOM LINKS USING DIFFERENT ORBITS AND MODULATION MODES Marcel Kohl Friedrich Jondral Universitaet Karlsruhe, Nachrichtensysteme D-76128 Karlsruhe, Germany Tel: +49 721 6083748; fax: +49 721 6086071 e-mail: kohl@inss1.etec.uni-karlsruhe.de The use of SATCOM systems is an essential part of today's worldwide communications. As the portion of satellite orbits in low altitudes increases, Doppler shifts often influence the received signal. Prior to the removal of this eefect, the exact course and the amount of the Doppler must be known. Therefore this paper derives the equations to calculate the orbit and the Doppler shift and shows the behaviour and the effects caused by LEO and HEO satellites. Finally a method is proposed to compensate this influence.
SCRAMBLING AND ERROR CORRECTION BY MEANS OF LINEAR TIME-VARYING FILTERS Alban Duverdier and Bernard Lacaze National Polytechnics Institute of Toulouse LEN7/GAPSE, 2 rue Camichel, 31071 Toulouse, France tel: (33) 61 58 83 67 Fax:(33) 61 58 82 37 email: duverdie@len7.enseeiht.fr In numerous communication applications, it is desirable to scramble the contents of the information. In addition, we seek to design a scrambling system which has maximum immunity to additive noise. This paper presents a method of analogue signal scrambling/unscrambling by means of linear periodic time-varying filters for any frequency selective noise. It is well known that linear periodic time-varying filters transform a stationary process into a cyclostationary signal. This thus spreads the spectral representation of the input process. The original part of the paper consists of using this property to reconstruct an initial band-limited process without error for any frequency selective noise.
A FAST LUT+CMAC DATA PREDISTORTER Francisco J. Gonzalez-Serrano (*) and Anibal R. Figueiras-Vidal (**) and Antonio Artes-Rodriguez (**) (*) Grupo de Teoria de Senal Departamento de Tecnologias de las Comunicaciones ETSI Telecomunicacion. Universidad de Vigo. 36200 VIGO-SPAIN. Tel : +(34) 86 81 2130 Fax : +(34) 86 81 2116 E-mail : frank@tsc.uvigo.es and (**) Grupo de Teoria y Tratamiento de Senal DSSR - ETSI Telecomunicacion. Universidad Politecnica de Madrid. 28040 MADRID-SPAIN Tel : +(34) 1 549 5700 Fax : +(34) 1 336 7350 E-mail : antonio@gtts.ssr.upm.es The subject of this communication is the compensation of nonlinearities in digital radio links, where the major source of nonlinearity is caused by the High Power Amplifier (HPA), typically working close to its saturation point because of energy constraints. This paper deals with the design of CMAC-based predistorters for application in digital transmission over nonlinear channels with memory. A novel hybrid structure composed of a Look-Up-Table in parallel with a CMAC network is proposed. Finally, a performance analysis for typical radio channels is presented.
DESIGN OF PULSE SHAPING FILTERS AND THEIR APPLICATIONS IN RADIO SYSTEMS Jong-Jy Shyu, Yo-Chuan Lai Department of Computer Science and Engineering Tatung Institute of Technology, Taipei, Taiwan e-mail: jshyu@cse.ttit.edu.tw Partial-response signaling is known as correlative level coding wherein the constraint on waveforms is relaxed so as to allow a controlled amount of ISI. In this paper, the Lagrange multiplier approach, which is easy to incorporate both time- and frequency-domain constraints by minimizing a quadratic measure of the error in the design bands, is applied to design a large class of such digital filters for communication in this paper. Also, the iterative Lagrange multiplier approach combining the Lagrange multiplier approach and a tree search algorithm is proposed for designing discrete coefficient pulse shaping FIR digital filters. System experiments such as an SSB radio system using partial response signaling are demonstrated to present the usefulness of the proposed algorithm.
APPROXIMATE MAXIMUM LIKELIHOOD ESTIMATION IN LASER VELOCIMETRY. Olivier Besson and Frederic Galtier. ENSICA, Department of Avionics and Systems. 1, Place Emile Blouin. 31056 Toulouse - France. besson,galtier@ensica.fr Abstract: In this paper, we study the estimation of signals of the form $%s(t)=A.\exp \left\{ -2\alpha ^2f_d^2t^2\right\} .\cos \left( 2\pif_dt\right)$ which are encountered in the measurement of particles velocity in a flow by means of laser Doppler velocimeters. We derive anApproximate Maximum Likelihood Estimator of the parameters A and $f_d$ in the model considered. The algorithm is based upon replacing the first and second-order derivatives of the log-likelihood function by approximated and easy to compute expressions. Numerical examples illustrate the performance of the proposed method and quantify the influence of the sample size, the frequency $f_d$ and the parameter $\alpha$. They show that the estimator is statistically efficient in a wide range of scenarios.
INSTRUMENTAL VARIABLE SOLUTION TO AN EXTENDED FRISCH PROBLEM Petre Stoica, Mats Cedervall, Joakim Sorelius and Torsten Soderstrom Systems and Control Group, Uppsala University PO Box 27, S-751 03 Uppsala, Sweden; Tel: +46 18 183074; fax: +46 18 503611; e-mail: petre.stoica@syscon.uu.se In signal processing and time series analysis applications we often encounter cases in which a number of (noise-free) variables are linearly related and we want to make inferences on the number and the form of the linear relations among those variables from noisy observations of them. The Frisch problem is concerned with the aforementioned inferences under the assumption that the components of the observation noise vector are mutually uncorrelated. In this paper we extend the Frisch problem by allowing the noise vector components to be correlated in an arbitrary (and unknown) way. The EXtended FRIsch problem of this paper is called EXFRI for short. To make EXFRI solvable we basically assume that the observation noise is temporally white whereas the noise-free signals are temporally correlated. We show that, under the assumptions made, the EXFRI problem has a computationally simple and statistically elegant Instrumental Variable (IV) solution, which is essentially based on a canonical correlation decomposition procedure.
BERNOULLI-GAUSSIAN DECONVOLUTION IN NON-GAUSSIAN NOISE, CONTRIBUTION OF WAVELET DECOMPOSITION H.Rousseau and P.Duvaut E.T.I.S. - E.N.S.E.A., 6, avenue du Ponceau, 95014 CERGY Cedex e-mail : rousseau@ensea.fr We introduce a method to restore Bernoulli-Gaussian processes immerged in a non-gaussian noise. It uses wavelet decomposition to gaussianize'' the noise. The convergence, after wavelet projection, of some non-gaussian noise to a gaussian noise quantifies the quality of the gaussianization'' effect of the wavelet. This property is used to apply a Bernoulli-Gaussian algorithm at each scale of wavelet decomposition. After, we use a fusion strategy to merge all results. We obtain also a new deconvolution algorithm which is very performant, for all satistical noises, when the noise variance is not well estimated. When the noise variance is correctly estimated, it improves the classical Bernoulli-Gaussian algorithm for strongly non-Gaussian noises.
MMSE EQUALIZERS FOR MULTITONE SYSTEMS WITHOUT GUARD TIME L. Vandendorpe UCL Communications and Remote Sensing Laboratory, 2, place du Levant, B 1348 Louvain-la-Neuve, Belgium. Phone: +32 10 47 23 12 - Fax: +32 10 47 20 89 - E-Mail : vandendorpe@tele.ucl.ac.be Recently the concept of multitone modulation or OFDM has received much attention. For such a modulation, the dispersiveness of the channel is classically solved by the technique of guard time. In the present paper we investigate the performance of OFDM without guard time but with MIMO equalization. Linear and decision-feedback structures structures are derived for an MMSE criterion and their performance is assessed by means of their steady-state behavior. Symbol rate equalizers following channel matched filters are derived and investigated. It is shown that equalized OFDM outperforms OFDM with guard time.
MSE-BASED REGULARIZATION APPROACH TO RANK DETERMINATION IN CLS AND TLS ESTIMATION H. Kagiwada, Y.Aoki, J. Xin andA.Sano Department of Electrical Engineering, Keio University 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223, Japan Tel: +81 45 563 1141; fax: +81 45 563 2773 e-mail: sano@sano.elec.keio.ac jp The corrected least squares (CLS) approach using an over- determined model is investigated to decide the number of sinusoids in additive white noise. Like the total least squares (TLS) approach, the CLS estimation is different from the ordinary least squares (LS) method in that the noise variance is subtracted from the diagonal elements of the correlation matrix of the noisy observed data. Therefore the inversion of the resultant matrix becomes ill-conditioned and then adequate trunc at i on of the eigenv alue decompositi on (EVD) s hould be done. This paper clarifies how to simultaneously estimate the noi se variance and truncate the eigenvalues , since they are mutually dependent. By introducing a multiple number of regulanzation parameters and determining them to minimize the MSE of the model parameters, we can give an optimal scheme for the truncation of eigenvalues. Furthermore, an iterative algorithm using only observed data is also clarified.
ROBUSTNESS ANALYSIS OF MUSIC AND ESPRIT FREQUENCY ESTIMATORS FOR SINUSOIDAL SIGNALS WITH TIME-VARYING AMPLITUDE Olivier Besson and Petre Stoica ENSICA, Department of Avionics and Systems, Place Emile Blouin, 31056 Toulouse, France. besson@ensica.fr Uppsala University, Systems and Control Group, 75103 Uppsala, Sweden. Abstract: In this paper, we address the problem of estimating the frequency of a sinusoidal signal with random, lowpass amplitude. We propose to use MUSIC and ESPRIT frequency estimators as if the signal had a constant amplitude. The aim of the paper is to analyze the degradation of performance induced by the aforementioned mismodelling. Unified expressions for the bias and variances of the MUSIC and ESPRIT frequency estimators are derived under the hypothesis of small bandwidth of the signal envelope. Numerical simulations illustrate the agreement between theoretical and empirical results and study the influence of the envelope bandwidth onto the frequency estimation performance.
HOS BASED DETECTORS FOR PERIODIC SIGNALS P.R. White, N. Khalili ISVR, University of Southampton, Highfield, Hants, U.K., SO17 1BJ Tel.: +44 1703 592274, Fax: +44 1703 593033 email: prw@isvr.soton.ac.uk This paper discusses algorithms for the detection of periodic pulse-like signals. Such signals exhibit phase as well as frequency coupling and are thus suitable for detection using HOS. The algorithm presented herein can be regarded as an extension to an existing second order spectral algorithm, to include third order terms. The results of simulation studies are presented which demonstrate the performance advantage offered by this new algorithm.
DETECTION OF ABRUPT CHANGES : A TIME-FREQUENCY APPROACH Helene LAURENT, Christian DONCARLI and Philippe POIGNET Laboratoire d'Automatique de Nantes, U.R.A. C.N.R.S. 823 Ecole Centrale de Nantes/Universite de Nantes 1 rue de la Noe, 44072 NANTES CEDEX, FRANCE Tel: (33) 40 37 16 00; Fax: (33) 40 37 25 22 e-mail: poignet@lan.ec-nantes.fr This paper presents a comparison between parametric and non-parametric approaches of abrupt changes detection in noisy signals. The goal is to propose an alternative way to be used when the model-based methods do not work very well because of an unsuitable model structure or a non strictly stationnary stepwise signal. In this latter case, an analysis of time-frequency distributions allows the detection of abrupt spectral changes without any hypothesis and provides some results as good as parametric methods for the studied type of signals.
DETECTION AND ESTIMATION OF CHANGES IN A POLYNOMIAL-PHASE SIGNAL USING THE DPPT C. Theys, A. Ferrari and G. Alengrin I3S Universite de Nice-Sophia Antipolis 41, Bd Napoleon III - 06041 NICE cedex - FRANCE e-mail : theys@unice.fr This paper is concerned with on-line detection and estimation of changes in the parameters of a noisy polynomial-phase signal. This problem arises in vibration monitoring where the measured signals reflect both the nonstationarities due to the surrounding excitation, modelled by a polynomial-phase and the nonstationarities due to changes in the eigen structure, modelled by a break in the polynomial parameters. Development of a likelihood ratio test to detect and estimate changes in a polynomial-phase signal requires accurate estimation of the parameters vector after change, theta1. Use of the Maximum Likelihood Estimate (MLE) of theta1 is not practically useful since it involves the optimization of a multi-variable cost function. We propose to estimate theta1 by using the Discrete Polynomial-Phase Transform (DPPT) in order to derive a detector having asymptotically the same properties than the GLR one for a much lower computational cost. Experimental performances, mean delay to the detection as a function of mean time between false alarms, will be studied.
SYMBOL DECODING BASED ON SIGNAL SUBSPACE DECODING IN MSK Rafael Ruiz Margarita Cabrera Dept. of Signal Theory and Communications, E.T.S.I. Telecomunicacion, UPC. Apdo. 30002, 08080 Barcelona. SPAIN e_mail: rafael@gps.tsc.upc.es ABSTRACT: The availability of fast processors with architectures tailored to meet the computational demand of digital signal processing algorithms is widely applied to demodulation and decodification of CPM signals in some scenes: Mobiles, AWGN channels,... In this application the number of floating point operations executed by each processed symbol is a critical parameter to be designed, this is to be minimized. In this paper a method that reduces significantly the number of operations (until 80%) by symbol for CPM signals is presented. The decodification stage is performed from the rank reduced signal subspace obtained by means of an orthogonal decomposition of the signal.
ADAPTIVE NEURAL NETWORKS FOR ROBUST ESTIMATION OF PARAMETERS OF NOISY HARMONIC SIGNALS A. Cichocki FRP Riken - ABS Laboratory, Institute of Physical and Chemical Research, Japan Tel: +81 48 465 2645; fax: +81 48 462 4633 e-mail: cia@kamo.riken.go.jp P. Kostyla, T. Lobos, Z. Waclawek Technical University of Wroclaw pl. Grunwaldzki 13, 50-370 Wroclaw, Poland Tel: +48 71 203448; fax: +48 71 229725 e-mail: lobos@elektryk.ie.pwr.wroc.pl ABSTRACT In many applications, very fast methods are required for estimating and measurement of parameters of harmonic signals distorted by noise. This follows from the fact that signals have often time varying amplitudes. Most of the known digital algorithms are not fully parallel, so that the speed of processing is quite limited. In this paper we propose new parallel algorithms, which can be implemented by analogue adaptive circuits employing some neural network principles. The problem of estimation is formulated as an optimization problem and solved by using the gradient descent method. Algorithms based on the least-squares (LS), the total least-squares (TLS) and the robust TLS criteria are developed and compared. The networks process samples of observed noisy signals and give as a solution the desired parameters of signal components. Extensive computer simulations confirm the validity and performance of the proposed algorithm.
MAXIMUM LIKELIHOOD ESTIMATION OF AR MODULATED SIGNALS Mounir GHOGHO National Polytechnics Institute of Toulouse, ENSEEIHT/GAPSE, France email: ghogho@len7.enseeiht.fr The desired signal is embedded in both multiplicative and additive noises. The multiplicative noise is modeled by a Gaussian AR process. Closed forms expressions are derived for the finite-sample Cramer-Rao bound and for the maximum likelihood estimator. A cyclic approach is used to initialize the maximum likelihood algorithm when the signal is a harmonic.
TIME DELAY AND MOTION ESTIMATORS BASED ON DIGITAL FAST TIME-SCALING OF RANDOM SIGNALS Gaetano Giunta INFO-COM Department, University of Rome "La Sapienza", Via Eudossiana 18, 00184 Rome, Italy tel.: + 39 6 44585838; fax: + 39 6 4873300 e-mail: giunta@infocom.ing.uniroma1.it The estimation of time-delay and time-scaling is required in many signal processing applications. A parabolic approximation was recently suggested for fine estimation of time delay from sampled signals. The method directly extends to scaling estimation by a parallel multi-rate sampling of the analog received signal. Such rescaling can be implemented by digital techniques and two efficient algorithms are here devised and analysed.
A SUPER-RESOLUTION METHOD BASED ON THE DISCRETE COSINE TRANSFORMS Hisashi SAKANE, Kiyoshi NISHIKAWA and Hitoshi KIYA Dept. of Elec. & Info. Eng., Tokyo Metropolitan University, e-mail: kiya@eei.metro-u.ac.jp A super-resolution method based on the discrete cosine transform (DCT) is proposed for a signal with some frequency damage under a type 1 linear-phase (LP) FIR filter as a damage model. The proposed method can be carried out with real value operation and is applicable to any DCT in 4 kinds of DCTs. In addtion, two magnification schemes based on the proposed method to improve the conventional scheme are described.
ROBUST PARAMETER ESTIMATION FOR PERIODIC POINT PROCESS SIGNALS USING CIRCULAR STATISTICS Stephen D. Elton(1) and Benjamin J. Slocumb(2) (1) Electronics and Surveillance Research Laboratory Defence Science and Technology Organisation and Cooperative Research Centre for Robust and Adaptive Systems P.O. Box 1500, Salisbury, SA 5108, Australia e-mail: Stephen.Elton@dsto.defence.gov.au (2) Electronic Systems Laboratory, Georgia Tech Research Institute Georgia Institute of Technology, Atlanta, GA 30332-0840, U.S.A. e-mail: Ben.Slocumb@gtri.gatech.edu We discuss the application of signal parameter estimators for periodic point process signals with missing data. The proposed estimation techniques operate on the observed event arrival time sequence of a pulse train signal and have application to pulse train signal classification and signal reconstruction. The methods we describe are based on the use of circular statistics and are shown to offer considerable robustness to a pulse train time series corrupted by missing pulses.
A METHOD FOR COMPUTING THE INFORMATION MATRIX OF STATIONARY GAUSSIAN PROCESSES Jose M. B. Dias and Jose M. N. Leitao Instituto de Telecomunicacoes and D.E.E.C., Instituto Superior Tecnico Tel: +351 1 8418464; fax: +351 1 8418472 Email: edias@beta.ist.utl.pt This paper proposes a new method for the efficient computation of the Fisher information matrix of zero-mean complex stationary Gaussian processes. Its complexity (measured by the number of floating point operations) is smaller than the fastest previously available procedure. The key idea exploited is that the Fisher information matrix depends only on the sum of the diagonals of the inverse covariance matrix derivative (with respect to the model parameters), rather than on the whole matrix. To obtain the referred sum, a new efficient technique, built upon the Trench algorithm for computing the inverse of a Toeplitz matrix, is presented.
Title: FULLY BAYESIAN ANALYSIS OF HIDDEN MARKOV MODELS Authors: Arnaud DOUCET, Patrick DUVAUT Affiliation: LETI-CEA Technologies Avancees 91191 Gif sur Yvette FRANCE ENSEA-ETIS Groupe Signal 6, avenue du Ponceau 95014 Cergy Pontoise FRANCE douceta@ensea.fr - duvaut@ensea.fr Abstract: In this paper, we present in an unified framework some applications of stochastic simulation techniques, the Markov chain Monte Carlo methods, to perform Bayesian inference for a very wide class of hidden Markov models. Efficient implementation of the Gibbs sampler based on finite dimensional optimal filters is described. An improved version of this algorithm is also presented. Two problems of great practical interest in signal processing are addressed: blind deconvolution of Bernoulli-Gauss processes and blind equalization of a channel. In simulations, we obtain very satisfactory results.
Title: PERFORMANCE ANALYSIS OF A WAVELET BASED WBCAF METHOD FOR TIME DELAY AND DOPPLER STRETCH ESTIMATION X. X. Niu P. C. Ching Dept. of Electronic Engineering, Chinese University of Hong Kong, Hong Kong Tel: (852) 2609 8275 Fax: (852) 2603 5558 Email: xxniu@ee.cuhk.edu.hk pcching@ee.cuhk.edu.hk Y. T. Chan Dept. of Electrical Engineering, Royal Military College of Canada, Canada Abstract: A wavelet based method for time delay and Doppler stretch estimation has been proposed. It makes use of the relationship between the wideband cross ambiguity function (WBCAF) and the cross wavelet transform of the received signals. This paper derives the Cramer-Rao lower bound (CRLB) and analyses the performance of the algorithm. It is found that under high SNR, the method is asymptotically unbiased, and the variances of the estimation parameters are fairly close to the CRLB. Simulation results are given to corroborate the theoretical derivation.
A NEW METHOD FOR WAVELETS GENERATION. A.Martnez-Gonzalez, L. Ortiz-Balbuena, H. Perez-Meana, E. Sanchez-Sinencio* and J. C. Sanchez-Garca Universidad Autonoma Metropolitana Iztapalapa ,Depart.of Electrical Engineering, CBI Division. Av. Michoacn y Pursima. Col. Vicentina, Iztapalapa. C.P. 09340 Mexico, D.F. Mexico. Tel: (525) 725 46 35; Fax: (525) 725 49 02. e-mail: leob@xanum.uam.mx * Texas A & M, Department of Electrical Engineering, College Station, Texas, U.S.A. Wavelets operators are very important in most practical applications. Implementation of these operators in software and in commercial DSP hardware are popular. We are presenting an alternative hardware implementation of wavelets operators using mixed-mode signal techniques, that is, a judicious combination of analog and digital hardware implementations. The approach is general and can be applied to a number of wavelets types.
Trieste paper 074 1-D SAMPLED DATA RECOGNITION WITH AUGMENTED PROGRAMMED GRAMMAR P.M. Grant, D.T. Lin, J.M. Hannah and R.D. Pringle Department of Electrical Engineering, University of Edinburgh, Edinburgh, EH9 3JL, Scotland Tel: +44 131 650 5569; fax: +44 131 650 6554; email pmg@ee.ed.ac.uk ABSTRACT This syntactic parser for pattern recognition, uses a descriptive grammar to test whether data samples fall within an expected shape or envelope. The construction of this recogniser, which is based on an augmented programmed grammar, is described and its recognition statistics are simulated on irregularly sampled pattern waveforms. It is shown to be able to correctly recognise 1-D waveforms with a wide range of sizes or scale factors, within a single grammatical representation.
ORDER DETERMINATION OF STATE SPACE SYSTEMS Anthony G. Place and Gregory H. Allen Electrical and Computer Engineering Department James Cook University of North Queensland Queensland Australia Tel: +61 7 814299; fax: +61 7 251348 e-mail: Anthony.Place@jcu.edu.au and Gregory.Allen@jcu.edu.au Recent techniques proposed for the identification of state space models have focused on using the singular value decomposition of block Hankel input-output matrices. In these procedures the order of the system is determined by examining the singular values and identifying the separation between the signal'' and noise'' subspaces. Order determination of state space systems requires an understanding of what singular value magnitudes are expected. This paper examines how system structure and noise levels affect the magnitude of singular values. An order selection criterion formed from the AIC and MDL is also examined.
THE BEST ORDER OF LONG AUTOREGRESSIVE MODELS FOR MOVING AVERAGE ESTIMATION P.M.T. Broersen Department of Applied Physics, Delft University of Technology P.O.Box 5046, 2600 GA Delft, The Netherlands phone + 31 15 278 6419, fax + 31 15 278 4263, email broersen@tn.tudelft.nl ABSTRACT Durbin's method for Moving Average (MA) estimation uses the estimated parameters of a long AutoRegressive (AR) model to compute the desired MA parameters. A theoretical order for that long AR model is infinity, but very high AR orders lead to inaccurate MA models in the finite sample practice. A new theoretical argument is presented to derive an expression for the best finite long AR order for a known MA process and a given sample size. Intermediate AR models of precisely that order produce the most accurate MA models. This new order differs from the best AR order to be used for prediction. An algorithm is presented that enables use of the theory for the best long AR order in known processes to data of an unknown process.
Title : A CLASS OF REAL-TIME AR IDENTIFICATION ALGORITHMS IN THE CASE OF MISSING OBSERVATIONS. Authors : Sina Mirsaidi and Jacques Oksman Affiliation : SUPELEC, Service des Mesures, Plateau de Moulon, 91192 Gif-sur-Yvette Cedex, FRANCE. Tel : (33) 1 69.85.12.12 Fax : (33) 1 69.85.12.34 E-mails : Mirsaidi@soleil.supelec.fr, Oksman@supelec.fr. Abstract : This paper deals with the problem of adaptive AR estimation from incomplete observations. The method is based on the optimization of a weighted squared error criterion. Various approximates of this criterion lead to different algorithms. The formal description of these algorithms are given and their performances in stationary and non-stationary environments are compared.
UNSUPERVISED RESTORATION OF GENERALIZED MULTISENSOR HIDDEN MARKOV CHAINS Nathalie Giordana and Wojciech Pieczynski Departement Signal et Image Institut National des Telecommunications 9 rue Charles Fourier, 91000 Evry cedex France Tel: (33 1) 60764425; fax: (33 1) 60764433 e-mail: Nathalie.Giordana@int-evry.fr Wojciech.Pieczynski@int-evry.fr This work addresses the problem of generalized multisensor Hidden Markov Chain estimation with application to unsupervised restoration. A Hidden Markov Chain is said to be generalized'' when the exact nature of the noise components is not known; we assume however, that each of them belongs to a finite known set of families of distributions. The observed process is a mixture of distributions and the problem of estimating such a generalized'' mixture thus contains a supplementary difficulty: one has to label, for each state and each sensor, the exact nature of the corresponding distribution. In this work we propose a general procedure with application to estimating generalized multisensor Hidden Markov Chains.
APPLICATION OF HIDDEN MARKOV MODELS TO BLIND CHANNEL ESTIMATION AND DATA DETECTION IN A GSM ENVIRONMENT Carles Antn-Haro, Jos A.R. Fonollosa and Javier R. Fonollosa. Dpt. of Signal Theory and Communications. Universitat Politcnica de Catalunya. c/ Gran Capit s/n. 08034 Barcelona (SPAIN) Tel: +34-3-4016454, Fax: +34-3-4016447, e-mail: carles@gps.tsc.upc.es In this paper, we present an algorithm based on the Hidden Markov Models (HMM) theory to solve the problem of blind channel estimation and sequence detection in mobile digital communications. The environment in which the algorithm is tested is the Paneuropean Mobile Radio System, also known as GSM. In this system, a large part in each burst is devoted to allocate a training sequence used to obtain a channel estimate. The algorithm presented would not require this sequence, and that would imply an increase of the system capacity. Performance, evaluated for standard test channels, is close to that of non-blind algorithms.
ESTIMATING PIECEWISE LINEAR MODELS USING COMBINATORIAL OPTIMIZATION TECHNIQUES Marco Mattavelli *, Edoardo Amaldi # * Signal Processing Laboratory, Swiss Federal Institute of Technology, CH-1015 Lausanne, Switzerland, Tel: +41 21 693 4807, E-mail: marco.mattavelli@lts.de.epfl.ch. # School of Operations Research and Center for Applied Mathematics, Cornell University, Ithaca, NY 14853, USA, E-mail amaldi@cs.cornell.edu. A wide range of image and signal processing problems have been formulated as ill-posed linear inverse problems. Due to the importance of discontinuities and non-stationarity, piecewise linear models are a natural step towards more realistic results. Although there have been some attempts to extend classical approaches to deal with discontinuities, finding at the same time the piecewise decomposition and the corresponding model parameters remains a major challenge. A new approach based on partitioning inconsistent linear systems into a minimum number of consistent subsystems MIN PCS is proposed for solving ill-posed problems whose formulation as linear inverse problems with discrete data fails to take into account discontinuities. In spite of the NP-hardness of MIN PCS, satisfactory approximate solutions can be obtained using simple but effective variants of an algorithm which has been extensively studied in the artificial neural network literature. Our approach presents various advantages compared to classical alternatives, including a wider range of applicability and a lower computational complexity.
STRUCTURED TOTAL LEAST SQUARES METHODS IN SIGNAL PROCESSING Philippe Lemmerling Sabine Van Huffel Bart De Moor Katholieke Universiteit Leuven philippe.lemmerling@esat.kuleuven.ac.be In many signal processing applications, one has to solve an overdetermined system of linear equations Ax=b. The Total Least Squares (TLS) method finds a Maximum Likelihood (ML) estimate of the parameter vector x when the noise on the entries of [A b] is i.i.d. Gaussian noise with zero mean and equal variance. In many applications, these last conditions do not hold because of the structure present in [A b]. Under those circumstances, the TLS will not yield a ML estimate of the parameter vector x since the SVD (which is the standard way to obtain the TLS solution) is not structure preserving. Therefore, several structured Total Least Squares methods have been developed in recent years: the Constrained Total Least Squares (CTLS) method , the Structured Total Least Squares (STLS) method and the Structured Total Least Norm (STLN) method. As opposed to the ordinary TLS these methods yield a ML estimate of the parameter vector x, by imposing the structure of the errors on [A b].
TITLE: BAYESIAN DECONVOLUTION OF CYCLOSTATIONARY PROCESSES BASED ON POINT PROCESSES AUTHORS: Christophe ANDRIEU - Patrick DUVAUT - Arnaud DOUCET AFFILIATION: ENSEA - ETIS Groupe Signal / 6 avenue du Ponceau 95014 Cergy Cedex France E-mail: andrieu@ensea.fr - duvaut@ensea.fr - douceta@ensea.fr ABSTRACT: In this paper we address the problem of the fully Bayesian deconvolution of a widely spread class of processes, filtered point processes, whose underlying point process is a self excited point process. In order to achieve this deconvolution, we perform powerful stochastic algorithms, the Markov chain Monte Carlo (MCMC), which despite their power have not been yet widely used in signal processing. We present in this paper an application to a particular class of weakly cyclostationary processes.
DIFFERENTIAL CEPSTRUM DEFINED ON INTERPOLATED SEQUENCES Damjan Zazula University of Maribor Faculty of Electrical Engineering and Computer Science Smetanova 17 2000 Maribor SLOVENIA Tel.: +386 62 221 112; fax: +386 62 225 013 E-mail: zazula@uni-mb.si The paper introduces a novel definition of the differential cepstrum. It is based on the interpolation sequences in the frequency domain and exists also for the singular signals with no spectral inverse. Besides, we showed analytically and statistically that such a differential cepsrtum exhibits lower cepstral aliasing when calculated with the DFT comparing to the calculation without interpolation. On average, the improvement is 39 % in case of the interpolation to the half-intervals and 46 % in case of the quarter-intervals.
MODULATION CLASSIFICATION -- AN UNIFIED VIEW Peter A.J. Nagy National Defence Research Establishment, Sweden P.O. Box 1165, S-581 11 Linkoping, Sweden E-mail: petna@lin.foa.se There are many research papers published in modulation classification, and most of them have a common framework. In this paper we will give an overview, and the paper contains four topics: 1) Some fundamental principles, 2) features used for classification, 3) the algorithm structure, and finally 4) a literature survey.
RECONSTRUCTION OF STRUCTURE AND TEXTURE OF PLANAR ENVIRONMENTS BY DYNAMIC VISION TECHNIQUES M. Cossi, G.M. Cortelazzo, R. Frezza D.E.I., University of Padova via Gradenigo 6/a, 35131 Padova, Italy Tel. +39 49 8277825; fax: +39 49 8277826 e-mail: frezza@dei.unipd.it ABSTRACT This work is concerned with the estimate of structure and texture of buildings from a video sequence. The goal includes the recovery of metric information. The results could be conceivably used for many purposes ranging from photogrammetric applications to CAD models that could be applied, for example, for virtual visits of sites of artistic and historical significance. We present an original algorithm to estimate both structure and texture of environments composed by planes like the interiors of most buildings. From a video sequence of a decorated wall the algorithm computes a plane that approximates the wall (structure estimation) and composes a mosaic of the single images to reproduce the decoration (texture estimation). The data are organized so that it is possible to observe the wall from an arbitrary point of view.
ALGORITHMS AND SYSTEMS FOR MODELING MOVING SCENES V. Michael Bove, Jr. Media Laboratory, Massachusetts Institute of Technology Room E15-324, 20 Ames Street, Cambridge MA 02139 USA vmb@media.mit.edu, http://www.media.mit.edu/~vmb/ In this paper I describe the application of machine-vision techniques to video coding in order to create what my research group calls object-oriented television, where moving scenes are represented in terms of objects (as recovered by analysis methods). Beyond data compactness, such a representation offers the ability to add new degrees of freedom to content creation and display. I discuss some of the scene analysis problems (particularly 2-D and 3-D model-fitting and object segmentation) and the algorithmic approaches my group has taken to solve them; suggest computational strategies for compact, powerful, programmable decoding hardware (particularly stream-based computing combined with automatic resource management); and demonstrate some of the applications we have developed.
REGION-BASED IMAGE ANNOTATION USING COLOR AND TEXTURE CUES Eli Saber and A. Murat Tekalp Xerox Corporation, 435 W. Commercial St., East Rochester, NY 14445, saber@roch803.mc.xerox.com We present algorithms for automatic image annotation and retrieval based on pixel-based color, and block- or region-based texture features. Region formation has been accomplished by utilizing Gibbs random fields or morphological based operations. Color, and texture indexing may be knowledge-based (using appropriate training sets) or by example. The algorithms are designed to: i) offer the user a wide range of options and flexibilities in order to enhance the outcome of the search and retrieval operations, and ii) provide a compromise between accuracy and computational complexity.
ORIENTATION RADIOGRAMS FOR INDEXING AND IDENTIFICATION IN IMAGE DATABASES S. Michel (1), B. Karoubi (2), J. Bigun (1) and S. Corsini (3) (1) Signal Processing Laboratory, Swiss Federal Institute of Technology,CH-1015 Lausanne, Switzerland. (2) CREATIS,Research Center Associated to CNRS (#1216) and Affiliated to INSERM, Lyon, France. (3) Bibliotheque Cantonale et Universitaire Lausanne, CH-1015 Lausanne/Dorigny, Switzerland. mch@es1.siemens.ch karoubi@creatis.insa-lyon.fr joseph.bigun@epfl.ch Archival of images in databases, enabling further study with respect to their contents, is at our focus of attention. The major difficulties are i) the processing of a large number of images, ii) that the steadily growing number of images increase the complexity of the pattern recognition problems to be solved. We propose orientation radiograms, to be used as image signatures for shape based queries. These are the projections of a set of orientation decomposed images (here 6) to axes whose directions change synchronously with the orientation bands at hand. The peaks in the radiograms represent long edges or lines which are important for the human when he recognizes or compares images. We present the results of experiments based on approximately 400 images in an application concerning typographic ornament images. Also is presented a comparative study comprising classical moment invariants.
DIGITAL WATERMARKS FOR AUDIO SIGNALS Laurence Boney Departement Signal ENST Paris, France 75634 email: boney@email.enst.fr Ahmed H. Tewfik and Khaled N. Hamdy Department of Electrical Engineering University of Minnesota Minneapolis, MN 55455 email: tewfik@ee.umn.edu, khamdy@ee.umn.edu In this paper, we present a novel technique for embedding digital watermarks'' into digital audio signals. Watermarking is a technique used to label digital media by hiding copyright or other information into the underlying data. The watermark must be imperceptible and should be robust to attacks and other types of distortion. In addition, the watermark also should be undetectable by all users except the author of the piece. In our method, the watermark is generated by filtering a PN-sequence with a filter that approximates the frequency masking characteristics of the human auditory system (HAS). It is then weighted in the time domain to account for temporal masking. We discuss the detection of the watermark and assess the robustness of our watermarking approach to attacks and various signal manipulations.
EMBEDDING PARAMETRIC DIGITAL SIGNATURES IN IMAGES Adrian G. Bors and Ioannis Pitas Department of Informatics, University of Thessaloniki, Thessaloniki 540 06, Greece, E-mail: adrian@zeus.csd.auth.gr, pitas@zeus.csd.auth.gr A new approach to digital image signatures (watermarks) is proposed in this study. An image signature algorithm consists of two stages~: signature casting and signature detection. In the first stage, small changes are embedded in the image which afterwards are identified in the second stage. After chosing certain pixel blocks from the image, a constraint is embedded among their Discrete Cosine Transform (DCT) coefficients. Two different embedding rules are proposed. The first one employs a linear type constraint among the selected DCT coefficients and the second assigns circular detection regions, similar to the vector quantization techniques. The resistance of the digital signature to JPEG compression and to filtering are analyzed.
A NEW SPEECH SCRAMBLING METHOD: COMPARATIVE ANALYSIS AND A FAST ALGORITHM V. D. Delic, V. Senk, and V. S. Milosevic University of Novi Sad, Faculty of Technical Sciences, Trg Dositeja Obradovica 6, 21000 Novi Sad, Yugoslavia Tel: (381 21) 350 244; fax: (381 21) 59 449 e-mail: tlk_delic@uns.ns.ac.yu ABSTRACT: Conventional speech scrambling concept is based on permutation of time segments and/or frequency subbands. Although this approach is regarded as an insecure speech encryption method, almost all published scramblers are of that type. We found out that a linear combination based on Hadamard matrices instead of conventional permutation gives better cryptographic performances, maintaining all the good features of the scrambling concept. The new scrambling method provides a large keyspace and a simpler key selection. It attains negligible residual intelligibility and higher degree of cryptanalytic immunity. The price of these great improvements is a potential complexity increase. That is why we designed a fast algorithm for the new scrambling method. 
LINEAR FILTERING AND IRREGULAR SAMPLING R.J.Martin GEC Hirst Research Centre, Elstree Way, Borehamwood, Herts WD6 1RX, UK R.Martin@hirst.gmmt.gecm.com We show how to suppress coloured noise by subtraction (rather than convolution). The method generalises to nonuniform sampling. It can also be used for identifying narrow-band signals in noisy backgrounds.
MULTIRESOLUTION ANALYSIS USING ORTHOGONAL POLYNOMIAL APPROXIMATION Rupendra Kumar and Pradip Sircar (Corresponding author. email: sircar@iitk.ernet.in) Department of Electrical Engineering Indian Institute of Technology Kanpur KANPUR 208 016, INDIA Multiresolution decomposition of signals has been conventionally carried out by the wavelet representation. In this paper, the orthogonal polynomial approximation has been employed for multiresolution analysis. It is demonstrated that the proposed technique based on polynomial approximation has certain distinct advantages over the conventional method employing wavelet representation.
PERFORMANCE EVALUATION OF D-ALPHA FILTERS M. TABIZA, PH. BOLON LAMII/CESALP, Universit de Savoie B.P. 806 - F.74016 Annecy Cedex, France (CNRS G1047 Information-Signal-Image) e-mail: bolon@univ-savoie.fr; tabiza@esia.univ-savoie.fr We study the output variance of a class of nonlinear filters, called da-filters. In general, it is impossible to obtain an explicit expression of the output variance because of the implicit Input/Output relationship, except for a=1 (median filter), a=2 (mean filter) and a= (midrange filter). In this paper, we develop a new approach to the computation of the filter output variance. It is based on a linearisation of the filter output about the order statistics expected values. This approximation is valid for a > 1. It allows optimal a-values to be computed. Experimental results are presented. They are compared to those of L-filters and with theoretical lower bounds (Bhattacharyya system of lower bounds).
NONLINEAR DYNAMICS OF BANDPASS SIGMA-DELTA MODULATION Orla Feely and David Fitzgerald Department of Electronic and Electrical Engineering University College Dublin Dublin 4, Ireland tel: +353-1-706 1852 fax: +353-1-283 0921 e-mail: Orla.Feely@ucd.ie ABSTRACT Much research attention in recent years has been focussed on the subject of oversampled analogue-to- digital and digital-to-analogue conversion, based on the principle of sigma-delta modulation. Theoretical analysis of these conversion methods has been complicated by their nonlinear nature, precluding the application of standard linear circuit analysis methods. In recent years a number of researchers have undertaken a study of sigma-delta modulation based on nonlinear methods. This paper summarises the results that have been obtained by this study in the case of bandpass sigma-delta modulation, and shows how these results can be extended to handle certain circuit nonidealities.
ELIMINATION OF LIMIT CYCLES IN A DIRECT FORM DELTA OPERATOR FILTER Juha Kauraniemi Timo I. Laakso Laboratory of Signal Processing and Computer Technology Institute of Radiocommunications Helsinki University of Technology Otakaari 5 A FIN-02150 Espoo Finland Email: Juha.Kauraniemi@hut.fi School of Electronic and Manufactoring System Engineering University of Westminster 115 New Cavendish Street London W1M 8JS United Kingdom Email: laaksot@cmsa.westminster.ac.uk Delta operator realizations have been found to be robust against roundoff errors when high sampling rate relative to signal bandwidth is used. In this paper zero input limit cycles in the transposed direct form delta operator structure are studied. It is shown that the limit cycles of the basic delta structure are much lower in amplitude than those of the direct form delay structure for narrowband lowpass filters. Moreover, by certain modifications to the delta operator the zero input limit cycles can be completely avoided. It is also shown that narrowband lowpass filters with both low roundoff noise and absence of limit cycles can be implemented.
ILL-CONDITIONING OF NON-MINIMUM PHASE SYSTEMS S. Hashemi & J. K. Hammond Institute of Sound and Vibration Research (ISVR), University of Southampton ABSTRACT The typical inverse problem is the recovery of the input, x, given data, y and the knowledge of the system A. Such problems occur frequently in instrumental science. For the Linear Time Invariant (LTI) systems the governing equation can be expressed in matrix form, y=Ax. In this paper the problem of ill-conditioning of non-minimum phase systems and the relation of the phase structure of the system to the singular values of its system matrix is discussed.
FLEXIBLE NONUNIFORM FILTER BANKS USING ALLPASS TRANSFORMATION OF MULTIPLE ORDER M. Kappelan, B. Strauss, P. Vary Institute of Communication Systems and Data Processing (IND) RWTH Aachen, University of Technology D-52056 Aachen, Germany Tel: +49 (0)241 80 6959; Fax: +49 (0)241 8888 186 e-mail: kiwi@ind.rwth-aachen.de This paper deals with allpass frequency transformations of uniform filter banks to achieve nonuniform bandwidths. The known transformation with an allpass of first order is extended to an allpass transformation of order K. Thus the flexibility of the filter bank design can be increased significantly.
ELIMINATION OF CLIKS AND BACKGROUND NOISE FROM ARCHIVE GRAMOPHONE RECORDINGS USING THE "TWO TRACK MONO" APPROACH Maciej NIEDZWIECKI Faculty of Electronics Department of Automatic Control, Technical University of Gdansk ul. Narutowicza 11/12, Gdansk , Poland Tel: + 48 58 472519; fax +48 58 415821 e-mail: maciekn@sunrise.pg.gda.pl Old gramophone recordings are corrupted with a wideband noise (granulation noise) and impulsive disturbances (cliks, pops, record scratches) - both caused by aging and/or mishandling of the vinyl material. The paper presents an improved method of gramophone noise reduction which makes use of two signals obtained when a mono record is played back using the stereo equipment.
EFFICIENT ALLOCATION OF POWER-OF-TWO TERMS IN COMPLEX FIR FILTER DESIGN Tolga Ciloglu* and Yong Hoon Lee** *Dept. of Electrical and Electronics Eng., Middle East Tech. Univ., Ankara, 06531, Turkey e-mail: ciltolga@rorqual.cc.metu.edu.tr **Dept. of Electrical Eng., Korea Advanced Institute of Science and Technology, Taejon, Korea e-mail: yohlee@eekaist.kaist.ac.kr Abstract The design of discrete coefficient FIR filters with arbitrary magnitude and phase specifictions and whose coefficients are expressed as the signed combination of a few power-of-terms (SPT) is considered. The total number of SPT terms is fixed and their distribution among the coefficients is not restricted. The proposed method is an improved version of those originally proposed for the design of linear phase filters [9], [10].
CHEBYSHEV DESIGN OF FIR FILTERS WITH ARBITRARY MAGNITUDE AND PHASE RESPONSES Mathias Lang INTHFT, Vienna University of Technology Gusshausstrasse 25/389, A-1040 Vienna, Austria Tel: +43 1 58801 3527; fax: +43 1 587 05 83 e-mail: mlang@neptun.nt.tuwien.ac.at This paper presents a method for the design of nonlinear phase FIR digital filters with complex or real-valued coefficients using the Chebyshev error criterion. Three different problems are considered: Complex Chebyshev approximation with additional weighting of the resulting magnitude and phase errors, simultaneous Chebyshev approximation of a given magnitude and phase response, and simultaneous Chebyshev approximation of a given magnitude and group delay response. A linearization approach leads to a problem formulation that allows the use of stable algorithms with guaranteed convergence. It is shown that for this linear approach the simultaneous Chebyshev approximation of a desired magnitude and phase response is a special case of complex Chebyshev approximation with independent weighting of the magnitude and phase errors. Two existing design methods are included in this method as special cases.
DESIGNING OF ROBUST STABLE DIGITAL FILTERS Mariusz Ziolko Institute of Electronics AGH ul.Czarnowiejska 78, 30-054 Krakow, Poland Tel: + 48 12 173048; fax: +48 12 332398 e-mail: ziolko@uci.agh.edu.pl The Ackerman-Barmish method was used to establish a set of stable family of an Infinite Impulse Response (IIR) digital filters. Next, the optimization method was used to choose a filter which meets design specifications given in the frequency domain. Designing of lowpass third order IIR filter is presented as an example.
CEPSTRAL SYNTHESIS OF MINIMUM-PHASE FIR AND IIR DIGITAL FILTERS P. Nagel Department of Electrical Engineering, University of Kaiserslautern A new technique for designing causal and minimum-phase FIR and IIR digital filters is presented. Here, the deviation from a desired quefrency response is minimised using the Fletcher-Powell algorithm. As a consequence, this leads to an optimisation of both log-magnitude response and phase response. Therefore, the method is of special interest for both equalisers and allpasses. It works with real parameters which represent the poles and zeros of the system.
ARMA MODEL IDENTIFICATION USING HIGHER ORDER STATISTICS AND FISHER INFORMATION CONCEPTS Eric LE CARPENTIER and Jean-Luc VUATTOUX Laboratoire d'Automatique de Nantes, URA C.N.R.S. 823, Ecole Centrale de Nantes/Universite de Nantes, 1 rue de la Noe, 44072 Nantes cedex 03, France. Tel: (33) 40 37 16 46. Fax: (33) 40 37 25 22 e-mail: lecarpentier@lan.ec-nantes.fr The problem of estimating the parameters of a non causal ARMA system, driven by an unknown input noise with unknown symmetrical probability density function (PDF) is addressed. A maximum likelihood approach is proposed in this paper. The main idea of our approach is that the assumed PDF of the input noise is the PDF minimizing the Fisher information among PDFs matching the estimated cumulants of $2nd$ and $4th$ order. This minimization problemis hard to solve, so we use an over-parameterized PDF model, which is a gaussian mixture. We obtain two different models for the classes of sub-Gaussian and super-Gaussian PDFs. For this latter class, we get the most robust estimator in Huber's sense, among these generated by this class. A new parameter estimation method is given and its robustness and optimality properties are detailed. The performances of the resulting identification scheme are compared to those of another higher order method.
ARMA Parameter Estimation Through Enhanced Double MA Modelling Achilleas G. Stogioglou and Stephen McLaughlin Signals and Systems Group, Department of Electrical Engineering, The University of Edinburgh ABSTRACT This paper considers the application of MA cumu- lant enhancement to the identification of the para- meters of a causal nonminimum phase ARMA(p, q) system which is excited by an unobservable inde- pendent identically distributed (IID) non-Gaussian process. The method proposed in this paper is based on the double MA method of [1]. The cumu- lant enhancement is used to improve the cumulants of the two intermediate MA models which result from the decomposition of the original ARMA(p, q) model. Simulation results are presented to demon- strate the effects of cumulant enhancement on the estimated ARMA parameters.
DETECTION AND CLASSIFICATION OF NOISY AR AND ARMA PROCESSES Jean-Yves TOURNERET, Karine VAREILLE and Martial COULON ENSEEIHT/GAPSE, National Polytechnics Institute of Toulouse 2 rue Camichel, 31071 Toulouse, France email: tournere@len7.enseeiht.fr The paper focuses on the detection and the classification of noisy AR and ARMA processes. These two kinds of processes cannot be distinguished by means of their second-order statistics, since they are Spectrally Equivalent (SE). Higher-order statistics are shown to be an efficient tool for their detection. A Neyman-Pearson (NP) test, based on these higher-order statistics, is then studied. The performance of the NP test provides a reference for comparing suboptimal detector performances. 
HIGHER ORDER DETECTION TEST FOR DETERMINISTIC SIGNALS Claire Chichereau, Bruno Flament, Roland Blanpain LETI (CEA-Technologies Avancees) DSYS CEA - Grenoble - 17, rue des Martyrs 38054 Grenoble Cedex 9 - France Tel: +33 76 88 95 42; fax +33 76 88 51 59 e-mail: chichereau@dsys.ceng.cea.fr In the contex of electromagnetic signals, we want to detect a transient in a non stationnary gaussian noise by a higher order statistic test. In this paper, we use a new formalism (an extension of Gardner's work) that enables us to evaluate theoretically the response of higher order statistic test for detection. We develop the theoretical ground and we prove that higher order statistic detection test provides a very short delay detection. We apply our methods to simulation of a simple and typical example : the kurtosis.
DETERMINING THE FALSE-ALARM PERFORMANCE OF HOS-BASED QUADRATIC PHASE COUPLING DETECTORS J W A Fackrell and S McLaughlin Department of Electrical Engineering, University of Edinburgh, UK jwaf@ee.ed.ac.uk Quadratic Phase Coupling (QPC) can be detected using Higher Order Statistics (HOS) measures. Previously, the bispectrum, biphase and bicoherence have been used as components in two QPC-detection algorithms. In this paper it is shown that the expressions which describe these detectors reduce to the same form for the white Gaussian noise case. The performance of these detectors is discussed, and particular attention is given to false alarms, which occur when QPC is detected in signals which do not exhibit QPC. A simple expression is derived which gives the probability of false alarm (PFA) for QPC detectors. This expression shows how the PFA increases as the Signal to Noise Ratio decreases, a relationship which is also observed in a simulation example.
LINEAR TIME-VARIANT PROCESSING OF HIGHER- ORDER ALMO ST-PERIODICALLY CORRELATED TIME-SERIES Luciano Izzo Antonio Napolitano Universita di Napoli Federico II, Dipartimento di Ingegneria Elettronica via Claudio 21, I-80125 Napoli, Italy; Tel: +39-81-7683156; Fax: +39-81-7683149 E-mail: izzoQnadis.dis.unina.it The characterization and linear time-variant processing of the higher-order almost-periodically correlated time- series in the fraction-of-time probability framework are considered. At first, the characterization in the tem- poral domain is presented by exploiting the expression of the temporal moment function as a sum of complex sinusoids whose amplitudes and frequencies are contin- uous functions of the lag vector. Then, the character- ization in the frequency domain is considered. Finally, for both random and nonrandom linear systems, the in- put/output relationships in terms of generalized cyclic temporal moment functions and generalized cyclic spec- tral moment functions are stated. As special cases, lin- ear almost-periodically time-variant systems as well as systems performing time-scale changing are also treated.
TITLE : A HIGHER-ORDER CUMULANT BASED DOA ESTIMATION ALGORITHM PAPER IDENTIFICATION NUMBER : 384 AUTHOR(S) : W.K.Lai and P.C.Ching AFFILIATION : Department of Electronic Engineering The Chinese University of Hong Kong, N.T., Hong Kong Tel: (852) 2609 8266; fax : (852) 2603 5558 E-MAIL : wklai@ee.cuhk.edu.hk ABSTRACT Most of the existing direction-of-arrival estimation algorithms depend on decomposition of the covariance matrix of the system which in turn require modeling of the contaminating noise. In this paper, a higher-order cumulant based algorithm for estimating the direction-of-arrival of m narrowband far field sources impinging on an array with n uniformly spaced sensors is proposed. Due to the unique property of higher order cumulant, the proposed method is shown to be at least theoretically independent of the additive Gaussian noise. The algorithm first evaluates the 2r-th order cumulant from the output of the system. By making use of these output cumulants, we obtain a new vector in which its elements are the coefficients of an equation whose roots are the DOA of the sources. The validity of the algorithm is demonstrated by extensive computer simulations.
TITLE : SOME PROPERTIES AND ALGORITHMS FOR FOURTH ORDER SPECTRAL ANALYSIS OF COMPLEX SIGNALS AUTHORS : Cecile HUET and Joel LE ROUX I3S, University of Nice Sophia Antipolis - CNRS 250 rue Albert Einstein Sophia Antipolis 06560 Valbonne FRANCE Tel: +33 92.94.26.82; fax: +33 92.94.28.96 e-mail: huet@alto.unice.fr - leroux@alto.unice.fr ABSTRACT Some algorithms for linear system identification based on fourth order spectra are given. They extend algorithms developed in the case of third order statistics. We also give a method for phase unwrapping for fourth order spectra and we establish a link between algorithms based on kurtosis maximization and identification method in the frequency domain. Keywords : higher order statistics (HOS) - higher order spectra - blind identification - kurtosis maximization - phase unwrapping.
STATIONARY MOMENTS OF A POLYNOMIAL PHASE SIGNAL, APPLICATION TO PARAMETER ESTIMATION A. Ferrari, C. Theys and G. Alengrin I3S Universit\'e de Nice-Sophia Antipolis 41, Bd Napol\'eon III - 06041 NICE cedex - FRANCE e-mail : ferrari@unice.fr This communication addresses the problem of estimating the parameters of a polynomial phase signal using an original approach: although this signal is clearly non stationary, some of its high order moments are shift invariant. The condition verified by the delays of these stationary'' moments is derived in the noiseless and noisy case. It is demonstrate that the only identifiable phase parameter is the highest order coefficient, the estimation requiring moments of order at least the double of the phase degree. An algorithm relying on these high order moments is derived and its performances are presented and compared to a recent algorithm.
EXTENDED SPECTRAL SUBTRACTION Pavel Sovka & Petr Pollak & Jan Kybic Czech Technical University, Faculty of Electrical Engineering CTU FEL K331, Technicka 2, 166 27 Praha 6, Czech Republic Tel: (+42 2) 2435 2291 Fax: (+42 2) 2431 0784 E-mail: [sovka,pollak]@feld.cvut.cz This paper describes a new method for one channel noise suppression system which overcomes the typical disadvantage of one channel noise suppression algorithms - the impossibility of noise estimation during speech sequence. Our method is the combination of Wiener filtering and spectral subtraction. The noise can be successfully updated even during the speech sequences and that is why there is no need of the voice activity detector.
NOISE REDUCTION OF SPEECH SIGNALS USING THE RANK-REVEALING ULLV DECOMPOSITION Peter S. K. Hansen, Per Christian Hansen(1), Steffen Duus Hansen and John Aasted Sorensen Department of Mathematical Modelling, Section for Digital Signal Processing Technical University of Denmark, DK-2800 Lyngby, Denmark E-mail: pskh@imm.dtu.dk, sdh@imm.dtu.dk and jaas@imm.dtu.dk (1)UNI-C, Technical University of Denmark, DK-2800 Lyngby, Denmark E-mail: Per.Christian.Hansen@uni-c.dk A recursive approach for nonparametric speech enhancement is developed. The underlying principle is to decompose the vector space of the noisy signal into a signal subspace and a noise subspace. Enhancement is performed by removing the noise subspace and estimating the clean signal from the remaining signal subspace. The decomposition is performed by applying the rank-revealing ULLV algorithm to the noisy signal. With this formulation, a prewhitening operation becomes an integral part of the algorithm. Linear estimation is performed using a proposed minimum variance estimator. Experiments indicate that the approximative method is able to achieve a satisfactory quality of the reconstructed speech signal comparable with eigenfilter based methods.
Speech Enhancement Using a Wiener Filtering Under Signal Presence Uncertainty A. AKBARI AZIRANI - R. LE BOUQUIN JEANNS - G. FAUCON Laboratoire du Traitement du Signal et de l'Image - Universit de Rennes 1 Bt. 22 - Campus de Beaulieu - 35042 RENNES CEDEX - FRANCE Regine.Lebouquin@univ-rennes1.fr Abstract Noise reduction is a key-point of speech enhancement systems in hands-free communications. A number of techniques have been already developed in the frequency domain such as an optimal short-time spectral amplitude estimator proposed by Ephraim and Malah including the estimation of the a priori signal-to-noise ratio. This approach reduces significantly the disturbing noise and provides enhanced speech with colorless residual noise. In this paper, we propose a technique based on a Wiener filtering under uncertainty of signal presence in the noisy observation. Two different estimators of the a priori signal-to-noise ratio are tested and compared. The main interest of this approach comes from its low complexity.
IMPROVED SPECTRAL SUBTRACTION FOR SPEECH ENHANCEMENT Y. Malca and D. Wulich Department of Electrical & Computer Engineering, Ben-Gurion University of the Negev. Beer-Sheva 84105, POB 635, Israel. Tel: ++972-7-461537, Fax: ++972-7-472949, e-mail: dov@bguee.bgu.ac.il ABSTRACT The spectral subtraction approach has become almost standard in speech enhancement because it is relatively easy to understand and implement. The major drawback of the spectral subtraction method is that it leaves residual noise with annoying noticeable tonal characteristics referred to as musical noise. For low SNR the perceived effect of the "musical noise" is close to that of the additive noise. In the present work we propose to reduce the musical noise by applying the output of a standard spectral subtractor to a constrained high order notch filter which suppresses the "musical noise". The filtration process distorts the speech signal. It is possible to reduce the level of distortion if the speech signal is preprocessed properly before it is contaminated by the noise. It will be demonstrated that the proposed method is superior to the standard spectral subtraction specially for low SNR. A comprehensive listening test indicated that for segmental SNR= -12dB, 77% of the listeners strongly preferred the proposed approach over the usual spectral subtraction approach.
A SINGLE MICROPHONE NOISE CANCELLER BASED ON ADAPTIVE KALMAN FILTER M. Gabrea, E. Mandridake and M. Najim Equipe Signal et Image, ENSERB and GDR-134, CNRS BP 99, 33 402 Talence, FRANCE email: najim@goelette.tsi.u-bordeaux.fr This paper deals with the problem of Adaptive Noise Cancellation (ANC) when only corrupted speech signal with an additive Gaussian white noise is available for processing. We propose a new method based on adaptive Kalman filtering. All the approaches based on the Kalman filter proposed in the past, in this context, operate in two steps: they first estimate the noise variance and the parameters of the signal model and secondly estimate the speech signal. The approach presented in this paper gives an alternative to these approaches since it does not require the estimation of the noise variance. The noise variance estimation is a part of the Kalman gain calculation. For optimizing the Kalman gain we have reformulated and adapted, to the single-microphone ANC problem, the approach proposed in control by R. K. Mehra.
TWO MICROPHONES SPEECH ENHANCEMENT SYSTEM BASED ON A DOUBLE FAST RECURSIVE LEAST SQUARES (DFRLS) ALGORITHM M. Gabrea*, E. Mandridake*, M. Menez+, M. Najim* and A. Vallauri++ * Equipe Signal et Image, ENSERB and GDR-134, CNRS BP 99, 33 402 Talence, France + LASSY-I3S Nice, France ++ Texas-Instruments, Villeneuve-Loubet, France email: limby@goelette.tsi.u-bordeaux.fr In this paper a symmetric feedback implementation scheme of a two microphones speech enhancement is presented. We consider the coupling systems modelled as a linear time-invariant Finite Impulse Response (FIR) filters and propose a new recursive-based adaptive filter solution to enhance the noisy speech . The optimum filter weight adaptation is based on a Double Fast Recursive Least Squares (DFRLS) algorithm. This approach can be extended for a subclass of signal separations where the direct link is stronger than the interference link in the both channels. A comparative study with other adaptive algorithms shows the superiority of the DFRLS in SNR performance improvement.
Signal Restoration of Broad Band Speech Using Nonlinear Processing Hiroshi Yasukawa NTT Optical Network Systems Labs. 1-2356 Take, Yokosuka, 238-03 Japan Tel: +81-468-59-3016; Fax: +81-468-55-1283 e-mail: yasukawa@exa.onlab.ntt.jp ABSTRACT This paper describes a new system that can enhance the quality of speech signals that have been severely band limited during transmission. We have already proposed a spectrum widening method that utilizes aliasing in sampling rate conversion with digital filtering for spectrum shaping. This paper proposes a quite simple method by adding spectrum in the higher band using nonlinear processing. Implementation procedures are clarified, and its performance is discussed. It is shown that the proposed method offers good performance in terms of spectrum distortion characteristics.
Adaptive Digital Filtering For Signal Reconstruction Using Spectrum Extrapolation Hiroshi Yasukawa NTT Optical Network Systems Labs. 1-2356 Take, Yokosuka, 238-03 Japan Tel: +81-468-59-3016; Fax: +81-468-55-1283 e-mail: yasukawa@exa.onlab.ntt.jp Abstract This paper describes adaptive filtering for signal reconstruction. The speech quality enhancement system by the spectrum extrapolation of the band limited signals is discussed. In telephone communication, the spectrum extrapolation which employs aliasing processing is widely known. In this paper a new implementation using adaptive methods is proposed. This method introduces frequency domain adaptive digital filtering to broaden band limited signals into wide band signals. Implementation of the system and its performance are discussed.
COMBINATION OF TWO-CHANNEL SPECTRAL SUBTRACTION AND ADAPTIVE WIENER POST-FILTERING FOR NOISE-REDUCTION AND DEREVERBERATION Matthias Doerbecker, Stefan Ernst Institute of Communication Systems and Data Processing, Aachen University of Technology, 52056 Aachen, Germany e-mail: matthias@ind.rwth-aachen.de In this contribution a novel structure for the enhancement of speech signals disturbed by acoustic noise is presented which is based on Spectral Subtraction. The Spectral Subtraction technique is combined with a novel estimator for the noise power spectrum which takes advantage of the employment of a second microphone. Due to the extension to a two-microphone system the Spectral Subtraction can be used to reduce realistic, non-stationary noise sources. Additionally, the performance of the system is further improved by the application of a post filter adapted according to Wiener filter techniques. As a result, the proposed speech enhancement system provides a significant suppression of noise in realistic situations as well as a reduction of room reverberation. 
LIP MOVEMENTS SYNTHESIS USING TIME DELAY NEURAL NETWORKS Sergio Curinga, Fabio Lavagetto, Fabio Vignoli D.I.S.T. - University of Genova Via Opera Pia 13A, 16145 GENOVA E-mail: sergio@dist.dist.unige.it Abstract A method exploiting the audio-visual correlation of speech in order to estimate the lip and mouth movements is presented. Its applications are in the field of aids and services for elderly people, in videotelephony, in cartoons and movie dubbing. Notice that lip movements synthesis does not imply speech recognition and that the mouth shape is not only specified by the phoneme currently uttered but it also depends on some past and future speech information. In order to take into account this temporal correlation, and considering the constraint of computational effectiveness, the Time Delay Neural Networks (TDNNs) seem to be the most appropriate analysis tool in comparison with methods like Markov Models, which are more resource consuming.
SPEECH SEGMENTATION USING MULTILEVEL HYBRID FILTERS Marcos Faundez, Francesc Vallverdu Department of Signal Theory and Communications UPC e-mail: marcos@gps.tsc.upc.es A novel approach for speech segmentation is proposed, based on Multilevel Hybrid Filters with the following features: - An accurate transition location - Good performance in noisy environments (gaussian and impulsive noise) The proposed method is based on spectral changes, with the goal of segmenting the voice into homogeneous acoustic segments. This algorithm is being used for phonetically segmented speech coder with successful results.
A BACKWARD-ADAPTIVE PERCEPTUAL AUDIO CODER Joao Manuel Rodrigues Ana Maria Tome Departamento de Electronica e Telecomunicacoes / INESC Universidade de Aveiro 3810 AVEIRO, PORTUGAL Tel: +351-34-370500; Fax: +351-34-370545 e-mail: jmr@inesca.pt This paper presents a new audio compression algorithm that includes a nonuniform filter bank, gain-adaptive logarithmic quantizers, arithmetic entropy coding and an explicit psychoacoustic model to adapt the quantization according to perceptual considerations. Unlike existing perceptual coders, the new system is backward-adaptive, i.e., adaptation depends exclusively on already quantized samples, not on the original signal. We discuss the advantages of backward adaptiveness and show that it can be successfully applied to perceptual coding.
Title : SAMPLE-BY-SAMPLE GAIN ADAPTIVE CELP CODING OF WIDEBAND AUDIO Authors : Man-Tak Chu and Cheung-Fat Chan Affiliation : Department of Electronic Engineering City University of Hong Kong 83, Tat Chee Avenue, Hong Kong email : eecfchan@cityu.edu.hk fax : (852) 27887791 ABSTRACT -------- This paper presents a high quality wideband audio coder based on a low delay code excited linear predictive (LD-CELP) model where the excitation gain is adapted in a sample-by-sample manner. The proposed coder employs a backward adaptive predictor which introduces no extra delay to the system. A simple gain adaptive control is utilized to perform a sample-by-sample gain adaptive excitation model. In other words, the proposed coder exploits the advantages of the LD-CELP and ADPCM coding. This coder can provide transparent quality audio signals at a bitrate of 1.5 bits/sample.
SPLIT-BAND LD-CELP WIDEBAND SPEECH CODING AT 24 KBIT/S Andrea Santilli(*), Aurelio Uncini(**), Francesco Piazza(**) (*) AETHRA S.r.L. 60020 Palombina (AN), Italy (**) Dip. Elettronica ed Automatica, Univ. of Ancona, 60131 Ancona, Italy phone: +39 71 220 4453 fax: +39 71 220 4464 e-mail: upfm@eealab.unian.it Nowaday 7 Khz wideband speech coding requires at least 48 kbit/s as it still depends on the ITU standard G.722. CELP coders have been developed for wideband systems achieving high quality speech coding at rates from 16 kbit/s to 32 kbit/s as the wideband LD-CELP at 32 kbit/s. In this paper, a new split-band LD-CELP wideband coder at 24 kbit/s is proposed and its performance and complexity are compared with those of the already known wideband LD-CELP.
INNOVATION CODING WITH A CROSS-CORRELATED QUANTIZATION NOISE MODEL Soeren Vang Andersen, Morten Olesen, Soeren Holdt Jensen, and Egon Hansen CPK, Aalborg University, Fredrik Bajers Vej 7, DK-9220 Aalborg OEst, Denmark. E-mail: sva@cpk.auc.dk We present the use of a cross-correlated quantization noise model in the recently proposed Kalman innovation speech coding scheme. Computer simulations and informal listening tests indicate that the incorporation of a cross-correlated noise model yields an improvement in both SNR and perceptual quality when compared to a uncorrelated noise model.
MINIMUM CLASSIFICATION ERROR TRANSFORMATIONS FOR IMPROVING SPEECH RECOGNITION SYSTEMS Angel de la Torre, Antonio M. Peinado, Antonio J. Rubio, Jose C. Segura, Victoria E. Sanchez Dpto. de Electronica y Tecnologia de Computadores Universidad de Granada, 18071 GRANADA (Spain) e-mail atv@hal.ugr.es Signal representation is an important aspect to be taken into account for pattern classification. Recently, discriminative training methods have been applied to feature extraction for speech recognition. In this paper, we apply the Minimum Classification Error estimation to train the parameters of a feature extractor. This feature extractor is a linear transformation of the original representation space. The new representation of the speech signal makes easier the recognition task and the performance of the different tested recognizers is improved as the experimental results show.
TOWARDS SUBBAND-BASED SPEECH RECOGNITION Herv Bourlard (1,3) Stphane Dupont (1) Hynek Hermansky (2,3) Nelson Morgan (3) (1) Facult Polytechnique de Mons - TCTS 31, Bld. Dolez, B-7000 Mons, Belgium Email: bourlard,dupont@tcts.fpms.ac.be (2) Oregon Graduate Institute, Portland, OR, USA (3) Intl. Computer Science Institute, Berkeley, CA, USA In the framework of hidden Markov models (HMM) or hybrid HMM/Artificial Neural Network (ANN) systems, we present a new approach towards speech recognition. The general idea is to split the whole frequency band (represented in terms of critical bands) into a few subbands on which different recognizers are independently applied and then recombined at a certain speech unit level to yield global scores and a global recognition decision. The preliminary results presented in this paper show that such an approach, even using quite simple recombination strategies, can yield at least comparable performance on clean speech while providing significantly better robustness in the case of speech corrupted by narrowband noise.
NONLINEAR DISCRIMINANT ANALYSIS WITH NEURAL NETWORKS FOR SPEECH RECOGNITION Vincent Fontaine, Christophe Ris, Henri Leich Faculte Polytechnique de Mons --- TCTS 31, Bld. Dolez, B-7000 Mons, Belgium Tel : + 32 65 374176 - Fax : + 32 65 374129 e-mail: {fontaine,ris,leich}@tcts.fpms.ac.be Linear Discriminant Analysis (LDA) has been applied successfully to speech recognition tasks, improving accuracy and robustness against some types of noise. However, it is well known that LDA suffers from some weaknesses if the distributions are not unimodal or when the mean of the distributions are shared. In this paper, we propose to take advantage of the nonlinear discriminant properties of the Artificial Neural Networks (ANN) in the task of reducing the dimensionality of the input space, leading to a nonlinear discriminant analysis.
ROBUST SPEECH RECOGNITION USING FUZZY MATRIX QUANTISATION, NEURAL NETWORKS AND HIDDEN MARKOV MODELS Professor C S Xydeas and Lin Cong Speech Processing Research Laboratory, Electrical Engineering Division, School of Engineering, University of Manchester, Dover Street, Manchester, M13 9PL, UK, Tel/Fax: +44[161]2754511/2754528, E-Mail: c.xydeas@man.ac.uk Abstract In this paper a new approach to robust speech recognition using Fuzzy Matrix Quantisation, Hidden Markov Models and Neural Networks is presented and tested when speech is corrupted by car noise. Thus two new robust isolated word speech recognition (IWSR) systems called FMQ/HMM and FMQ/MLP, are proposed and designed optimally for operation in a variety of input SNR conditions. The schemes and associated system training methodologies result into a particularly high recognition performance at input SNR levels as low as 5 and 0 dBs.
LOCALLY RECURRENT NEURAL NETWORKS FOR EFFICIENT REALIZATION OF A SPEECH RECOGNIZER Klaus Kasper, Herbert Reininger, Dietrich Wolf, and Harald Wuest wuest@apx00.physik.uni-frankfurt.de The computational complexity of speech recognizers based on fully connected recurrent neural networks, i.e. the large number of connections, prevents a hardware realization. We introduced locally connected recurrent neural networks in order to keep the properties of recurrent neural networks and to reduce the connectivity density of the network. A special form of feature presentation and output coding is developed which reduces the computational complexity and allows learning of long-term dependencies. By applying all these methods a locally recurrent neural network results, which has only one third of the weights as a fully connected recurrent network. Thus, with this concept a speech recognition system can be realized on a single VLSI-Chip.
TEXT-INDEPENDENT OFF-LINE WRITER RECOGNITION USING NEURAL NETWORKS D. A. Valkaniotis, J. Sirigos, N. Fakotakis and G. Kokkinakis Wire Communications Laboratory, University of Patras, 26500 Patras, Greece Tel: +33 61 991722; fax: +33 61 991855 e-mail : valkan@wcl.ee.upatras.gr ABSTRACT In this paper we present a text-independent off-line writer recognition system based on multi-layer perceptrons (MLPs). The system can be used for both identification and verification purposes. It was tested on a population of 20 writers with non-correlated training and test specimens. The mean error for identification was 3.5% while error rates as low as 0.5% were achieved on specimens with more than 25 characters. For verification the mean error was 1.2% (2.22% false rejection, 0.18% false acceptance) considering a minimum of 15 characters per test specimen. These error rates are comparable to those achieved by classical methods while the response of the system is substantially faster.
SEGMENTAL LVQ3 TRAINING FOR PHONEME-WISE TIED MIXTURE DENSITY HMMS Mikko Kurimo Helsinki University of Technology, Neural Networks Research Centre Rakentajanaukio 2 C, FIN-02150, Espoo, FINLAND tel: +358 9 451 3266 fax: +358 9 451 3277 email: mikko.kurimo@hut.fi The system trains speaker dependent, but vocabulary independent, phoneme models for the recognition of Finnish words. The Learning Vector Quantization (LVQ) methods are applied to increase the discrimination between the phoneme models. A segmental LVQ3 training is proposed to substitute the LVQ2 based corrective tuning as a parameter estimation method. The experiments indicate that the new method can provide the corresponding recognition accuracy, but with less training and more robustness over the initial models. Experiments to up-scale the current system by introducing context vectors and larger mixture pools show up to 40 % reduction of recognition errors compared to the earlier results.
Title : THIRD-ORDER CUMULANT-BASED WIENER FILTERING ALGORITHM APPLIED TO ROBUST SPEECH RECOGNITION Authors : Josep M. SALAVEDRA, Javier HERNANDO Affiliations : Universitat Politecnica de Catalunya. c/ Gran Capita s/n. 08034-BARCELONA. SPAIN. Tel/Fax: +34-3-4017404 / 4016447 . E-mail: mia@gps.tsc.upc.es ABSTRACT : In previous works [5], [6], we studied some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim-Oppenheim [2], where the AR spectral estimation of the speech is carried out using a second-order analysis. But in our algorithms we consider an AR estimation by means of cumulant analysis. This work extends some preceding papers due to the authors: a cumulant-based Wiener Filtering (AR3_IF) is applied to Robust Speech Recognition. A low complexity approach of this algorithm is tested in presence of bathroom water noise and its performance is compared to classical Spectral Subtraction method. Some results are presented when training task of the speech recognition system (HTK-MFCC) is executed under clean and noisy conditions. These results show a lower sensitivity to the presence of water noise when applying AR3_IF algorithm inside of a speech recognition task.
COMPARISON OF SEVERAL PREPROCESSING TECHNIQUES FOR ROBUST SPEECH RECOGNITION OVER BOTH PSN AND GSM NETWORKS Chafic Mokbel, Laurent Mauuary, Denis Jouvet and Jean Monn France Tlcom - CNET / LAA / TSS / RCP 2 av. Pierre Marzin, 22307 Lannion cedex, France e-mail: mokbel(jouvet, monne)@lannion.cnet.fr ABSTRACT In this paper several preprocessing techniques used to improve speech recognition performance are compared over both PSN and GSM networks. Recognition experiments are conducted on a digit database in a speaker- independent isolated-word mode in order to evaluate the performances under within- and cross-network (PSN and GSM) conditions. Two classes of preprocessing techniques are distinguished depending on whether they deal with additive ambient noise or convolved perturbations. The first class preprocessing techniques are based on spectral subtraction. In the second class, the low frequencies of cepstral trajectories are eliminated in order to reduce convolved disturbances. Blind equalization adaptive filtering has been proposed to reduce channel effects. In this study, channel equalization and speech enhancement techniques are combined and compared. Different recording conditions may be integrated in order to increase robustness. This is done during the training phase using HMM models with variable parameters. Recognition results are analysed as a function of recording conditions.
CONSISTENT SUBSETS IN SPEECH RECOGNITION SYSTEMS Stefan Grocholewski Institute of Computing Science, Poznan University of Technology Piotrowo 3a, 60-965 Poznan, Poland Tel: i48 (0)61 782 373; fax: +48 (0)61 771 525 grocholew@poznlv.tup.edu.pl ABSTRACT In the paper the method of the transformation of the learning samples into their representatives is presented. The proposed algorithm combines the features of the neural nets approach, i.e. the representatives lie near the boundaries separating the classes, and cluster seeking approach - each representative corresponds to the group of elements lying close to each other. By using the consistent subset the drawbacks of those approaches (cluster can comprise samples from different classes; the sophisticated network is not appropriate in the regions where the classes overlap) can be avoided in some cases. Several applications in the area of speech recognition are presented.
VOCABULARY INDEPENDENT ACOUSTIC-PHONETIC MODELING FOR CONTINUOUS SPEECH RECOGNITION L. Fissore (+), P. Laface (*), G. Micca (+), F. Ravera (+) (+) CSELT - Centro Studi e Laboratori Telecomunicazioni Via G. Reiss Romoli 274 - I-10148 Torino, Italy E-Mail fissore@cselt.stet.it (*) Dipartimento di Automatica e Informatica - Politecnico di Torino Corso Duca degli Abruzzi 24 - I-10129 Torino, Italy E-Mail laface@polito.it This paper investigates the problem of defining the acoustic-phonetic unit set for flexible vocabulary continuous speech recognition systems. As an alternative to the classical modeling approach with biphones and triphones, a set of stationary/transitory state units is defined that is limited enough in number as to represent a closed set trainable once and for all. A major benefit of these units is that inter-word transitions can easily be taken into account. We show that a system employing these new units favorably compares with respect to a baseline recognizer with Continuous Density Hidden Markov Models of context-dependent biphones and triphones, selected through a minimal occurrence criterion within the training database.
ASYNCHRONOUS INTEGRATION OF AUDIO AND VISUAL SOURCES IN BI-MODAL AUTOMATIC SPEECH RECOGNITION +Paul Del'eglise, Alexandrina Rogozan and Mamoun Alissali LIUM, University of Maine ++Av. Olivier Messiaen, BP 535, 72017 Le Mans Cedex, France Tel: +33 43.83.37.70; Fax: +33 43.83.33.66 e-mail: deleglise@lium.univ-lemans.fr This paper presents our work on the integration of visual data in automatic speech recognition systems. We particularly aim at solving two problems: o classifiation differences for the modeling of acoustic information (phonemes) and visual information (visemes); o the phenomena of anticipation and retention of visemes on the corresponding phonemes. We developed and tested three systems, each dealing with one or both problems and proposing a different integration strategy. The comparison of system performances show that some of the solutions we propose give satisfactory results, and suggest that further work on some others would lead to more performance improvement.
Title NEW TIME-FREQUENCY DERIVED CEPSTRAL COEFFICIENTS FOR AUTOMATIC SPEECH RECOGNITION Authors Hubert Wassner, Gerard Chollet Affiliation IDIAP (wassner@idiap.ch, chollet@idiap.ch), ENST (chollet@sig.enst.fr) Abstract The goal is to improve recognition rate by optimisation of Mel Frequency Cepstral Coefficients (MFCCs): modifications concern the time-frequency representation used to estimate these coefficients. There are many ways to obtain a spectrum out of a signal which differ in the method itself (Fourier, Wavelets,...), and in the normalisation. We show here that we can obtain noise resistant cepstral coefficients, for speaker independent connected word recognition.The recognition system is based on a continuous whole word hidden Markov model. An error reduction rate of approximately 50\% is achieved. Moreover evaluation tests demonstrate that these results can be obtained with smaller databases: halving the training database have small effects on recognition rates (which is not the case with traditional MFCCs).
RECOGNITION OF PHONEMES FROM ESTIMATION ERRORS L Baghai-Ravary and S W Beet Department of Electronic and Electrical Engineering, The University of Sheffield, Mappin Street, Sheffield, S1 3JD, UK. Tel: (+44 ) 114 282 5409; Fax: (+44) 114 272 6391 Email: l.baghai-ravary@shef.ac.uk, s.beet@shef.ac.uk Speech recognition systems generally use delta and delta-delta (velocity and acceleration) coefficients to characterise the dynamics apparent in frame-based representations of speech. These coefficients can be thought of as the errors of simple predictors. This paper describes the use of error coefficients derived from more advanced (and accurate) forms of prediction and interpolation. Both overall recognition accuracy and the detailed confusions observed are compared with those of the traditional methods. The task used is speaker-independent phoneme recognition using a subset of the TIMIT database, and four different speech representations. The error coefficient performance on this task appears to be directly related to the robustness of the estimator used, with the best of the new methods out-performing delta-delta coefficients by around 10%.
Words on Lips: How to Merge Acoustic and Articulatory Informations to Automatic Speech Recognition Regine Andre-Obrecht, Bruno Jacob, Christine Senac IRIT- CNRS UMR 5055 - Universite Paul Sabatier 118, route de Narbonne, 31062-Toulouse CEDEX, France e-mail: obrecht@irit.fr Our work deals with the classical problem of merging heterogeneous and asynchronous parameters. It's well known that lip reading improves the speech recognition score, specially in noisy conditions; so we study more precisely the modeling of acoustic and articulatory parameters to propose new Automatic Speech Recognition systems. We use a segmental pre-processing, a robust unit "the pseudo-diphone" and we compare a global HMM and a master-slave HMM. We confirm through experiments the importance of labial features in clean and noisy environment.
JOINT INTERPOLATION, MOTION AND PARAMETER ESTIMATION FOR IMAGE SEQUENCES WITH MISSING DATA Simon J. Godsill and Anil C. Kokaram Signal Processing and Communications Laboratory, University of Cambridge e-mail: {sjg,ack}@eng.cam.ac.uk This paper presents a new scheme for interpolation of missing data in image sequences, an important problem in many areas including archived motion picture film and digital video. A unified framework for image data modelling and motion estimation is adopted which is based on 3-dimensional autoregressive (3DAR) models with motion correction. A fully Bayesian methodology is implemented using the Gibbs Sampler, a method which allows for joint estimation with respect to all of the unknowns, including the motion field.
TITLE : DETECTION AND REMOVAL OF LINE SCRATCHES IN DEGRADED MOTION PICTURE SEQUENCES. AUTHOR : Anil Kokaram AFFILIATION : Signal Processing and Communications Group, Engineering Department, University of Cambridge, Trumpington St., Cambridge CB2 1PZ, England. Tel: +44 1223 332767; Fax: +44 1223 332662 email: ack@eng.cam.ac.uk ABSTRACT : Line scratches are a common problem in archived motion pictures. They are caused by the abrasion of the film material as it passes through the projection mechanism. This paper presents a technique for detecting and removing these line artefacts. The method employs a model of the line profile for detection and the 2D Autoregressive model (2D AR) of the image for interpolation. KEYWORDS : Image Reconstruction, Line Finding, Hough Transform, Gibbs Sampling, Autoregressive modelling, Bayesian Estimation. 
CURVED SURFACE RECONSTRUCTION USING MONOCULAR VISION William Puech and Jean-Marc Chassery TIMC-IMAG laboratory, Institut Albert Bonniot, Domaine de la Merci, 38706 LA TRONCHE Cedex France, Tel: 76549484; fax: 76549414 e-mail: William.Puech@imag.fr, Jean-Marc.Chassery@imag.fr ABSTRACT: In monocular vision, a priori knowledge is necessary to perform 3D reconstruction. This paper describes how to evaluate two out of six external parameters of a camera in order to project an image on a curved surface (generalized cylinder). The final aim consists of reconstructing the model of the surface. Afterwards, with this model we can derive a flat representation of the scene without any distortions due to the projective geometry. In this work based on one projected view of the scene, we develop two methods to detect the projection of the revolution axis of the curved surface. With this axis, we can then extract the external parameters of a camera. The first one is based on the derivation of a polynomial function and the second one is based on the detection of the common normal between curves.
### REI.4

IMAGE SEQUENCE RESTORATION FOR REMOVING SPACE-VARIANT MOTION BLUR Kwan Pyo Hong, Dong Wook Kim, and Joon Ki Paik Department of Electronic Engineering, Chung-Ang University 221 Hunksuk-Dong, Dongjak-Ku, Seoul, 156-756, Korea Tel:+82-2-820-5300; Fax:+82-2-825-1584 e-mail: paikj@video1.ee.cau.ac.kr An image restoration algorithm for removing motion blur, which occurs in an image sequence or moving pictures, is proposed. More specifically, the proposed iterative restoration algorithm adaptively reduces nonuniform motion blur by using motion vector information from consecutive image fields. Motion vectors are estimated based on the well known block match ing algorithm, and the corresponding blur model is embodied into the point spread function, which is used to implement the iterative image restoration algorithm. A blur model modification method is also proposed to reduce artifacts on the boundary area between objects with different blur patterns
### REI.5

COHERENT MODEL-BASED OPTICAL RESOLUTION AND SNR A.J. den Dekker Delft University of Technology, Department of Applied Physics Lorentzweg 1, 2628 CJ Delft, The Netherlands Tel: +31 15 2781823; Fax: +31 15 2784263 e-mail: dekker@tn.tudelft.nl In this paper a new parameter estimation based criterion for two-point resolution is proposed. Unlike the classical resolution criteria, the new criterion takes account of noise and systematic errors. A resolution limit in terms of the observations is derived. This limit depends on the point spread function used and the degree of coherence supposed. For statistical observations the probability of resolution as a function of the SNR is derived. This probability can be used as a performance measure in the assessment of optical instruments.
### REII.1

DISCRETE B-SPLINE FUNCTIONS Koichi ICHIGE and Masaru KAMADA (Koichi ICHIGE) Doctoral Program in Engineering, University of Tsukuba, Tsukuba, Ibaraki 305 Japan e-mail: ichi@fmslab.is.tsukuba.ac.jp (Masaru KAMADA) Department of Computer and Information Sciences, Ibaraki University, Hitachi, Ibaraki 316 Japan e-mail: kamada@cis.ibaraki.ac.jp A simple discrete version of B-splines is proposed. The proposed discrete version has different values from B-splines at the discrete points, but it is proven that the proposed discrete version tends to B-splines when the sampling interval goes to zero. They can be evaluated more quickly than the former discrete B-splines, only by RRS digital filters.
### REII.2

2-D NEURAL HYBRID FILTERS USING ADAPTIVE WINDOWS AND LAYERED MEDIAN FILTERS Mitsuji Muneyasu, Takahiro Maeda and Takao Hinamoto Faculty of Engineering, Hiroshima University 1-4-1 Kagamiyama, Higashi-Hiroshima 739, Japan e-mail: muneyasu@ecl.sys.hiroshima-u.ac.jp A new structure of 2-D neural hybrid filters composed of the cascade connection of layered median filter, 2-D linear filter with adaptive windows, and a neural network is developed. The proposed filter can be used for edge-preserving smoothing of an image under the mixed noise environment such that both white Gaussian noise and impulsive noise exist. The layered median filter section is composed of the cascade connection of 1-D median filters which select the median value of 3 points. The window sizes of 2-D linear filters are chosen so as to prevent edges in the output image from degrading. The parameters of the neural network are adjustable by using a learning algorithm to adapt itself to the property of an image to be processed. An experimental result is shown to illustrate the effectiveness of the proposed filter.
### REII.3

Title: VECTOR MEDIAN-VECTOR DIRECTIONAL HYBRID FILTER FOR COLOR IMAGE RESTORATION Authors: Moncef Gabbouj and Faouzi Alaya Cheikh Affiliation: Signal Processing Laboratory, Tampere Univ. of Technology P. O. BOX. 553, 33101 Tampere, Finland Tel: + 358-31-365 3967; Fax: + 358-31-365 3967 moncef@cs.tut.fi; faouzi@cs.tut.fi Abstract: In this paper we propose a new approach for multichannel signals and image processing. This new scheme is similar to the VDF's approach, in the way it decomposes the filtering process into direction estimation and magnitude estimation of the output vector. While the VDF performs these two stages sequentially; our filtering approach may execute the two stages in parallel. This parallel structure eliminates the distorting effect of the magnitude processing stage on the direction estimated in the first step. And it reduces the required time of the overall processing to the time corresponding to the most demanding task. A further speedup factor is gained over the VDF approach, since our algorithm does not use sorting at any stage. Simulation results show the effectiveness of the proposed scheme in color image restoration.
### REII.4

REGULARIZED IMAGE DECONVOLUTION IN A WAVELET SCHEME. Jean-Louis Burdeau** and Rmy Prost*, Member EURASIP *CREATIS, Research Unit Associated to CNRS (#C5515) and Affiliated to INSERM, Lyon, France. INSA 502, 69621 VILLEURBANNE Cedex France. E-mail remy.prost@creatis.insa-lyon.fr ** INT, Signal and Image Dpt, 9 rue Charles Fourier 91011 EVRY Cedex France and CREATIS, Lyon, France. E-mail burdeau@int-evry.fr ABSTRACT This paper addresses the problem of deconvolution in a multiresolution scheme. It results a deconvolution problem at each level of resolution. The Miller regularized approach is used and the normal equations are solved using a constrained iterative algorithm. Simulations show the advantages of this approach.
### REII.5

A Blind Deconvolution Algorithm for Simultaneous Image Restoration and System Characterisation. M. Razaz and D. Kampmann-Hudson School of Information Systems University of East Anglia Norwich, UK Email: mr@sys.uea.ac.uk, dmh@sys.uea.ac.uk The restoration of a blurred image in a practical imaging system is critically dependent on the system point spread function. Measurement of the point spread function is often a difficult and time consuming process, and the measurement environment itself is somehow artificial. Also, it is frequently the case that an observed image and the point spread function are not measured simultaneously under the same conditions. An iterative blind deconvolution algorithm is presented here which is capable of restoring an image without the need for an exact estimate of the point spread function. The ideal image and the point spread function can be estimated simultaneously by imposing appropriate a priori constraints. Typical experimental results are presented and discussed.
### REII.6

MULTIRESOLUTION IMAGE DECOMPOSITION WITH COMPLEX STEERABLE PYRAMIDS G. Jacovitti*, A. Manca*, A. Neri** * INFOCOM Dpt., University of Rome La Sapienza, Via Eudossiana 18, 00814 Rome, Italy ** Electronics Engineering Dept., University of Rome III, Via Vasca navale 84, 00146 Rome, Italy e-mail: neri@infocom.ing.uniroma1.it Abstract In this contribution we present a steerable pyramid based on complex wavelets named Circular Harmonic Wavelets (CHW), suited for multiscale feature-based representations. The Circular Harmonic Pyramid (CHP) performs a local windowed Fourier analysis in polar co-ordinates around any point of the image. After a survey on the general properties of the CHP, we illustrate the application of the CHP to the classical problem of image restoration against additive noise.
### REII.7

AN ALGORITHM FOR RECONSTRUCTING POSITIVE IMAGES FROM NOISY DATA Geoffrey de Villiers DRA Malvern, St. Andrews Road, Malvern, Worcestershire, WR14 3PS, U.K. Tel: +44 (0)1684 894750; fax: +44 (0)1684 896502 e-mail gdv@signal.dra.hmg.gb In this paper we describe a novel method for finding non-negative solutions to linear inverse problems. Such problems include image reconstruction where one is required to deconvolve a known point spread function from the image to produce a clearer image. The method described here is related to the truncated singular function expansion for solving linear inverse problems. The method consists of choosing the non-negative solution with minimum 2-norm whose singular function expansion agrees with the truncated singular function expansion solution in its first N terms. The fact that only the first N singular function coefficients, which are easily derived from the data, are used gives the method robustness with respect to noise and the method is not computationally very demanding. British Crown Copyright 1996/DERA Published with the permission of the Controller of Her Majesty's Stationery Office.
### REII.8

IDENTIFICATION OF A DEGRADED IMAGE BY A MULTIPLICATIVE OR ADDITIVE NOISE Lionel Beaurepaire, Kacem Chehdi E.N.S.S.A.T, 6 Rue de Krampont, BP 447, 22305 Lannion cedex, France Tel: 96-46-50-30; fax: (33) 96-37-01-99 e-mail: beaurepa@enssat.fr, chehdi@enssat.fr This paper deals with the problem of identifying the nature of the noise from the observed image in order to apply the processing or analysis algorithm, whichever is the most appropriate. Here, we restrict ourselves to additive and multiplicative noises. To identify these two kinds of noises, we propose a new approach consisting of characterizing each class and thus, each degraded image by a vector of five parameters. These parameters are obtained from local statistics computed on homogeneous regions of the image.
### REII.9

A MULTIRESOLUTION SPECKLE REDUCTION ALGORITHM WITH APPLICATION TO SAR IMAGES Carmela Galdi (1), John J. Soraghan (2) (1) Dipartimento di Ingegneria Elettronica, Universit degli Studi di Napoli "Federico II", via Claudio 21, 80125 Napoli, Italy. Tel: +39 81 7683200; fax: +39 81 7683149 e-mail: galdi@nadis.dis.unina.it (2) Signal Processing Division, Dept. of Electronic and Electrical Engineering, University of Strathclyde 204 George Street, Glasgow, G1 1XV, Scotland, U.K. Fax: +44-141-5522487 e-mail: jjs@spd.eee.strath.ac.uk Synthetic Aperture Radar images are the representation in range and azimuth coordinates of the signal received by a radar system exploring a portion of the earth surface. The speckle reduction technique presented in this paper takes advantage of the knowledge of the statistical model of the backscattered signal to design a wavelet thresholding scheme, appropriate for this particular type of noise. Before the application to actual images, the algorithm validity has been tested by comparison with the Wiener filter, performed on random sequences generated according to the backscattering statistical model.
### REII.10

SAR IMAGES RECONSTRUCTION VIA PHASE RETRIEVAL T. Isernia(l-2), V. Pascazio(3), R. Pierri(4), G. Schirinzi(2) (l)Dipartimento di Ingegneria Elettronica - Universita di Napoli Federico 11 via Claudio, 21 - 80125 Napoli, Italy tel: +39-(0)81-7683512; fax: +39-(0)81-5934448; e-mail: isernia@dieO03.dis.unina.it (2)Istituto per l'Elettromagnetismo e i Componenti Elettronici - Consiglio Nazionale delle Ricerche via Diocleziano, 328 - 80124 Napoli Italy tel : +39-(0)81-5707999; fax: +39-(0)81-5705734; e-mail: schiri@irecel.irece.na.cnr.it (3)Istituto di Teoria e Tecnica delle Onde Elettromagnetiche - Istituto Universitario Navale via Acton, 38 - 80133 Napoli, Italy tel: +39-(0)81-5513976; fax: +39-(0)81-5512884; e-mail: pascazio@naval.uninav.it (4)Dipartimento di Ingegneria dell'Informazione - Seconda Universita di Napoli via Roma, 29 - 81031 Aversa (CE), Italy tel : +39-(0)81-5044035; fax: +39-(0)81-5045804; e-mail: pierri@uxing2.sunap.it ABSTRACT A new method to accurately reconstruct a Synthetic Aperture Radar complex image starting from phase errors atfected raw received data is presented. It is based on a phase retrieval algorithm, and the unknown complex reflectivity is found by minimising a proper functional using the partial phase infonnation cfuried out by the phase corrupted raw data as the initial guess of an iterative procedure. The method, which is capable of compensating for both 1-D and 2-D phase errors, has been validated on real data.
### SAS.1

WAVEFORM INTERPOLATION TECHNIQUE FOR TEXT-TO-SPEECH SYNTHESIS Mikel Larreategui and Rolando A. Carrasco School of Engineering, Staffordshire University Beaconside, PO 333, ST18 ODF, Stafford, UK. TEL: +44 1785 353366; FAX: +44 1785 353552 e-mail: mikel@staffs.ac.uk ABSTRACT The waveform interpolation (WI) technique has recently been proposed by Kleijn [5][6] for speech coding applications. However, there are no known published works in the open literature concerning the application of the WI method for high-quality text-to-speech (TTS) synthesis. The original contribution of this paper is to study and evaluate the performance of the WI technique in the context of TTS systems.
### SAS.2

IMPROVED PHONOTACTIC ANALYSIS IN AUTOMATIC LANGUAGE IDENTIFICATION Jiri Navratil Department of Communication and Measurement Technical University of Ilmenau P.O.Box 0565, 98684 Ilmenau, Germany Tel: +49 3677 69 1145; fax: +49 3677 69 1195 e-mail: jiri.navratil@e-technik.tu-ilmenau.de This paper presents a method for phone-dependent weighting within phonotactic models in automatic language identification. Based on statistical analysis of the phonetic-recognizer behaviour, a phone confidence measure is derived and used to weight the bigram probabilities during testing. The confidence corresponds to the expected decoding stability of individual phones. The proposed method was shown to improve the system performance consistently on a three-language task. The best improvement of the error rate was from 8.4% to 1.8% for the 45-second utterances.
### SAS.3

AUTOMATIC LANGUAGE IDENTIFICATION: USING INTONATION AS A DISCRIMINATING FEATURE V.F. Leavers, K. Wiehler, C.E. Burley Electrical Engineering Division, Manchester University, Dover Street, Manchester, M13 9PL, England, vfl@ipg.ph.kcl.ac.uk Current research into automatic language identification systems sees the problem as being related to speaker independent speech recognition and speaker identification. In particular, speaker indentification methods appear to outperform all other methods and the incorporation of prosodic information has contributed only marginally to their success. This is a counterintuitive result suggesting that perhaps the brute-force application of standard available pattern recognition methods is inappropriate, not least because it ignores the linguistic cues that human beings use so easily and efficiently. It has been proposed that an attempt to rank parameter extraction with respect to a taxonomy of linguistic complexity would give results more in keeping with our own abilities to discriminate between various languages. For example, the pressure of discrimination concerning grossly different languages such as Mandarin Chinese and English would be low compared to that associated with an attempt to distinguish between two quite similar languages such as Dutch and German. The present work aims to differentiate between the two broadest groups, tone and stress, using parameters which best model the linguistic differences between those groups. In particular, the supra-segmental feature of intonation is modelled as a memory effect which can be measured using the Hurst exponent.
### SAS.4

PROSODY GENERATION BY MEANS OF A SYNTACTIC APPROACH AND ITS APPLICATION IN A TEXT TO SPEECH SYSTEM Enzo Mumolo, Massimo Teia Dipartimento di Elettrotecnica, Elettronica ed Informatica Universita' di Trieste, Via Valerio 10, 34127 Trieste, Italy Tel/Fax: +39.40.676.3861/3460 e-mail: mumolo@univ.trieste.it Abstract An algorithm for modeling and generating prosody from a written text is described in this paper. Among the several speech processing areas which could benefit of this algorithm, in this paper we have dealt with text to speech synthesis (TTS). An experimental evaluation of the algorithm has been carried out and it has been shown that the naturalness of the produced speech has greatly improved.
### SAS.5

A TEXT-TO-SPEECH SYSTEM FOR THE SLOVENIAN LANGUAGE Jerneja Gros, Nikola Pavesic, France Mihelic Faculty of Electrical Engineering, University of Ljubljana e-mail: jerneja.gros@fer.uni-lj.si A text-to-speech system, capable of synthesising continuous Slovenian speech from an arbitrary input text is described. The TTS system is based on the concatenation of basic speech units, diphones, using the TD-PSOLA technique, and no special hardware is required. The input text is transformed into its spoken equivalent by a series of modules. These modules, constituting the TTS system are described in detail. Finally, the quality of synthesised speech is assessed in terms of acceptability and intelligibility.
### SAS.6

SPEAKER RECOGNITION BASED ON A WEIGHTED ACOUSTIC DISCRIMINATION Carmen Garcia-Mateo, Leandro Rodriguez-Linares Departamento de Tecnologias de las Comunicaciones Universidad de Vigo, Spain Phone:34-86-812133, Fax:34-86-812116 e-mail:carmen@tsc.uvigo.es, leandro@tsc.uvigo.es} ABSTRACT We combine multiple-mixture single-state Markov models with phonetic classification in order to improve the performance of a speaker recognition system. Three broad phonetic classes: voiced frames, unvoiced frames and transitions, are defined. We design speaker templates by the parallel connection of the weighted outputs of three single state HMM's. Each model corresponds with a distinct sound class and the output weights take into account the perceptual influences across phonetic classes. The preliminary results show that this novel architecture outperforms its counterpart without phonetic classification.
### SAS.7

TITLE: SPEAKER RECOGNITION WITH ARTIFICIAL NEURAL NETWORKS AND MEL-FREQUENCY CEPSTRAL COEFFICIENTS CORRELATIONS AUTHORS: Roberto Amilton Bernardes Soria, Euvaldo F. Cabral Jr. AFFILIATION: University of Sao Paulo - DEE/EPUSP Laboratory of Communication and Signals - LCS CAIXA POSTAL 8174, Sao Paulo, SP, 01065-970, Brazil ABSTRACT: The problem addressed in this paper is related to the fact that classical statistical approach for speaker recognition yields satisfactory results but at the expense of long length training and test utterances. An attempt to reduce the length of speaker samples is of great importance in the field of speaker recognition since the statistical approach, due to its limitations, is usually precluded from use in real-time applications. A novel method of text-independent speaker recognition which uses only the correlations among MFCCs, computed over selected speech segments of very-short length (approximately 120ms) is proposed. Three different neural networks - the Multi-Layer Perceptron (MLP), the Steinbuch's Learnmatrix (SLM) and the Self-Organizing Feature Finder (SOFF) - are evaluated in a speaker recognition task. The ability of dimensionality reduction of the SOFF paradigm is also discussed.
### SAS.8

IMPROVED VOCAL TRACT MODEL FOR SPEECH SYNTHESIS Minsheng Liu, Arild Lacroix Institut fur Angewandte Physik; University of Frankfurt Robert-Mayer-Str.2-4; D-60325 Frankfurt am Main,Germany e-mail:Liu@iap.uni-frankfurt.de, Lacroix@iap.uni-frankfurt.de Speech synthesis of nasal and non-nasal speech sounds are studied on the basis of an improved model where a nasal tract is included in the vocal tract. The transfer function of the model is analysed. Because of the closure of the oral tract, the three-port adaptor at the velum is reduced to a two-port adaptor, so that the model parameters can be estimated by inverse filtering from the speech signal. Moreover this method is applied to investigate nasalization of vowels.
### SAS.9

VOWEL-NON VOWEL CLASSIFICATION OF SPEECH USING AN MLP AND RULES John Sirigos, john@wcl.ee.upatras.gr Vassilis Darsinos, darsinos@wcl.ee.upatras.gr Nikos Fakotakis, fakotaki@wcl.ee.upatras.gr George Kokkinakis, gkokkin@wcl.ee.upatras.gr Wire Communications Laboratory, University of Patras, 26500 Patras, Greece ABSTRACT In this paper we present a high precision speaker independent vowel/non vowel classifier based on a simple feed forward MLP (Multi Layer Perceptron) and several rules. RASTA-PLP analysis of the speech signal resulting to mel-cepstral coefficients and a formant tracking method are used in order to provide the feature vectors for the MLP. To train and test the system we used a part of the TIMIT database. The results indicate that the performance of this classifier for speaker independent vowel classification is approximately 97.25% so it can be favorably used for speaker recognition or speech labeling purposes.
### SAS.10

A WAVELET REPRESENTATION EVALUATION FOR STOP-CONSONANTS CLASSIFICATION Christophe Gerard, Marc Baudry, Alexandrina Rogozan L.I.U.M., University of Le Mans Avenue O. Messiaen, B.P. 535, Le Mans 72017 Cedex, France Tel: +33 4383 32 21; Fax: +33 43 8335 65 E-mail: gerard@lium.univ-lemans.fr ABSTRACT Regarding Short Time Fourier Transform based methods, stop-consonants representation could be improved using the wavelet transform. After presenting our framework, we describe the wavelet parameterization and the classification method. Stop consonants are represented with pseudo-cepstral wavelet based parameters computed on a single-burst-neighbourhood-20 ms frame. Non-parametric nearest neighbours method is used. Evaluation is speaker-independent ; 1593 stop-consonants extracted from TIMIT database are evaluated. Results are described and discussed comparatively to MFCC's (Mel Frequency Cepstrum Coefficients). It appears that, in our field of research, wavelet gives equivalent classification percentages. The first thing which was pointed out, is the necessity to build an elaborated-wavelet-based-representation to get significant improvements.
### SC.1

AN ATM SPEECH CODEC WITH IMPROVED RECONSTRUCTION OF LOST CELLS Kai Clver Institut fr Fernmeldetechnik, Technische Universitt Berlin Einsteinufer 25, D-10587 Berlin, Germany telephone: +49 30 314-24581; fax: +49 30 314-25799 e-mail: cluever@ftsu00.ee.tu-berlin.de A speech codec for ATM networks is presented which includes ATM adaptation layer functions, a voice activity detection, and a new method for the reconstruction of lost cells. As the cell assembly already requires a relatively high buffering delay, only algorithms are applied which introduce small additional delays. The reconstruction of lost cells is based on an analysis of the LPC and pitch parameters of the speech signal. The new waveform substitution method considerably reduces the speech quality impairment caused by cell loss. 
### SC.2

MULTIMODE SPECTRAL CODING OF SPEECH FOR SATELLITE COMMUNICATIONS Amitava Das* and Allen Gersho** *Qualcomm Inc., 6455 Lusk Boulevard, San Diego, CA 92121. Tel/FAX: 619-651-4006/658-1562. email: adas@qualcomm.com **Dept. of Electrical & Computer Eng. University of California, Santa Barbara, CA 93106. Tel/FAX: 805-893-2037/3262. email: gersho@ece.ucsb.edu We present a multimode spectral coding algorithm which employs the enhanced MBE (EMBE) spectral model and a new spectral quantization technique called transformed variable dimension vector quantization (TVDVQ) offering good speech quality at low rate. The EMBE model represents the short-term speech spectrum in a mode-specific way. TVDVQ encodes the variable-dimension spectral components efficiently at low complexity. The resulting 2.9 kb/s source coder offers good speech quality comparable to the 4.8 kb/s CELP 1016 and the 4.15 kb/s IMBE coder. An additional 1.1 kb/s of channel coding preserves the speech quality and intelligibility quite well with up to 2% random bit errors.
### SC.3

CELP CODING BASED ON SIGNAL CLASSIFICATION USING THE DYADIC WAVELET TRANSFORM Joachim Stegmann, Gerhard Schroeder, Kyrill A. Fischer Deutsche Telekom AG, Technologiezentrum, Am Kavalleriesand 3, 64295 Darmstadt, Germany e-mail: stegmann@fz.telekom.de This paper describes a CELP speech-coding algorithm which makes use of a specific signal classifier especially designed for this purpose. The classification method is based on the Dyadic Wavelet Transform (DyWT) and has proved to be superior to common classifiers that use the open-loop long-term prediction gain for mode selection. The classifier's output is used for the control of several coder parameters, such as the choice of the subframe length and the selection of the synthesis model and the corresponding codebooks. We designed a fully quantised coder operating at a fixed bit rate of 4 kbit/s with a 20-ms frame. The proposed coder improves the weighted segmental signal-to-noise ratio (WSegSNR) by 2.3 dB on the average in comparison with a conventional CELP coder, thereby achieving high speech quality.
### SC.4

AN ALGORITHM FOR THE TRAINING OF CELP EXCITATION CODEBOOKS Ulrich Balss, Herbert Reininger, Holger Schalk, Dietrich Wolf Institut fuer Angewandte Physik der J.W. Goethe-Universitaet Frankfurt a.M. Robert-Mayer-Strasse 2-4, D-60054 Frankfurt am Main, FRG Tel: +49 69 798 28163; Fax: +49 69 798 28510 e-mail: balss@apx00.physik.uni-frankfurt.de CELP schemes with trained excitation codebook are able to reproduce more complex waveforms than stochastic CELP schemes. Here we present a new algorithm for the design of trained CELP excitation codebooks which are well adapted to the residual of speech even in transition regions. The vectors of the excitation codebook are adapted to a training speech sequence by applying an iterative algorithm. To obtain a high coding accuracy, the analysis-by-synthesis error measure used during coding process is also used in the codebook design procedure. Due to the simultaneous occurance of quantized amplitude vector and quantized gain in the error measure, both codebooks are optimized iteratively. The amplitude codebook vectors are designed as subvectors of a so-called base excitation sequence by shifting their offset. Comparative listening tests have shown that this method outperforms stochastic CELP in objective SNR as well as in subjective quality.
### SC.5

CRITICAL BAND QUANTISATION ANALYSIS FOR MASKED DISTORTION SPEECH CODING Paul M. McCourt Department of Electrical&Electronic Engineering Queen's University of Belfast Belfast BT9 5AH, UK e-mail pm.mccourt@ee.qub.ac.uk ABSTRACT This paper presents new results on critical band masked distortion controlled quantisation of a linear transform representation of speech. In particular, fixed rate split vector quantisation of a critical band gain vector is investigated. While shown to be objectively significant in meeting masked distortion criteria, near-transparent quantisation of the critical band gain spectrum is nonetheless achieved at 1.75 kbits/sec. The relevance of this result is explained by a comparative interpretation of the parametric spectral synthesis performed by current analysis-by-synthesis, multi-band excitation and sinusoidal transform coders.
### SC.6

PERCEPTUAL CODING OF SPEECH USING A FAST WAVELET PACKET TRANSFORM ALGORITHM Benito Carnero and Andrzej Drygajlo Signal Processing Laboratory Swiss Federal Institute of Technology at Lausanne CH-1015 Lausanne, SWITZERLAND e-mail: carnero@lts.de.epfl.ch This paper presents a new speech coding algorithm based on a fast wavelet packet transform algorithm and psychoacoustic modeling. The employed FFT-like overlapped block orthogonal transform allows us to approximate the auditory critical band decomposition in an efficient manner, which is a major advantage over previous approaches. Owing to such a decomposition of the original signal, we make use of the human ear masking properties to decrease the mean bit rate of the encoder.
### SC.7

SUBJECTIVE PERFORMANCE OF SPECTRAL EXCITATION CODING OF SPEECH AT 2.4 KB/S P. Lupini and V. Cuperman School of Engineering Science, Simon Fraser University, Burnaby, BC, Canada, V5A 1S6 lupini@cs.sfu.ca, vladimir@cs.sfu.ca This paper presents a low rate speech codec (2.4 kb/s) based on a sinusoidal model applied to the excitation signal. A frame classifier in combination with a phase dispersion algorithm allows the same model to be used for voiced as well as unvoiced and transitional sounds. The phase dispersion algorithm significantly improves the perceived quality for all frame classes resulting in more natural'' reconstructed speech. Informal MOS testing indicates that the 2.4 kb/s SEC system achieves MOS scores close to the existing 4 kb/s standards (differences up to 0.2 on the MOS scale) and significantly better than the existing 2.4 kb/s LPC-10 standard (difference of 1.5 on the MOS scale).
### SC.8

ROBUST MULTIBAND EXCITATION CODING OF SPEECH BASED ON VARIABLE ANALYSIS FRAME SIZES Eric W.M. YU and Cheung-Fat CHAN Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong. Phone: (852) 2788-7758 Fax: (852) 2788-7791 Email: eewmeyu@cityu.edu.hk eecfchan@cityu.edu.hk A robust technique for the coding of multiband excitation (MBE) model parameters from a non-stationary speech segment is proposed in this paper. The non-stationary speech segment which has an abrupt increase in its signal energy with respect to the time is divided into 2 quasi- stationary speech segments. A variable analysis frame size technique is proposed to analyze the lower energy portion and the higher energy portion separately. A high quality fixed 1.6 kbps variable frame size MBE linear predictive (MBELP) speech coder was developed.
### SC.9

Title A PROTOTYPE WAVEFORM INTERPOLATION LOW BIT RATE SPEECH CODEC Authors Gloria Menegaz and Michele Mazzoleni Affiliation DE-LTS, Swiss Federal Institute of Technology CH-1015 Lausanne, Switzerland Tel: +39 2 66161267; fax: +39 2 66100448 e-mail: menegaz@mailer.cefriel.it CEFRIEL, Via Emanueli 15, I-20126 Milano Abstract Voiced speech is characterized by a high level of periodicity. In order to encode voiced speech with a good quality, the correct degree of periodicity must be preserved. The proposed coding algorithm attempts to respect such a constraint even at low bit rates. The method exploits the temporal redundancy of voiced segments in order to achieve high compression rates. Voiced speech is interpreted as a concatenation of slowly evolving pitch-cycle waveforms. The signal is synthesized by waveform interpolation from a downsampled sequence of pitch-cycles with a rate of one prototype waveform per frame (20-30ms). An original method of prototype representation, parametrization and coding based on a proper mixed time-frequency representation allows a high quality prototype reconstruction. The effectiveness of such a parametrization renders it well suited to low bit rate applications, yet maintaining a good quality of the reconstructed signal. The method can be combined with existing LP-based speech coders, such as CELP, for unvoiced segments.
### SC.10

QUANTIZATION OF THE LPC MODEL VV1TH THE RECONSTRUCTION ERROR DISTORTION MEASURE Jan 5. Erkelens and Piet M. T. Broersen Delft University of Technology, Department of Applied Physics P.O. Box 5046, 2600 GA Delft, The Netherlands Tel +31 15 2781823 / +31 15 2786419 Fax +31 15 2784263 e-mail: erkelens@gtn.tudelft.nl / broersen@gtn.tudelft.nl ABSTRACI In Linear Predictive Coding algorithms, the codmg of the speech signal consists of two separate stages: coding of the LPC model and coding of the excitation. In CELP, the LPC excitation is coded by Analysis-by-Synthesis in the reconstruction domain, not by minimization of the error in the LPC residual domain. Commonly used distortion measures for quantization of the LPC spectral model are the Spectral Distortion and the Likelihood Ratio. For small quantization errors, they belong to a class of similar distortion measures which express an error in the residual domain. A new spectral distortion measure is proposed, the Reconstruction Error Distortion measure, which expresses an error in the reconstruction domain. Preliminary results indicate that about five bits per frame can be gained with this new measure, without a loss in subjective quality.
### SE.1

PARAMETER IDENTIFICATION OF FREQUENCY-SELECTIVE NOISY FAST-FADING RAYLEIGH DIGITAL CHANNELS VIA NONLINEAR YULE-WALKER-LIKE EQUATIONS Roberto Cusani, Enzo Baccarelli INFOCOM Dpt., University of Rome "La Sapienza", Rome, Italy Tel. +39 6 4458589; fax: +39 6 4873300; email: robby@infocom.ing.uniroma1.it New procedure is proposed for the identification of data channels affected by randomly time-variant fading. It is based on a set of nonlinear equations employing a minimum number of lags of the observed autocorrelation function (acf), and its solution gives the desired channel fading parameter estimates. Better estimation accuracy is obtained in comparison with the use of classic higher-order Yule-Walker procedure (although this latter employs a linear equation system), in particular for small Doppler spreads and for signal-to-noise ratios not very high.
### SE.3

SUPER-RESOLUTION SPECTRUM ANALYSIS REGULARIZATION : BURG, CAPON & AGO-ANTAGONISTIC ALGORITHMS Frederic Barbaresco THOMSON-CSF AIRSYS Radar Development / Algorithms & New concepts Department (RD/RAN) 7-9, rue des Mathurins 92221 Bagneux, FRANCE Tel : 33-(1) 40.84.20.04 ; Fax : 33-(1) 40.84.36.31 e-mail : barbareso@airsys.thomson.fr ABSTRACT We propose a regularized Burg algorithm, based on a frequency domain smoothness prior constraint, which solves model order estimation problem in case of short data records. A second algorithm deals with a recursive eigendecomposition method from autoregressive parameters, that allows Capon spectrum analysis regularization. Finally, we have developed a new regularized detectors using log-likehood ratio from regularized reflection coefficients.
### SE.4

SPECTRAL ANALYSIS OF RANDOMLY SAMPLED SIGNALS A. Ouahabi(1), C. Depollier(2), L. Simon(2), D. Kouame(1), J.F. Roux(1) and F.Patat(1) (1) LUSSI,GIP Ultrasons/EIT 7 Av M. Dassault BP 407 37004 TOURS Cedex France Phone:(+33) 47 71 12 26 Fax: (+33) 47 28 95 33 e-mail: ouahabi@balzac.univ-tours.fr (2) LAUM URA CNRS 1101 Av. O. Messiaen, BP 535 72017 LE MANS Cedex France Phone:(+33) 43 83 32 70 Fax: (+33) 43 83 35 20 e-mail: depol@laum.univ-lemans.fr Abstract: The power spectral density of randomly sampled signals is studied with reference to fluid velocity measured by laser Doppler velocimetry. In this paper, we propose a new method for spectral estimation of Poisson-sampled stochastic processes. Our approach is based on polygonal interpolation from the sampled process followed by resampling and usual fast Fourier transform. This study emphasizes the merit of the polygonal hold vs. the sample-and-hold.
### SE.5

HIGH RESOLUTION SPECTRAL ANALYSIS USING A COMBINATION OF AN ORTHOGONAL APPROACH AND A GENETIC ALGORITHM Jean-Marc Vesin Signal Processing Laboratory Swiss Federal Institute of Technology CH-1015 Lausanne, Switzerland Tel: +41 21 693 3996; fax: +41 21 693 7600 e-mail: vesin@ltssg4.epfl.ch We describe in this paper how a method for parsimonious sinusoidal representation of signals based upon an orthogonalization technique can be suitably modified by embedding it into a genetic algorithm. We first describe the orthogonalizationformalism, then we present the genetic algorithms in general and the specific form, based on a floating-point parameter representation, that we have employed in this work. Experiments are presented and possible extensions are discussed.
### SE.6

AN ENHANCED METHOD FOR THE ESTIMATION OF A DOPPLER FREQUENCY J. Crestel, M. Guitton, H. Chuberre ENSSAT / LASTI, Universite de RENNES I B.P. 447 22305 Lannion (France) Tel: (33) 96 46 56 43 Fax: (33) 96 37 01 99 e-mail: crestel@merlin.enssat.fr The enhanced method for the estimation of a Doppler frequency which is dealt with aims at achieving a real time measure of the movements of a vehicule, given an on-board configuration of microwave Radar sensors. The prime idea is that the Doppler frequency can be assimilated to the mean instantaneous frequency of the signal. Then this frequency is estimated using the first moment of a quadratic time-frequency distribution. The enhancing process of the method is involved both in a specific preprocessing of the distribution so as to capture a reliable signal information, and in a weighted rejection of the higher variance components, likely to be meaningless. Simulations, as well as preliminary real tests, show probative results.
### SE.7

GABOR TRANSFORM AND ZAK TRANSFORM WITH RATIONAL OVERSAMPLING Martin J. Bastiaans Technische Universiteit Eindhoven, Faculteit Elektrotechniek, EH 5.33, Postbus 513, 5600 MB Eindhoven, Netherlands, tel: +31 40 2473319, fax: +31 40 2448375, e-mail: M.J.Bastiaans@ele.tue.nl Gabor's expansion of a signal into a set of shifted and modulated versions of an elementary signal is introduced, along with the inverse operation, i.e. the Gabor transform, which uses a window function that is related to the elementary signal and with the help of which Gabor's expansion coefficients can be determined. The Zak transform - with its intimate relationship to Gabor's signal expansion - is introduced. It is shown how the Zak transform can be helpful in determining Gabor's expansion coefficients and how it can be used in finding window functions that correspond to a given elementary signal. In particular, a simple proof is presented of the fact that the window function with minimum L2 norm is identical to the window function whose difference from the elementary signal has minimum L2 norm, and thus resembles best this elementary signal, and that this window function yields the Gabor coefficients with minimum L2 norm.
### SE.8

Title: PARAMETER ESTIMATION OF EXPONENTIALLY DAMPED SINUSOIDS USING SECOND ORDER STATISTICS Authors: K. Abed-Meraim*, A. Belouchrani**, A. Mansour***, and Y. Hua* Affiliation: * Department of Electrical and Electronics Engineering, The University of Melbourne, Parkville, Victoria 3052 Australia, a.karim@ee.mu.OZ.AU ** Department of Electrical Engineering and Computer Sciences, The University of California, Berkeley CA 94720, U.S.A, adel@robotics.eecs.berkeley.edu *** LTIRF - INPG, 46 Av. Felix Viallet, 38031 Grenoble, mansour@tirf.inpg.fr Abstract: In this contribution, we present a new approach for the estimation of the parameters of exponentially damped sinusoids based on the second order statistics of the observations. The method may be seen as an extension of the minimum norm principal eigenvectors method to cyclo-correlation statistics domain. The proposed method exploits the nullity property of the cyclo-correlation of stationary processes at non-zero cyclo-frequencies. This property allows in a pre-processing step to get rid from stationary additive noise. This approach presents many advantages in comparison with existing higher order statistics based approaches: (i) First it deals only with second order statistics which require generally few samples in contrast to higher-order methods, (ii) it deals either with Gaussian and non-Gaussian additive noise, and (iii) also deals either with white or temporally colored (with unknown autocorrelation sequence) additive noise. The effectiveness of the proposed method is illustrated by some numerical simulations.
### SE.9

SUBSPACE-BASED PARAMETER ESTIMATION OF SYMMETRIC NON-CAUSAL AUTOREGRESSIVE SIGNALS FROM NOISY MEASUREMENTS Petre Stoica and Joakim Sorelius Systems and Control Group, Uppsala University P.O. Box 27, S-751 03 Uppsala, Sweden; Tel: +46 18 183074; fax: +46 18 503611; e-mail: js@syscon.uu.se The notion of Symmetric Non-causal Auto-Regressive Signals (SNARS) arises in several, mostly spatial, signal processing applications. In this paper we introduce a subspace fitting approach for parameter estimation of SNARS from noise-corrupted measurements. We show that the subspaces associated with a Hankel matrix built from the data covariances contain enough information to determine the signal parameters in a consistent manner. Based on this result we propose a MUSIC (MUltiple SIgnal Classification)-like methodology for parameter estimation of SNARS. Compared with the methods previously proposed for SNARS parameter estimation, our SNARS-MUSIC approach is expected to possess a better trade-off between computational and statistical performances.
### SE.10

AUTOREGRESSIVE MODELLING OF IRREGULARLY-SAMPLED DATA R.J.Martin GEC Hirst Research Centre, Elstree Way, Borehamwood, Herts WD6 1RX, UK R.Martin@hirst.gmmt.gecm.com We shall discuss how to reformulate AR modelling in terms of a stochastic differential equation, and thence how to generalise the notion of prediction to irregular sampling. This gives rise to spectral estimation and FIR filtering methods for irregularly-sampled data. We also present an extension of Shannon's theorem for the missing data problem.
### SP.1

Title: NONLINEAR PREDICTION OF SPEECH SIGNALS USING RADIAL BASIS FUNCTION NETWORKS Author: Martin Birgmeier Affiliation: Department of Communication and Radio Frequency Engineering Vienna University of Technology Gusshausstrasse 25/E389 A-1040 Vienna Austria Phone: (+43 1) 58801 x 3661 Fax: (+43 1) 5870583 e-mail: Martin.Birgmeier@nt.tuwien.ac.at Abstract: In this paper, we compare the capabilities of various forms of radial basis function networks as nonlinear short-term predictors for speech signals representing sustained utterances of German vowels. We use RBF and RBF-AR network architectures, trained using a standard algorithm or alternatively the extended Kalman filter (EKF) algorithm, and linear least squares predictors. We also look at cascaded forms of linear/nonlinear predictors. We evaluate both prediction gain and spectral flatness measure of the residual. The results indicate: The RBF-AR structure is the most powerful, EKF training yields better results than standard training for RBF networks, and a non-cascaded RBF-AR predictor produces results superior to cascaded predictors.
### SP.2

NONLINEAR FORMANT-PITCH PREDICTION USING RECURRENT NEURAL NETWORKS Ekrem VAROGLU Kadri HACIOGLU Department of Electrical and Electronics Engineering Eastern Mediterranean University, Gazi Magosa, Mersin-10, Turkey Tel: +90 (392) 366 65 88; Fax: +90 (392) 366 44 79; e-mail: evaroglu@salamis.emu.edu.tr ABSTRACT In this study, a parallel structure is proposed for the nonlinear formant and pitch prediction of speech signals using Recurrent Neural Networks (RNN) The well known Real Time Recurrent Learning (RTRL) algorithm is used as the learning algorithm. Its performance is evaluated in terms of the mean-square error and sensitivity to pitch errors through extensive computer simulations and compared to the combined formant-pitch RNN predictor and to the linear predictor.
### SP.3

SPEECH ENHANCEMENT FOR HEARING AIDS Douglas R. Campbell Department of Electrical and Electronic Engineering, University of Paisley, High Street, Paisley, Scotland, UK, PA1 2BE Tel/Fax: +44 (0)141 848 3400/3404, email: d.r.campbell@paisley.ac.uk ABSTRACT The performance of hearing aids in noisy reverberant surroundings remains a major source of complaint and discomfort to wearers. Given the current capabilities and pace of development in microelectronics, the major problem is to find successful speech enhancement schemes. Binaural unmasking experiments demonstrate an enhancement advantage, due to binaural correlation properties, which can lower the hearing threshold in noise and there is evidence that this may operate in frequency sub-bands. The performance is presented of an adaptive sub-band noise cancellation scheme which supports the possibility of performing "binaural unmasking" outwith the body, and is shown to be capable of out-performing a standard noise-cancellation scheme in the presence of reverberation.
### SP.4

ON SPEECH ENHANCEMENT ALGORITHMS BASED ON THE MMSE ESTIMATION Pascal SCALART1, Jozue VIEIRA FILHO2,3, Jos GERALDO CHIQUITO3 1FRANCE TELECOM - CNET LAA/TSS/CMC Technopole ANTICIPA, 2 Avenue Pierre Marzin, 22307 Lannion Cedex, FRANCE 2Universidade Estadual Paulista DEE/FEIS/UNESP, Av. Brasil Centro 56, Ilha solteira- SP, BRAZIL 3Universidade Estadual de Campinas (DECOM/FEE/UNICAMP), SP, BRAZIL E-mail : scalart @lannion.cnet.fr This paper addresses the problem of single microphone frequency domain MMSE noise reduction technique for speech enhancement in noisy environments. We first analysed asymptotic performance of the MMSE estimate and compared these results with the Wiener filter. Practical implementation of the MMSE filter is then presented. Comparisons between optimal and practical behaviour of the MMSE filter demonstrate that an effective improvement in the noise reduction process can be gained if greater attention is given to the these estimators.
### SP.5

EVALUATION OF DIGITAL HEARING AID ALGORITHMS ON WEARABLE SIGNAL PROCESSOR SYSTEMS Uwe Rass, Gerhard H. Steeger Georg-Simon-Ohm-Fachhochschule, FB NF PO-Box 210320, D-90121 Nuernberg, Germany Tel: +49-911-5880-147, Fax: +49-911-5880-109 e-mail: rass@nf.fh-nuernberg.de, steeger@nf.fh-nuernberg.de ABSTRACT The benefit of hearing aid algorithms in everyday life can hardly be estimated from results obtained in the laboratory. Extensive field tests with many hearing impaired subjects are necessary to evaluate these processing schemes. A wearable digital hearing aid prototype is described which was developed specifically for that purpose. It is based on a fixed-point digital signal processor. This unit enables the testing of even highly sophisticated algorithms, with a changing interval of the accumulator pack of 10 hours. As application examples, a very flexible three channel dynamic compression algorithm and a binaural processing scheme for enhancing speech signals in noisy and reverberant environments are described. Application of 20 units in 3 European clinics has been started recently.
### SP.6

REDUCED-RANK NOISE REDUCTION: A FILTER-BANK INTERPRETATION Soeren Holdt Jensen (1) and Per Christian Hansen (2) (1) CPK, Aalborg University, Fredrik Bajers Vej 7, DK-9220 Aalborg OEst, Denmark. E-mail: shj@cpk.auc.dk (2) UNI-C, Building 304, Technical University of Denmark, DK-2800 Lyngby, Denmark. E-mail: Per.Christian.Hansen@uni-c.dk The key step in reduced-rank noise reduction algorithms is to approximate a matrix by another one with lower rank, typically by truncating a singular value decomposition (SVD). We give an explicit and closed-form derivation of the filter properties of the rank reduction operation and interpret this operation in the frequency domain by showing that the reduced-rank output signal is identical to that from a filter-bank whose analysis and synthesis filters are determined by the SVD. Our analysis includes the important general case in which pre- and dewhitening is used.
### SP.7

SPEAKER LOCALIZATION AND ITS APPLICATION TO TIME DELAY ESTIMATORS FOR MULTI-MICROPHONE SPEECH ENHANCEMENT SYSTEMS Martin Drews Institut fuer Fernmeldetechnik, Technische Universitaet Berlin Einsteinufer 25, D-10587 Berlin, Germany phone: +49 30 31424573, fax: +49 30 31425799 e-mail: drews@ftsu00.ee.tu-berlin.de ABSTRACT A time delay estimator for a multi-microphone speech enhancement system with 16 microphones is presented. It is based on a generalized cross-correlator and an improved peak detector. The problems associated with delay estimation in noisy speech signals are solved by performing a speaker localization and a plausibility check of the time delays derived from the speaker position. By applying these techniques to the time delay estimator, a significant reduction of the computational load is achieved, and the TDOA estimation errors are reduced. 
### SP.8

A WIDE-BAND SPEECH-MODEL PROCESS AS A TEST SIGNAL M.R. Serafat and U. Heute Institute for Network & System Theory, University Kiel, Germany Tel: +49 431 77572 401, Fax: +49 431 77572 403, E-Mail: res@techfak.uni-kiel.d400.de some of the major problems in objective quality assessment of speech coding systems or in testing other adaptive speech transmission systems are the speaker dependence, reproducibility, and the comparability of the measurement results, if natural speech is used as the test signal. This problem can be avoided by using suitable speech-model processes. In this paper, we present a wide-band speech--model process, which includes the same long- and short-time characteristics as natural speech. The controlling part of the generator of this process involves several trained Markov chains (mc) to adapt the time-varying properties of the process to those of natural speech. Furthermore, special care is taken of the necessary probabilty density function (PDF) asymmetries, because the natural wide-band speech has an asymmetric PDF.
### SP.9

QUADRATIC CLASSIFIER WITH SLIDING TRAINING DATA SET IN ROBUST RECURSIVE IDENTIFICATION OF NON-STATIONARY AR MODEL OF SPEECH Milan Markovic Institute of Applied Mathematics and Electronics Kneza Milosa 37 11000 Belgrade Yugoslavia fax: 381-11 324-8681 e-mail: emarkovm@ubbg.etf.bg.ac.yu ABSTRACT In this work, a robust recursive procedure based on WRLS algorithm with VFF and a quadratic classifier with sliding training data set for identification of non-stationary AR model of speech production system is proposed. Experimental analysis is done according to the results obtained in analyzing speech signal with voiced and mixed excitation segments. Presented experimental results justify that two main problems of LPC speech analysis, non-stationarity of LPC parameters and non-appropriateness of AR modeling of speech (particularly on the voiced frames), can be solved by using the proposed robust procedure.
### SR.1

A New Training Algorithm For Hybrid HMM/ANN Speech Recognition Systems Herve Bourlard, Yochai Konig, Nelson Morgan, and Christophe Ris Faculte Polytechnique de Mons - TCTS, 31 Bld. Dolez, B-7000 Mons, Belgium. International Computer Science Institute, 1947 Center Street, Suite 600, Berkeley, CA 94704, USA. Email: bourlard@tcts.fpms.ac.be In this paper, we briefly describe REMAP, an approach for the training and estimation of posterior probabilities, and report its application to speech recognition. REMAP is a recursive algorithm that is reminiscent of the Expectation Maximization (EM) algorithm for the estimation of data likelihoods. Although very general, the method is developed in the context of a statistical model for transition-based speech recognition using Artificial Neural Networks (ANN) to generate probabilities for Hidden Markov Models (HMMs). In the new approach, we use local conditional posterior probabilities of transitions to estimate global posterior probabilities of word sequences. As with earlier hybrid HMM/ANN systems we have developed, ANNs are used to estimate posterior probabilities. In the new approach, however, the network is trained with targets that are themselves estimates of local posterior probabilities. Initial experimental results support the theory by showing an increase in the estimates of posterior probabilities of the correct sentences after REMAP iterations, and a decrease in error rate for an independent test set.
### SR.2

AUTOMATIC DISCOVERY OF WORD CLASSES THROUGH LATENT SEMANTIC ANALYSIS Jerome R. Bellegarda, John W. Butzberger, Yen-Lu Chow, Noah B. Coccaro, Devang Naik Interactive Media Group, Apple Computer, Cupertino, California 95014, USA (jerome @ apple.com) A new approach is proposed for the automatic discovery of word classes in a given vocabulary. The method is based on a paradigm first formulated in the context of information retrieval, called latent semantic analysis. This paradigm leads to a parsimonious vector representation of each word in a suitable vector space, where familiar clustering techniques can be applied. The resulting word classes are intuitively satisfactory, and lead to a language model whose predictive power, as measured by perplexity, compares favorably with a conventional bigram's. Because its semantic nature, this approach may prove useful as a complement to syntactically-oriented class-based n-gram techniques.
### SR.3

CONTINUOUS SPEECH RECOGNITION USING A NEW NEURAL NETWORK WITH TWO DIFFERENT STRUCTURES Noriyuki Ohtsuki+, Yoshikazu Miyanaga++, and Koji Tochinai++ +Department of Information Engineering, Kushiro National College of Technology Kushiro-shi 084, Japan. E-mail ohtsuki@kushiro-ct.ac.jp ++Division of Information Media Engineering, Faculty of Engineering Hokkaido University, Sapporo-shi 060, Japan. Tel. +81-11-706-6534, FAX. +81-11-709-6277 E-mail miyanaga@hudk.hokudai.ac.jp Abstract This report proposes a continuous speech recognition method using a new neural network which has two different structures. This method is able to recognize time-varying speech phonemes. The new neural network in this method consists of a self-organized clustering network and a multi-layered neural network. The self-organized clustering network extracts some characteristics of speech in spectrum domain. The multi-layered neural network finds the time-varying characteristics of speech. From some experimental results, this report shows that the system is quit suitable for speech recognition.
### SR.4

SPEECH RECOGNITION WITH A NEURAL NETWORK TRACE-SEGMENTATION Euvaldo F. Cabral Jr. So Paulo University, Polytechnic School, Department of Electronic Engineering So Paulo - SP - Brazil Tel: +55 11 818-5267; Fax: +55 11 818-5718 email: euvaldo@lcs.poli.usp.br ABSTRACT Trace-segmentation (TS) is a method for non-linear time-normalization of a sequence of speech representation frames prior to recognition of the sequence. It has been shown in a recent work [1] that an Individual Trace- Segmentation (ITS), i.e. a separate segmentation of the trajectory described by each individual coefficient in the speech frame leads to a much improved recognition which exceeds the performance provided by DTW recognition on the same database. This paper describes a follow on work on the ITS technique where a Multi- layer Perceptron has been used to perform an internal mapping in the original ITS input space in order to provide a tighter set of clusters of the speech sequences. This novel technique is called Neural Network Trace- Segmentation (NNTS) and has produced a significant improvement on the ITS original performance.
### SR.5

RECOGNITION OF VOICED SPEECH FROM THE BISPECTRUM. Delopoulos, Anastasios Rangoussi, Maria Andersen, Janne. National Technical University of Athens, Dept. Of Electrical and Computer Engineering, Division of Computer Science, 9 Iroon Polytechneioy str. ATHENS, GR-15780, GREECE e-mail: adelo@image.ece.ntua.gr, maria@softlab.ece.ntua.gr Recognition of voiced speech phonemes is addressed in this paper using features extracted from the bispectrum of the speech signal. Voiced speech is modeled as a superposition of coupled harmonics, located at frequencies that are multiples of the pitch and modulated by the vocal tract. For this type of signal, nonzero bispectral values are shown to be guaranteed by the estimation procedure employed. The vocal tract frequency response is reconstructed from the bispectrum on a set of frequency points that are multiples of the pitch. An AR model is next fitted on this transfer function. The AR coefficients are used as the feature vector for the subsequent classification step. Any finite dimension vector classifier can be employed at this point. Experiments using the LVQ neural classifier give satisfactory classification scores on real speech data, extracted from the DARPA/TIMIT speech corpus.
### SR.6

Extraction of LP-Based Features from One-Bit Quantized Speech Signals for Recognition Purposes M.Felici, A.Ferrari, M.Borgatti, R.Guerrieri D.E.I.S - University of Bologna Viale Risorgimento, 2 40136 Bologna - ITALY {mfelici, aferrari, mborgatti, rguerrieri}@deis.unibo.it A simplified fixed-point computation of cepstral coefficients, based on linear predictive analysis and infinite clipping of speech signals, is described. The autocorrelation function of the clipped signal is directly used to compute the linear predictor coefficients. The performance of an isolated word recognition system based on these coefficients is presented and compared with a system which uses standard linear predictive cepstral features. The results show that these coefficients can be efficiently used for small dictionary speech recognition systems and, since the analog-to-digital conversion can be avoided, they are suitable for a low-voltage and low-power hardware implementation.
### SR.7

BLIND EQUALIZATION FOR ROBUST TELEPHONE BASED SPEECH RECOGNITION Laurent MAUUARY e-mail: mauuary@lannion.cnet.fr France Telecom, Centre National d'etudes des telecommunications, CNET/LAA/TSS/RCP, Technopole Anticipa, 2, avenue Pierre Marzin 22307 LANNION, FRANCE ABSTRACT An adaptive filter in a blind equalization scheme has recently been proposed in order to reduce telephone line effects for speech recognizers. This paper presents the principles of this filter and describes the implementation of a circular-convolution frequency domain adaptive filter in the blind equalization scheme. The property of a constant long-term speech spectrum helps to compute the gradient used for the adaptation of the weights. However, using this property in a straightforward manner results in a crude implementation of this filter. Alternative computations of the standard stochastic gradient algorithm are therefore evaluated. On the basis of the speech recognition results obtained from different speaker independent telephone databases, this filter proves to be efficient for the channel equalization task.
### SR.8

Connected Word Recognition in Extreme Noisy Environment using Weighted State Probabilities (WSP). T. Vaich and A. Cohen Recognition of continuous speech in extreme noisy environments is a difficult task. A novel algorithm is suggested to enhance the performance of recognition in very low SNRs. The left to right HMM Weighted State Probabilities (WSP) method considers not only the probability of getting the given observation sequence, but also the pattern of states probabilities. On a ten digits (Hebrew) recognition task, with SNR of 10 db, the WSP has improved recognition results from 0% to 50%. It is suggested to apply the method, in conjunction with PMC enhancement algorithm, to very low SNR word spotting systems.
### SR.9

HANDLING DISYNCHRONIZATION PHENOMENA WITH HMM IN CONNECTED SPEECH Pierre Jourlin Laboratoire d'Informatique C.E.R.I 339, Chemin des Meinajari\es BP 1228 84911 Avignon Cedex 9 France Tel: +33 90 84 35 35 fax: +33 90 84 35 01 e-mail: jourlin@univ-avignon.fr Anticipation and retention phenomena between the different phonatory organs have been widely studied in the speech perception and production domain. However, few automatic speech recognition systems are able to handle them. In this paper, we define a product of valuated transitions automata handling these difficulties. Then, we use such automata in a recognition system based on HMM. This method is evaluated in two different contexts : bimodal and unimodal speech recognition. The results show an improvement for the the product model against a synchronous one of 1.9% in the bimodal field and of 1.2% in the unimodal one.
### SR.10

STATISTICAL LIP MODELLING FOR VISUAL SPEECH RECOGNITION Juergen Luettin (1,2), Neil A. Thacker (1) and Steve W. Beet (1) (1) Dept. of Electronic and Electrical Engineering University of Sheffield Sheffield S1 3JD, UK (2) IDIAP CP 592, 1920, Martigny, Switzerland Luettin@idiap.ch, N.Thacker@shef.ac.uk, S.Beet@shef.ac.uk ABSTRACT We describe a speechreading (lipreading) system purely based on visual features extracted from grey level image sequences of the speaker's lips. Active shape models are used to track the lip contours while visual speech information is extracted from the shape of the contours. The distribution and temporal dependencies of the shape features are modelled by continuous density Hidden Markov Models. Experiments are reported for speaker independent recognition tests of isolated digits. The analysis of individual feature components suggests that speech relevant information is embedded in a low dimensional space and fairly robust to inter- and intra-speaker variability.
### SSP.1

ANALYTICAL LINKS BETWEEN STEERING VECTORS AND EIGENVECTORS Nadge THIRION, Jrme MARS, Jean-Louis LACOUME CEPHAG-ENSIEG, BP 46, rue de la Houille Blanche, 38402 ST MARTIN D'HERES Cedex France Tl/Fax: (33) 76.82.64.21 / (33) 76.82.63.84 e-mail: thirion@cephag.observ-gr.fr We consider the problem of separation of convolutive mixtures of wideband signals impinging on an antenna of sensors focusing on the case of interfering seismic waves. We are looking at the spectral matrix filtering method. The analytical study of its resolving power, makes it possible for us to theoretically justify its use but even to explain its deficiencies in difficult context (waves of very close energies or/and too near slowness for instance). But first, this question induces us to discuss on the links between two basis: the eigenvectors one and the steering vectors one.
### SSP.2

SEPARATION OF SEISMIC SIGNALS: A NEW CONCEPT BASED ON A BLIND ALGORITHM Nadge THIRION *, Jrme MARS *, Jean-Luc BOELLE ** * CEPHAG-ENSIEG, BP 46, rue de la Houille Blanche, 38402 ST MARTIN D'HERES Cedex France Tl/Fax: (33) 76.82.64.21 / (33) 76.82.63.84 e-mail: thirion@cephag.observ-gr.fr ** Socit Elf-Aquitaine CSTJF Avenue Larribeau, 64018 PAU Cdex, France In geophysical operations, the aims of signal processing are the separation and the identification of waves to get a better understanding of the onshore. The limits of the usually used techniques may appear when waves are too close in terms of energies or/and slowness. We propose an alternative via a blind algorithm that exploits some of the concepts of blind separation of sources. The performances of such an approach are illustrated on field data.
### SSP.3

MULTICHANNEL DISTANCE FILTERING OF SEISMIC ELECTRIC SIGNALS G. Economou, A. Ifantis*, D. Sindoukas University of Patras, Physics Department, Electronics Laboratory, GR-26110 Patras, GREECE. Tel.: +30 61 997463, FAX: +30 61 997456, email: economou@physics.upatras.gr *- Technological Educational Institute of Patras, Dept. of Electrical Engng., Patras 26334. ABSTRACT A novel type of distance weighted multichannel filter is used to filter correlated multichannel 1-D seismic electric signals. These signals are weak, short time variations of the geoelectric field occurring prior to an earthquake. The new filters use intersample distances to compute coefficients. Both vector and componentwise correlation is utilised in the computation. The new composite distance filters preserve better, sharp edges and correlated signal features while at the same time possess very good noise suppression properties.
### SSP.4

HIGHER ORDER STATISTICS APPLIED TO WAVELET IDENTIFICATION OF MARINE SEISMIC SIGNALS Mohammed Boujida & Jean-Marc Boucher Tlcom Bretagne, Dpartement Signal et Communications BP 832, 29285 BREST Cedex, FRANCE Tel : 98 00 13 57, Fax: 98 00 10 12, E-mail : JM.Boucher @enst-bretagne.fr ABSTRACT The purpose of this paper is to present the use of higher order statistics to solve the blind identification problem of reflection seismic data. We develop and compare some non-parametric and parametric methods based on higher order statistics. To compare these methods, non-minimum phase wavelet and non-gaussian reflectivity function are simulated. They are then applied to real data of high resolution marine seismic reflection.
### SSP.5

FRESNEL RAYS AND RESOLUTION OF TOMOGRAPHIC IMAGING Claudio Chiaruttini D.I.N.M.A., University of Trieste, via Valerio, 10, I-34127 Trieste, Italy tel: +39 40 676 7157; fax: +39 40 676 3497 e-mail: chiaruttini@univ.trieste.it Alessandro Pregarz and Enrico Priolo Osservatorio Geofisico Sperimentale (OGS), Trieste, Italy Ray-theoretic travel-time tomography assumes an infinite signal bandwidth. When this condition is not met, energy propagates from source to receiver along Fresnel rays of finite cross-section, instead of infinitely thin mathematical rays. We use approximate analytical solutions of the weak scattering problem and numerical modelling of the full wave equation to discuss the resolution of bandlimited records. The setting of the numerical simulations is illustrative of a cross-well seismic experiment. We show that bandlimited travel-time data suffer an unexpected loss of resolution just along the mathematical ray. Nevertheless, this loss can be fully recovered including signal amplitude in an inversion procedure. We also discuss the problem of time picking, and show that, to be consistent with the weak scattering assumption, arrival time must be estimated at the signal peak.
### VCI.2

A TESTBED FOR THE EVALUATION OF MPEG VIDEO TRANSMISSION ON ATM NETWORKS Christian J. van den Branden Lambrecht* and Andrea Basso+ *Signal Processing Laboratory, Swiss Federal Institute of Technology, CH-1015 Lausanne, Switzerland, vdb@lts.de.epfl.ch, http://ltswww.epfl.ch/~vdb/ +Telecommunications Laboratory, Swiss Federal Institute of Technology, CH-1015 Lausanne, Switzerland, basso@tcom.epfl.ch, http:/tcomwww.epfl.ch/ Most of the new broadcasting and multimedia applications intensively rely on networked video. The current trend for distributing digital video on broadband ISDN networks is towards the adaptation of MPEG streams on ATM networks. End-to-end testing of such communication systems is very important and requires robust testing methodologies that are capable of evaluating both coding and transmission errors. This paper proposes a complete architecture for doing so. The system is entirely automatic, relies on synthetic test patterns and estimates the subjective quality of video coding and network transmission.
### VCI.3

A PERFORMANCE MODEL FOR THE MPEG CODER G. Calvagno, G.A. Mian, A. Moro, R. Rinaldo Dipartimento di Elettronica e Informatica Via Gradenigo 6/a, 35131 Padova, Italy Tel: +39-49-827 7731, Fax: +39-49-827 7699, E-mail: calvagno@dei.unipd.it Abstract The MPEG video coding standard provides the syntax and semantics of bit streams representing compressed video. The underlying algorithm uses block matching motion compensation and block based DCT, with run-length coding of the quantized coefficients. It is important to derive models that allow to predict, for a given input sequence, the algorithm performance in terms of quality versus bit rate. In this work, we show that a simple model can be used to this purpose, despite the complexity of the overall MPEG algorithm. The model can be conveniently used to determine the quantizer parameters that give a desired quality or bit rate. For instance, in buffer control, it is necessary to precisely adapt the input rate to the buffer content in order to prevent overflow and underflow.
### VCI.4

FEED-FORWARD BUFFERING AND RATE CONTROL BASED ON SCENE CHANGE FEATURES FOR MPEG VIDEO CODER Yoo-Sok Saw, Peter M. Grant, and John M. Hannah Dept of Electrical Engineering, University of Edinburgh, Edinburgh, EH9 3JL, UK. Tel: +44 131 6505655; fax: +44 131 650 6554 e-mail: ys@ee.ed.ac.uk Video traffic management has been a challenging task in the fields of network management and multi-media communication. Transmission buffering is widely used to smooth bursty traffic and maintain a steady traffic level by adapting the incoming source traffic to the buffer. This paper describes an efficient adaptive buffering scheme which is based on feed-forward control to adaptively handle the non-stationary nature of bursty video traffic. The performance of a series of quantisation scale mapping curves is presented in terms of occupancy and video quality.
### VCI.5

TREE-STRUCTURED LATTICE VECTOR QUANTIZATION Vincent Ricordel and Claude Labit IRISA/INRIA Rennes, Campus de Beaulieu, 35042 Rennes Cedex, France e-mail: ricordel@irisa.fr, labit@irisa.fr We have already introduced a new vector quantizer (VQ) for the compression of digital image sequences. Our approach unifies both efficient coding methods : a fast lattice encoding and an unbalanced tree-structured codebook design according to a distortion vs. rate tradeoff. This tree-structured lattice VQ is based on the hierarchical packing of embedded truncated lattices. Now we investigate the determination of the most efficient lattice respectively to this method. We also describe a fast test which permits to detect the input vectors whose norm is above than the maximum allowed by the TSLVQ. Finally we analyse experimental results applied to image sequence with our VQ taking place in a region-based coding scheme for a videophone application.
### VCI.6

Improving bit-rate and quality control for MPEG-2 video sources Giancarlo Cicalini*, Lorenzo Favalli*, Alessandro Mecocci** *Universit di Pavia,- Dipartimento di Elettronica via Ferrata, 1, I-27100 Pavia (PV) Italy; Tel: +39-382-505923; fax: +39-382-422583; e-mail: lorenzo@comel1.unipv.it **Universit di Siena,- Facolt di Ingegneria; via Roma, 77, I-53100 Siena (SI), Italy tel: +39-577-2636041 fax: +39-577-263602; e-mail: mecocci@comel1.unipv.it Abstract. In video compression techniques, it is very important to implement the most efficient bit allocation strategy in order to achieve the best quality with the minimum number of bits. This paper presents a new feedback/feedforward controller, for MPEG-2 coding, that dynamically tunes the quantization parameters by analysing the image sequence from a psycovisual point of view. The analysis is carried out on an 8x8 pixels block basis to determine the visual characteristic of each macroblock. This pre-analysis classifies macroblocks and assigns quantization parameters to them according to a proposed scale measuring their visual relevance. A post-analysis procedure provides the final tuning. The system generates images with higher quality with respect to the standard Test Model 5.
### VCI.7

CELL DELAY VARIATION PERFORMANCE OF CBR AND VBR MPEG-2 SOURCES IN AN ATM MULTIPLEXER Javier Zamora, Dimitris Anastassiou and Kand Ly Department of Electrical Engineering and Image Technology for New Media Center Columbia University, New York, NY 10027, USA e-mail: javier@ee.columbia.edu Video services require specific constraints regarding the delay variation or jitter experienced when they are transmitted in packet networks such as ATM. This delay component is mainly generated in multiplexing processes and it has a direct impact on the final QoS. In this paper the jitter issue is addressed in the environment of a video server connected to an ATM Network. Both CBR and VBR MPEG-2 streams are considered as traffic sources. For each video source its delay variation is studied using first order and second order statistics such as jitter variance and GCRA, respectively. We study several traffic scenarios, where correlation between video sources is considered . Finally the obtained results are compared with the M+D/D/1 model.
### VCI.8

A TEMPORAL MODE SELECTION IN THE MPEG-2 ENCODER SCHEME Laurent Piron Signal Processing Laboratory Swiss Federal Institute of Technology CH-1015 Lausanne, Switzerland Tel: +41 21 693 2605; fax +41 21 693 7600 e-mail: piron@ltssg2.epfl This paper deals with the mode decision in an MPEG-2 framework. An algorithm for mode decision is introduced. This algorithm is based on a rate-distortion criterion and takes into account the temporal dependency of the frames. This approach can allow a quality gain of more than one dB compared to the Test Model 5 (TM5) mode decision algorithm.
### VCI.9

REGION BASED CODING SCHEME WITH SCALABILITY FEATURES Olivier Egger, Frank Bossen, and Touradj Ebrahimi Signal Processing Laboratory Swiss Federal Institute of Technology at Lausanne CH-1015 Lausanne, Switzerland Email: egger@lts.de.epfl.ch ABSTRACT In order to satisfy the needs of new applications in a multimedia environment the problem of object-oriented coding has to be addressed. In this paper two main ap- proaches are presented to tackle this problem. First, an algorithm for shape coding is presented. It is based on a chain coding algorithm where powerful modeling techniques are used to increase the compression ratio. Second, an algorithm for interior coding is described. It is based on an arbitrarily-shaped subband transform followed by a generalized embedded zerotree wavelet al- gorithm. It is shown in the paper that it achieves good compression results and has additional properties such as supporting arbitrarily-shaped regions, being compu- tationally efficient, keeping the same dimensionality in the transformed domain, allowing perfect reconstruction and an intrinsic rate control mechanism. The presented results show that the two algorithms build an efficient basis to design object-oriented video coding schemes.
### VCI.10

A MODIFIED MPEG-1 SYSTEM BASED ON GENLOT S. H. Oguz, T. Q. Nguyen and Y. H. Hu ECE Department, University of Wisconsin-Madison 1415 Johnson Drive, Madison, WI 53706 U.S.A. Tel: 1 608 2655739; Fax: 1 608 2654623 e-mail: oguz@cae.wisc.edu, nguyen@ece.wisc.edu, hu@engr.wisc.edu In this study, a modification to ISO MPEG-1 and MPEG-2 digital video coding standards is proposed and preliminary results on its performance are reported. The proposed modification aims to improve the visual quality of MPEG-1 and MPEG-2 coding at medium-to-low bit-rate regimes by eliminating the blocking effect caused by the Discrete Cosine Transform. This goal is achieved without introducing a significant change in the MPEG hierarchy and algorithm. The theory of Lapped Orthogonal Transforms which constitutes a rather recently introduced tool for block transform coding suggests that they can reduce the blocking effect to very low levels. Hence, in the modified MPEG-like system, instead of the original two dimensional Discrete Cosine Transform, a Lapped Orthogonal Transformation is used as the basic spatial correlation reduction operation and also customized quantization and variable length codeword tables are provided to ensure efficiency. The modified coding algorithm is implemented in software. Simulations are made to compare its performance to the original MPEG-1 algorithm. As performance criteria, PSNR versus compression ratio (equivalently bit-rate) plots and also subjective ratings of visual quality are used.
### VCII.1

PARTITION PREDICTION FOR SEGMENTATION-BASED CODING TECHNIQUES Ferran Marques, Bernat Llorens and Antoni Gasull Universitat Politecnica de Catalunya Campus Nord - Modul D5 C/ Gran Capita, 08034 Barcelona, Spain E-mail: ferran@gps.tsc.upc.es This paper presents a general partition prediction scheme. It consists of four steps: region parametrization, region prediction, region ordering and partition creation. The evolution of each region is separated into two types: regular motion and shape deformation. Fourier Descriptors are used to parametrized both types of evolution and they are separately predicted in the Fourier domain. The predicted partition is built from the ordered combination of the predicted regions, using morphological tools. This technique is applied in the framework of segmentation-based video coding techniques for coding sequences of complete partitions as well as sequences of binary images (shape information in Video Object Planes -VOP-).
### VCII.2

TITLE: BIORTHOGONAL B-SPLINE FILTER BANKS FOR LOW BIT RATE VIDEO CODING AUTHORS: Sergio M. M. de Faria Mohammed Ghanbari AFFILIATION: Dep. of ESE, University of Essex Wivenhoe Park -- Colchester CO4 3SQ -- England Tel: +44 1206 872448; fax: +44 1206 872900 e-mail: defasa@essex.ac.uk, ghan@essex.ac.uk ABSTRACT: In this paper we investigate the performance of B-Spline filter banks for low bit rate image coding. The influence of certain characteristics of the analysis and synthesis of FIR filters are studied. These include the B-Spline polynomial order, the effects of coefficient truncation, coding quantisation and the distortion introduced by the filters themselves. Due to the high concentration of energy in the low frequency band, these biorthogonal filter banks have better capabilities to reconstruct signals from the lower frequency band than their counterparts. As a result a very low bit rate video codec can be designed by coarse quantisation of the higher bands.
### VCII.3

SCALABLE VIDEO CODING AT VERY LOW BIT RATES EMPLOYING RESOLUTION PYRAMIDS Klaus Illgner and Frank Mueller Institut fr Elektrische Nachrichtentechnik RWTH Aachen, 52056 Aachen, Germany Tel: +49-241-80-7681; Fax: +49-241-8888-196 {illgner,mueller}@ient.rwth-aachen.de In this paper an approach for scalable video coding is described, based on the hybrid coding scheme. The scalability is achieved by decomposing the frames to be coded into a resolution pyramid. Motion estimation and compensation is performed at each level. The focus of the paper is to design motion estimation and compensation such, that the resulting pyramid of vector fields as well as the pyramid of prediction errors can be coded in an efficient fashion.
### VCII.4

ADAPTIVE SUBBAND VQ FOR VERY LOW BIT RATE VIDEO CODING Stathis P. Voukelatos and John J. Soraghan Signal Processing Division, Dept. of Electronic and Electrical Eng., University of Strathclyde, Glasgow G1 1XW, Scotland, U.K., E-Mail: stathis@spd.eee.strath.ac.uk ABSTRACT A novel adaptive VQ based subband coding scheme for very low bit rate coding of video sequences is presented. Overlapped block motion estimation/compensation is employed to exploit interframe redundancy. A 2D wavelet transform (WT) is applied to the resulting displaced frame difference (DFD) signal. The WT coefficients are encoded using an adaptive vector quantization scheme in combination with a dynamic bit allocation strategy based on marginal analysis. Simulation results on videophone-type test sequences are given to evaluate the performance of the codec at very low bit rates. A comparative performance with the H.261 and H.263 video coding standards is also shown.
### VCII.5

VECTOR REPRESENTATION OF CHROMINANCE FOR VERY LOW BIT RATE CODING OF VIDEO Maciej Bartkowiak (1) Marek Domanski (1) Peter Gerken (2) (1) Politechnika Poznanska Instytut Elektroniki i Telekomunikacji ul. Piotrowo 3a 60-965 Poznan, Poland E-mail: mbartkow@et.put.poznan.pl domanski@et.put.poznan.pl (2) Institut fuer Theroretishe Nachrichtentechnik und Informationsverarbeitung, Universitaet Hannover Appelstrasse 9A 30167 Hannover, Germany E-mail: gerken@tnt.uni-hannover.de A chrominance vector quantization technique is proposed as a preprocessing step prior to any kind (e.g. DCT-based or OBASC) of video coding. The operation converts the stream of two-component vectors into a scalar stream of chrominance labels. Therefore the coder processes two channels only: one luminance and one chrominance. After decoding the two chrominance channels are reconstructed from the stream of labels of chrominance codebook entries. Experimental results with still images show recognizable improvement of the subjective quality by a constant compression ratio.
### VCII.6

A LOW BIT RATE HIERARCHICAL VIDEO CODEC Kui Zhang, Miroslaw Bober and Josef Kittler Department of Electronic and Electrical Engineering, University of Surrey, Guildford GU2 5XH, United Kingdom e-mail:K.Zhang@surrey.ac.uk, M.Bober@surrey.ac.uk, J.Kittler@surrey.ac.uk The performance of a very low bit rate video codec largely depends on the efficient use of motion compensated prediction technique and on a good coding control strategy. In our previous approach, we proposed a multiple layer video codec using affine motion compensation. In this paper, we further extend our affine compensated multi-layer codec by incorporating a new block level and designing a coding control strategy. A measure of coherent motion is used in the decision process which makes the codec perform efficiently at very low bit rate and for small size image sequences (QCIF and sub-QCIF format). The experimental results conduced on 15 MPEG test sequences in QCIF format show improvement in PSNR of 0.2 dB and reduction in bit rate of 0.9 kbits/second.
### VCII.7

3-D SUBBAND CODING OF VIDEO USING RECURSIVE FILTER BANKS Marek Domanski and Roger Swierczynski Politechnika Poznanska, Instytut Elektroniki i Telekomunikacji, ul. Piotrowo 3a, 60-965 Pozna, Poland Phone: +48 61 782 762, Fax: +48 61 782 572, E-mail: domanski@et.put.poznan.pl , roger@et.put.poznan.pl Abstract A video coding technique based on a three-dimensional subband analysis with recursive spatial filter banks is proposed. Moreover a simple technique to compress digital data in the subbands is described. In order to avoid annoying artifacts at edges and thin lines the filter banks are switched adaptively. Flat areas are processed with recursive filters exhibiting long impulse responses and good selectivity, while object edges and other detailed regions are processed with recursive filters with highly attenuated impulse responses and poorer selectivity. For very simple encoding scheme good visual quality has been obtained for real test video sequences in the CIF format encode at the bitrates about 150 kbps. Obviously further bit rate reduction could be obtained using a more sophisticated coder. The very important advantage of the technique proposed is its simplicity.
### VCII.8

MOVING PICTURE FRACTAL CODING USING A MIXED APPROACH IFS AND MOTION J.-L. Dugelay and J.-M. Sadoul Institut EURECOM Multimedia Communications dept., 2229, route des Cretes, B.P. 193, 06904 Sophia Antipolis Cedex. Tel: +33 93 00 26 41; Fax: + 33 93 00 26 27 e-mail. dugelay@eurecom.fr url. http://www.eurecom.fr/~image This paper deals with a possible extension of the fractal compression algorithm defined for still image to moving picture. The addressed approach is a mixed approach based on a combinaison between inter-frame coding using block-matching, and an intra-frame coding using IFS.
### VCII.9

A NOVEL METHOD IN REDUCING THE COMPLEXITY OF FRACTAL ENCODING L.K. Ma, O.C. Au*, and M.L. Liou** Department of Electrical and Electronic Engineering The Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong. Tel: +852 2358-7053*, +852 2358-7055** Email: eeau@ee.ust.hk*, eeliou@ee.ust.hk** ABSTRACT Fractal coding is a promising technique for image compression. However, one of the challenges for cost effective implementation is to reduce the huge computational complexity of the encoder. In this paper, we propose a novel algorithm to address this issue. Firstly, we replace mean square error with mean absolute error as distortion measure to reduce multiplication. Secondly, we use statistical normalisation to eliminate the need to compute the scaling factor and offset during the search. Thirdly, we change the domain block search to range block search to reduce memory requirement. Simulation results suggest that our algorithm can reduce computation by three order of magnitude for a QCIF image with negligible visual degradation.
### VCII.10

AUTOMATIC FRAME FITTING FOR SEMANTIC-BASED MOVING IMAGE CODING USING A FACIAL CODE-BOOK Paul M. Antoszczyszyn, John M. Hannah and Peter M. Grant Department of Electrical Engineering, The University of Edinburgh Edinburgh, EH9 3JL, UK Tel: +44 131 6505655; fax: +44 131 650 6554 e-mail: plma@ee.ed.ac.uk An entirely new method of automatic wire-frame fitting for semantic-based moving image coding is proposed. The algorithm utilises a code-book of facial images. All elements of the facial data-base are pre-processed and manually fitted with the wire frame model. Both pre-processing and manual fitting are a part of the facial images data-base preparation. As such, they are not a part of on-line processing of an unknown image. Only the pre-processed images (monochrome bitmaps) are used in automatic frame fitting. This allows a reduced space requirement for storage of the reference data-base.
### PL.2

MIXED ANALOG-DIGITAL MULTIRATE SIGNAL PROCESSING Sanjit K Mitra Department of Electrical and Computer Engineering University of California Santa Barbara, CA 93106, U.S.A. Jose E. Franca Department of Electrical and Computer Engineering Instituto Superior Tecnico Av. Rovisco Pais, 1, 1096 Lisboa Codex, Portugal ABSTRACT To achieve higher levels of integration there has been a growing interest in recent years in designing systems containing both analog and digital functions on a single integrated circuit. In most cases, these are inherently multirate systems because of the different sampling rates employed at various stages of the system. This paper reviews some recent developments in this area of integrated multirate analog-digital systems, with a special emphasis on their applications to communication systems.
### PL.3

EXPECTATION-BASED MULTI-FOCAL VISION FOR VEHICLE GUIDANCE Ernst D. Dickmanns Universitaet der Bundeswehr Muenchen D-85577 Neubiberg, Germany Tel: +89 6004 2077/3583; Fax: +89 6004 2082; e-mail: Ernst.Dickmanns@unibw-muenchen.de ABSTRACT Based on experience with several generations of vision systems for road vehicle guidance a new complex vehicle eye and corresponding control schemes for viewing direction control and feature extraction are proposed allowing new levels of performance with state of the art general purpose processors. Modeling along the time axis is the key to an efficient use of the degrees of freedom gained by saccadic viewing strategies.
Paper

### SS.1.1

A MRF BASED APPROACH TO COLOR IMAGE RESTORATION C.S. Regazzoni, E. Stringa, A.N. Venetsanopoulos* Department of Biophysical and Electronic Engineering (DIBE), University of Genoa Via all'Opera Pia 11A, 16145 Genova, ITALY Tel: +39 10 3532792; fax: +39 10 3532134 e-mail: carlo@dibe.unige.it *Department of Electrical and Computer Engineering, University of Toronto 10 King's College Road, Toronto, ON, CANADA ABSTRACT In this paper, a Markov Random Field (MRF)-based method is presented. MRF methods are based on a probabilistic representation of a image processing problem; the problem is represented as the maximization of a probability measure computed starting from input data for all possible solutions. The optimization process is often computationally expensive. The coupled problem of restoring and extracting edges from an image is here considered. An extension to the color case of the deterministic mean-field annealing method presented in [1] is presented. The main advantage of this approach is its capability of obtaining a sub-optimum solution in a faster way with respect to optimal stochastic methods (e.g., Simulated Annealing).
### SS.1.2

Resolution Enhancement of Color Video Brian C. Tom and Aggelos K. Katsaggelos} Northwestern University Department of Electrical Engineering and Computer Science McCormick School of Engineering and Applied Science Evanston, IL 60208-3118 USA Tel: (847) 491-7164 Fax: (847) 491-4455 Email: briant@eecs.nwu.edu, aggk@eecs.nwu.edu In this paper, an approach to improve the spatial resolution of color video is presented. Such high resolution images are desired, for example, in video printing. Previous work has shown that the most important step in achieving high quality results is the accuracy of the motion field. It is well known that motion estimation is an ill-posed problem. However, in processing color video, additional information contained in the color channels may be used to improve the accuracy of the motion field over the motion field obtained with the use of only one channel. In turn, this improvement in the motion field will be shown through several experimental results to significantly improve the estimation of a high resolution image sequence from a corresponding observed low resolution sequence.
### SS.1.3

Noise modeling for smoothing the colour histogram L.Shafarenko, M.Petrou and J.Kittler Dept. of Electronic and Electrical Engineering, University of Surrey, Guildford GU2 5XH, United Kingdom. e-mail: l.shafarenko,m.petrou @ee.surrey.ac.uk In this paper we present a segmentation algorithm for colour images that uses the watershed algorithm to segment either the 2D or the 3D colour histogram of an image. For compliance with the way humans perceive colour, this segmentation has to take place in a perceptually uniform colour space like the space. To avoid oversegmentation, the watershed algorithm has to be applied to a smoothed out histogram. The noise, however, is inhomogeneous in the space and we present here the noise analysis for this space based on assumptions that are experimentally justified.
### SS.1.4

ADAPTIVE MULTICHANNEL L FILTERS BASED ON REDUCED ORDERING N. Nikolaidis I. Pitas Dept. of Electrical and Computer Engineering, University of Thessaloniki, Thessaloniki, GREECE, nikolaid@zeus.csd.auth.gr Dept. of Informatics, University of Thessaloniki, Thessaloniki, GREECE, pitas@zeus.csd.auth.gr Multichannel L filters that are based on the reduced ordering principle have been proposed lately as an effective nonlinear filtering structure for multivariate data. The evaluation of the optimal coefficients for these filters requires a priori information on the signal statistics which might not be always available. To overcome this, we propose adaptive multichannel L filters that are based on the LMS algorithm. Convergence issues for the new adaptive filter structures are studied. Experiments involving color images prove the superior performance of the proposed filters in noise removal.
### SS.1.5

NEAREST NEIGHBOUR MULTICHANNEL FILTERS FOR IMAGE PROCESSING K.N. Plataniotis, D. Androutsos, A.N. Venetsanopoulos} Department of Electrical and Computer Engineering University of Toronto Toronto, Ontario, M5S 1A4, Canada http://www.comm.toronto.edu/~dsp/dsp.html e-mail: kostas@dsp.toronto.edu This paper addresses the problem of noise attenuation for multichannel data. Two multichannel filters which utilize adaptively determined data dependent coefficients are introduced. The special case of colour image processing is studied as an important example of multichannel signal processing. Simulation results indicate that the new filters are computationally attractive and have excellent performance.
### SS.1.6

COLOR IMAGE FILTERING USING GENERALIZED COST FUNCTIONS D. Sindoukas, S. Fotopoulos, G. Economou University of Patras, Physics Department, Electronics Laboratory, GR-26110 Patras,GREECE. Tel.: +30 61 997465, FAX: +30 61 997456 email: spiros@physics.upatras.gr ABSTRACT The concept of cost function (CF) in the context of image filtering is put under investigation in this work. Optimal behaviour of the resulting filters in respect with noise attenuation and edge preservation is sought through the minimization of these functions. This behaviour can be controlled by proper adjustment of certain parameters in some cases. Function combinations are also considered. Finally, the proposed schemes are tested on real images and objective as well as subjective results are reported.
### SS.1.7

MORPHOLOGICAL LIKE OPERATORS FOR COLOR IMAGES Constantin Vertan, Viorel Popescu, Vasile Buzuloiu Department of Applied Electronics, Bucuresti "Politehnica" University fax: + 40 1 312. 31. 93 email: vertan@alpha.imag.pub.ro, vpopescu@edil.edil.pub.ro, buzuloiu@alpha.imag.pub.ro Primarily based on Serra's framework, mathematical morphology has become one of the most used nonlinear processing and analysis techniques. Later work extended the initially set operators to functions, in a general algebraic definition for multidimensional scalar signals. The case of vector valued images (or signals) is not included in this theory. The extension of mathematical morphology to color images is equivalent to the definition of an ordering relation in a vector space. In this paper we will investigate several ordering relations in the color space, each of them yielding to the definition of morphological operations. The performance of the filtering based on these operations is evaluated in terms of Normalized Mean Square Error (NMSE), Mean Chromaticity Error (MCRE), space topology preservation and visual subjective perception of image quality.
### SS.1.8

IMAGE SEGMENTATION BY AREA DECOMPOSITION OF HSV COMPONENTS Stephen J. Impey and J. Andrew Bangham School of Information Systems, University of East Anglia, Norwich NR4 7TJ, UK Email: sji@sys.uea.ac.uk Coloured images may be simplified with an area based sieve whilst preserving edges and, usually, colour up to the edges using either the hue, saturation and value (HSV) or red, blue, green (RGB) components. Furthermore, an image may be segmented by area. Applying the sieve to HSV components from a colour image appears to significantly improve the chances of finding objects in a scene, particularly when the objects have different colours. An example of finding cars in a car park scene is presented.
### SS.1.9

CLASSIFICATION OF MULTISPECTRAL REMOTE-SENSING IMAGES BY NEURAL NETWORKS F. Roli(1), S.B. Serpico(2), L. Bruzzone(2), and G. Vernazza(1) (1) Dept. of Electrical and Electronic Eng., University of Cagliari Piazza dArmi, I-09123, Cagliari, Italy tel: +39 70 6755897; fax: +39 70 6755900 e-mail: vernazza@elettro1.unica.it (2) Dept. of Biophysical and Electronic Eng., University of Genoa Via All Opera Pia, 11A, 16145, Genova, Italy tel: +39 10 3532752; fax: +39 10 3532134 e-mail: vulcano@dibe.unige.it ABSTRACT This paper addresses the classification of multispectral remote-sensing images by the neural-network approach. In particular, an experimental comparison on the performances provided by different neural models for classifying multisensor remote-sensing data is reported. Four neural classifiers are considered in the comparison: the Multilayer Perceptron, Probabilistic Neural Networks, Radial Basis Function networks and a kind of Structured Neural Networks.
### SS.1.10

NEURAL PROCESSING OF MULTISPECTRAL AND MULTITEMPORAL AVHRR DATA Vito Cappellini(*), Marco Benvenuti (), Carlo Di Chiara (), Stefano Fini () (*) University of Florence, Department of Electronic Engineering Via di S. Marta, 3 - 50139 Florence - Italy () Fondazione per la Meteorologia Applicata Via Caproni, 8 - 50145 Florence - Italy () Centro di Studi per l'Informatica applicata in Agricoltura Via Caproni, 8 - 50145 Florence - Italy ABSTRACT In the last years a large amount of multisensor data has been generated in consequence of the development of remote sensing techniques for the analysis of the Earth's surface. The study of the evolution of the vegetation status is particularly useful in planning agro-ecological operations and in the estimation of the vegetation development. In this paper, vegetation index data (NDVI) collected by the AVHRR sensor on the NOAA satellite are processed. These multitemporal data belong to a historical archive composed of ten years of ten-day images of the whole African continent. This archive has been implemented in the framework of a co-operation between NASA-GSF and the FAO Remote Sensing Centre (ARTEMIS project). The archive starts from August 1981 to June 1991 and is composed of 356 georeferenced images having a spatial resolution of 7.6 km x 7.6 km. This set of NDVI data collected over a so long period of time is extremely useful when the annual and seasonal variations of the reflectance of the Earth surface have to be investigated. In this work a new approach to NDVI data processing is presented: it is composed of both statistical analysis techniques and neural algorithms. The large number of images in the archive makes extremely difficult to analyse the whole data set and this is particularly true when personal computer are used for processing. The method can be summarized in two fundamental steps: i) reduction of the number of images to be processed controlling the loss of information by means of statistical techniques; ii) the use of a neural network for clustering the scene in order to put in evidence areas showing similar vegetation index.variability. In the first processing step, the Principal Component transformation is applied to images of each year thus eliminating redundant information. In this way the number of images to be processed by the unsupervised classifier is dramatically reduced. The optimal number of classes is chosen by the chi-squared statistical test, suitably modified and applied to different classifications with variable number of clusters. A three-layered neural network is used for clustering. This newtork is obtained with the combination of two well known architectures: the first one is unsupervised (Kohonen map) whilst the second one is supervised (Grossberg layer). At the end, means and standard deviations of the vegetation index for each cluster as well as for each decade are computed.
### SS.2.2

VIDEO CODING USING ADAPTIVE GLOBAL MC AND LOCAL AFFINE MC Hirohisa Jozawa, Kazuto Kamikura, Kazuhisa Yanaka, and Hiroshi Watanabe NTT Human Interface Laboratories (jozawa@nttvdt.hil.ntt.jp) This paper describes an efficient video coding method using two-stage motion compensation (MC). The proposed MC method employs global MC (GMC) and overlapped block affine MC. GMC is adaptively turned on/off for each macroblock since GMC cannot predict all regions in an image. Simulation results show that the proposed coding method using two-stage MC significantly outperforms H. 263 for sequences with fast motion. Performance improvements in PSNR are about 3-4 dB over H. 263.
### SS.2.3

STANDARDS BASED VIDEO COMMUNICATIONS AT VERY LOW BIT-RATES Bernd Girod, Niko Faerber, and Eckehard Steinbach Lehrstuhl fuer Nachrichtentechnik University of Erlangen-Nuremberg Cauerstrasse 7, D-91058 Erlangen, Germany Tel: +49 9131 857100; fax: +49 9131 303840 E-mail: girod@nt.e-technik.uni-erlangen.de Video communication at very low bit-rates has made significant progress recently through the new ITU-T standard H.263. In this paper, we are reviewing the performance advances over the 1990 ITU-T standard H.261, and present a novel extension that allows robust transmission of moving video over highly unreliable channels, such as the mobile channel.
### SS.2.4

SELECTIVE CODING BY FOCUS OF ATTENTION: A NEW TOOL TO ACHIEVE VLBR VIDEO CODING Eric Nguyen, Claude Labit IRISA, Campus Universitaire de Beaulieu 35042 Rennes Cedex, France Tel: +33 99 84 72 60; fax: +33 99 84 71 71 {nguyen,labit}@irisa.fr Selective source coding is an essential part of very low bit rate (VLBR) image/video compression where a significant irrelevancy reduction has to be performed. In this paper, this reduction is described in the context of visual attention: the selection of relevant spatial information at the expense of other (non-relevant) information in order to maximize the efficiency of a particular visual communication task. We first give a general framework of selective coding. We then illustrate it with some examples of implementation using the generic wavelet representation as a stand-alone technique or for spatial encoding of the MC residuals in a MC-DPCM hybrid video coding scheme.
### SS.2.5

LOW BIT RATE VIDEO CODING FOR MOBILE MULTIMEDIA COMMUNICATIONS Reginald L. Lagendijk, Jan Biemond and Cor P. Quist Delft University of Technology, Department of Electrical Engineering, Information Theory Group P.O. Box 5031, 2600 GA Delft, The Netherlands Tel: +31 15 278 3731; Fax: +31 15 278 1843 e-mail: {lagendijk,biemond}@et.tudelft.nl; WWW: http://www- it.et.tudelft.nl In this paper we first describe the objectives of the Delft Mobile Multimedia Communications project. Next, the subject of lossy contour compression is considered in more detail as it is an essential component of most object or region-based compression techniques for low bit rate video coding. We propose an optimized B-splines approximation approach, which results in a 40 percent higher compression than the lossless conditional chain code method. Achieved rates are, depending on the tolerable deviation between original and coded contour, in the order of 0.70 to 0.90 bit per contour pixel.
### SS.2.6

A VERY LOW BIT-RATE VIDEO CODEC WITH OPTIMAL TRADE-OFF AMONG DVF, DFD AND SEGMENTATION Guido M. Schuster and Aggelos K. Katsaggelos Northwestern University Department of Electrical and Computer Engineering 2145 Sheridan Road, Evanston, Illinois 60208-3118, USA E-mail: gschuster@nwu.edu, aggk@eecs.nwu.edu In this paper we present a theory for the optimal bit allocation among quad-tree (QT) segmentation, displacement vector field (DVF) and displaced frame difference (DFD). The theory is applicable to variable block size motion compensated video coders (VBSMCVC), where the variable block sizes are encoded using the QT structure, the DVF is encoded by first order differential pulse code modulation (DPCM), the DFD is encoded by a block based scheme and an additive distortion measure is employed. We consider the case of a lossless VBSMCVC first, for which we develop the optimal bit allocation algorithm using Dynamic Programming (DP). We then consider a lossy VBSMCVC, for which we use Lagrangian relaxation and show how an iterative scheme, which employees the DP-based solution, can be used to find the optimal solution. We finally present a VBSMCVC, which is based on the proposed theory, which employees a DCT-based DFD encoding scheme. We compare the proposed coder with H.263. The results show that it outperforms H.263 by about 25% in terms of bit rate for the same quality reconstructed image.
### SS.2.7

SELECTIVE USE OF MODEL-BASED CODING FOR LARGE MOVING OBJECTS Don Pearson Departement of Electronic Systems Engineering University of Essex, Colchester CO4 3SQ, UK Tel: +44 1206 872865; Fax: +44 1206 872900 Email: dep@essex.ac.uk Measurements using a continuous quality recording method have revealed the extent of quality variations that occur in MPEG2 pictures at low bit rates. large moving objects in particular can give rise to particularly severe troughs in quality. The complementary characteristics of model-based coding are examined with a view to a synthesis of the two methods in a switched coder, with possible increased overall coding efficiency.
### SS.2.8

VERY LOW BITRATE VIDEO CODING AND MPEG-4: STILL A GOOD RELATION Fernando Pereira Instituto Superior Tcnico - Instituto de Telecomunicaes Av. Rovisco Pais, 1096 Lisboa Codex, PORTUGAL Telephone: + 351 1 8418460; Fax: + 351 1 8418472 E-mail: eferbper@beta.ist.utl.pt ABSTRACT MPEG-4 emerged recently as an important development in the field of audio-visual coding aiming at establishing the first content-based audio-visual coding standard. This paper intends to analyse the current relation between MPEG-4 and very low bitrate video coding and corresponding applications, notably by considering the MPEG-4 objectives, functionalities and recent technical developments related to video coding.
### SS.2.9

DYNAMIC CODING FOR VISUAL COMMUNICATIONS Emmanuel REUSENS, Touradj EBRAHIMI, Roberto CASTAGNO, Corinne LE BUHAN and Murat KUNT Signal Processing Laboratory Swiss Federal Institute of Technology CH-1015 Lausanne, SWITZERLAND E-mail: reusens@lts.de.epfl.ch In this paper, a new approach to the problem of visual data representation in the framework of multimedia is introduced. The approach, named 'dynamic coding', consists in a dynamic combination of multiple representation models and segmentation strategies. Given an application, these two degrees of freedom are assembled so as to yield a specific profile which meets the specifications dictated by the application. The data is represented as the union of data segments, each described with a locally appropriate representation model. In order to illustrate this approach, a video compression system, based on the principles of dynamic coding, is proposed in the context of video-telephone/conference applications.
### SS.2.10

SEGMENTATION-BASED VIDEO CODING: TEMPORAL LINKING AND RATE CONTROL Philippe Salembier, Ferran Marques and Montse Pardas Universitat Politecnica de Catalunya Campus Nord - Modul D5 C/ Gran Capita, 08034 Barcelona, Spain E-mail: {philippe,ferran,montse}@gps.tsc.upc.es} This paper analyzes the main elements that a segmentation-based video coding approach should be based on so that it can address coding efficiency and content-based functionalities. Such elements can be defined as temporal linking and rate control. The basic features of such elements are discussed and, in both cases, a specific implementation is proposed.
### SS.3.1

Chinese Remainder Theorem: Recent Trends and New Results in Filter Banks Design C.W.Kok and T.Q.Nguyen ECE Dept., University of Wisconsin Madison, 1415 Engineering Drive, Madison, Wl 53706 Tel: (608)-265-4885 Fax: (608)-262-4623 email: ckok@cae.wisc.edu and nguyen@ece.wisc.edu Recent advances in the time domain methods have led to many new approaches in filter bank designs. The objective of this paper is to derive a unified theory for these time domain methods, based on the Chinese Remainder Theorem. Topics discussed in this paper include two-channel filter banks, M-channel filter banks and 2-D filter banks. Design examples are presented to demonstrate the theory.
### SS.3.2

ON PERFECT-RECONSTRUCTION FIR FILTER BANKS Eleftherios Kofidis{1} S. Theodoridis{2} N. Kalouptsidis{2} 1: Department of Computer Engineering and Informatics, University of Patras, Patras 265 00, Greece. E-mail: kofidis@cti.gr 2: Department of Informatics, Division of Communications and Signal Processing, University of Athens, Athens 157 71, Greece. E-mail: {stheodor,kalou}@di.uoa.gr This paper deals with the problem of designing an N-band maximally-decimated analysis filter bank given K of its filters, so that perfect reconstruction with FIR synthesis filters is possible. An algorithm for computing the N-K unknown analysis filters and the synthesis filters is given and the solution set is completely parametrized. The parametrization is exploited in optimizing the frequency responses of the resulting filters and to derive also a simple parametrization for the paraunitary case. The linear-phase case is also discussed with emphasis on the 2-band filter banks. An example is provided to illustrate the theory.
### SS.3.3

LATTICE STRUCTURE FOR TWO-CHANNEL FILTER BANKS WITH COMPLEX COEFFICIENTS, WHICH YIELD SYMMETRIC WAVELET BASES Todor Cooklev* , Akinori Nishihara^ , and Masaki Kato^ * Dept. Electr. Comp. Eng. ^Dept. Physical Electronics University of Toronto Tokyo Inst. Technology 10 King's College Rd. 2-12-1 Ookayama, Meguro-ku Toronto, ON M5S 1A4, Canada Tokyo, 152 Japan minipage cooklev@dsp.toronto.edu aki@ss.titech.ac.jp ABSTRACT A new lattice structure is described. It is capable of implementing all paraunitary two-channel filter banks where the filters have complex coefficients and yield symmetric wavelet bases. This lattice structure, while being a general design method, can also be used to actually design the filter bank. These filter banks are, in fact, a special case of multifilter banks and can also be related to Golay-Rudin-Shapiro complementary polynomial pairs. The applications of such filter banks are to be found in subband coding and communications systems.
### SS.3.4

FIR OVERSAMPLED FILTER BANKS AND FRAMES IN l2(Z) Zoran Cvetkovic and Martin Vetterli Department of Electrical Engineering and Computer Sciences University of California, Berkeley, CA 94720, USA zoran@eecs.berkeley.edu, martin@eecs.berkeley.edu Perfect reconstruction FIR filter banks implement a particular class of signal expansions in l2(Z). These expansions are studied in this paper. Necessary and sufficient conditions on an FIR filter bank to implement a frame or a tight frame decomposition are given, as well as the necessary and sufficient condition for a feasibility of perfect reconstruction using FIR filters. Complete parameterizations of FIR filter banks satisfying these conditions are given. Further, we study the condition under which the minimal dual frame to the frame associated to an FIR filter bank is also FIR, and give a parameterization of a class of filter banks having this property. We then concentrate on the least constrained class, namely nonsubsampled filter banks, for which these frame conditions have particular forms.
### SS.3.5

AN ADAPTIVE PROJECTION ALGORITHM FOR MULTIRATE FILTER BANK OPTIMIZATION Dong-Yan Huang and Phillip A. Regalia Departement Signal & Image Institut National des Telecommunications 9, rue Charles Fourier F-91011 Evry cedex France huang@int-evry.fr, regalia@galaxie.int-evry.fr Abstract: We develop a new algorithm for multirate filter bank optimization, which finds application in subband coding or wavelet signal analysis. Although some impressive off-line algorithms have recently been developed for this purpose, the computation demand of such algorithms often renders them prohibitive for real-time applications. In this vein, adaptive filtering solutions remain of interest. A simple gradient descent algorithm may be ill suited due to the nonquadratic nature of the cost function to be minimized, and accordingly non gradient algorithms may offer some attractive alternatives. The present paper describes a projection type algorithm, which aims to construct a lossless filter bank such that one of its impulse responses lies close to an extremal eigenvector of the input signal autocorrelation matrix. Though a formal convergence proof of the algorithm is not offered, simulations show that the algorithm converges to an acceptable vicinity of the global minimum point of the cost function.
### SS.3.6

CONSIDERATIONS IN THe DESIGN OF OPTIMUM COMPACTION FILTERS FOR SUBBAND CODERS Yuan-Pei Lin and P. P. Vaidyanathan yplin@systems.caltech.edu ppvnath@systems.caltech.edu Dept. of Electrical Engineering, 136-93, Caltech, Pasadena, CA 91125, U.S.A. Abstract Recently there has been considerable interest in the design of optimal paraunitary filter banks for a given class of inputs. In this paper we address a number of practical considerations associated with the design and implementation of optimal paraunitary filter banks.
### SS.3.7

ORTHOGONAL TRANSMULTIPLEXER: A MULTIUSER COMMUNICATIONS PLATFORM FROM FDMA TO CDMA Ali N. Akansu and Mehmet V. Tazebay New Jersey Institute of Technology Department of Electrical and Computer Engineering Center for Communications and Signal Processing Research University Heights, Newark, NJ 07102 ABSTRACT Orthogonal transmultiplexers have been successfully utilised for multi-user communications. They are of the FDMA type in their most common version. Mostly, frequency-selective PR-QMFs were used in transmultiplexers as orthogonal user codes for CDMA communications reported in the literature. This conflicts with the fundamentals of CDMA theory. We introduce novel M-valued spread spectrum PRQMF codes in this paper. It is shown that the proposed M-valued spread spectrum PR-QMF codes with minimised auto- and cross-correlation properties outperform the conventional Gold codes in CDMA communication scenarios considered in the paper.
### SS.3.8

ON EFFICIENT IMPLEMENTATION OF MULTIDIMENSIONAL MULTIRATE FILTERS DERIVED FROM ONE-DIMENSIONAL FILTERS Tsuhan Chen AT&T Research Room 4C528, 101 Crawfords Corner Road, Holmdel, NJ 07733, USA Tel: +1 908 949-2708 Fax: +1 908 957-8388 e-mail: tsuhan@research.att.com We study the efficient implementation of multidimensional (MD) filters used in multirate systems. These filters, typically having parallelepiped-shaped passband supports, can be derived from one-dimensional (1D) prototype filters. The resulting nonseparable MD filters have separable polyphase components that are combinations of the polyphase components of the 1D prototypes, so efficient implementation exists. We show that, for the two-dimensional case, all the polyphase components of the 1D prototypes are utilized. Therefore, there is no design overhead in this scheme.
### SS.3.10

MEASUREMENT AND SYMBOLIC ANALYSIS OF IMPLEMENTED MULTIRATE SYSTEMS Hans W. Schuessler and Frank Heinle Lehrstuhl fuer Nachrichtentechnik, Universitaet Erlangen-Nuernberg, Cauerstrasse 7, D-91058 Erlangen, Germany Phone : +49-9131-857101 Fax : +49-9131-303840 E-mail: heinle@nt.e-technik.uni-erlangen.de Multirate systems (MRS) play a major role in modern telecommunication. Important examples are filter banks for image or speech coding, transmultiplexers, and sampling rate converters. In general, these systems are designed without consideration of implementation aspects such as wordlength limitations. The performance of realized systems will therefore differ from the desired one depending on the system structure. Not all deviations can be calculated in closed form and even practicable calculations are often extensive and error-prone. Therefore, we present a method for measuring quantization effects in realized MRS. Furthermore, we introduce a new program for the symbolic analysis of MRS using the computer algebra program MAPLE.
### SS.3.11

EFFICIENT IIR SWITCHED-CAPACITOR DECIMATORS AND INTERPOLATORS F. A. P. Baruqui (1), A. Petraglia (2), S. K. Mitra (3) and J. E. Franca (4) (1) Programa de Engenharia Eletrica COPPE, EE/UFRJ - 21945-970 Rio de Janeiro, RJ, Brasil. baruqui@coe.ufrj.br (2) Programa de Engenharia Eletrica COPPE, EE/UFRJ - 21945-970 Rio de Janeiro, RJ, Brasil. antonio@coe.ufrj.br (3) Depto. of Elec. & Comp. Engineering - Univ. of California, Santa Barbara, CA 93106-9560. mitra@ece.ucsb.edu (4) Grupo de Circ. e Sist. Integrados, Inst. Superior Tecnico - Av. Rovisco Pais 1, 1096 Lisboa Codex Portugal. franca@ecsm4.ist.utl.pt ABSTRACT The IIR switched-capacitor decimators and interpolators proposed in this paper are based on the polyphase decomposition of an M-th band IIR lowpass filter, and uses first- and second-order allpass switched-capacitor filters as basic building blocks, which operate at the lower sampling rate, reducing power consumption, capacitance spread and total capacitance area. The resulting switched-capacitor network has low sensitivity with respect to capacitance ratio errors, specially in the passband, where very low sensitivity is guaranteed by using structurally allpass filters. These properties have been verified by computer based sensitivity analysis, and an illustrative design example, considering realistic specification for video communication applications, included in the paper, along with comparisons with other approaches reported in the literature. Laboratory results obtained with a prototype filter are shown as well.
### SS.4.1

COMBINED ACOUSTIC ECHO CONTROL AND NOISE REDUCTION FOR HANDS-FREE TELEPHONY - STATE OF THE ART AND PERSPECTIVES Rainer Martin and Peter Vary IND, Aachen University of Technology 52056 Aachen, Germany Tel: +49 241 806984; fax: +49 241 8888186 e-mail: martin@@ind.rwth-aachen.de In this paper we summarize and discuss recent results in acoustic echo cancellation and noise reduction with emphasis on methods which combine both aspects. It is shown that echo control and noise reduction can support each other in a true synergy. The paper discusses fundamental issues of algorithm design and suggests that a frequency domain multi-microphone solution might be best suited to achieve the desired performance.
### SS.4.2

BINAURAL ANALYSIS METHODS AND THEIR RELATIONSHIP TO QUALITY EVALUATION OF HANDS-FREE TELECOMMUNICATION EQUIPMENT H. W. Gierlich HEAD acoustics GmbH, Ebertstr. 30a 52134 Herzogenrath, Germany,Tel.: +49 2407 57722; Fax: +49 2407 57799, e-mail:head-gr@infoac.rmi.de Since modern telecommunication equipment, especially hands-free telephones, incorporates sophisticated signal processing, the analysis methods must take into account the properties of the human hearing. The basis for the correct aquisition of test data -used for auditory but instrumental measurements as well- is the binaural rcording and binaural analysis of the test stimuli. The paper gives an overview, in what ways binaural methods can be applied for Quality evaluation. The paper focusses on methods for aquiring test data in the listening situation, in the converational situation and for instrumental measurements using defined, artificial test stimuli. Various methods for playback of binaurally recorded sounds in different situations are shown.
### SS.4.3

IMPLEMENTATION AND EVALUATION OF AN ACOUSTIC ECHO CANCELLER USING DUO-FILTER CONTROL SYSTEM Yoichi Haneda, Shoji Makino, Junji Kojima, and Suehiro Shimauchi NTT Human Interface Laboratories (E-mail: haneda@splab.hil.ntt.jp) The developed acoustic echo canceller uses an exponentially weighted step-size projection algorithm and a duo-filter control system to achieve fast convergence and high speech quality. The duo-filter control system has an adaptive filter and a fixed filter, and uses variable-loss insertion. Evaluation of this system with multi-channel A/D and D/A converters showed that (1) the convergence speed is under 1.5 seconds for speech input when the adaptive filter length is 125 ms, (2) the residual echo level is nearly as low as the ambient noise level (average: under -20 dB; maximum: under -35 dB), and (3) near-end speech is sent with no disturbance during double talk.
### SS.4.4

IDENTIFYING THE TRUE ECHO PATH IMPULSE RESPONSES IN STEREOPHONIC ACOUSTIC ECHO CANCELLATION Fabrice Amand*, Andr Gilloire**, Jacob Benesty*** * CEFRIEL, Politecnico di Milano, Via Emanueli, 15, 20126 - Milano, Italy email: amand@mailer.cefriel.it ** CNET LAA/TSS/CMC Technopole Anticipa, 2 Avenue Pierre Marzin, 22307 Lannion Cedex, France e-mail: gilloire@lannion.cnet.fr *** Lucent Technologies, Bell Labs Innovations, 600 Mountain Avenue, Murray Hill, NJ 07974, USA ABSTRACT A fundamental problem in stereophonic acoustic echo cancellation for teleconferencing is the possibility to identify the true impulse responses of the acoustic echo paths. This problem arises from the correlation between the two signals picked up in the remote room. We demonstrate by simple theoretical considerations and experiments that in real situations, due to the characteristics of the acoustic environment in the remote room, the identified impulse responses converge to the true echo path impulse responses.
### SS.4.5

ANALYSIS OF TWO STRUCTURES FOR COMBINED ACOUSTIC ECHO CANCELLATION AND NOISE REDUCTION Yann Guelou, Abdelkrim Benamar, Pascal Scalart France Telecom CNET LAA/TSS/CMC benamar@lannion.cnet.fr, scalart@lannion.cnet.fr This paper addresses the problem of speech enhancement in the context of GSM hands-free radiotelephony where the signal to be transmitted is corrupted by background noise and echo signals. We analyze possible schemes for combined acoustic echo cancellation (AEC) and noise reduction (NR) devices. Considering two AEC algorithms and one NR device, we show that the overall performances obtained by these schemes are greatly dependent on the intrinsic behaviour of the considered AEC algorithms. These results are confirmed by informal listening tests presented in that contribution.
### SS.4.6

PERFORMANCE OF ADAPTIVE DEREVERBERATION TECHNIQUES USING DIRECTIVITY CONTROLLED ARRAYS C. Marro*, Y. Mahieux*, K. U. Simmer** *FRANCE TELECOM - CNET LAA/TSS/CMC Technopole Anticipa, 2 avenue Pierre Marzin 22307 Lannion Cedex - FRANCE **Houpert Digital Audio, Wiener Str 5, D-28359 Bremen, GERMANY e-mails: marro@lannion.cnet.fr - mahieux@lannion.cnet.fr - u.simmer@proaudio.de ABSTRACT: The use of optimal postfiltering has been previously proposed to increase the performance of microphone arrays. In this paper, an analysis of the postfilter shows that its behaviour is closely related to the one of the array. This is illustrated by considering a typical videoconferencing context. The results we provide demonstrate that the use of a directivity controlled array is a requirement to ensure a sufficient robustness of the whole system. It is also shown that the dereverberation performed by the postfilter is limited and that its main interest lies in a significant reduction of the acoustic echo even in the double talk case. This attractive property depends on the whole design of the array including its placement versus the acoustic echo sources.
### SS.4.7

A HANDS-FREE PHONE SYSTEM BASED ON A PARTITIONED FREQUENCY-DOMAIN ADAPTIVE ECHO CANCELER Pius Estermann and August Kaelin Swiss Federal Institute of Technology Zurich, Switzerland, esterman@isi.ee.ethz.ch Providing means for hands-free conversation is of great interest for industry and is still a current research topic. In this paper, a partitioned frequency-domain adaptive FIR filter is applied in a hands-free phone system to provide echo compensation. It is optimally designed in such a way that it approaches the tracking behavior of the Recursive Least-Squares (RLS) algorithm, and it is combined with a new adaptive step-size control in order to cope with varying far-end/local speaker situations. Its performance is demonstrated by means of real speech signals. Assuming a loudspeaker-room-microphone impulse response duration of 3500 taps, an increase in the critical gain of 14dB has been obtained (for each phone) by using an adaptive echo canceler with 1152 taps.
### SS.4.8

ECHOCOMPENSATION AND NOISE SUPPRESSION FOR SPEECH RECOGNITION APPLICATIONS Dr. Walter Stammler, Matthias Schulz, Frank Scheppach Daimler-Benz Aerospace AG, Sensor Systems Woerthstrasse 85, D-89077 Ulm, Germany phone: +49 731 3925631, fax: +49 731 3927144 e-mail: scheppf@vs-ulm.dasa.de ABSTRACT This contribution deals with the role and the performance of echocompensation and noise suppression, when used in combination with speech recognition systems. For two applications of interest (speech control in car or via telephone) there are quite significant differences to classical echocompensation and noise suppression for telephone conferences. It will be pointed out, how the systems are structured, what performance can be achieved and how realtime solutions are looking like.
### SS.4.9

HANDSFREE SPEAKING FOR COMMUNICATION TERMINALS Hans J. Matt and Michael Walker ALCATEL TELECOM, Lorenzstr. 10, D-70435 Stuttgart, Germany Tel: +49-711-869-32246 and -32556; Fax: +39-711-869-32302 e-mail: hmatt@rcs.sel.de and mwalker@rcs.sel.de Abstract In this paper some considerations for the realisation of a most natural handsfree speaking are presented. Its essential features comprise full duplex operation, speech loudness well adapted to the user's environment, background noise suppression and cancellation of line echoes. Furthermore its algorithms be able to work properly even under severe weaknesses caused by low cost components to allow the realisation of economic products.
### SS.5.1

Title: AN IMPROVED FULLY PARALLEL STOCHASTIC GRADIENT ALGORITHM FOR SUBSPACE TRACKING Authors: Jeroen Dehaene (*), Marc Moonen (+), Joos Vandewalle (+) Affiliation: (*) Harvard University, Pierce Hall, Cruft lab 311, 29 Oxford street, Cambridge MA 02138, U.S.A. email : jeroen@arcadia.harvard.edu (+) Katholieke Universiteit Leuven, E.E Dept. (ESAT), K. Mercierlaan 94, 3001 Leuven, Belgium email: marc.moonen@esat.kuleuven.ac.be Abstract: A new algorithm is presented for principal component analysis and subspace tracking, which improves upon classical stochastic gradient based algorithms (SGA) as well as several other related algorithms that have been presented in the literature. The new algorithm is based on and inherits its main properties from a continuous-time algorithm, closely related to the QR flow. It gives the same estimates as classical SGA algorithms but requires only O(N.k) operations per update instead of O(N.k.k), where N is the dimension of the input vector and k is the number of principal components to be estimated. A parallel version with O(k) parallelism (processors) and throughput O(1/N) and is straightforwardly derived. A fully parallel version, with throughput independent of the problem size O(1), may be obtained at the expense of O(N.N) additional operations.
### SS.5.2

A MINIMAL, GIVENS ROTATION BASED FRLS LATTICE ALGORITHM Francois Desbouvries and Phillip A. Regalia Departement Signal & Image Institut National des Telecommunications 9 rue Charles Fourier 91011 Evry cedex, France desbou@int-evry.fr, regalia@int-evry.fr Abstract: We propose a new Givens rotation based least-squares lattice algorithm. Based on spherical trigonometry principles, this algorithm turns out to be a normalized version of the fast QRD-based least-squares lattice filter, introduced independently by Ling and Proudler et al. In constrast to those algorithms, the storage requirements of the new algorithm are minimal (in the system theory sense). From this, we show that the new algorithm satisfies the backward consistency property, and hence enjoys stable error propagation.
### SS.5.3

A HIGHLY PARALLEL MULTICHANNEL FAST QRD-LS ADAPTIVE ALGORITHM Athanasios A. Rontogiannis and Sergios Theodoridis Department of Informatics Division of Communications and Signal Processing University of Athens GR-157 71 Zografou, GREECE e-mail:{tronto,stheodor}@di.uoa.gr A new fast multichannel QR decomposition (QRD) least squares (LS) adaptive algorithm is presented in this paper. The algorithm deals with the general case of channels with different number of delay elements and is based exclusively on numerically robust orthogonal Givens rotations. The new scheme processes each channel separately and as a result it comprises scalar operations only. Moreover, the proposed algorithm is implementable on a very regular systolic architecture and offers substantially reduced computational complexity compared to previously derived multichannel fast QRD schemes.
### SS.5.4

Increasing the Performance of the LMS algorithm using an Adaptive Preconditioner. I. K. Proudler, I.D. Skidmore, and J.G. McWhirter. Rm. E506, DRA, St. Andrews Road, Malvern, Worcestershire, WR14 3PS, UK. Tel. +44 1684 894228 Fax. +44 1684 896502 e-mail: proudler@signal.dra.hmg.gb In this paper we outline a technique for increasing the convergence rate of the LMS algorithm by means of a preconditioning filter which reduces the eigenvalue spread of the input signal. Specifically we use a low order linear prediction lattice filter followed by a tapped-delay-line as the preconditioner. Some computer simulations are provided to demonstrate the increased convergence rate of the new algorithm. (c) British Crown Copyright 1996 / DERA.
### SS.5.5

Stabilizing the LFTF algorithm by leakage control Bernhard Nitsch and Stephan Binde Institut fuer Netzwerk- und Signaltheorie Fachgebiet Theorie der Signale Merkstrasse 25 D-64283 Darmstadt Germany To stabilize the FTF algorithm the accumulation of numerical errors can be prevented by introducing a leakage factor in the equation system. In state space description the leakage factor causes a reduction of the eigenvalues of the linearized error system matrix. By an appropriate choice of the leakage factor the eigenvalues can be transformed into the unit circle of the z-plane resulting in a stable round-off error system. The structure of the linearized error system matrix shall be analysed and by comparing the Leakage FTF algorithm (LFTF) with the Stabilized FTF algorithm (SFTF) and the NLMS algorithm in a real-time environment the success of this method is shown.
### SS.5.6

PAST INPUT RECONSTRUCTION IN BACKWARD CONSISTENT FAST LEAST-SQUARES ALGORITHMS Phillip A. Regalia Departement Signal & Image Institut National des Telecommunications 9, rue Charles Fourier F-91011 Evry cedex France e-mail: regalia@galaxie.int-evry.fr Abstract: We present an analytic solution to the past input reconstruction problem, which consists in describing all past input sequences which would give rise to a given set of variables in fast least-squares algorithms, whenever the variables in question are reachable.
### SS.5.7

ASYMPTOTIC ANALYSIS OF THE UNDERDETERMINED RECURSIVE LEAST-SQUARES ALGORITHM Authors: B. Baykal, O. Tanrikulu and A. G. Constantinides Affiliation: Signal Processing and Digital Systems Section Dept. of EE. Eng., Imperial College of Sci., Tech. and Med., London SW7 2BT, UK, Email: b.baykal@ic.ac.uk Abstract: The asymptotic analysis of the Underdetermined Recursive Least-Squares (URLS) algorithm is performed. In particular, the behaviour of the weight-error correlation matrix is investigated and the misadjustment is calculated. For highly correlated input signals the misadjustment is shown to be inversely proportional to the minimum eigenvalue of the underdetermined order autocorrelation matrix. Simulations are included to justify the conclusions.
### SS.5.8

ROBUSTNESS AND CONVERGENCE OF ADAPTIVE SCHEMES IN BLIND EQUALIZATION AND NEURAL NETWORK TRAINING Ali H. Sayed and Markus Rupp Ali H. Sayed, Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106--9560, sayed@ece.ucsb.edu Markus Rupp, Wireless Technology Research Dept., Lucent Technology, 791 Holmdel-Keyport Rd., Holmdel NJ 07733--0400, rupp@lucent.com We pursue a time-domain feedback analysis of adaptive schemes with nonlinear update relations. We consider commonly used algorithms in blind equalization and neural network training and study their performance in a purely deterministic framework. The derivation employs insights from system theory and feedback analysis, and it clarifies the combined effects of the step-size parameters and the nature of the nonlinear functionals on the convergence and robustness performance of the adaptive schemes.
### SS.5.9

MULTI-CHANNEL ADAPTIVE FILTERING APPLIED TO MULTI-CHANNEL ACOUSCTIC ECHO CANCELLATION Jacob Benesty (1), Pierre Duhamel (2), Yves Grenier (2) (1) Lucent Technologies, Bell Labs Innovations, New Jersey, USA, jb@research.att.com (2) ENST, Dept. Signal, 46 rue Barrault, 75634 Paris Cedex 13, France duhamel@sig.enst.fr, grenier@sig.enst.fr This paper presents some new ways of deriving multi-channel (M-C) adaptive algorithms in the context of M-C acoustic echo cancellation (AEC). We first discuss the M-C identification problem which occurs in such systems by distinguishing between the ideal case where the adaptive filters have the very length of the impulse responses of the distant room and the real case. These properties also explain some problems encountered with classical M-C least mean squares (LMS) algorithm: straightforward generalization of the LMS algorithm and the affine projection algorithm (APA) to the M-C case are easily obtained. However, the resulting algorithms do not take into account the cross-correlation between the input signals (such the M-C RLS algorithm), hence they do converge very slowly. Based on an original writing of the M-C recursive least squares (RLS) algorithm, we obtain useful properties that are used to overcome this problem, and we derive efficient algorithms in terms of convergence rate.
### SS.5.10

A NEW FREQUENCY DOMAIN EQUALIZER FOR CHANNELS WITH LONG IMPULSE RESPONSE Kostas Berberidis (*) and Jacques Palicot (#) (*) Computer Technology Institute (C.T.I.) P.O. Box 1122 26110 Patras, GREECE E-mail: berberid@cti.gr (#) C.C.E.T.T., SRA/DCS 4 rue du Clos Courtel 35512 Cesson Sevigne Cedex, FRANCE E-mail: palicot@ccett.fr ABSTRACT: In this paper a recently introduced block Decision Feedback Equalizer (DFE) is further studied and developed. Moreover it is shown that the new technique is particularly suitable for channel equalization in applications involving channels with medium up to long impulse response. The new equalizer, which is totally implemented in the frequency domain, offers remarkable savings in computational complexity as compared to the conventional time domain DFE. Moreover the new technique results in a Symbol Error Rate which is always lower (or much lower) with respect to that of the existing frequency domain linear equalization techniques. -------------------------------
### SS.6.1

NONLINEAR FUZZY FILTERS: AN OVERVIEW Fabrizio Russo D.E.E.I. - University of Trieste, Via A. Valerio 10, Trieste I-34127, Italy Tel.: +39-40-6763015, FAX : +39-40-6763460, E-mail: rusfab@univ.trieste.it Emergent techniques based on Fuzzy Logic have successfully entered the area of nonlinear filters. Indeed, a variety of methods have been recently proposed in the literature which are able to perform detail-preserving smoothing of noisy image data yielding better results than classical operators. The aim of this paper is to present a selection of the most significant contributions in this field focussing on their similarities and differences. A brief introduction to the theory of fuzzy sets and systems is presented in order to make these results available to non-fuzzy researchers too.
### SS.6.2

DATA-DEPENDENT FILTERING BASED ON IF-THEN RULES AND ELSE RULE Akira Taguchi and Tomoaki Kimura Department of Electrical and Electronic Engineering Musashi Institute of Technology Setagaya-ku, Tokyo 158, Japan Tel: +81 3 3703 3111; Fax: +81 3 5707 2174 e-mail: taguchi(@ipc.musashi-tech.ac.jp ABSTRACT We have proposed fuzzy filters based on local characteristics, in order to remove additive noise while preserving signal edges. Fuzzy filters were constructed by only IF-THEN rules. This paper shows a novel fuzzy filter which is constructed by not only IF- THEN rules but also ELSE rule. A lot of IF-THEN rules which have the same consequent, can be integrated into one ELSE rule. As a results, introducing the ELSE rule can realize increasing the local characteristics for the fuzzy filter without increasing the number of IF-THEN rules.
### SS.6.3

FUZZY CELL HOUGH TRANSFORM Vassilios Chatzis and Ioannis Pitas Department of Informatics University of Thessaloniki, 54006 Thessaloniki, GREECE Tel, fax: +30-31-996304 e-mail: pitas@zeus.csd.auth.gr In this paper a new variation of Hough Transform is proposed. It can be used to detect shapes or curves in an image, with better accuracy, especially in noisy images. It is based on a fuzzy split of the Hough Transform parameter space. The parameter space is split into fuzzy cells which are defined as fuzzy numbers. This fuzzy split of the parameter space provides the advantage to use the uncertainty of the contour points location, which is increased when noisy images have to be used. Moreover, the computation time is slightly increased by this method, in comparison with classical Hough Transform.
### SS.6.4

FUZZY CENTER WEIGHTED MEDIAN FILTERS Akira Taguchi and Nobunori Izawa Department of Electrical and Electronic Engineering Musashi Institute of Technology Setagaya-ku, Tokyo 158, Japan Tel: +81 3 3703 3111; Fax: +81 3 5707 2174 e-mail: taguchi@ipc.musashi-tech.ac.jp ABSTRACT Stack filters are a class of nonlinear filters, first introduced by Wedent et. al. Stack filters perform well in many situations where linear filters fail. Stack filters include rank order filters, morphological filters and weighted median filters. The stack filter is defined by a Boolean function. The output of Boolean functions is restricted two values (i.e., "0" or "1"). Intuitively, one would expect better performance for stack filters, if the output of Boolean functions is defined from 0 to 1 continuously. We call this Boolean functions fuzzy Boolean functions. We discuss about fuzzy center weighted median (FCWM) filters which is one of the simplest fuzzy stack filters in this paper. Two design methods are shown in this paper.
### SS.6.5

A FUZZY EXPERT SYSTEM FOR LOW LEVEL IMAGE SEGMENTATION Mauro Barni*, Sandro Rossi*, Alessandro Mecocci** *Department of Electronic Engineering, University of Florence Via di Santa Marta, 3 - 50139 Firenze, ITALY **Department of Electronic Engineering, University of Siena Via Roma, 56 - 53100 Siena, ITALY e-mail: barni@cosimo.ing.unifi.it Abstract. In this paper a general purpose fuzzy expert system is presented for low level image segmentation. By means of approximate reasoning based on fuzzy logic, the criticality of the choice of the several thresholds and parameters which usually must be tuned to make the expert system work properly is reduced. More specifically, it is proved that, by keeping constant the number of rules the expert system consists of, the fuzzy approach permits to build a more general system, capable of giving satisfactory results for a large number of images stemming from different applications. The validity of the approach is demonstrated by comparing the effectiveness of a classical expert system with that of its corresponding fuzzy version. Upon analysis of the results, the superiority of the fuzzy system in terms of robustness and generality comes out.
### SS.6.6

INTEGRATION OF LINGUISTIC KNOWLEDGE FOR COLOUR IMAGE SEGMENTATION T. CARRON, P. LAMBERT Laboratoire d'Automatique et de MicroInformatique Industrielle LAMII/CESALP - Universite de Savoie - B.P 806 - F.74016 Annecy Cedex (France) (CNRS G1047 - Information-Signal-ImageS) e-mail: carron@univ-savoie.fr - lambert@univ-savoie.fr The Hue, Chroma, Intensity (HCI) space is well suited to colour images segmentation processing. In this paper, we used fuzzy logic for integrating specific knowledge of the Hue component. Based upon several linguistic rules which built a symbolic cooperation between Hue and Intensity according to Chroma, a region growing segmentation with fuzzy aggregation is proposed. This fuzzy segmentation is compared with a technique using a Fuzzy C-Means algorithm in different colour spaces.
### SS.6.7

FUSION OF DATA FROM FUZZY INTEGRAL-BASED ACTIVE AND PASSIVE COLOUR STEREO VISION SYSTEMS FOR CORRESPONDENCE IDENTIFICATION Alois Knoll, Ralf Schroeder, and Andre Wolfram University of Bielefeld, Faculty of Technology, Department of Computer Science, Postfach 10 01 31, D-33501 Bielefeld, Germany e-mail: {knoll,andre}@techfak.uni-bielefeld.de As shown in our previous work, an approach using the fuzzy-integral [3] can be applied to solving the correspondence problem of active colour stereo vision systems [2]. Evaluating the similarity measure derived in [2] enables the identification of a correct match or otherwise indicates at least several possible matches. To reduce the remaining ambiguity further, the novel approach presented here uses data fusion techniques to make use of additional fuzzy feature-based information gathered by passive colour stereo procedures. Our experimental results, which are discussed in the paper, indicate that this new approach is considerably more effective than the approach using only intensity-based information for determining the similarity of line blocks in colour stereo images. We conclude the paper with a discussion of the potential of the method and directions of possible future research.
### SS.6.8

FUZZY CLUSTERING OF DIGITAL IMAGES BY EXPLOITING DENSITOMETRIC AND TOPOLOGICAL INFORMATION M. Mari, C. Garcia and S. Dellepiane Department of Biophisical and Electrionic Engineering (DIBE) University of Genoa via Opera Pia, 11a, 16145 Genova, Italy Tel. +39 10 3532754; fax: +39 10 3532134 e-mail: silvana@dibe.unige.it ABSTRACT Topological features are very seldom exploited in image processing, also due to the complexity of their extraction. Even when topological features are used, densitometric information are usually not considered at the same time. The simultaneous exploitation of such features, as proposed in the paper, allows a more appropriate automatic processing of digital images. A novel image segmentation approach is presented (based on fuzzy clustering) that exploits topological and densitometric image features. The novelty of such an image segmentation consists mainly in using easy and fast computation methods, to improve the handling of any digital image, whenever automatic segmentation or data reduction processing is required.
Paper