Workshop on Automatic Speech Annotation and Analysis
  • Home
  • About the Workshop
  • Speakers
  • Finalized Programe
  • Registration

19 – 23 October 2015


10:00 – 16:00


Room R903 in the Shirley Chan Building (Block R) at PolyU


Free of charge



PROCORE – France/Hong Kong Joint Research Scheme

VARIAMU - Variations in Action: A Multilingual Approach

Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University

polyulogo           cbslogo            

It has become more and more expected for linguists of different branches to take into account large quantities of empirical data, often including hours of recorded speech data, in formulating any linguistic theories. The biggest obstacle linguists are faced with today is not the availability of data, but its annotation and analysis. Though a number of tools have been developed in recent years to automate those labor-intensive tasks, most of them require a certain level of expertise in computer science, which is beyond the reach of most linguists without the help of an engineer. The aim of the four-day workshop is to introduce to linguists the principles and tools available to perform automatic speech annotation and analysis. It will provide hands-on practice on installing and using the following tools designed for linguists to do speech analysis:

SPPAS – Automatic Annotation of Speech - an annotation software that allows users to create automatically, visualize and search annotations for audio data. Among others, it is able to produce automatic annotations of various linguistic domains including utterance, word, syllable and phonemes from a speech recording and its transcription. SPPAS is multi-lingual which is currently implemented for French, English, Italian, Spanish, Catalan, Polish, Mandarin, Japanese, Taiwan Southern Min and Cantonese. It is multi-platform (Linux, Windows, and MacOS), and is open source software issued under the GNU Public License. In addition to phonetic segmentation and alignment, word segmentation for spoken Cantonese will also be introduced.

Momel – an algorithm for the automatic factoring of fundamental frequency contours into two components: a macromelodic component and a micromelodic component.

INTSINT – a prosodic equivalent of the International Phonetic Alphabet. Originally designed as a descriptive tool for linguistic annotation, INTSINT has since been implemented as an algorithm converting the output of the Momel algorithm to a sequence of discrete tonal symbols which can then be used as input to synthesise a fundamental frequency contour.

ProZed – a tool to test prosodic models of rhythm and melody using an analysis by synthesis paradigm to derive a synthetic output from an abstract representation of the prosody.

(Listed in alphabetical order)

Dr. Brigitte Bigi
Researcher, Laboratoire Parole et Langage, CNRS & Aix-Marseille Université, FRANCE. 

Prof. Daniel Hirst
Directeur de Recherches Emeritus, Laboratoire Parole et Langage, CNRS & Aix-Marseille Université, FRANCE.

Prof. Tan Lee
Associate Professor, Director of the Digital Signal Processing and Speech Technology Laboratory,
Department of Electronic Engineering, The Chinese University of Hong Kong.



19 Oct (Mon)

20 Oct (Tue)

22 Oct (Thur)

23 Oct (Fri)

10:00 - 12:00

Lecture and Hands-on:
Methodology on Speech
Corpus Creation

Lecture and Hands-on:
Running Analysis with

The Phonetic Annotation of
Speech Melody

The Phonological Annotation
of Speech Melody

Dr. Brigitte Bigi

Dr. Brigitte Bigi

Prof. Daniel Hirst

Prof. Daniel Hirst

  • Basic concepts of a corpus creation
  • Methodology workflow
  • Introducing SPPAS
  • How to run statistic analysis with SPPAS
  • How to filter data with SPPAS
  • A survey on different systems of automatic annotation of speech melody
  • Problems of obtaining a reliable F0 curve for the utterances analyzed
  • Problems of separating out micromelodic effects from macromelodic effects
  • Choosing and applying an appropriate model for the macromelodic pattern
  • Different levels of analysis and representation of speech prosody
  • How those levels relate to current models of prosody such as ToBI
  • A description of a general multi-level and multilingual framework for prosodic annotation

12:00 - 14:00


14:00 - 16:00

Lecture and Hands-on:
Automatic Annotation with

Acoustic Analysis of
Cantonese Speech

Momel and ProZed


Dr. Brigitte Bigi

Prof. Tan Lee

Prof. Daniel Hirst

Prof. Daniel Hirst

  • Installing SPPAS
  • Using SPPAS


  • The principles and application examples of acoustic analysis of Cantonese speech
  • How to display the prosody of an utterance on a speaker independent scale
  • How to describe intonation patterns in a language independent way


Please send your full name, email address, institution, and the programme you are studying in your institution
to the following email address:

Due to limited quota available, registration will be processed on a first-come-first-served basis.


Please contact Dr. Roxana Fung (