The proposed methodology is designed for the human analyst (mostly researchers in Linguistics).
Therefore, we assume that the methodology is general enough to be useful for broad class of research applications.
Different analytical domains - e.g. speech and gesture - and theoretical perspectives require a rigorous organization of the annotation procedure.
A very large number of dimensions have been annotated in the past on mono and multimodal corpora. To quote only a few, some frequent speech or language based annotations are speech transcript, segmentation into words, utterances, turns, or topical episodes, labeling of dialogue acts, and summaries; among video-based ones are gesture, posture, facial expression [...].
(Popescu-Belis, 2010)
In this tutorial, we will report on:
Garbage in, Garbage out.
The capture of multimodal corpora requires complex settings such as instrumented lecture and meetings rooms, containing capture devices for each of the modalities that are intended to be recorded, but also, most challengingly, requiring hardware and software for digitizing and synchronizing the acquired signals.
(Popescu-Belis, 2010)
Speech may be annotated for:
An EOT must include, at least:
The automatic systems must be adapted to deal with EOT
A problem divided into 3 sub-tasks:
Tokenization is also known as "Text Normalization".
The main steps in SPPAS are:
mp | this | is | hum | an | enriched | transcription | number | one | |
this | is | + | hum | an | enrich | transcription | number | one | |
(Bigi 2011)
Phonetization is also known as grapheme-phoneme conversion
Converting from written text into actual sounds, for any language, cause several problems that have their origins in the relative lack of correspondence between the spelling of the lexical items and their sound contents.
SPPAS implements: (Bigi 2013)
In (Bigi et al. 2012)
, we compared 3 types of OT:
Manual alignment has been reported to take between 11 and 30 seconds per phoneme.
(Leung and Zue, 1984)
SPPAS (python+Julius), available for English, French, Italian, Spanish, Catalan, Polish, Japanese, Mandarin Chinese, Taiwanese, Cantonese
(Bigi et al. 2010)
(Bigi et al. 2014)
(Hirst and Espesser, 1993)
(Tellier 2014)