Softwares

Software selection

To decide about usefulness and usability, it is necessary to know:

  1. about the license,
  2. about the ease of use,
  3. about the strengths/weaknesses for specific annotation purposes,
  4. about the type of data or analysis it is designed for,
  5. about its compatibility with other annotated data.

versus

Finding and evaluating appropriate software (1)

1. About the license

Even if you can personally afford to pay for a licence for software you may wish to share your methodology with other students or researchers who cannot afford to buy a license.

Finding and evaluating appropriate software (2)

2. About the ease of use

If the software requires the help of an engineer each time you need to use it, this will be a serious limitation on your usage.

Finding and evaluating appropriate software (3)

3. About the strengths/weaknesses for specific annotation purposes

Finding and evaluating appropriate software (4)

4. About the type of data or analysis the tool/software is designed for

When annotating corpora at multiple linguistic levels, annotators may use different expert tools for different phenomena or types of annotation. These tools employ different data models and accompanying approaches to visualization, and they produce different output formats. (Chiarcos et al. 2008)

Finding and evaluating appropriate software (5)

5. About its compatibility with other annotated data

Brief overview of 3 softwares

Requirements :

  1. free and open-source (GPL),
  2. multi-platform, ease of use (GUI), with a tutorial and/or documentation,
  3. well-known in their communities, with publications and evaluations.

Praat: design & compatibility

  1. Type of data or analysis:
    • Manually annotating sound files
    • Visualizations of audio data: waveform or spectrogram, pitch contour, ...
    • Annotations on multiple layers, called tiers
    • Many plugins for different kinds of analysis
  2. Compatibility:
    • Annotation files are in several Praat-specific text formats : Praat-TextGrid
    • Interoperability: none!

Praat: screenshot

http://www.praat.org
http://www.praat.org

Elan: design & compatibility

  1. Type of data or analysis:
    • Creation of complex annotations in video (and audio) resources
    • Annotations can be created on multiple layers, that can be hierarchically interconnected and can correspond to different levels of linguistic analysis
  2. Compatibility:
    • Annotation files are in a specific XML format
    • Import from/export to a variety of other formats, including Praat-TextGrid

Elan: screenshot

https://tla.mpi.nl/tools/tla-tools/elan/
https://tla.mpi.nl/tools/tla-tools/elan/

SPPAS: design & compatibility

  1. Type of data or analysis:
    • Create, visualize and search annotations for audio data
    • Automatically speech segmentation annotations from a recorded speech sound and its transcription.
    • Plugins, CLI, scripting
  2. Compatibility:
    • Annotation files are in a specific XML format
    • Import from/export to a variety of other formats, including: Praat (TextGrid, PitchTier, IntensityTier), Elan (eaf), Annotation Pro (antx), Phonedit (mrk), HTK (lab, mlf), Sclite (ctm, stm), subtitles (sub, srt) Transcriber (trs, import only), Anvil (anvil, import only), CSV

SPPAS: dedicated to automatic annotations

SPPAS: screenshot

http://sldr.org/sldr000800/preview/
http://sldr.org/sldr000800/preview/

Summary

 
  1. Introduction
  2. Softwares
  3. An annotation workflow
  4. Exploring
  5. Sharing
  6. References