logo comere

Canal "cmr-getalp_org-koma" du corpus de français tchaté

logo ortolang

This page: https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-koma-tei-v1
Back to corpus: https://hdl.handle.net/11403/comere/cmr-getalp_org

How to cite this resource

Falaise, A.(2014).Corpus de français tchaté getalp_org .In Chanier T. (ed) Banque de corpus CoMeRe Ortolang/CoMeRe.[ https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-koma-tei-v1]

This form has been automatically extracted from the TEI file. For the full contents, see https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-koma-tei-v1.xml.

Overview of the corpus

This sub-corpus corresponds to an individual channel within cmr-getalp_org. This is a textchat corpus, in French, from the EpikNet network of Internet Relay Chat. The corpus was collected in 2004 and automatically encoded (Falaise 2005). It includes 4 million messages from 105 channels that are heterogenous in terms of their thematic and prgamatic nature. Topics discussed in each channel vary and and range from general chat, talking about everything and nothing, to specialised chat where, for example, programming problems or current affairs are discussed. Differences between channels also exist on a pragmatic level. Certain channels are dedicated to games (hangman, quizzes) whilst others include press releases from the Agence France Presse (AFP - French Press Agency) or are dedicated to technical discussions that take a question-answer form; for example, the channel dedicated to programming questions. The initial corpus was converted into TEI within the framework of the CoMeRe (Communication médiée par les réseaux) project. This project aims to assemble different network-mediated communication corpora in French (Internet, telecommunication), to structure them in a standard format and to release the corpora in an open access format for research purposes. The CoMeRe project has received support from ORTOLANG and the national consortium Corpus-écrits. ;

Keywords : applied_linguistics ; discourse_analysis ; text_and_corpus_linguistics ; primary_text ; dialogue ; Communication Médiée par les Réseaux ; CoMeRe ; clavardage ; Computer Mediated Communication ; CMC ; textchat ; IRC ;

References

Falaise, A. (2005). Constitution d'un corpus de français tchaté. Actes de RECITAL 2005, Dourdan. oai:hal.archives-ouvertes.fr:hal-00909667


Rationale for this corpus

This corpus is a subpart of the CoMeRe corpus databank

The CoMeRe (Communication Médiée par les Réseaux) project aims to build a kernel corpus assembling existing corpora of different CMC (Computer-Mediated Communication) genres and new corpora built on data extracted from the Internet. These heterogenous corpora will be structured and processed in a uniform way, complemented with metadata. CoMeRe will be released as OpenData through the national infrastructure Ortolang, following constraints which will be reused for the forthcoming “Corpus de Référence du Français”. Project supported by the national consortium Corpus-écrits, sub-part of Huma-Num, and Ortolang (French correspondant to DARIAH).

The TEI structure used is an extension of TEI for CMC genres. This extension is developped by a European project for which thr participants are : Michael Beißwenger (DE), Thierry Chanier (FR), Isabella Chiari (IT), Maria Ermakova (DE), Maarten van Gompel (NL), Iris Hendrickx (NL), Axel Herold (DE), Henk van den Heuvel (NL), Lothar Lemnitzer (DE), Angelika Storrer (DE).


Description of the Interaction Space

CMC Environment

  • texchat-epiknet : Definition of the modality textchat. Type of messages used in cmr-getalp_org. Textchat features are those coming from EpikNet
  • Structure of interactions
    post: One post corresponds to one texchat turn, i.e. one participant's utterrance.

    Data Collection

    Data collected : From 2004-02-03 to 2004-04-09
    rs: Blanquefort, France
    rs: 7008161
    rs: http://www.botstats.com
    rs: http://www.epiknet.org

    Language of the data: français

    Types of interaction

    channel: mode: w ,textchat
    constitution: Messages typed by participants inside EpikNet IRC Channels and then collected by Botstats.com
    derivation: type: original ,
    domain: type: public ,
    factuality: type: fact ,
    interaction: type: complete ,active: plural ,passive: many ,
    preparedness: type: spontaneous ,
    purpose: degree: high ,Canal généraliste.

    Participants (extract)

    As explained in the tagUsageof the element post, the system does not offer unambiguous ways of identifying a participant when interacting in a given channel (over , possibly, several weeks). Tracking aliases' use may be one way of approching this identification, but is not completely reliable. Hence it is not possible to list here the list of participants. This identification may be a topic of investigation for future analyses.

    Person ID= cmr-get-c053-p87
    persName: RoMinet

    Person ID= cmr-get-c053-p102
    persName: Anonyme3816579 Anonyme3823806 Anonyme3823810 Anonyme4133778 Anonyme4136467 Anonyme4725192 Anonyme4725194 Anonyme4725200 Anonyme4725203 Anonyme4725204 Anonyme5166824 Anonyme5169017 Anonyme5170918 Anonyme5170976 Anonyme5173867 Anonyme5742573 Anonyme5750154 Anonyme5750274 Anonyme6588985 Anonyme6589953 Anonyme7055530 Anonyme7055532 Anonyme7504364 Ashley BOB_MATTE_Le_PV COCOTTEEEEEEEEEEEEEEEE CoChOn CoCoooooooooooooo Cramoh_Mechant DaWa DarkWilloW Dieu94 Forbidage LeSuicidaire LioneL LuX Luri-luri2 LyLy_P0WA Mary-Kate Matte-24-S2 Matte-Alias-S2 MiaMaGe MissOrphy MisterGellar TheDieu UlfGooD_HaveGooD VaMpIrO Vanessa Vive-La-Len WILLOW Wh-Dreams-Cannelle Wh-Dreams-Vaness Will_DivX_DoDo WilloW WilloW-Formate WilloW-HaveGooD WilloW-LinuX WilloW10 WilloW2 WilloW4 WilloW6 WilloW94 WilloW_BOUDE_ANGIE WilloW_HaVeGooD WilloW_HaWaY WilloW_HaWay WilloW_HaWay_Brb_Pasla_OUT_OFF WilloW_HaveGooD WilloW_Haway WilloW_Kiss_Cho WilloW_Kiss_LyLy WilloW_M_Alysea WilloW_M_IkkI WilloW_M_IkkI_et_quasi WilloW_M_Orphy WilloW_TheMoches WilloW_UlfGooD_HaveGooD WilloW_Very_OQp WilloW__HaWay WilloW_haway WilloW_identify WilloW|DivX WilloW|Moches_TeAm| WilloW|away Willou Willou_Dreams_Cannellou Willou_Dreams_Cannelou Willou_NumeriK Willouuuuuuuuuuu-KIssss-Cybele Willouuuuuuuuuuuuuu-KIssss-Cyb X-raison [[[]]]] appelle-moi-Dieu apppelle-moi-Dieu bigmaker_es_til_gentil bigmaker_est_gentil squatteur

    Person ID= cmr-get-c053-p565
    persName: Ur_Nammu Ur_Nammu`

    Person ID= cmr-get-c053-p566
    persName: Ur_Nammu


    Extracts of Interactions


    Composition of the corpus

    Collection cmr-getalp_org: list of files

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org---quizz---tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org--p-u-r--tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-18-25ans-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-actu-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-allsoluces-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-anaisgirl-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-angel-corp-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-blondin-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-botstats-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-c++-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-caline-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-cocktails-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-cstrike-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-darkcloud-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-dbz_legend-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-debian-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-deejays-world-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-deglingo-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-dragon-ball-z-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-edelweiss-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-edensensuelcam-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-enjoy-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-fac-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-ffmaniac-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-ffparadise-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-ffx-2-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-fikx-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-foldingathome-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-france1-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-francophone-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-funkycops-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-g-faction-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-games-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-gck-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-greatnothing-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-hikago-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-hinatalove-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-hokutoteam-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-humour-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-iquotes-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-irpg-chat-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-irpg-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-ishtar-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-japanimation-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-jump-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-koma-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-kyo-music-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-le-monde-des-reptiles-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-leseigneurdesanneaux-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-linux-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-madness-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-magnapoke-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-manga-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-manga4ever-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-manganimation-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-mew-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-mixi-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-nemo-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-ninou-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-nintendojofr-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-nokiagame.fr-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-php-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-planete-gundam-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-pokelord-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-politique-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-princedelu-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-programmation-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-qcradio-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-quebec-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-radioabf-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-radiofrhub-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-ragnarok-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-ragot-chan-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-raysanctuary-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-rhone-alpes-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-rien-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-sc-team-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-scripts-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-slackware-tei-v1

    https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-[dmb]dreamchan-tei-v1


    Download the whole corpus: https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-tei-v1.zip (ZIP file, 118.8 Mo )

    nbMotsMessages=53098 ; nbevenements=34323 ; nbcommandes=514 ; nbmessages=18641 ; nbmots=229653 ; nbparticipants=477 ; nbconnectes=1530 ; nbformes2=2772 ; nbformes1=2849 ; information computed and described by A. Falaise according to nbconnectes : le nombre d'utilisateurs uniques se connectant ou se déconnectant, déterminés d'après les logins de connexion (et non les pseudos); c'est à dire à peu de choses près le nombre d'utilisateurs connectés à un moment ou un autre sur le canal nbparticipants : le nombre d'intervenants uniques, déterminés d'après les pseudos (un même utilisateur peut avoir plusieurs pseudos, on comptera alors plusieurs intervenants); c'est à dire le nombre d'utilisateurs envoyant des messages et/ou des commandes. nbmessages : le nombre de balises "chat-message" (voir cette balise) nbevenements : le nombre de balises "chat-event" (voir cette balise) nbcommandes : le nombre de balises "chat-command" (voir cette balise) nbMotsMessages : le nombre de mots dans les interventions des balises "chat-message", paratexte (date, heure, pseudo de l'auteur) non compris. Un mot est défini par n'importe quoi compris entre deux blancs ou caractères .:/\'"+;!,?(){}[] nbmots : le nombre de mots dans toutes les interventions, paratexte (date, heure, pseudo de l'auteur) non compris. nbformes1 : le nombre de formes uniques apparaissant au moins une fois. nbformes2 : le nombre de formes uniques apparaissant au moins deux fois.


    Credits

    principal : Falaise Achille, Chanier Thierry.
    compiler : Falaise Achille .
    editor : Chanier Thierry .
    data inputter : Hriba Linda, Jin Kun.
    developer : Lotin Paul.
    participant : Wigham Ciara.
    publisher : ORTOLANG (Outils et Ressources pour un Traitement Optimisé de la LANGue), Nancy:France .

    Publication Statement and Rights

    Publisher(s)

    Date: 2014-05-01

    Identifier(s)

    uri: cmr-getalp_org-koma-tei-v1
    short-uri: cmr-get-c053
    url: https://hdl.handle.net/11403/comere/cmr-getalp_org/cmr-getalp_org-koma-tei-v1

    Licence

    http://creativecommons.org/licenses/by-nc-sa/4.0/

    This corpus can be freely distributed and shared subject only to attribution, non commercial use and share alike. The way to reference / cite the corpus is given in the titleSmt

    Rights holders of this corpus are: Kévin Labécot ; Achille Falaise ; Thierry Chanier