logo CoMeRe

This page: http://hdl.handle.net/11403/comere/cmr-polititweets
Back to Repository main page: http://hdl.handle.net/11403/comere

logo Ortolang
Open Resources and TOols for LANGuage

Corpus CoMeRe cmr-polititweets-tei-v1 : corpus Polititweet, tweets provenant de comptes politiques influents 1

How to cite this resource

Longhi, J., Marinica, C., Borzic, B., Alkhouli, A.(2014). Polititweets : corpus de tweets provenant de comptes politiques influents 1. In Chanier T. (ed) Banque de corpus CoMeRe. Ortolang.fr : Nancy. [http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-tei-v1]


The corpus Polititweets gathers tweets of 7 personalities from 6 French different political groups : Mélenchon, Bayrou, Copé, Fillon, Lepen, Ayraut, Cohn-Bendit. Extracted from the Twitter (https://twitter.com/) accounts (twittos in French) from these personalities, by a method that selected the messages of 205 twittos send in 2013-14, what makes for the total corpus 34273 messages (tweets) that contain 502 085 tokens / written forms, ponctuation excluded. An important part of the content of these messages is related to the campaign of municipal elections of March 2014. The first version of this corpus has been build up by the project "Numerical humanities and data journalism : the case of political vocabulary"from the Université of Cergy-Pontoise. The initial corpus initial has been converted to the TEI format within the framework of the project CoMeRe (Communication médiée par les réseaux, Network mediated communication) http://comere.org".

The complete corpus consists in 7 TEI folders. In the body of each folder, the messages coming from about 30 user accounts / twittos are gathered. The first serie of messages comes from the account of one of the 7 personalities that have been selected. The CoMeRe projet aims to gather different corpus that represent the forms of communication in French on the networks (Internet, phone, etc.), all structured and informed in the same way, diffused in open acces for research purposes. The CoMeRe projet has received the support of ORTOLANG (the French equivalent of DARIAH) and of the national consortium Written-Corpus ('Corpus-écrits', http://corpusecrits.corpus-ir.fr", subsection of Huma-Num.XXX

Keywords: Tweet; Computer Mediated Communication; CMC;


This corpus contains :

http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-tei-v1-manuel.pdf; http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-c001-tei-v1.xml; http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-c002-tei-v1.xml; http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-c003-tei-v1.xml; http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-c004-tei-v1.xml; http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-c005-tei-v1.xml; http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-c006-tei-v1.xml; http://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-c007-tei-v1.xml;



This corpus can be freely distributed and shared subject only to attribution. The way to reference / cite the corpus is given in the bibliographicCitation
Rights holders of this corpus are: Julien Longhi ; Thierry Chanier