| détails the way the accounts of 200
persons were selected. Here an extrait 1) we started with 7 personalities of 6
different French political groups : JLMelenchon, Bayrou, Copé, Fillon, Lepen, Ayraut,
Cohn-Bendit 2) we gathered on all the lists quotations mentioning them => 7087 lists 3)
we selected among these lists, the ones that had at least 6 twittos (user accounts) and who contained
the chain of characters *politic* in the name or description of the list => 120 listss
(11K lignes) 4) On these 120 lists, we selected 2934 tweets ; 5) to be sure to select
only political twittos (and not journalistic...), we work by levels. By selecting only
the accounts quoted on more than 12 lists, we obtain 205 political twittos. On the 205
accounts, we recovered the 200 last tweets of every person at the date of 27 March
2014, that is 34273 messages / tweets. This has permitted to obtain a corpus centered on the
period between two ballots of the local elections 2014, or, for the accounts that were
less actives, the consideration of these eletions, or the previous ones (because,
according to the density of the publication of tweets, the temporality of each account
will be different : the oldest one is dated 2009-03-04 11:59:49).
This corpus is a subpart of the CoMeRe corpus databank. The CoMeRe
(Communication Médiée par les Réseaux) project aims to
build a kernel corpus assembling existing corpora of different CMC (Computer-Mediated
Communication) genres and new corpora build on data extracted from the Internet. These
heterogenous corpora will be structured and processed in a uniform way, complemented with
metadata. CoMeRe will be released as OpenData through the national infrastructure
Ortolang, following constraints which will be reused for the forthcoming “Corpus de
Référence du Français”. Project supported by the national consortium Corpus-écrits, sub-part of Huma-Num, and Ortolang (French
correspondant to DARIAH).
The TEI structure used is an extension of TEI for CMC genres. This
extension is developped by a European project which participants are : Michael Beißwenger
(DE), Thierry Chanier (FR), Isabella Chiari (IT), Maria Ermakova (DE), Maarten van Gompel
(NL), Iris Hendrickx (NL), Axel Herold (DE), Henk van den Heuvel (NL), Lothar Lemnitzer
(DE), Angelika Storrer (DE).
Description of the Interaction Space
CMC Environment
: Definition of the modality Tweet. Type of messages used in Tweets.
Structure of interactions
text: each text correspond to the set of tweets coming from the same Twitter
post: one post corresponds to one tweet.
- xml:idID of the posting.
whenis date of message on Twitter.
whoID of the twitter account, see listPerson .
typetype of message cf. taxononomy. Not displayed here. Default value for
all tweets
p: This element appears inside the
distinct: This element appears inside
- twitter-hashtag. Then the element contains ident with
#, and ref with the URL of discussion topic
- twitter-retweet. Then the element contains ident with
- twitter-via. Then the element contains ident with
addressingTerm: Addressing terms address an utterance to a particular
interlocutor / twitto or refers to a twitto. It includes :
- addressMarker with @
- addressee refers to a Twitter account
trailer: This element appears inside
Data Collection
Data collected : From 2009-03-04 to 2014-03-27
Twitter website
Out of the 30 twitter-account selected from French politicians, the last 200 last
tweets have been extracted. Since most of the messages were sent at the end of 2013 and
before the end of March 2014, they mainly refers to discussions betwwen the two rounds
of voting in the municipal elections of March 2014
Language of the data:
Types of interaction
channel: mode: w
Message sent through a Twitter
constitution: Selected through automatic processing. See projectDesc for more
derivation: type: original
domain: type: public
domain of a message: politics
factuality: type: fact
interaction: type: complete
active: many
preparedness: type: spontaneous
purpose: political local elections
Participants (extract)
The list or participants, i.e. twittos is given in sourceDesc is
Extracts of Interactions
head:Tweets de François Fillon
- POST: xml:id: p_ts-a449141337640558592
| who: s-p551669623
| when: 2014-03-27T12:10:14
| xml:lang: fra
p: Le 2ème tour doit surtout conduire F. #Hollande à sortir enfin de l'ambiguïté.
#Municipales2014 >>
- medium: TweetDeck
- favoritecount: 4
- retweetcount: 14
- POST: xml:id: p_ts-a449137666974834688
| who: s-p551669623
| when: 2014-03-27T11:55:39
| xml:lang: fra
p: A la rencontre des parisiennes et des parisiens du 4e avec
@vincentroger754 et
48.8555 - 2.3638
- medium: Twitter for iPhone
- favoritecount: 4
- retweetcount: 11
- POST: xml:id: p_ts-a449133103421063168
| who: s-p551669623
| when: 2014-03-27T11:37:31
| xml:lang: fra
p: Avec 31.000 chômeurs en plus, il est temps que le chef de l'État assume une autre
politique. @Le_Figaro #Chômage >>
- medium: web
- favoritecount: 5
- retweetcount: 28
- POST: xml:id: p_ts-a449123458921029632
| who: s-p551669623
| when: 2014-03-27T10:59:11
| xml:lang: fra
p: Il y a urgence : F. #Hollande doit se remanier lui-même.
#Municipales2014 >>
- medium: web
- favoritecount: 7
- retweetcount: 33
- POST: xml:id: p_ts-a449103491408334848
| who: s-p551669623
| when: 2014-03-27T09:39:50
| xml:lang: fra
p: RT
#municipales2014 Fillon: «Les Français ont exprimé leur rejet de
la politique de F. Hollande»
- medium: Twitter for iPhone
- retweetcount: 64
- isRetweet: true
- retweetedstatus_id: 449100421693329408
- POST: xml:id: p_ts-a448915754973679616
| who: s-p551669623
| when: 2014-03-26T21:13:51
| xml:lang: fra
p: Mes amis, notre victoire dimanche c'est le début du redressement national ! #Municipales2014
44.91178 - 4.91596
- medium: iOS
- favoritecount: 28
- retweetcount: 69
- POST: xml:id: p_ts-a448914334660710400
| who: s-p551669623
| when: 2014-03-26T21:08:12
| xml:lang: fra
p: Jusqu’à la dernière heure de ce scrutin, nos candidats ont besoin de chacun d’entre
nous. Rassemblons-nous pour aller chercher la victoire !
- medium: web
- favoritecount: 22
- retweetcount: 74
- POST: xml:id: p_ts-a448913849841102848
| who: s-p551669623
| when: 2014-03-26T21:06:16
| xml:lang: fra
p: Ici à Valence, le succès est à portée de votre main. A l’évidence,
est le maire déterminé et fédérateur dont Valence a besoin
- medium: web
- favoritecount: 7
- retweetcount: 23
- POST: xml:id: p_ts-a448913243923562496
| who: s-p551669623
| when: 2014-03-26T21:03:52
| xml:lang: fra
p: Le seul vote offensif contre le gouvernement, efficace pour la France, utile à nos
communes : c’est le vote droite républicaine et centre !
- medium: web
- favoritecount: 12
- retweetcount: 52
head:Tweets de France Diplomatie
- POST: xml:id: p_ts-a449141838645964800
| who: s-p29701712
| when: 2014-03-27T12:12:13
| xml:lang: fra
p: Le diplomate français à l'étranger est un vrai... couteau suisse ! A lire dans le
#blogdiplo :
- medium: web
- favoritecount: 4
- retweetcount: 4
- POST: xml:id: p_ts-a449137284227792896
| who: s-p29701712
| when: 2014-03-27T11:54:07
| xml:lang: fra
p: RT
@LaurentFabius: Avec mon homologue chinois
Wang Yi, tout à l'heure avant le séminaire franco-chinois
@France_en_Chine http://t.c…
- medium: web
- retweetcount: 10
- isRetweet: true
- retweetedstatus_id: 449118709479534592
- POST: xml:id: p_ts-a449137246567149568
| who: s-p29701712
| when: 2014-03-27T11:53:58
| xml:lang: fra
p: RT
@LaurentFabius: Le président chinois Xi
Jinping a rendu hommage au général de Gaulle hier
- medium: web
- retweetcount: 14
- isRetweet: true
- retweetedstatus_id: 449114721237467136
- POST: xml:id: p_ts-a449137025040793600
| who: s-p29701712
| when: 2014-03-27T11:53:06
| xml:lang: fra
p: RT
@LaurentFabius: Nous entendons travailler
avec la #Chine
pour préparer #Parisclimat2015 et aboutir à un accord universel et
- medium: web
- retweetcount: 13
- isRetweet: true
- retweetedstatus_id: 449103769868197888
- POST: xml:id: p_ts-a449136977623801858
| who: s-p29701712
| when: 2014-03-27T11:52:54
| xml:lang: fra
p: RT
@LaurentFabius: La France veut être au rdv
de ce nouvel âge de l'économie et de la société chinoise #diploéco
- medium: web
- retweetcount: 6
- isRetweet: true
- retweetedstatus_id: 449103584828080129
- POST: xml:id: p_ts-a449136897840123904
| who: s-p29701712
| when: 2014-03-27T11:52:35
| xml:lang: fra
p: #FranceChine
@LaurentFabius : Nous souhaitons que la
puisse accueillir très prochainement le #G20
- medium: web
- retweetcount: 5
- POST: xml:id: p_ts-a449136751735754752
| who: s-p29701712
| when: 2014-03-27T11:52:00
| xml:lang: fra
p: #FranceChine
@LaurentFabius: Nous visons 3 millions de
touristes chinois en #France
au lieu de la moitié #Chine
- medium: web
- retweetcount: 4
- POST: xml:id: p_ts-a449136500085915648
| who: s-p29701712
| when: 2014-03-27T11:51:00
| xml:lang: fra
p: #FranceChine
@LaurentFabius: Nous voulons favoriser
l'apprentissage du chinois dans le système éducatif francais #Chine
- medium: web
- retweetcount: 4
- POST: xml:id: p_ts-a449136215036805120
| who: s-p29701712
| when: 2014-03-27T11:49:52
| xml:lang: fra
p: RT
Il va y avoir des élections, un nouveau pouvoir légitime va émerger
- medium: web
- retweetcount: 9
- isRetweet: true
- retweetedstatus_id: 449076872941682688
Composition of the corpus
30 user accounts / twittos ; 4 920 posts / tweets ; 83 388 tokens
principal : Longhi Julien, Chanier Thierry.
compiler : Longhi Julien, Maricina Claudia.
editor : Chanier Thierry .
data inputter : Hriba Linda.
developer : Lotin Paul.
participant : Ledegen Gudrun.
publisher : ORTOLANG (Outils et Ressources pour un Traitement Optimisé de la
LANGue), Nancy:France
Publication Statement and Rights
Date: 2014-05-14
uri: cmr-polititweets-c004-tei-v1
Rights holders of this corpus are: Julien Longhi ; Thierry
This corpus can be freely distributed and shared subject only to
attribution. The way to reference / cite the corpus is given in the