cmr-polititweets-tei-v1-manuel.pdf
| détails the way the accounts of 200
persons were selected. Here an extrait 1) we started with 7 personalities of 6
different French political groups : JLMelenchon, Bayrou, Copé, Fillon, Lepen, Ayraut,
Cohn-Bendit 2) we gathered on all the lists quotations mentioning them => 7087 lists 3)
we selected among these lists, the ones that had at least 6 twittos (user accounts) and
who contained the chain of characters *politic* in the name or description of the list
=> 120 listss (11K lignes) 4) On these 120 lists, we selected 2934 messages / tweets ;
5) to be sure to select only political twittos (and not journalistic...), we work by
levels. By selecting only the accounts quoted on more than 12 lists, we obtain 205
political twittos. On the 205 accounts, we recovered the 200 last messages / tweets of
every person at the date of 27 March 2014, that is 34273 tweets. This has permitted to
obtain a corpus centered on the period between two ballots of the local elections 2014,
or, for the accounts that were less actives, the consideration of these eletions, or the
previous ones (because, according to the density of the publication of tweets, the
temporality of each account will be different : the oldest one is dated 2009-03-04
11:59:49).
This corpus is a subpart of the CoMeRe corpus databank. The CoMeRe
(Communication Médiée par les Réseaux) project aims to
build a kernel corpus assembling existing corpora of different CMC (Computer-Mediated
Communication) genres and new corpora build on data extracted from the Internet. These
heterogenous corpora will be structured and processed in a uniform way, complemented with
metadata. CoMeRe will be released as OpenData through the national infrastructure
Ortolang, following constraints which will be reused for the forthcoming “Corpus de
Référence du Français”. Project supported by the national consortium Corpus-écrits, sub-part of Huma-Num, and Ortolang (French
correspondant to DARIAH).
The TEI structure used is an extension of TEI for CMC genres. This
extension is developped by a European project which participants are : Michael Beißwenger
(DE), Thierry Chanier (FR), Isabella Chiari (IT), Maria Ermakova (DE), Maarten van Gompel
(NL), Iris Hendrickx (NL), Axel Herold (DE), Henk van den Heuvel (NL), Lothar Lemnitzer
(DE), Angelika Storrer (DE).
Description of the Interaction Space
CMC Environment
tweet
: Definition of the modality Tweet. Type of messages used in Tweets.
Structure of interactions
text: each text correspond to the set of tweets coming from the same Twitter
account
post: one post corresponds to one tweet.
- xml:idID of the posting.
-
whenis date of message on Twitter.
-
whoID of the twitter account, see listPerson .
-
typetype of message cf. taxononomy. Not displayed here. Default value for
all tweets
p: This element appears inside the
distinct: This element appears inside
- twitter-hashtag. Then the element contains ident with
#, and ref with the URL of discussion topic
- twitter-retweet. Then the element contains ident with
RT
- twitter-via. Then the element contains ident with
via
addressingTerm: Addressing terms address an utterance to a particular
interlocutor / twitto or refers to a twitto. It includes :
- addressMarker with @
- addressee refers to a Twitter account
trailer: This element appears inside
Data Collection
Data collected : From 2009-03-04 to 2014-03-27
location:
Twitter website
Out of the 30 twitter-account selected from French politicians, the last 200 last
tweets have been extracted. Since most of the messages were sent at the end of 2013 and
before the end of March 2014, they mainly refers to discussions betwwen the two rounds
of voting in the municipal elections of March 2014
France
Language of the data:
français
Types of interaction
channel: mode: w
,
Message sent through a Twitter
account
constitution: Selected through automatic processing. See projectDesc for more
information
derivation: type: original
,
domain: type: public
,
domain of a message: politics
factuality: type: fact
,
interaction: type: complete
,
active: many
,
preparedness: type: spontaneous
,
purpose: political local elections
Participants (extract)
The list or participants, i.e. twittos is given in sourceDesc is
Extracts of Interactions
head:Tweets de François Bayrou
- POST: xml:id: p_ts-a446716147963289600
| who: s-p17211968
| when: 2014-03-20T19:33:23
| xml:lang: fra
p: A. Juppé a vraiment raison. Ces spéculations sur 2017 sont ridicules et déplacées.
Qu'on nous laisse nous concentrer sur nos villes !
trailer:
- medium: web
- favoritecount: 27
- retweetcount: 77
- POST: xml:id: p_ts-a443151199412699136
| who: s-p17211968
| when: 2014-03-10T23:27:33
| xml:lang: fra
p: RT
@aimonspau: Intervention François Bayrou
meeting avec Alain Juppé samedi 8 mars - http://t.co/UzGkrU6rQD
trailer:
- medium: web
- retweetcount: 12
- isRetweet: true
- retweetedstatus_id: 443049867124805632
- POST: xml:id: p_ts-a442325971707133952
| who: s-p17211968
| when: 2014-03-08T16:48:24
| xml:lang: fra
p: RT
@aimonspau: Une journée pour penser aux
femmes et 6 ans pour agir concrètement. Les candidates et les candidats d'#aimonspau s'engagent !
trailer:
- medium: web
- retweetcount: 14
- isRetweet: true
- retweetedstatus_id: 442191826486034432
- POST: xml:id: p_ts-a442195819312476160
| who: s-p17211968
| when: 2014-03-08T08:11:13
| xml:lang: fra
p: RT
@aimonspau: Ce soir à 19h à la Foire Expo,
venez nombreux au meeting @bayrou
@alainjuppe : pour des projets communs
Bordeaux-Pau!
trailer:
- medium: web
- retweetcount: 15
- isRetweet: true
- retweetedstatus_id: 442187258599587840
- POST: xml:id: p_ts-a441595879028699136
| who: s-p17211968
| when: 2014-03-06T16:27:16
| xml:lang: fra
p: RT
@MHollenweger:
L"@Alternative_fr vous présente
@Les_Europeens, ses candidats aux élections européennes
2014. @UDI_off
@MoDem
@bayrou
@JL…
trailer:
- medium: TweetDeck
- retweetcount: 7
- isRetweet: true
- retweetedstatus_id: 441587035678453761
- POST: xml:id: p_ts-a431798901436604416
| who: s-p17211968
| when: 2014-02-07T15:37:35
| xml:lang: fra
p: RT
@aimonspau: Réunion de quartier au centre
social la #pépinière avec @bayrou
@MarcCABANE
@AnneCastera
@VéroniqueLipsos
@64Regis
trailer:
- medium: TweetDeck
- retweetcount: 6
- isRetweet: true
- retweetedstatus_id: 431798330755407873
- POST: xml:id: p_ts-a431390265388314624
| who: s-p17211968
| when: 2014-02-06T12:33:48
| xml:lang: fra
p: RT
@francebleu: Merci d'avoir suivi notre LT
de @bayrou dans #FBME avec
@Bleu_Bearn et
@publicsenat
#Bleumun
http://t.co/Gr75SlhZU4
trailer:
- medium: TweetDeck
- retweetcount: 8
- isRetweet: true
- retweetedstatus_id: 431390231174983681
- POST: xml:id: p_ts-a431363554307231744
| who: s-p17211968
| when: 2014-02-06T10:47:40
| xml:lang: fra
p: RT
@francebleucom: ECOUTER
@bayrou est l'invité
de France Bleu Midi Ensemble à 12h08 http://t.co/EpemFZ8ExO
#mun64000
trailer:
- medium: TweetDeck
- retweetcount: 11
- isRetweet: true
- retweetedstatus_id: 431350302847946753
- POST: xml:id: p_ts-a430639846353166336
| who: s-p17211968
| when: 2014-02-04T10:51:54
| xml:lang: fra
p: Loi sur la #famille: Le gouvernement a eu peur de sa majorité. http://t.co/UCrmT4vlAf
trailer:
- medium: TweetDeck
- favoritecount: 10
- retweetcount: 33
head:Tweets de Élysée
- POST: xml:id: p_ts-a448961439969996800
| who: s-p16717501
| when: 2014-03-27T00:15:23
| xml:lang: fra
p: Dîner d’État en l'honneur de M. XI Jinping, président de la République populaire de
Chine http://t.co/DR76tR2S3v http://t.co/GXwlYZOz0x
trailer:
- medium: web
- favoritecount: 23
- retweetcount: 23
- POST: xml:id: p_ts-a448912658234748929
| who: s-p16717501
| when: 2014-03-26T21:01:32
| xml:lang: fra
p: [PHOTOS] Entretiens de @fhollande avec M. XI Jinping http://t.co/DcQuci3zAz
#FranceChine
http://t.co/iPSRU0XusQ
trailer:
- medium: web
- favoritecount: 26
- retweetcount: 25
- POST: xml:id: p_ts-a448884792055316481
| who: s-p16717501
| when: 2014-03-26T19:10:48
| xml:lang: fra
p: #FranceChine > Pour ouvrir une nouvelle étape d’un partenariat
global stratégique franco-chinois étroit et solide http://t.co/udNiIiEFn4
trailer:
- medium: web
- favoritecount: 8
- retweetcount: 21
- POST: xml:id: p_ts-a448872927128477696
| who: s-p16717501
| when: 2014-03-26T18:23:40
| xml:lang: fra
p: [VIDÉO] Déclaration de @fhollande et de M. XI Jinping à l'occasion de leur
entretien à l’@Elysee
http://t.co/ejA5IpT908
#FranceChine
trailer:
- medium: web
- favoritecount: 4
- retweetcount: 11
- POST: xml:id: p_ts-a448864047388450816
| who: s-p16717501
| when: 2014-03-26T17:48:23
| xml:lang: fra
p: [DIRECT] Visite d’État #FranceChine > Signatures d’accords et déclaration conjointe de
@fhollande et M. XI Jinping http://t.co/Fq3lOctqcs
trailer:
- medium: web
- favoritecount: 8
- retweetcount: 11
- POST: xml:id: p_ts-a448859895941451776
| who: s-p16717501
| when: 2014-03-26T17:31:53
| xml:lang: fra
p: [PHOTOS] Cérémonie d’accueil de M. XI Jinping à l’Hôtel national des Invalides http://t.co/CKXZ2Bhwcs
#FranceChine
http://t.co/rxpckEqt3c
trailer:
- medium: web
- favoritecount: 13
- retweetcount: 36
- POST: xml:id: p_ts-a448848372787474433
| who: s-p16717501
| when: 2014-03-26T16:46:06
| xml:lang: fra
p: Le président @fhollande s'entretient avec M.Xi Jinping, président
de la République populaire de Chine #FranceChine http://t.co/DltaHKqHLG
trailer:
- medium: web
- favoritecount: 12
- retweetcount: 31
- POST: xml:id: p_ts-a448845726446284800
| who: s-p16717501
| when: 2014-03-26T16:35:34
| xml:lang: fra
p: La Garde républicaine escorte
@fhollande et le président chinois M. XI Jinping
jusqu'à l'@Elysee
https://t.co/13THwlK3b6
#FranceChine
trailer:
- medium: web
- favoritecount: 16
- retweetcount: 33
- POST: xml:id: p_ts-a448844139455475712
| who: s-p16717501
| when: 2014-03-26T16:29:16
| xml:lang: fra
p: Visite d’État > François Hollande
@fhollande reçoit le président chinois M. XI Jinping à
l'@Elysee
#FranceChine
http://t.co/8rjQJuS53h
trailer:
- medium: web
- favoritecount: 13
- retweetcount: 35
Composition of the corpus
30 user accounts / twittos ; 5 511 posts ; 92 386 tokens
Credits
principal : Longhi Julien, Chanier Thierry.
compiler : Longhi Julien, Marinica Claudia.
editor : Chanier Thierry .
data inputter : Hriba Linda, Borzic Boris, Alkhouli Abdulhafiz.
developer : Lotin Paul.
participant : Ledegen Gudrun.
publisher : ORTOLANG (Outils et Ressources pour un Traitement Optimisé de la
LANGue), Nancy:France
.
Publication Statement and Rights
Publisher(s)
Date: 2014-05-14
Identifier(s)
uri: cmr-polititweets-c002-tei-v1
url: https://hdl.handle.net/11403/comere/cmr-polititweets/cmr-polititweets-c002-tei-v1
Licence
http://creativecommons.org/licenses/by/4.0/
Rights holders of this corpus are: Julien Longhi ; Thierry
Chanier
This corpus can be freely distributed and shared subject only to
attribution. The way to reference / cite the corpus is given in the
titleSmt