Skip to Main content Skip to Navigation
Conference papers

Exploring a corpus annotated in causal discourse relations for the study of causal lexical clues: Cross-Linguistic Discourse Annotation: applications and perspectives

Abstract : 1 Introduction Usually, the study of Discourse Relations (DRs) is based on Lexical Clues (LCs) commonly associated with these DRs, like connectives. For example, a corpus study of causal DRs can be done from the analysis of some connectives commonly associated with causality, like because. Such a semasiological approach, that proceeds from a given LC towards DRs, has a significant advantage: it is much easier to locate LCs than DRs in a corpus. The approach presented here complementarily exploits two types of analysis. We first adopt an onomasiological approach, that proceeds from a given DR towards LCs. In other words, we analyze all the occurrences of this DR in a corpus in order to identify all the LCs that contribute to the DR interpretation. Then, the results of these first analyses are completed by a semasiological analysis: each LC that has been identified is projected on the corpus in order to determine whether it specifically marks the given DR or not. The onomasiological approach requires working on data that have previously been annotated with DRs. Before the ANNODIS corpus was built (ANNotation DIScursive de corpus; Péry-Woodley et al., 2009, 2011; Afantenos et al., 2012), such data did not exist for French and an onomasiological approach, as presented above, was simply impossible for this language. This rather new methodology has already been applied to a few DRs on the ANNODIS corpus (see Vergez-Couret, 2010, for an application to Elaboration DR). We propose to focus here on a specific family of DRs: causal DRs, and to base our study on a corpus specifically annotated with causal DRs: the EXPLICADIS corpus (EXPLication et Argumentation en DIScours; Atallah, 2014; Atallah, 2015). 2 The EXPLICADIS corpus In the ANNODIS project, 86 texts were segmented into Elementary Discourse Units (EDUs) and then annotated with a tagset of DRs inspired by SDRT relations (Seg-mented Discourse Representation Theory, Asher and Lascarides, 2003). The EXPLICADIS corpus has been built in the continuity of ANNODIS: the 86 texts were reused and re-annotated with a more complete and accurate new set of causal DRs. 2 Then 31 more texts were added, segmented and annotated in order to provide a better representation of different text genres: narrative, expositive and argumentative. The whole EXPLICADIS corpus includes 117 texts, 4,580 EDUs and 39,103 tokens. This new set of causal DRs was adopted in order to remedy the difficulties experienced by ANNODIS annotators with the first set of DRs and to adequately account for the data in a semantically clear set of relations (Atallah, 2014; Atallah et al., 2016). It includes, like the previous one, two types of relations: Explanation relations (noted further Rh_Exp) and Result relations (noted further Rh_Res) 1. The new set is original because it distinguishes within both rhetorical types four subtypes of DRs: content-level DRs that involve a causal link between the eventualities that are described in the propositional content: Explanation (1) and Result (2); epistemic DRs that involve a causal link between knowledge items and beliefs: Explanation ep (3) and Result ep (4) ; inferential DRs that involve a causal link between knowledge items: Explanation inf (5) and Result inf (6); speech-act (or pragmatic) DRs that involve a causal link between an eventuality that is described in the propositional content and a speech act: Explanation prag (7) and Result prag (8).
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02982984
Contributor : Françoise Grélaud <>
Submitted on : Tuesday, November 3, 2020 - 11:08:36 AM
Last modification on : Saturday, November 7, 2020 - 3:14:10 AM

File

Exploring a corpus annotated i...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02982984, version 1

Citation

Caroline Atallah, Myriam Bras, Laure Vieu. Exploring a corpus annotated in causal discourse relations for the study of causal lexical clues: Cross-Linguistic Discourse Annotation: applications and perspectives. Text Link 2018, Mar 2018, Toulouse, France. pp.12-28. ⟨hal-02982984⟩

Share

Metrics

Record views

26

Files downloads

8