From text to knowledge: factuality and degrees of certainty in Spanish – TAGFACT
The multitextual analysis aims at accounting for different texts about the same event; the multidimensional analysis, for a varied number of the voices that evaluate the event; and the multilevel one, for the various linguistic markers that express the point of view of the author and the sources at the phonological, syntactic and also discursive level.
The first phase of the project is to create an annotation tool of the factual values of the events expressed in texts in Spanish that (i) is able to identify coreferences to events in the same text and across texts; (ii) is able to identify the various sources that contribute their degree of commitment towards the certainty of the describe events; and (iii) takes into account the markers of factuality at different linguistic levels when it comes to the assignation of the different factual values.
The second phase is to elaborate a varied and rich representation of factuality. This representation must be able not only to account for the different elements and sources that play a role in it, but also to apply measures to calculate intermediate values, based on the different evaluations of the various origins of the described events.
Finally, the third stage is to establish the degree of proximity between the readers’ evaluations, either individually or collectively, and the evaluations resulting from the annotation tool.
In our project, we will focus on the study of texts from political news, and, hence, the final results will be especially crucial in this field.
Ministerio de Economía, Industria y Competitividad – FFI2017-84008-P
- Alonso, L., I. Castellón, H, Curell, A. Fernández-Montraveta, S. Oliver, G. Vázquez (2018). “Proyecto TAGFACT: Del texto al conocimiento. Factualidad y grados de certeza en español”, Procesamiento del Lenguaje Natural, 61, p. 151-154. ISSN: 1135-5948
- Barrios, L., G. Vázquez (2020). “Las oraciones concesivas en español y la factualidad”. Estudios Filológicos, 66, 151-183.
- Fernández-Montraveta, A., G. Vázquez (2019). “Analysis of the production of pronominal constructions in Spanish in a learner corpus”. Journal of Research Design and Statistics in Linguistics and Communication Science, 5:1-2.
- Fernández-Montraveta, A. H. Curell, G. Vázquez, I. Castellón (2020). “The TAGFACT annotator and editor: A versatile tool”. Research in Corpus Linguistics, 8:1, 131-146.
- Rosá, A., I. Castellón, I., L. Chiruzzo, H. Curell, M. Etcheverry, A. Fernández, G. Vázquez, D. Wonsever (2019). “Overview of FACT at IberLEF 2019. Factuality Analysis and Classification Task”. Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019) Co-located with 35th Conference of the Spanish Society for Natural Language Processing (SEPLN 2019). Bilbao.
- Rosá, A., L. Alonso, I. Castellón, L. Chiruzzo, H. Curell, A. Fernández-Montraveta, S. Góngora, M. Malcouri, G. Vázquez, D. Wonsever (2020). “Overview of FACT at IBERLEF 2020: Events detection and classification”. IBERLEF-FACT. Universidad de Málaga.
- Vázquez, G., A. Fernández-Montraveta (2020). “Annotating Factuality in the Tagfact Corpus”. M. Fuster-Márquez, C. Gregori-Signes, J. Santaemilia Ruiz (eds.), Multiperspectives in analysis and corpus design. Granada: Comares, pp. 115-125.