From text to knowledge: factuality and degrees of certainty in Spanish – TAGFACT

The general objective of this project is to elaborate an exhaustive representation of the factuality expressed in texts in Spanish, by means of a multidimensional, multitextual and multilevel analysis, with the aim to arrive at an analysis that can be performed automatically to a very high degree.

The multitextual analysis aims at accounting for different texts about the same event; the multidimensional analysis, for a varied number of the voices that evaluate the event; and the multilevel one, for the various linguistic markers that express the point of view of the author and the sources at the phonological, syntactic and also discursive level.

The first phase of the project is to create an annotation tool of the factual values of the events expressed in texts in Spanish that (i) is able to identify coreferences to events in the same text and across texts; (ii) is able to identify the various sources that contribute their degree of commitment towards the certainty of the describe events; and (iii) takes into account the markers of factuality at different linguistic levels when it comes to the assignation of the different factual values.

The second phase is to elaborate a varied and rich representation of factuality. This representation must be able not only to account for the different elements and sources that play a role in it, but also to apply measures to calculate intermediate values, based on the different evaluations of the various origins of the described events.

Finally, the third stage is to establish the degree of proximity between the readers’ evaluations, either individually or collectively, and the evaluations resulting from the annotation tool.

In our project, we will focus on the study of texts from political news, and, hence, the final results will be especially crucial in this field.


Ministerio de Economía, Industria y Competitividad – FFI2017-84008-P

  • Barrios, L., G. Vázquez (2020). “Las oraciones concesivas en español y la factualidad”. Estudios Filológicos, 66, 151-183.
  • Fernández-Montraveta, A., G. Vázquez (2019). “Analysis of the production of pronominal constructions in Spanish in a learner corpus”. Journal of Research Design and Statistics in Linguistics and Communication Science, 5:1-2.
  • Vázquez, G., A. Fernández-Montraveta (2020). “Annotating Factuality in the Tagfact Corpus”. M. Fuster-Márquez, C. Gregori-Signes, J. Santaemilia Ruiz  (eds.),  Multiperspectives in analysis and corpus design. Granada: Comares, pp. 115-125.