You are here

Antonio Moreno Sandoval

Antonio Moreno Sandoval


Professional contact information

Universidad Autónoma de Madrid (UAM)

Fields of interest

Corpus design, compilation and types
Corpus-based computational linguistics

Research area keywords

computational linguistics

Knowledge transfer keywords

Text and Opinion Mining

Other details


Ph.D. in Linguistics (1991, UAM), B.A. in Hispanic Philology (1986, UAM).

Current Positions

  • Principal Researcher of the Computational Linguistics Laboratory (LLI-UAM) at Autonomous University of Madrid (since 2001). Web site:
  • Profesor Titular (Associate Professor) of General and Computational Linguistics (2001) at the Faculty of Philosophy, UAM. Previously, he was a Profesor Asociado (Assistant Professor) in the same position from 1993. Member of the LLI-UAM since 1993.
  • Senior Researcher at the Instituto de Ingenería del Conocimiento (IIC), UAM, from December 2009.

Previous positions

  • Research Visiting Scholar at New York University (1991-1992), Computer Science Department, supported by a Fulbright Postdoctoral grant.
  • Research Visiting Scholar at Universität Augsburg (1998), Chair of Applied Linguistic, supported by a DAAD Postdoctoral grant.
  • Researcher at the Madrid IBM Scientific Centre (1989-1991).


Research leadership experience

Principal Researcher in the projects C-ORAL-ROM (2001-2004, European Union), UAM Spanish Treebank (1998-… NYU), MULTIMEDICA (2011-14), BRAVO-RL (2007-2010), RILARIM (2005-2007, CICYT), MAVIR (2006-2015) ACORDEON (2000-2002, CICYT).

Researcher in the projects EUROTRA (1988, UE), Silvia-NLQ (1898-1991, IBM), Proteus (1991-1995, NYU), ARIES (1995, CICYT), ATILA (1996-1999, CICYT), Spanish Network of Speech Technologies (2003-2005).

Supervisor of 9 Ph.D defended theses and 12 M.A. theses. Supervisor of 10 research and doctorate scholarships, among them those of Dong Yang (MAE-AECI), Manuel Alcántara (FPU-MEC), Doaa Samy (MAE-AECI), Prem Prakash (MAE-AECI), Ana González (FPI-CAM), Leonardo Campillos(CAM), Alicia González (FPU-UAM) and Yuanyi Liu (China Gov).

Corpora development and collection

Antonio Moreno-Sandoval has directed and supervised the following data collection:

  1. UAM Spanish Treebank
  2. the Spanish set of the C-ORAL-ROM corpus, a spontaneous speech collection in Italian, French, Portuguese and Spanish.
  3. CHIEDE corpus of spontaneous child language
  4. C-ORAL-JAPAN, a spontaneous speech corpus of Japanese
  5. C-ORAL-CHINA, a spontaneous speech corpus of Chinese
  6. MAVIR corpus of spoken lectures on ICT (Spanish and English)
  7. MULTIMEDICA corpus of Medicine (Spanish, Japanese and Arabic)

Those data collections can be found at



Author of the books Lingüística computacional: introducción a los modelos simbólicos, probabilísticos y biológicos (Computational Linguistics: an Introduction to the symbolic, probabilistic and biological models) (1998) and   Gramáticas de unificación y rasgos (“Unification Grammars”) (2001). In addition, he collaborated in chapters in collective books such as Treebanks: building and using parsed corpora (2003, Kluwer) and Corpus Linguistics around the World (2006, Rodopi). He is author of articles in international journals (Journal of Natural Language Engineering, Journal of Logic, Language and Computation, among others) as well as in international congresses (COLING, LREC, MUC-4, GULP-PRODE, SEPLN). In total, more than 110 publications.