Introduction to transformersa break through in NLP

  1. Sabrina Queipo 1
  2. Antonio Garcia-Cabot 2
  3. Eva Garcia-Lopez 2
  4. David de-Fitero Domínguez 2
  1. 1 Massachusetts Institute of Technology (MIT) (USA)
  2. 2 Universidad de Alcalá
    info

    Universidad de Alcalá

    Alcalá de Henares, España

    ROR https://ror.org/04pmn0e78

Libro:
ATICA2022: Aplicación de Tecnologías de la Información y Comunicaciones Avanzadas y Accesibilidad
  1. Luis Bengoechea (coord.)
  2. Paola C. Ingavélez (coord.)
  3. José Ramón Hilera (coord.)

Editorial: Editorial Universidad de Alcalá ; Universidad de Alcalá

ISBN: 9788419745538

Año de publicación: 2023

Páginas: 142-147

Tipo: Capítulo de Libro

Resumen

Transformers are amongst the most powerful classes of NLP models invented to date. From machine translation to speech recognition systems, models such as BERT and GPT-2 are achieving state-of-the-art performance both in terms of evaluation score (BLEU score) and training time. What’s novel about transformers? Like many other scientific breakthroughs, transformers are the synthesis of several ideas, including transfer learning, attention, and scaling up neural networks. This paper describes the key mechanisms behind their success. In addition, variants of the transformer architecture are described, as well as an overview to the Hugging Face ecosystem. This paper aims to introduce the theory of transformers, essential for students to start deploying and training their own models for practical applications