Codificación y anotación preliminar de un corpus oral multilingüe de conversaciones telefónicas interpretadas para el estudio de los ataques a la imagen

Marcelo Yuji Himoro; Antonio Pareja-Lora

doi:10.25267/PRAGMALINGUISTICA.2022.I30.19

Codificación y anotación preliminar de un corpus oral multilingüe de conversaciones telefónicas interpretadas para el estudio de los ataques a la imagen

Marcelo Yuji Himoro ¹
Antonio Pareja-Lora ²

1 Universidad Nacional de Educación a Distancia

Universidad Nacional de Educación a Distancia

Madrid, España

ROR https://ror.org/02msb5n36
2 Universidad de Alcalá

Universidad de Alcalá

Alcalá de Henares, España

ROR https://ror.org/04pmn0e78

Journal:

Pragmalinguistica

ISSN: 1133-682X

Year of publication: 2022

Issue: 30

Pages: 413-432

Type: Article

DOI: 10.25267/PRAGMALINGUISTICA.2022.I30.19 DIALNET GOOGLE SCHOLAR Open access editor

More publications in: Pragmalinguistica

Sustainable development goals

Abstract

El crecimiento y la consolidación de la demanda de servicios de interpretación telefónica ha traído consigo un mayor estudio de los mismos en el entorno académico. El objetivo de este trabajo ha sido crear un corpus oral de interacciones telefónicas mediadas por intérpretes, orientado en concreto al estudio de los ataques contra la imagen. Estas interacciones, que siempre incluyen el español, se realizan también en alemán, chino, francés, inglés o ruso. Primeramente se describen brevemente los hitos alcanzados en este sentido en trabajos anteriores: la recopilación de las grabaciones anonimizadas de las conversaciones, su procesamiento inicial, su transcripción y su traducción. En segundo lugar, se detalla el proceso de conversión de las transcripciones al formato EXMARaLDA y su posterior sincronización con las grabaciones. Para terminar, se discuten las limitaciones y dificultades encontradas en estos procesos de conversión y sincronización.

Bibliographic References

ANGERMEYER, P., MEYER, B. y SCHMIDT, T. (2012): “Sharing Community Interpreting Corpora: A pilot study”, Schmidt, T. y Wörner, K. (coords.): Multilingual Corpora and Multilingual Corpus Analysis, Amsterdam: John Benjamins Publishing Company, pp. 275-294. https://doi.org/10.1075/hsm.14.19ang
AUSTIN, J. L. (1962): How to Do Things with Words, Oxford: University Press.
BOERSMA, P. y VAN HEUVEN, V. (2001): “Praat, a system for doing phonetics by computer”, Glot International, 5:9/10, pp. 341-345. https://www.fon.hum.uva.nl/paul/papers/speakUnspeakPraat_glot2001.pdf (Fecha de consulta: 03/12/2021).
BRAVO, D. y BRIZ GÓMEZ, A. (2004): Pragmática sociocultural: estudios sobre el discurso de cortesía en español, Barcelona: Ariel.
BROWN, P. y LEVINSON, S. C. (1987): Politeness: Some universals in language usage, Cambridge: Cambridge University Press.
BRUGMAN, H. y RUSSEL, A. (2004): “Annotating Multimedia/Multi-modal resources with ELAN”, Proceedings of LREC 2004, Fourth International Conference on Language Resources and Evaluation, Lisbon: European Language Resources Association, pp. 2065-2068. https://aclanthology.org/L04-1285/ (Fecha de consulta: 03/12/2021).
CAMBRIDGE, J. (1997): “Information exchange in bilingual medical interviews”, Trabajo de fin de máster, University of Manchester, Manchester, Inglaterra.
CROWDY, S. (1993): “Spoken Corpus Design”, Literary and Linguistic Computing, 8(4), pp. 259-265. https://doi.org/10.1093/llc/8.4.259
CULPEPER, J., BOUSFIELD, D. y WICHMANN A. (2003): “Impoliteness revisited: With special reference to dynamic and prosodic aspects”, Journal of Pragmatics, 35, pp. 1545-1579. https://doi.org/10.1016/S0378-2166(02)00118-2
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION (2014): Language resource management — Semantic annotation framework (SemAF) — Part 5: Discourse structure (SemAF-DS), (ISO/TS Standard No. 24617-5). Disponible en: https://www.iso.org/standard/57083.html (Fecha de consulta: 03/12/2021).
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION (2016): Language resource management — Semantic annotation framework (SemAF) — Part 8: Semantic relations in discourse, core annotation schema (DR-core), (ISO Standard No. 24617-8). Disponible en: https://www.iso.org/standard/60780.html (Fecha de consulta: 03/12/2021).
INTERNATIONAL ORGANIZATION FOR STANDARDIZATION (2020): “Language resource management — Semantic annotation framework (SemAF) — Part 2: Dialogue acts (ISO Standard No. 24617-2). Disponible en: https://www.iso.org/standard/76443.html (Fecha de consulta: 03/12/2021).
KIPP, M. (2001): “Anvil - A Generic Annotation Tool for Multimodal Dialogue”, Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001), Aalborg: International Speech Communication Association, pp. 1367-1370. https://www.isca-speech.org/archive/pdfs/eurospeech_2001/kipp01_eurospeech.pdf
LACHENICHT, L. G. (1980): “Aggravating language: A study of abusive and insulting language”. International Journal of Human Communication, 13(4), pp. 607–688. https://doi.org/10.1080/08351818009370513
LÁZARO GUTIÉRREZ, R. (2018): “Design and Compilation of a Multilingual Corpus of Mediated Interactions about Roadside Assistance”, Ruiz Mezcua, A. (ed.): Approaches to Telephone Interpretation. Research, Innovation, Teaching and Transference, Bern: Peter Lang.
LÁZARO GUTIÉRREZ, R. (2019): “Telephone interpreting and roadside assistance”, Translation and Translanguaging in Multilingual Contexts, (5)3, pp. 215-240.
LÁZARO GUTIÉRREZ, R. y ALCALDE PEÑALVER, E. (2022): “‘El cliente siempre lleva la razón’: problemas de comunicación y soluciones en la interpretación telefónica para asistencia en carretera”, Pragmalingüística, 30, pp. 433-446.
LÁZARO GUTIÉRREZ, R. y CABRERA MÉNDEZ, G. (2018): “Pragmática e interpretación telefónica: un estudio sobre ataques contra la imagen de los intérpretes (FTA, Face threatening acts)”, Curado Fuentes, A. (coord.): LSP in Multi-disciplinary contexts of Teaching and Research. Papers from the 16th International AELFE Conference, Manchester: EasyChair, pp. 85-90.
LÁZARO GUTIÉRREZ, R. y CABRERA MÉNDEZ, G. (2019): “Context and pragmatic meaning in telephone interpreting”, Garcés-Conejos Blitvich, P., Fernández Amaya, L. y Hernández-López, M. O. (coords.): Technology Mediated Service Encounters, Amsterdam: John Benjamins Publishing Company, pp. 45-67.
LOVE, R., DEMBRY, C., HARDIE, A., BREZINA, V. y MCENERY, T. (2017): “The Spoken BNC2014: designing and building a spoken corpus of everyday conversations”. International Journal of Corpus Linguistics, 22(3), pp. 319-344. https://doi.org/10.1075/ijcl.22.3.02lov
MARCOS-MARÍN, F. (1992): “Corpus de referencia de la lengua española contemporánea: Corpus oral peninsular”, Laboratorio de Lingüística Informática. http://www.lllf.uam.es/ESP/Corlec.html (Fecha de consulta: 03/12/2021).
MONTI, C., BENDAZZOLI, C., SANDRELLI, A. y RUSSO, M. (2005): “Studying Directionality in Simultaneous Interpreting through an Electronic Corpus: EPIC (European Parliament Interpreting Corpus)”, Meta, 50(4). https://doi.org/10.7202/019850ar
O'DRISCOLL, J. (2007): “What's in an FTA? Reflections on a chance meeting with Claudine”, Journal of Politeness Research, 3(2), pp. 243-268. https://doi.org/10.1515/PR.2007.011
OZOLINS, U. (1998): Interpreting & Translating in Australia: Current Issues and International Comparisons, Melbourne: The National Language and Literacy Institute of Australia. https://eric.ed.gov/?id=ED426597
PENA DÍAZ, C. (2022): “El uso de atenuantes retóricos en la interpretación telefónica en la asistencia en carretera”, Pragmalingüística, 30, pp. 447-462.
PHELAN, M. (2001): The Interpreter’s Resource, Clevedon: Multilingual Matters.
PÖLLABAUER, S. (2004): “Interpreting in asylum hearings. Issues of role, responsibility and power”, Interpreting (International Journal of Research and Practice in Interpreting), 6(2), pp. 143-180. https://doi.org/10.1075/intp.6.2.03pol
RÜHLEMANN, C. (2018): “CL and speech acts”, Corpus Linguistics for Pragmatics: A Guide for Research, Abingdon: Routledge, pp. 16-47.
SÁNCHEZ, M. S. (2005): “El corpus de referencia del español actual (CREA): el CREA oral”. Oralia: Análisis del discurso oral, 8, pp. 37-56.
SAYERS, D., SOUSA-SILVA, R., HÖHN, S. et al. (2021): “The Dawn of the Human-Machine Era: A forecast of new and emerging language technologies”, EU COST Action CA19102 ‘Language In The Human-Machine Era’. https://doi.org/10.17011/jyx/reports/20210518/1
SCHMIDT, T. (2004): “Transcribing and annotating spoken language with EXMARaLDA”, Proceedings of the LREC-Workshop on XML based richly annotated corpora, Paris: European Language Resources Association. https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/2317 (Fecha de consulta: 03/12/2021).
SIMPSON, R. C., BRIGGS, S. L., OVENS, J. y SWALES, J. M. (2002): The Michigan Corpus of Academic Spoken English, Ann Arbor: The Regents of the University of Michigan.
SVARTVIK, J. (ed.) (1990): The London–Lund corpus of spoken English: Description and research, Lund: Lund University Press. https://portal.research.lu.se/sv/publications/the-londonlund-corpus-of-spoken-english-description-and-research (Fecha de consulta: 03/12/2021).
TEXT ENCODING INITIATIVE (2019): “The TEI Guidelines”, TEI: Text Encoding Initiative. https://www.tei-c.org/release/doc/tei-p5-doc/en/html/index.html (Fecha de consulta: 03/12/2021).
VALERO GARCÉS, C. y LI, J. (2022): “La interpretación telefónica y presencial chino-español. Estudio de caso”, Pragmalingüística, 30, pp. 463-482.
VITALARU, B. (2022): “Mitigación y estrategias atenuadoras en interpretación telefónica: estudio de caso sobre la combinación español-ruso”, Pragmalingüística, 30, pp. 483-514.
YERGEAU, F. (2003): “RFC 2279: UTF-8, a transformation format of ISO 10646” (RFC Standard No. 2279). https://doi.org/10.17487/RFC3629

Data source: Dialnet

Codificación y anotación preliminar de un corpus oral multilingüe de conversaciones telefónicas interpretadas para el estudio de los ataques a la imagen

Universidad Nacional de Educación a Distancia

Universidad de Alcalá

Sustainable development goals

Abstract

Bibliographic References