Marcos Garcia
Investigador pós-doutoral em Processamento da Linguagem Natural.
Grupo LyS (Língua e Sociedade da Informação),
- Actualmente formo parte do centro de investigação CiTIUS da Universidade de Santiago de Compostela.
- Dei uma palestra convidada sobre recursos abertos do Galego no OpenCor 2020 (no PROPOR 2020, o 2 de Março).
- Estou a co-organizar o Workshop on Hybrid Intelligence for Natural Language Processing Tasks (HI4NLP), que se vai celebrar no ECAI-2020.
- Co-organizei o techLING, (eixo 3: Linguística Computacional), que decorreu na Corunha de 9 a 11 de Outubro).
- Fui editor convidado da edição especial Natural Language Processing and Text Mining (do jornal de acesso aberto Information).
Formação
- Doutoramento em Linguística (NLP), Universidade de Santiago de Compostela (2014).
- D.E.A. em Filologia Galega e Portuguesa, Universidade de Santiago de Compostela (2009).
- MA em Linguística, Universidade de Lisboa (2008).
- Licenciatura em Filologia Portuguesa, Universidade de Santiago de Compostela (2005).
Prémios
- Prémio à melhor dissertação de doutoramento no PROPOR 2016.
- Prémio extraordinário de doutoramento em Artes e Humanidades 2014/2015 da USC.
Financiamento competitivo
- Bolsa Leonardo a Investigadores e Creadores Culturais, Fundação BBVA, 2017.
- Juan de la Cierva incorporación 2016 (pós-doutoral).
- Juan de la Cierva formación 2014 (pós-doutoral).
- Programa de Doutoramento, Universidade de Santiago de Compostela, 2010.
- Programa de Investigação, Instituto Camões, 2007-2009.
Afiliações
- 2017 (actual): CITIC.
- 2016 (actual): Grupo LyS.
- 2011-2015: Investigador do CiTIUS.
- 2009-2015: Grupo GE / ProLNat.
- 2007-2008: Grupo Galabra.
- 2006-2007: Grupo Natural Language and Speech, NLX.
Docência
- Introdução ao Processamento das Línguas Naturais para Lexicografia (EMLex - European Master in Lexicography, UMinho), 2018/2019.
- Recursos e ferramentas para lexicografia: uso e desenho II (EMLex - European Master in Lexicography, USC), 2017/2018, 2018/2019.
- Línguas e tecnologias (Faculdade de Filologia, UdC), 2016/2017, 2017/2018.
- Linguística geral (Faculdade de Filologia, UdC), 2016/2017, 2017/2018.
- Fonética e fonologia do espanhol (Faculdade de Filologia, USC), 2011/2012.
- Análise computacional de textos hispánicos (Faculdade de Filologia, USC), 2010/2011.
- Fonética acústica: teoria e softwares (Pós-graduação em Ciências da Voz at ISAVE), 2008/2009.
Publicações
2019
- Garcia, Marcos, Marcos García-Salido e Margarita Alonso-Ramos, 2019. Weighted compositional vectors for translating collocations using monolingual corpora. In Computational and Corpus-Based Phraseology (EUROPHRAS 2019). Lecture Notes in Artificial Intelligence, 11755. Springer: 113-128.
- García-Salido, Marcos, Marcos Garcia e Margarita Alonso-Ramos, 2019. Identifying lexical bundles for an academic writing assistant in Spanish. In Computational and Corpus-Based Phraseology (EUROPHRAS 2019). Lecture Notes in Artificial Intelligence, 11755. Springer: 144-158.
- Gamallo, Pablo, Marcos Garcia, e Patricia Martín-Rodilla, 2019. NER and Open Information Extraction for Portuguese. Notebook for IberLEF 2019 Portuguese Named Entity Recognition and Relation Extraction Tasks. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019), co-located with 35th Conference of the Spanish Society for Natural Language Processing (SEPLN 2019): 457-467.
- Canosa, Xavier, Pablo Gamallo, Xavier Varela, José Ángel Taboada, Paulo Martínez Lema e Marcos Garcia, 2019. Uma utilidade para o reconhecimento de topónimos em documentos medievais. Linguamática, 11(1), p. 3-15.
- Garcia, Marcos e Marcos García-Salido, 2019. A method to automatically identify diachronic variation in collocations. In Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change 2019 at the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019): 71-80, Florença.
- Gamallo, Pablo e Marcos Garcia, 2019. Unsupervised Compositional Translation of Multiword Expressions. In Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019) at the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019): 40-48, Florença.
- Garcia, Marcos, Marcos García-Salido e Margarita Alonso-Ramos, 2019. A comparison of statistical association measures for identifying dependency-based collocations in various languages. In Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019) at the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019): 49-59, Florença.
- Garcia, Marcos, Marcos García-Salido, Susana Sotelo Docío, Estela Mosqueira e Margarita Alonso-Ramos, 2019. Pay attention when you pay the bills. A multilingual corpus with dependency-based and semantic annotation of collocations. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019): 4012-4019, Florença.
- Garcia, Marcos, Marcos García-Salido e Margarita Alonso-Ramos, 2019. Towards the automatic construction of a multilingual dictionary of collocations using distributional semantics. In Proceedings of eLex 2019: Smart Lexicography: 747-762, Sintra.
- García-Salido, Marcos Garcia e Margarita Alonso-Ramos, 2019. Towards a graded dictionary of Spanish collocations. In Proceedings of eLex 2019: Smart Lexicography: 849-864, Sintra.
- Garcia, Marcos, Marcos García-Salido e Miguel A. Alonso, 2019. Exploring cross-lingual word embeddings for the inference of bilingual dictionaries. In Proceedings of TIAD-2019 Shared Task – Translation Inference Across Dictionaries co-located with the 2nd Language, Data and Knowledge Conference(LDK 2019): 32-41. Leipzig. CEUR-WS, Vol. 2493.
- Garcia, Marcos, Marcos García-Salido e Margarita Alonso-Ramos, 2019. Discovering bilingual collocations in parallel corpora: A first attempt at using distributional semantics. In Irene Doval & María Teresa Sánchez-Nieto (eds.), Parallel corpora for contrastive and translation studies: New resources and applications. Studies in Corpus Linguistics, 90, p. 267-279. John Benjamins Publishing Company.
2018
- Garcia, Marcos, 2018. Comparing bilingual word embeddings to translation dictionaries for extracting multilingual collocation equivalents. In Stella Markantonatou, Carlos Ramisch, Agata Savary e Veronika Vincze (eds.), Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop. Phraseology and Multiword Expressions 3: 319-342. Language Science Press.
- Gamallo, Pablo, Marcos Garcia, César Piñeiro, Rodrigo Martínez-Castaño e Juan C. Pichel, 2018. LinguaKit: a Big Data-based multilingual tool for linguistic analysis and information extraction. In Proceedings of The Second International Workshop on Advances in Natural Language Processing (ANLP 2018) at The Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS-2018): 239-244. Valencia.
- Gamallo, Pablo e Marcos Garcia, 2018. Task-Oriented Evaluation of Dependency Parsing with Open Information Extraction, In Villavicencio, Aline, Viviane Moreira, Alberto Abad, Helena Caseli, Pablo Gamallo, Carlos Ramisch, Hugo Gonçalo Oliveira and Gustavo Henrique Paetzold (eds.), Computational Processing of the Portuguese Language. 13th International Conference, PROPOR 2018, Canela, Brazil Proceedings, volume 11122 of Lecture Notes in Artificial Intelligence: 77-82, Springer. (rascunho)
- Silva, João Silva, Marcos Garcia, João Rodrigues e António Branco, 2018. LX-SemanticSimilarity. In 13th International Conference on the Computational Processing of the Portuguese Language (PROPOR 2018). Demo papers: 4-6. Canela, Brazil, 2018.
- Garcia, Marcos, 2018. Extracción automática de equivalentes multilingües de colocaciones. Procesamiento del Lenguaje Natural, 61, p. 131-134.
- García-Salido, Marcos e Marcos Garcia, 2018. Comparing learners’ and native speakers’ use of collocations in written Spanish. International Review of Applied Linguistics in Language Teaching (IRAL) 56(4), p. 401-426 (aop 2017). (rascunho)
- Gamallo, Pablo e Marcos Garcia, 2018. Dependency parsing with finite state transducers and compression rules. Information Processing & Management, 54(6), p. 1244-1261. (rascunho)
- García-Salido, Marcos, Marcos Garcia, Milka Villayandre e Margarita Alonso-Ramos, 2018. A Lexical Tool for Academic Writing in Spanish based on Expert and Novice Corpora. In Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asunción Moreno, Jan Odijk, Stelios Piperidis and Takenobu Tokunaga (eds.), Proceedings of the 11th edition of the Language Resources and Evaluation Conference (LREC 2018), Miyazaki: 260-265.
- Gamallo, Pablo, Iván Rodríguez-Torres e Marcos Garcia, 2018. Distributional Semantics for Diachronic Search. Computers and Electrical Engineering (Special section on New Trends in Humanistic Informatics: Implementations and Applications), 65, p. 438-448.
- Garcia, Marcos, Carlos Gómez-Rodríguez e Miguel A. Alonso, 2018. New treebank or repurposed? On the feasibility of cross-lingual parsing of Romance languages with Universal Dependencies. Natural Language Engineering, 24(1), p. 91-122. (draft).
2017
- Querido, Andreia, Rita de Carvalho, João Rodrigues, Marcos Garcia, João Silva, Catarina Correia, Nuno Rendeiro, Rita Pereira, Marisa Campos e António Branco, 2017. LX-LR4DistSemEval: a collection of language resources for the evaluation of distributional semantic models of Portuguese. Revista da Associação Portuguesa de Linguística, 3, p. 265-283.
- Alonso-Ramos, Margarita, Marcos García-Salido e Marcos Garcia, 2017. Exploiting a Corpus to Compile a Lexical Resource for Academic Writing: Spanish Lexical Combinations. In Electronic lexicography in the 21st century. Proceedings of the eLex 2017 conference, Leiden: 571-584.
- Vilares, David, Marcos Garcia, Miguel A. Alonso e Carlos Gómez-Rodríguez, 2017. Towards Syntactic Iberian Polarity Classification. In Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA 2017) at EMNLP 2017: Conference on Empirical Methods in Natural Language Processing, Copenhagen: 67-73.
- Garcia, Marcos e Pablo Gamallo, 2017. A rule-based system for cross-lingual parsing of Romance languages with Universal Dependencies. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver: 274-282.
- Gamallo, Pablo e Marcos Garcia, 2017. LinguaKit: uma ferramenta multilingue para a análise linguística e a extração de informação. Linguamática, 9(1), p. 19-28.
- García-Salido, Marcos, Marcos Garcia e Margarita Alonso-Ramos, 2017. Identificación de fórmulas recurrentes en español académico. In 9th International Conference on Corpus Linguistics (CILC 2017), Paris.
- Gamallo, Pablo, Iván Rodríguez-Torres e Marcos Garcia, 2017. A Web Interface for Diachronic Semantic Search in Spanish. In Proceedings of the Software Demonstrations at the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017), Valencia: 45-48.
- Garcia, Marcos, Marcos García-Salido e Margarita Alonso-Ramos, 2017. Using bilingual word-embeddings for multilingual collocation extraction. In Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017) at the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017), Valencia: 21-30.
2016
- Garcia, Marcos, Carlos Gómez-Rodríguez e Miguel A. Alonso, 2016. Creación de un treebank de dependencias universales mediante recursos existentes para lenguas próximas: el caso del gallego. Procesamiento del Lenguaje Natural, 57, p. 33-40.
- Garcia, Marcos, 2016. Universal Dependencies Guidelines for the Galician-TreeGal Treebank. Technical Report, LyS Group, University of Corunha.
- Garcia, Marcos, 2016. Semantic Relation Extraction. Resources, Tools and Strategies. In João Silva, Ricardo Ribeiro, Paulo Quaresma, André Adami e António Branco (eds.), PROPOR 2016, Computational Processing of the Portuguese Language. Lecture Notes in Artificial Intelligence, 9727. Springer: 141-152. Prémio à melhor dissertação de doutoramento, PROPOR 2016.
- Gamallo, Pablo e Marcos Garcia, 2016. Entity Linking with Distributional Semantics. In João Silva, Ricardo Ribeiro, Paulo Quaresma, André Adami e António Branco (eds.), PROPOR 2016, Computational Processing of the Portuguese Language. Lecture Notes in Artificial Intelligence, 9727. Springer: 177-188.
- Garcia, Marcos, 2016. Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis (eds.), Proceedings of the 10th edition of the Language Resources and Evaluation Conference (LREC 2016), Portorož: 3357-3361.
2015
- Gamallo, Pablo e Marcos Garcia, 2015. Multilingual Open Information Extraction. In Francisco Pereira, Penousal Machado, Ernesto Costa e Amílcar Cardoso (eds.): EPIA 2015, Progress in Artificial Intelligence. Lecture Notes in Computer Science, 9273. Berlin: Springer-Verlag: 711-722.
- Garcia, Marcos e Pablo Gamallo, 2015. Exploring the Effectiveness of Linguistic Knowledge for Biographical Relation Extraction. Natural Language Engineering, 21(4), p. 519-551 (First Online, 2013).
- Garcia, Marcos e Pablo Gamallo, 2015. Yet Another Suite of Multilingual NLP Tools. In José-Luis Sierra-Rodríguez, José Paulo Leal e Alberto Simões (eds.), Languages, Applications and Technologies. Communications in Computer and Information Science, 563. Switzerland: Springer: 65-75. Revised Selected Papers of the Symposium on Languages, Applications and Technologies (SLATE 2015), Madrid.
- Gamallo, Pablo, Marcos Garcia, Iria del Río e Isaac González López, 2015. Avalingua: Natural language processing for automatic error detection. In Marcus Callies e Sandra Götz (eds.), Learner Corpora in Language Testing and Assessment. Studies in Corpus Linguistics, 70, p. 35-58. John Benjamins Publishing Company. (draft)
2014
- Garcia, Marcos, 2014. Extracção de relações semânticas. Recursos, ferramentas e estratégias. Tese de Doutoramento. Universidade de Santiago de Compostela. Prémio extraordinário de doutoramento em Artes e Humanidades 2014/2015 da USC.
- Abuín, José Manuel, Juan Carlos Pichel, Tomás Fernández Pena, Pablo Gamallo e Marcos Garcia, 2014. Perldoop: Efficient Execution of Perl Scripts on Hadoop Clusters. In Proceedings of the 2014 IEEE International Conference on Big Data (IEEE Big Data 2014). Washington DC.
- Gamallo, Pablo, Juan Carlos Pichel, Marcos Garcia, José Manuel Abuín and Tomás Fernández Pena, 2014. Análisis morfosintáctico y clasificación de entidades nombradas en un entorno Big Data. Procesamiento del Lenguaje Natural, 53, p. 17-24.
- Garcia, Marcos e Pablo Gamallo, 2014. Entity-Centric Coreference Resolution of Person Entities for Open Information Extraction. Procesamiento del Lenguaje Natural, 53, p. 25-32.
- Garcia, Marcos, Pablo Gamallo, Iria Gayo e Miguel Anxo Pousada Cruz, 2014. PoS-tagging the Web in Portuguese. National varieties, text typologies and spelling systems. Procesamiento del Lenguaje Natural, 53, p. 95-101.
- Gamallo, Pablo, Marcos Garcia, Susana Sotelo e José Ramom Pichel, 2014. Comparing Ranking-based and Naive Bayes Approaches to Language Detection on Tweets. In Proceedings of TweetLID: Twitter Language Identification Workshop at XXX Congreso de la Sociedad Española de Procesamiento del Lenguaje Natural (SEPLN 2014), Girona: 12-16.
- Garcia, Marcos e Pablo Gamallo, 2014. An Entity-Centric Coreference Resolution System for Person Entities with Rich Linguistic Information. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, Dublin: 741-752.
- Gamallo, Pablo e Marcos Garcia, 2014. Citius: A Naive-Bayes Strategy for Sentiment Analysis on English Tweets. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin: 171-175.
- Garcia, Marcos e Pablo Gamallo, 2014. Multilingual corpora with coreferential annotation of person entities. In Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk e Stelios Piperidis (eds.), Proceedings of the 9th edition of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik: 3229-3233.
2013
- Gamallo, Pablo, Marcos Garcia e Santiago Fernández-Lanza, 2013. A Naive-Bayes strategy for sentiment analysis on Spanish tweets. In Alberto Díaz Esteban, Iñaki Alegria e Julio Villena Román (eds.), Proceedings of the Workshop on Sentiment Analysis (TASS 2013) at the XXIX Congreso de la Sociedad Española de Procesamiento del Lenguaje Natural (SEPLN 2013), Madrid: 126-132.
- Gamallo, Pablo, Marcos Garcia e José Ramom Pichel, 2013. A Method to Lexical Normalisation of Tweets. In Alberto Díaz Esteban, Iñaki Alegria e Julio Villena Román (eds.), Proceedings of the Tweet Normalization Workshop at the XXIX Congreso de la Sociedad Española de Procesamiento del Lenguaje Natural (SEPLN 2013), Madrid: 81-85.
- Gamallo, Pablo, Marcos Garcia, Isaac González, Marta Muñoz e Iria del Río, 2013. An evaluation of Avalingua based on learner corpora. In Proceedings of the Workhsop on (Learner) Corpora and their application in language testing and assessment at English corpus linguistics on the move: Applications and implications (ICAME 34), Santiago de Compostela: 52-53.
- Gamallo, Pablo e Marcos Garcia, 2013. FreeLing e TreeTagger: um estudo comparativo no âmbito do Português. Technical Report, ProLNat Group, University of Santiago de Compostela.
- Gamallo, Pablo, Marcos Garcia, Isaac González, Marta Muñoz e Iria del Río, 2013. Learning verb inflection using Cilenis conjugators. In Ana Gimeno (ed.), The Eurocall Review, 21(1), p. 12-19.
2012
- Gamallo, Pablo e Marcos Garcia, 2012. Técnicas de procesamiento del lenguaje natural en la Recuperación de Información. Novática, 215, p. 42-47.
- Garcia, Marcos, Iria Gayo e Isaac González López, 2012. Identificação e Classificação de Entidades Mencionadas em Galego. Estudos de Lingüística Galega, 4, p. 13-25.
- Gamallo, Pablo, Marcos Garcia e Santiago Fernández-Lanza, 2012. Dependency-Based Open Information Extraction. In Proceedings of the ROBUS-UNSUP 2012: Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP at the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012). Avignon: 10-18.
- Gamallo, Pablo e Marcos Garcia, 2012. Extraction of Bilingual Cognates from Wikipedia. In Helena Caseli, Aline Villavicencio, António Teixeira e Fernando Perdigão (eds.): PROPOR 2012, Computational Processing of the Portuguese Language. Lecture Notes in Artificial Intelligence, 7243. Berlin: Springer-Verlag: 63-72.
- Garcia, Marcos e Isaac González López, 2012. Automatic Phonetic Transcription by Phonological Derivation. In Helena Caseli, Aline Villavicencio, António Teixeira e Fernando Perdigão (eds.): PROPOR 2012, Computational Processing of the Portuguese Language. Lecture Notes in Artificial Intelligence, 7243. Berlin: Springer-Verlag: 350-361.
2011
- Garcia, Marcos e Pablo Gamallo, 2011. A Weakly-Supervised Rule-Based Approach for Relation Extraction. In Jose A. Lozano, Jose A. Gámez e José A. Moreno Pérez (eds.), Proceedings of the XIV Conference of the Spanish Association for Artificial Intelligence (CAEPIA 2011). Workshop on Knowledge Extraction and Exploitation from Semi-structures Online Sources (KEESOS). La Laguna.
- Gamallo, Pablo e Marcos Garcia, 2011. A Resource-Based Method for Named Entity Extraction and Classification. In L. Antunes e H. S. Pinto (eds.): EPIA 2011, Progress in Artificial Intelligence. Lecture Notes in Computer Science (LNCS/LNAI), 7026/2011. Berlin: Springer-Verlag: 610-623.
- Garcia, Marcos e Pablo Gamallo, 2011. Dependency-Based Text Compression for Semantic Relation Extraction. In Preslav Nakov, Zornitsa Kozareva, Kuzman Ganchev e Jerry Hobbs (eds.), Proceedings of the Workshop on Information Extraction and Knowledge Acquisition (IEKA 2011) at 8th International Conference on Recent Advances in Natural Language Processing (RANLP 2011), Hissar: 21-28.
- Garcia, Marcos e Pablo Gamallo, 2011. Evaluating Various Linguistic Features on Semantic Relation Extraction. In Galia Angelova, Kalina Bontcheva, Ruslan Mitkov e Nikolai Mikolov (eds.), Proceedings of the 8th International Conference on Recent Advances in Natural Language Processing (RANLP 2011), Hissar: 721-726.
- Garcia, Marcos e Isaac González López, 2011. Conversión Fonética Automática con Información Fonológica para el Gallego. Procesamiento del Lenguaje Natural, 47, p. 283-291.
- Garcia, Marcos e Pablo Gamallo, 2011. Resolución de Correferencia de Nombres de Persona para Extracción de Información Biográfica. Procesamiento del Lenguaje Natural, 47, p. 47-55.
- Garcia, Marcos e Pablo Gamallo, 2011. An Exploration of the Linguistic Knowledge for Semantic Relation Extraction in Spanish. In Patrick Saint-Dizier e Rutu Mehta-Melkar (eds.), Proceedings of the Joint Workshop FAM-LbR/KRAQ'11. Learning by Reading and its Applications in Intelligent Question-Answering at 22nd International Joint Conference on Artificial Intelligence (IJCAI'11), Barcelona: 7-12.
2010
- Garcia, Marcos, 2010. O Segmento lateral /l/ em Rima Interna. Sonoridade e Nuclearização em Português Europeu. Linguística. Revista de Estudos Linguísticos da Universidade do Porto, 5, p. 53-70.
- Garcia, Marcos e Pablo Gamallo, 2010. Análise Morfossintáctica para Português Europeu e Galego: Problemas, Soluções e Avaliação. Linguamática, 2(2), p. 59-67.
- Garcia, Marcos e Pablo Gamallo, 2010. Using Morphosyntactic Post-processing to Improve POS-tagging Accuracy. In Proceedings of the 9th International Conference on Computational Processing of Portuguese Language (PROPOR 2010). Extended Activities Proceedings, Porto Alegre.
- Garcia, Marcos e Pablo Gamallo, 2010. Do processamento morfológico à análise sintáctica de corpora multilíngue. In Actas del XXXIX Simposio Internacional de la Sociedad Española de Lingüística, Santiago de Compostela.
2009
- Garcia, Marcos, 2009. Como somos vistos em Portugal? A visão da Galiza através dos visitantes portugueses. In Actas do IX Congreso Internacional de Estudos Galegos. Novas achegas ao estudo da cultura galega II. Enfoques socio-históricos e lingüístico-literarios. Capítulo III, 345-352 (2012).
- Garcia, Marcos, 2009. A imagem da Galiza através dos visitantes portugueses. Literatura, Turismo e Identidade. TIT, University of Santiago de Compostela.
2008
- Garcia, Marcos, 2008. Português Europeu e Galego. Estudo fonético e fonológico das consoantes em rima medial. MA Thesis. University of Lisbon.
- Garcia, Marcos, 2008. Aproximação ao rotacismo de /S/ pós-nasal nos dialectos ocidentais galegos. Estudos Linguísticos/Linguistic Studies, 1, p. 179-192.
- Garcia, Marcos, 2008. Turismo e Identidade. As motivações culturais dos visitantes portugueses à Galiza. Primeiras aproximações. In Helena Rebelo (coord.), Actas do IX Congresso da Associação Internacional de Lusitanistas (vol. 1), p. 265-270 (2011).
Projectos
- 2020-2023: Estudo das combinações léxicas num corpus académico de principiantes para uma ferramenta de ajuda à redacção de textos académicos (MICINN).
- 2018-2021: Avanços em novos sistemas de extracção de respostas com análise semântica e aprendizagem profunda (MINECO).
- 2017-2019 (PI): Extracção automática de equivalentes multilíngues de colocações (FBBVA).
- 2017-2019: Estudo das combinações léxicas do espanhol académico baseado em corpus para uma ferramenta de ajuda à redacção de textos académicos (MINECO).
- 2015-2017: Tecnologias linguísticas para a análise de opiniões em redes sociais (MINECO).
- 2013-2016: HPCNLP: Computação de altas prestações para o Processamento da Linguagem Natural (Governo Galego).
- 2012: CELTIC: Conhecimento estratégico com tecnologias de inteligência competitiva (FEDER-Innterconecta).
- 2011-2013: OntoPedia: Extracção automática de informação ontológica e enciclopédica sobre entidades mencionadas (MICINN).
- 2011: CORUXA Biomedical Text Mining: extractor automático e codificador de informação médica relevante com engenharia linguística de código aberto (Projecto industrial).
- 2010: COATI: minaria de opiniões multilíngue para a indústria e a administração pública (INCITE).
- 2008-2009: Desenho automático de uma ontologia de nomes próprios para um Sistema de Resposta a Perguntas (MEC).
- 2005-2007: GramaXing - Gramática computacional para o processamento linguístico profundo do português (FCT).
- 2004-2006: TagShare - Ferramentas e recursos para a anotação e processamento superficial do português (FCT).
Comités
- ACL (2020; SRW 2020, 2019; Demos 2019, 2018).
- STIL 2019.
- CILC 2019.
- EMNLP (Demos 2019 e 2018).
- SEMAPRO 2018 e 2019.
- PROPOR (2018; SRW 2020).
- ICIW 2018 e 2019.
- NAACL-HLT 2018.
- CoNLL (2019 e UD Shared Tasks 2017 e 2018).
- SLATE 2017.
- LREC 2020, 2016, 2014.
- LinguaMÁTICA.
Revisor
Recursos e ferramentas
- Explorador diacrónico.
- LX Similaridade Semântica.
- LinguaKit.
- Galician-TreeGal (Universal Dependencies treebank).
- FreeLing (módulos para português e galego).
Podes encontrar alguns recursos e ferramentas (provavelmente obsoletos) desenvolvidos durante o meu Doutoramento na Universidade de Santiago de Compostela.