Dr Bogdan Babych

Associate Professor in Translation Studies

+44 (0)113 343 1085

Summary: Machine Translation; computational models for morphosyntax; Slavonic languages; Ukrainian

Location: Michael Sadler Building

Teaching Commitments:
MODL5003M Principles and Applications of Machine Translation
MODL5005M Computers and the Translator
MODL5001M Methods and Approaches to Translation Studies
MODL5009M English for Translators


Bogdan Babych is Associate Professor in Translation Studies within the School of Modern Languages and Cultures.

He works in the area of Computational Linguistics and Natural Language Processing and has published papers on evaluating and improving the quality of Machine Translation (MT) with Information Extraction techniques, extracting translation equivalents from large non-parallel corpora, developing MT for under-resourced languages, hybrid MT, on computational models for Ukrainian and Russian morphosyntax, machine translation to and from Slavonic languages, using Machine Translation for supporting language learning and authoring in non-native language. He previously worked as a computational linguist at L&H Speech Products, Belgium. He holds a PhD in Machine Translation from the University of Leeds, and a degree in Ukrainian Linguistics from Ukrainian National Academy of Sciences. He now coordinates an FP7 Marie Curie project HyghTra on developing a new hybrid MT architecture. He previously worked in other FP7 projects -- ACCURAT (enhancing MT using comparable corpora for under-resourced languages) and TTC (mining translation terminology from comparable corpora). In 2007-2009 he worked on his project Translation Strategies in Comparable Corpora supported by the Leverhulme Early Career Research Fellowship. 

Bogdan's webpage at the School of Computing - Natural Language Processing group: http://www.comp.leeds.ac.uk/bogdan/

Research projects

(For details see Projects webpage)

EU FP7 Marie Curie IAPP project HyghTra (2010-2014)
Project: Hybrid high-quality translation system
Role: Coordinator

EU FP7 ICT Project ACCURAT (2010-2012)
Project: Analysis and evaluation of Comparable Corpora for Under Resourced Areas of machine Translation
Role: Principal Investigator for Leeds team

Leverhulme Early Career Research Fellowship (2007-2009)
Project: Translation Strategies in Comparable Corpora
Role: Principal Investigator

Research Student Supervision

Bogdan is interested in supervising PhD and MA by research students in a range of areas and topics, which include:

Machine Translation and Computer-Assisted Translation

Evaluation of Machine Translation
Linguistic models for Machine Translation
MT in the workflow of professional translators and translation companies
Improving MT quality with Computational Linguistics (CL) technologies
Emerging CL technologies for Computer-Assisted Translation and Interpreting
Collaborative translation workflow

Computational and Corpus Linguistics

Multiword expressions and phraseology
Computational models of discourse
Computational aspects of Slavonic languages
Computational models of morphosyntax
Corpus linguistics
Computational complexity of langauge
Tree Adjoining Grammars and other mildly context-sensitive formalisms
Corpus-based translation studies
Computational Linguistics methods for research in humanities

Slavonic Languages and General Linguistics

Ukrainian linguistics
Morphosyntax of Slavonic Languages
Linguistic constructions and their formal models

Please send an email, or your proposal with CV to b.babych@leeds.ac.uk
(Programming skills and/or knowledge of statistical packages are an advantage)

Citations, indices and videolectures

Google Scholar citations
ACM citations
Citeseer citations
MT Archive publications (with PDFs)
Dblp index
Humbox profile
Video of a talk at ACL 2007, Prague: Assisting Translators in indirect lexical transfer

Selected publications

(2014) Bogdan Babych, Jonathan Geiger, Mireia Ginestí Rosell, Kurt Eberle. Deriving de/het gender classification for Dutch nouns for rule-based MT generation tasks. In Proc of EACL 2014 Third Workshop on Hybrid Approaches to Translation (HyTra).

(2012) Bogdan Babych, Anthony Hartley, Kyo Kageura, Martin Thomas, & Masao Utiyama: MNH-TT: a collaborative platform for translator training. [Aslib 2012] Translating and the Computer 34, 29-30 November 2012, One Birdcage Walk, London, UK; 18pp. [PDF, 1710KB]; presentation by Martin Thomas: 41 slides [PDF, 8772KB]

(2012) Kurt Eberle, Bogdan Babych, Johanna Geiß, Mireia Ginestí-Rosell, Anthony Hartley, Reinhard Rapp, Serge Sharoff, & Martin Thomas: Design of a hybrid high quality machine translation system. EACL Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra): Proceedings of the workshop, 23-24 April 2012, Avignon, France; pp.101-112. [PDF, 381KB]

(2012) Mārcis Pinnis, Radu Ion, Dan Ştefănescu, Fangzhong Su, Inguna Skadiņa, Andrejs Vasiļjevs, & Bogdan Babych: ACCURAT toolkit for multi-level alignment and information extraction from comparable corpora. [ACL 2012] Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju, Republic of Korea, 10 July 2012, System Demonstrations; pp.91-96. [PDF , 235KB]

(2012) Reinhard Rapp, Serge Sharoff, & Bogdan Babych: Identifying word translations from comparable documents without a seed lexicon. EACL Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra): Proceedings of the workshop, 23-24 April 2012, Avignon, France; pp.10-19. [PDF, 188KB]

(2010) Jo Drugan & Bogdan Babych: Shared resources, shared values? Ethical implications of sharing translation resources. JEC 2010: Second joint EM+/CNGL Workshop “ Bringing MT to the user: research on integrating MT in the translation industry”, AMTA 2010, Denver , Colorado , November 4, 2010; pp.3-9. [PDF, 9,433KB]

(2009) Bogdan Babych, Anthony Hartley, & Serge Sharoff: Evaluation-guided pre-editing of source text: improving MT-tractability of light verb constructions.LREC 2008: 6th Language Resources and Evaluation Conference, Marrakech , Morocco , 26-30 May 2008; 4pp. [PDF, 57KB]

(2007) Bogdan Babych, Anthony Hartley, & Serge Sharoff: A dynamic dictionary for discovering indirect translation equivalents.Translating and the Computer 29. Proceedings of the twenty-ninth international conference on Translating and the Computer, 29-30 November 2007 ( London : Aslib, 2007); 10pp. [PDF, 150KB]

(2007) Bogdan Babych, Anthony Hartley, & Serge Sharoff: Translating from under-resourced languages: comparing direct transfer against pivot translation. MT Summit XI, 10-14 September 2007, Copenhagen , Denmark . Proceedings; pp.29-35 [PDF, 197KB]

(2007) Bogdan Babych & Anthony Hartley: Sensitivity of automated models for MT evaluation: proximity-based vs. performance-based methods. MT Summit XI Workshop: Automatic procedures in MT evaluation, 11 September 2007, Copenhagen , Denmark , [Proceedings]; 22pp. [PDF of PPT presentation, 150KB]

(2007) Bogdan Babych, Anthony Hartley, Serge Sharoff, & Olga Mudraya: Assisting translators in indirect lexical transfer.ACL 2007: proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, June 2007; pp. 136-143 [PDF, 285KB]

(2006) Serge Sharoff, Bogdan Babych, & Anthony Hartley: Using comparable corpora to solve problems difficult for human translators. Coling-ACL 2006: Proceedings of the Coling/ACL 2006 Main Conference Poster Sessions, Sydney, July 2006; pp.739-746. [PDF, 250KB]

(2006) Serge Sharoff, Bogdan Babych, & Anthony Hartley: Using collocations from comparable corpora to find translation equivalents. LREC-2006: Fifth International Conference on Language Resources and Evaluation. Proceedings, Genoa , Italy , 22-28 May 2006; pp.465-470 [PDF, 1104KB]

(2006) Serge Sharoff, Bogdan Babych, Paul Rayson, Olga Mudraya, & Scott Piao: ASSIST: automated semantic assistance for translators. EACL-2006: 11th Conference of the European Chapter of the Association for Computational Linguistics, Posters and demonstrations, Trento, Italy, April 5-6, 2006; pp.139-142 [PDF, 69KB]

(2005) Bogdan Babych, Anthony Hartley and Debbie Elliott: Estimating the predictive power of n-gram MT evaluation metrics across language and text types . MT Summit X, Phuket, Thailand, September 13-15, 2005, Conference Proceedings: the tenth Machine Translation Summit; pp.412-418. [PDF, 180KB]

(2005) Bogdan Babych: Information extraction technology in machine translation: IE methods for improving and evaluating MT quality. Ph D thesis, University of Leeds , Centre for Translation Studies, March 2005. 186pp. [PDF, 859KB]

(2004) Bogdan Babych, Debbie Elliott, and Anthony Hartley: Extending MT evaluation tools with translation complexity metrics.Coling 2004: 20th International Conference on Computational Linguistics, 23-27 August 2004, University of Geneva , Switzerland , Proceedings; 7pp. [PDF, 68KB]

(2004) Bogdan Babych & Anthony Hartley: Extending the BLEU MT evaluation method with frequency weightings. ACL 2004: 42nd annual meeting of the Association for Computational Linguistics: Proceedings of the conference, 21-26 July 2004, Barcelona , Spain ; pp. 621-628. [PDF, 132KB]

(2004) Bogdan Babych, Debbie Elliott, & Anthony Hartley: Calibrating resource-light automatic MT evaluation: a cheap approach to ranking MT systems by the usability of their output. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon , Portugal , 26-28 May 2004; pp.2031-2034. [PDF, 237KB]

(2004) Bogdan Babych & Anthony Hartley: Modelling legitimate translation variation for automatic evaluation of MT quality. LREC-2004: Fourth International Conference on Language Resources and Evaluation, Proceedings, Lisbon , Portugal , 26-28 May 2004; pp.833-836. [PDF, 283KB]