USING MACHINE TRANSLATION ENGINES IN THE CLASSROOM: A SURVEY OF TRANSLATION STUDENTS’ PERFORMANCE 7

This paper outlines the results of the experimental study aiming to explore the impact of using machine translation engines on the performance of translation students. Machine translation engines refer to the software developed to translate source texts into target languages in a fully automatic mode which can be classified according to the algorithm (example-based, rule-based, statistical, pragmatics-based, neural, hybrid) and the level of customisation (generic, customised, adaptive). The study was carried out in the form of a translation test held during the first semester of 2019/2020 academic year (September). The subjects included 48 undergraduate students of the School of Foreign Languages of V. N. Karazin Kharkiv National University (42 women and 6 men aged from 19 to 22) majoring in translation. They were subdivided into two sample groups: sample group 1 performed translation from English into Ukrainian with the help of the MT engine while sample group 2 translated the same text from English into Ukrainian by hand. The MT engine chosen for conducting the research was Microsoft Translator for personal use. The research was arranged into the following stages: preliminary (designing the experiment), main (implementing the experiment into life), conclusive (processing the results and interpreting them). The comparison of the students’ average results allowed us to come to the conclusion that the hypothesis preconceived in the beginning of the study was confirmed: the quality of the text translated by the students who received no previous training in post-editing with the use of a modern machine translation engine was of a poorer quality compared to the translation of the same text made by the students without the use of a machine translation engine. The students using the machine translation engine showed the tendency not to treat critically the output of the machine translation engine, thus scoring more penal points than the students translating on their own. The evidence from this study implies the necessity of teaching students to use machine translation in their work and to post-edit texts by means of developing the appropriate cross-curricular methodology of teaching.


Introduction
The way translators work has changed drastically in recent decades due to the extensive development of technologies.The modern translation market demands translating large volumes of texts within the shortest possible time.The requirement could be fulfilled only with the help of translation technologies, CAT-tools being in the top of the list.However, nowadays the situation on the technology market shifts as machine translation (MT) is gaining more and more popularity due to the significant improvement of its quality and the prospects of translation costs reduction.
MT can be defined as "a sub-field of computational linguistics (CL) or natural language processing (NLP) that investigates the use of software to translate text or speech from one natural language to another" (Qun and Xiaojun, 2015), while an MT engine refers to the software developed to translate source texts into target languages in a fully automatic mode without any human assistance (What is Machine Translation?).The terms "MT" and "MT engine" are often used interchangeably in the scientific papers, for example, "MT output" in fact means "the output produced by an MT engine".
The main criteria used to classify MT and MT engines include their algorithm and the level of customisation.
According to the algorithm, MT technologies include (Hutchins and Somers, 1992): Example-based Machine Translation (EBMT); Rule-based Machine Translation; Statistical Machine Translation (SMT); Pragmatics-Based Machine Translation (PBMT), Neural Machine Translation (NMT), but most modern MT engines usually operate on a hybrid basis combining NMT and some other technology (most commonly SMT).
According to the level of customisation MT engines are classified into generic (not customised and not specialised in any domain), customisable (customised and specialised which means they can be trained to ensure better terminology accuracy in a specific domain) and adaptive (trained on the results of post-editing carried out by the human translator which allows such a system to make more accurate suggestions to translators) (What is Machine Translation?, n.d.).
MT engines can be of real help, but they are unable to replace a human translator, their use is subject to some restrictions and limitations while the results (translated texts) need human control and correction.They influence the translation process and the way professional translators work and thus imply some training in their professional use to be introduced into the curriculum of students majoring in translation.The need in such training is explicitly outlined in the central European document governing translator pedagogy -European Master's in Translation (2017) stating that future translators are to master the basics of MT and its impact on the translation process as well as acquire the skills of assessing the relevance of MT engines in a translation workflow and implement the appropriate MT engines where relevant.Mellinger (Mellinger, 2017) emphasises the existence of the knowledge and skills gap in translator pedagogy in the aspect of MT use while Doherty & Kenny (2014) stress the need for developing an MT teaching methodology which requires carrying out fundamental research on all the aspects of the impact of MT technology on the translation process, post-editing and translators.The importance of the issue is confirmed by the extensive previous studies made on the topic, especially by foreign researchers.The study devoted to the impact of MT error types on post-editing effort indicators (Daems, Vandepitte, Hartsuiker, Macken, 2015) showed that MT quality was a significant predictor of all different types of post-editing effort indicators and that different types of MT errors predicted different post-editing effort indicators.Fiederer and O'Brien (2009) investigated whether the MT output quality is lower than the quality produced by human translators and found out that machine translated, post-edited output was judged by eleven suitably qualified raters to be of higher clarity and accuracy, while the human translations were judged to be of better style.Temizöz (2016) examined the quality of post-editing MT output by subject-matter experts versus professional translators and proved that post-editing quality showed by subject-matter experts was mainly as high as the quality produced by professional translators with the only exception lying in rendering terminology.The results of the studies devoted to measuring MT post-editing productivity are quite contradictory: while some researchers (Jia, Carl, Wang, 2019) state that there is a significant increase in productivity while post-editing MT as compared to translation by hand, others report (García, 2010) that no differences were noted, which leaves the problem open to discussion and suggests conducting further research.
In Ukraine, the studies devoted to MT are on their initial stage.Kostikova et al. (2019) conducted a comparative analysis of two MT engines' output -Google Translate and Pragma Online -in terms of four main criteria: adequacy, the use of correct word equivalents, the accuracy of the translation of terminology, grammatical compliance which proved that Google Translate, being a neural MT engine, demonstrated better results and thus can be recommended for use when the general understanding of scientific and technical texts is needed.
Goodmanian, Sitko and Struk (2019) investigated functional and pragmatic adequacy of journalistic style texts translation applying MT systems and concluded that MT systems do not satisfy the established translation quality requirements but can be quite effective when used by professional translators.The paper also provides the detailed analysis of MT mistakes.
Our previous research devoted to studying the correctness of rendering terms by an MT engine versus human translators showed that in most cases Google Translate rendered the terms incorrectly and the ratio of the types of machine translation mistakes varies in different spheres of knowledge.
All the studies mentioned above contributed significantly to solving the problem of studying the impact of MT on the translation process and translators, but there are still many aspects remaining unexplored and necessitating further research, specifically the impact of MT engines on the translation students' performance.
Thus, the purpose of the article is to study the impact of modern MT engines on the performance of translation students in terms of translation quality as compared to translation made by hand.Finding out whether the mentioned impact really exists will help us to formulate pedagogical implications on teaching future translators to use modern MT engines in an efficient way.For achieving the purpose we have set forth the following tasks: 1) to define the aim of the experimental study; 2) to develop a preconceived hypothesis; 3) to describe the structure of the experimental study; 4) to outline the materials used to conduct the study; 5) to provide the results obtained; 6) to discuss and interpret them; 7) to suggest pedagogical implications.
The aim of the experimental study is to measure and compare the translation students' performance in the classroom in translating social and political texts from English into Ukrainian with the use of a modern MT engine and without it.
In order to conduct our study, we developed a preconceived hypothesis: translation students who received no previous training in post-editing texts translated with the use of a modern MT engine are likely to show a poorer quality (measured in the proficiency coefficient developed by V. Bespalko) of the translated text compared to the translation of the same text made by the students without the use of the MT engine (by hand).

Method and procedure
The research was conducted in the form of a translation test held during the first semester of 2019/2020 academic year (September).The subjects included 48 undergraduate students of the School of Foreign Languages of V. N. Karazin Kharkiv National University (42 women and 6 men aged from 19 to 22) majoring in translation.All the students were arranged into two groups with the same level of preparation: having the same subject-matter knowledge, translation experience and language knowledge.The level of students' preparation was tested statistically based on their previous results in performing translations with the help of independent (unpaired) two-sample t-test, intended to compare the means of the two independent groups in order to determine whether there is a significant difference between them.We formulated the two hypotheses: H 0 -there is no significant difference between the means (translation performance results) of the two groups; H 1 -there is a significant difference between the means (translation performance results) of the two groups.
The calculations were made under the algorithm provided in the work of Seredenko & Dolzhikova (2009) and indicated that the H 0 hypothesis was confirmed: there is no significant difference in the level of preparation of the two groups.
None of the students was previously exposed to any training on the use of MT.The research was arranged into the following stages: preliminary (designing the experiment), main (implementing the experiment into life), conclusive (processing the results and interpreting them (Zinukova, 2018).
All the subjects were subdivided into two sample groups: sample group 1 performed translation from English into Ukrainian with the help of the MT engine while sample group 2 translated the same text from English into Ukrainian by hand.
The only resource sample group 1 could use was the MT engine.Students were instructed to use it as extensively as possible, so most of them used segment-per-segment pre-translation or even batch pretranslation feature and then post-edited the result to attain the highest possible quality.
While translating the text without the help of MT, students of sample group 2 were allowed to access the Internet on their mobile phones and to use all its information resources except MT engines.
The source text was an excerpt devoted to text-mining taken from the Internet and numbering 1610 characters without spaces (295 words).The MT engine chosen for conducting the research was Microsoft Translator for personal use as it is one of the leaders of the market combining two major algorithms: neural MT and statistical MT.According to the level of customisation, this engine can be classified as a generic one as it is not customised and not specialised in any specific domain.The software is free of charge and is commonly used by students in performing their home tasks and first freelance translations, but as the students weren't exposed to any specific training in post-editing we intended to find out whether the use of such MT engines influences students' performance in terms of quality and whether such influence is of a negative or positive character.The results obtained may give the answer to the important question concerning the necessity of introducing a specialised course in MT post-editing.The time for performance -80 minutes.The students were invited to produce a high-quality translation which would meet translation market requirements.After receiving all the instructions, students started performing translations autonomously.The researchers were present in the classroom throughout the experiment.

Evaluation method
As the aim of our study is to measure students' performance in terms of translation quality, the best way to do it is to use the proficiency coefficient developed by Bespalko (1989).In order to obtain it, we had to check students' translations according to the assessment scheme developed at Mykola Lukash Translation Studies Department of the School of Foreign Languages of V. N. Karazin Kharkiv National University (Chernovaty, 2009(Chernovaty, , 2010(Chernovaty, , 2013) ) in which three types of mistakes are distinguished, each being marked with the corresponding number of penal points: mistakes of the first type (1 penal point) -the sense is rendered erroneously or important information is omitted; mistakes of the second type (0,5 penal points) -the sense is rendered ambiguously and may be perceived erroneously by the recipient; mistakes of the third type (0,1 penal points) -the sense is rendered correctly but with some spelling, grammar, stylistic and other minor mistakes.
Each mistake in the students' translations was marked with the appropriate number of penal points with the overall sum of penal points being calculated after finishing the process of checking with further conversion of the scored penal points into a five-grade system mark following the scheme developed at the same Department.
The 5-grade system is commonly used in Ukrainian higher education establishments to evaluate students' academic performance.The meaning of the marks and their ECTS scale equivalents are given in Table 1.Then we converted the students' results in a 5-grade system mark into the proficiency coefficient using the formula developed by Bespalko (1989): К=А/N, wherе А − mark scored by the students for the correctly performed task, N -the maximum possible mark.The proficiency coefficient is an instrument allowing to measure completeness of the learning process.The learning process can be considered completed when К ≥0,7, as students possessing this level of proficiency, are ready for self-education and self-improvement in their further activities.When К ≤0,7, the students are likely to make systematic mistakes in their further activities and remain unable to correct them.

Results
The results of the test are provided in Tables 2 -3.The table 2 shows that the biggest number of the mistakes of the first type totalled to 3.0 penal points while the lowest number is 0 (the absence of the mistakes of this type).As for the mistakes of the second type, they varied from 0 to 4.0 penal points and the mistakes of the third type -from 0.5 to 2.8 penal points.
On average, students scored the biggest number of penal points in two categories: mistakes of the second and third types, while the lowest number of mistakes is fixed in the first category -mistakes of the first type.This situation can be viewed as a positive one as it can be a proof that students demonstrated their predominantly good understanding of the sense of the source text.But in terms of the proficiency coefficient, the results are not so good.16 students (61.5 %) demonstrated that their proficiency was below 0.7 (varying from 0.40 to 0.66) which is an insufficient level according to Vladimir Bespalko system (Bespalko, 1989).5 students (19.3 %) showed the sufficient level of proficiency with their coefficients varying from 0.72 to 0.78.The result of 2 students (7.7 %) can be marked as 'good' as their coefficients are 0.82 and 0.86.And only 3 students (11.5 %) yielded an excellent result: 0.9 -0.96.
The results of sample group 2 are given in Table 3.The figures in Table 3 demonstrate that all in all the results of sample group 2 are better.The overall number of penal points amounts to 2.6 compared to 4.0 penal points scored by sample group 1.The students of the second group have fewer mistakes of all types: mistakes of the first type -0.7 compared to 1.0 in group 2, mistakes of the second type -0.8 compared to 1.5, mistakes of the third type -1.1 compared to 1.5.
The average proficiency coefficient of sample group 2 amounted to 0.76 which is a satisfactory result under Vladimir Bespalko system (Bespalko, 1989).The total number of students who didn't manage to reach a satisfactory level is 7 students (31.8 %) with proficiency coefficient varying from 0.44 -0.66. 3 students (13.7 %) managed to achieve proficiency coefficient within 0.74 -0.78 (satisfactory).5 students (22.7 %) showed the proficiency coefficient at a good level: 0.82 -0.88 while 7 students (31.8 %) demonstrated an excellent result: 0.90 -0.96.
Figure 1 explicitly demonstrates the difference between the results of students' translations (in penal points) performed with the help of the MT engine (Sample group 1 -SG-1) and by hand (Sample group 2 -SG-2) from English into Ukrainian.

Figure 1. The results of students' translations (in penal points) performed with the help of the machine translation engine (Sample group 1 -SG-1) and by hand (Sample group 2 -SG-2) from English into Ukrainian
Figure 2 shows the overall difference between the results of students' translations (in proficiency coefficient) performed with the help of the MT engine (Sample group 1 -SG-1) and by hand (Sample group 2 -SG-2) from English into Ukrainian.

Figure 2. Overall results of students' translations (in the proficiency coefficient) performed with the help of the machine translation engine (Sample group 1 -SG-1) and by hand (Sample group 2 -SG-2) from English into Ukrainian
The data of the research were tested with mathematical statistics methods which showed that the results obtained are statistically significant.The difference in the overall proficiency coefficient of the two sample groups may indicate that the students of sample group 1 tended to demonstrate an excessive reliance on the MT engine -a failed strategy leading to the bigger number of mistakes of all the three types while the students of sample group 2 were inclined to think critically having no such a technological tool at their disposal.
Quantitative results of the research were complimented with the qualitative ones as after the written test, the students of the both sample groups were exposed to questioning.The questionnaire was simple and contained just five questions devoted to the main aspects of MT use.48 students (100 %) answered that they use MT in their work.43 students (89,6 %) assured that they trust the results of MT while 5 students (10,4 %) mostly distrusted them.40 students (83,3 %) believe that MT translation needs only some minimal post-editing while 8 students (16,7 %) disagreed with them.42 students (87,5 %) are convinced that MT improves translation quality and 46 students (95,8 %) are persuaded that MT increases the proficiency.6 students (12,5 %) remain rather sceptical about the improvement of quality with the help of MT and 2 students (4,2 %) are not sure that MT can enhance proficiency.
The overall results of the questionnaire administered to the students correlate well with the quantitative results obtained by checking students' translations and show that on the whole students perceive positively the MT technology, use it extensively, but tend to be excessively confident in the final results of MT output.

Discussion
The results of the research confirmed the hypothesis preconceived at the beginning of testing students.The students using the MT engine showed the tendency not to treat critically the output of the MT engine, thus scoring more penal points than the students translating on their own.The gap between the proficiency coefficient of sample group 1 (students performing translation from English into Ukrainian with the help of the MT engine) and sample group 2 (students translating the same text from English into Ukrainian by hand) totalled to 0.16 in favour of group 2. It's also worth mentioning that sample group 1 made more mistakes of the second type as compared to the results of sample group 2 which may be explained by the fact that students tended to feel rather confident about the results of the MT engine output making only minor edits if any at all.
The overall result demonstrated by sample group 1 is too low in terms of translation quality -the proficiency coefficient of the group averaged to 0.6 which is an unsatisfactory and quite poor result for the undergraduate level students, so the natural question arises: why did the students show such a low performance in post-editing NMT?We can assume that the answer lies in the absence of any training in using MT and post-editing: most students are unaware of the possibilities and restrictions of MT, levels of post-editing and are lacking the necessary skills.
Thus, Yamada (2019) investigated the impact of Google NMT on post-editing by student translators and concluded that the improved algorithms allowing to produce human-like translations make it even harder for students to meet professional post-editing quality standards: students demonstrated a poorer correction rate while post-editing NMT compared to post-editing SMT and it became obvious that post-editing NMT requires almost the same competence as translating a text 'from scratch' or editing human translation, thus implying the necessity of the specific training.Koponen & Salmi (2017) conducted a pilot study which aimed to analyse the edits performed by 16 translation students previously exposed to MT and post-editing teaching in a light post-editing English-Finnish task in terms of correctness and necessity of such edits.The results demonstrated that although most corrections made by the students were correct (which may be the result of the previous training), in most cases (34 %) they were unnecessary which may imply the need for further training and introducing some changes into the current MT and post-editing curriculum.Koehn & Germann (2014) studied the impact of four SMT systems on the quality of human post-editing which was performed by four fluent bilingual native speakers of German with no professional experience in translation.The investigation showed that differences between post-editors were much larger than the differences between SMT systems, thus indicating the need for skilled post-editors specifically trained for performing such tasks.
O'Brien (2002) compared the skillsets of a post-editor and a translator and concluded that they were different, thus making it impossible to assume that any qualified translator can be a successful post-editor.The researcher stresses the need for teaching post-editing arguing that it will enable future translators to meet modern translation market requirements: good quality translations within faster production times.It could also beneficially influence students' MT perceptions and make them ready to work with modern MT engines.
The results obtained in our study coupled with all the findings made in the previous researches enabled us to arrive at some pedagogical implications.
Though the gains in translation technology in general and in MT, in particular, are quite impressive, technological tools are unlikely to be able to substitute a human translator in all the aspects of the translation process, as it is more than just rendering words from one language to another.So, researchers (Pym, 2019) suggest that the best perspective for future translators lies not in competing, but rather in complementing modern technologies employing a vast range of neighbouring skills such as:  selling trustworthiness rather than words which means training students as a final authority approving or disapproving the results of MT output, especially in translating such sensitive domains as medicine, law etc.;  playing with the full orchestra of translation solutions, that is teaching students to apply the whole range of translation solutions: text tailoring, creating new words and doing many other things machines are not capable of;  engaging in automation-related activities which means teaching post-editing, pre-editing, revision, reviewing, project management etc.;  doing more than translating which means teaching students to act as localisation experts, transcreation specialists, quality assurance managers etc.
Being an important task for modern translators, the use of MT and post-editing is to take a special place in the professional training -the fact that indicates the need for developing a methodology for teaching MT and post-editing.Previous researches (Scansani, Bernardini, 2019) also showed that translation students who received training in MT trust the technology: they can actively interact with it and demonstrate their constructive attitude.The need for the extensive use of MT engines in the professional practices is suggested by the study of Guerberof (2009) devoted to the comparison of professional translators' productivity and quality when using machine-translated output and when processing fuzzy matches from translation memories which showed that both indicators (productivity and quality) were higher when post-editing MT engine output.Garcia (2011) points out that training in MT would have a beneficial effect on translation students increasing their productivity.
Obviously, the MT course should provide knowledge about MT and post-editing as well as some skills, so it should be arranged into two parts: theoretical and practical.Within the theoretical part, students are to acquire the main historic milestones of MT, translation technology landscape and the place of MT in it, MT engines classifications, MT advantages and disadvantages.The practical part should equip students with a wide range of skills starting from choosing the MT system appropriate for a specific project, preparing for the translation process with the use of MT to reduce the level of postediting required to perform full and light post-editing according to TAUS MT Post-editing Guidelines (TAUS, 2016).Moreover, as researchers (Samson, 2005) point out that translation technology skills are of a cross-curricular nature, MT is to be implemented in other translation courses as one of the means of translation performance enhancement -the so-called transversal, or "everyware" approach to the teaching of translation technologies (Raído, 2013).

Conclusions
In this article, we have studied the impact of MT on the performance of translation students in terms of quality.The data obtained from the experimental study suggest that MT engines for personal use (such as Microsoft Translator) can considerably influence translation students' performance in a negative way, so to improve the results students need to acquire some knowledge about MT and post-editing as well as master practical skills.
Thus, the modern technology exerts an enormous influence on the landscape of translation industry, the way professional translators work and, consequently, the way we teach future translators.MT has become a major driving force on the translation market and to remain competitive future translators are to master the technology and turn it into an indispensable aid in their professional activity.Achieving this goal requires conducting a lot of research devoted to various aspects of MT usage and post-editing process.The present study reveals several directions for further research.
Considering the restrictions of our study, its results cannot be regarded as the final ones.As the small number of students naturally limits the generalisability of the results, the study would benefit from conducting it once again on an increased number of students-participants.It would also be useful to measure the time within which sample groups perform their tasks in translation and post-editing.The study can also be replicated with other texts and language pairs.As the results of our research show that MT systems for personal use (such as Microsoft Translator) may exert some negative influence on the translation performance of the students who weren't exposed to any training, in our future work we also plan to build theoretical grounds for developing our own course for teaching MT and post-editing to undergraduate students majoring in translation.Another important aspect which must be studied is the influence of professionally oriented MT engines on students' performance in terms of quality.It is also important to conduct a comparative study to find out whether MT engines for personal and professional use influence the students' translation performance in terms of quality in a different way.In case the researches prove the existence of some negative influence of professionally oriented MT engines on the quality of translations performed by the untrained students, it would be reasonable to cover them in the course.The effectiveness of the course is to be verified in an experimental way via comparing the results of the students possessing no post-editing skills and the students who gained post-editing skills -which is one more direction of our further research.