Machine Translation Accuracy: Google Translate Case Study

Exclusively available on Available only on IvyPanda® • No AI

Table of Contents

Problem Definition
Task Definition
Proposed Solution
Innovation
Ethics
References

Problem Definition

Machine translation accuracy is a relevant topic, and technology that can successfully convert text from one language into another while keeping the meaning as similar as possible is in high demand. According to Anggaira and Hadi (2017), Google Translate is an inadequate tool for the purpose, mainly when translating less common languages. The software tends to commit numerous morphological and syntax mistakes, and a human language expert should analyze the output to fix errors and correct misinterpreted passages. As such, the problems that the solution attempted to address were the low accuracy and frequent errors committed by the program, particularly when working with less popular languages.

According to Castelvecchi (2016), Google’s algorithms previously did not rely on artificial neural networks, working with traditional methods. Castelvecchi (2016) notes that the tool would scan text word by word, browsing through its database of existing translations and looking for similar situations. The approach is effective at parsing shorter, uncomplicated sentences, but intricate word constructions can severely confuse the machine and cause the structure of the text past the problematic segment to crumble. According to Castelvecchi (2016), Google had to implement an approach that would be able to analyze a sentence by starting from the smallest syntax units and combining them to form meanings.

Task Definition

Machine translation tools accept text in a predefined language as an input and return writing with the same meaning in another requested language as an output. According to Sreelekha, Bhattacharyya, and Malathi (2016), the market for the technology will grow in the future, and translation speed, cost, and quality are all significant factors in the success of an application. Google’s service provides translations adequately quickly and is free for customers, leaving the concern of quality. As almost all machine translations are flawed, the field represents a vital area for improvement if a company wants to obtain an advantage over its competitors.

Expanding the application’s vocabulary and improving its understanding of intricate text constructions contribute to the enhancement of the translation process. According to Li, Zhang, and Zong (2016), unknown words represent a significant challenge for machine translation systems, as they are challenging to handle when the system knows neither the meaning nor the type of the word. However, according to Castelvecchi (2016), Google chose to concentrate on analyzing sentence structure and correctly interpreting word combinations. This choice is likely due to the size of the company’s database described by Castelvecchi (2016), which significantly increases the application’s vocabulary compared to its competitors.

Proposed Solution

According to Castelvecchi (2016), Google chose to implement translation via neural network analysis to improve the company’s business process, as the approach is effective at improving the quality of tasks that benefit from analytical abilities. According to Luong and Manning (2015), neural machine translation is conceptually simple but can achieve results comparable to state-of-the-art traditional algorithms. The idea requires the creation of a neural network and its education through studying the existing translations compiled by Google. During the supervised learning process, the system becomes capable of predicting and constructing a logic that contains language rules and decision trees for ambiguous situations.

Google will likely optimize the procedures, as the translation tool is expected to assist large numbers of people simultaneously. According to Zhang and Zong (2015), the algorithm can be applied to analyze text mathematically, but also to capture significant amounts of contextual information, improving the speed and accuracy of the translation. When viewed as a black box, the algorithm accepts a text passage and two languages as inputs (the application supports a language recognition option, but the feature can be viewed separately) and produces a matching section of text in the second language as the output.

Innovation

Google is not the first to implement neural networks in machine translations, but the company may have introduced the approach in a commercial product before its competitors. However, the company has resolved specific issues in an innovative way, such as zero-shot translation (Verma, Jain, Basak, and Saksena, 2018). The concept refers to interpretations of segments for which the application does not have a point of reference and therefore has to “guess” at the correct answer. The ability to perform zero-shot translations significantly enhances the algorithm’s ability to translate text between unpopular languages, where the reference base and training opportunities might be lacking.

Furthermore, the unexplored state of the neural machine translation field, as well as the vast resources at Google’s disposal, allow the company to modify the system with various innovative approaches. Google employs a large number of researchers (Research, n.d.) that continually investigate potential opportunities for improvement. As such, although the core idea is not innovative, the surrounding details enable a variety of new approaches and ideas.

Ethics

While the algorithm itself does not concern itself with ethics, Google Translate is subject to two variants of ethical concern. The first one is the ethics of the tool and its parent company with regard to the privacy of its users. According to Kamocki and O’Regan (2016), all the information entered into the application is processed on Google servers, and nothing prevents the company from saving the data and using it later. Furthermore, most users are not aware of the fact or do not pay much attention, which exacerbates the risk.

Good practice regarding the issue would require Google to notify users that their inputs may be collected and analyzed and possibly enable the option to decline such data submissions. Schaub, Balebako, Durity, and Cranor (2015) describe a variety of factors that should be considered when designing privacy notices as well as use cases that match the various uses of Google Translate. Currently, the application keeps its privacy statement on a separate page twice removed from the main one, and the button that lets the user access the policy is small and does not attract significant attention.

References

Anggaira, A. S., & Hadi, M. S. (2017). Linguistic errors on narrative text translation using Google Translate. Pedagogy: Journal of English Language Teaching, 5(1), 1-14.

Castelvecchi, D. (2016). Deep learning boosts Google Translate tool.Web.

Kamocki, P., & O’Regan, J. (2016). Privacy issues in online machine translation services – European perspective. Web.

Li, X., Zhang, J., & Zong, C. (2016). Towards zero unknown word in neural machine translation. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (pp. 2852-2858). Palo Alto, CA: AAAI Press.

Luong, M. T., & Manning, C. D. (2015). Stanford neural machine translation systems for spoken language domains. Web.

Research. (n.d.). Web.

Schaub, F., Balebako, R., Durity, A. L., & Cranor, L. F. (2015). A design space for effective privacy notices. In Eleventh Symposium on Usable Privacy and Security (pp. 1-17). Ottawa, Canada: Carleton University.

Sreelekha, S., Bhattacharyya, P., Jha, S. K., & Malathi, D. (2016). A survey report on evolution of machine translation. International Journal of Control Theory and Applications, 9(33), 233-240.

Verma, M. N., Jain, A., Basak, A., & Saksena, K. B. (2018). Survey and analysis on language translator using neural machine translation. International Research Journal of Engineering and Technology, 5(4), 3720-3726.

Zhang, J., & Zong, C. (2015). Deep neural networks in machine translation: An overview. IEEE Intelligent Systems, 30(5), 16-25.