An investigation of Grammar Gender-bias Correction for Google Translate When Translating from English to French
Dr. El Sayed Mahmoud
Date of Defense
Honours Bachelor of Computer Science (Mobile Computing)
automatic language translation, machine learning, natural language processing, decision trees
Faculty of Applied Science & Technology (FAST)
A thesis that investigates the use of Machine Learning and NLP to help correct Google Translate's Gender bias when translating from English to French. Such was able to be done through various techniques and identified complexity patterns.
This work investigated how to address the Google Translate's gender-bias when translating from English to French. The developed solution is called GT gender-bias corrector that was built based on combining natural language processing and machine learning methods. The natural language processing was used to analyze the original sentences and their translations grammatically identifying parts of speech. The parts of speech analysis facilitated the identification of three patterns that are associated with the gender bias of Google Translate when translating from English to French. The three patterns were labeled simple, intermediate and complex to reflect the structure complexity. Samples of texts that represent the three patterns were generated. The generated texts were used to build a decision-tree-based classifier to automatically detect the pattern to which a text belongs. The GT gender-bias corrector was tested using a survey completed by participants with diverse levels of English and French fluency. The survey analysis showed the success of the corrector in addressing the Google Translate gender-bias for the three patterns identified in this work.
© Ahmed Samy Merah
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License.
Merah, Ahmed Samy, "An investigation of Grammar Gender-bias Correction for Google Translate When Translating from English to French" (2020). Student Theses. 1.