On my first day, Tuesday, May 28, I was introduced to the Joshua Decoder, an open-source statistical machine translation decoder for hierarchical and syntax-based machine translation. Juri Ganitkevitch helped me set up a small translation experiment from Spanish to English. The BLEU score we got was much lower then excepted of .006, so Juri believed that there was something wrong with the run even though the sample size was small, only containing 10,000 sentence pairs. We discovered that the decoding process was not using the Spanish-to-English translation data and therefore was not translating any of the words. The following day, we downloaded and configured the Joshua Decoder on my computer and ran the experiment again. This time the results were much better with a BLEU score of .08. Through the translation results and the analysis the Joshua Decoder provides, I was able to get a better understanding of where translation was working well and which kinds of phrases resulted in a less precise translation.
With this small sample size, function words were translated correctly most of the time, whereas most of the content words remained in Spanish. The grammatical structure of the sentence was well translated, this may be because both English and Spanish have sentence structures of subject-verb-object (SVO). For example, the source sentence “la noticia aparecio en la bbc” was translated as “the learned aparecio in the bbc,” which contains the subject first, following the verb and finally the object. In order to test if the decoder can accurately reorder sentences based on the target languages structure, I will need to use a different source language. Another problem with most of the translations is that the adjectives are still coming after the nouns. For example, a source sentence “medida social” should be translated into “social measures,” but is instead translated into “measure social.” If we were to increase the amount of sentence pairs, it is plausible that we would see an improvement in the translations. I will need to evaluate a system with a solid amount of training data in order to better assess the accuracy of the machine translations.
I have also been doing a lot of reading to get me up to speed on machine translation and text-to-text translation topics. I learned a lot of information about machine translation basics by reading Statistical Machine Translation by Philipp Koehn. To gain a better understand of text-to-text generation, I have read revelvant research papers by Dr. Callison-Burch, Juri and other researchers in the field of paraphrasing.
Dr. Callison-Burch brought up an interesting topic about creating a software that can accurately test machine translation software without the use of human translators. BLEU is a software designed for this purpose, but it lacks the ability to test for the importance of words while testing for sentence translation accuracy. A key example is the word “not” which proves very important for translation meaning, but will be given the same importance as the word “a” by the BLEU algorithm. Based on the Spanish sentence in a previous paragraph, the reference sentence used by the decoder is “the case was reported by bbc” whereas Google Translate’s output is “the news appeared on the bbc.” These two statements are paraphrases of each other, but BLEU only looks at how many words were translated exactly. Therefore, Google Translate’s accurate translation would only record two words (‘the’, ‘bbc’) correct in the sentence, which would result in a BLEU score close to 0.
In order for novel machine translation systems to be compared, there needs to be a system that can accurately test and compare translations without the use of human translators. The problem with human translators is it requires a lot of time and translators are not consistent in grading different sentences.
Dr. Callison-Burch spoke with me about possible future projects; we are considering topics such as legalese to plain english, prose to poetry, and a few more. I began developing a spreadsheet containing legalese sentences to their plain english translations, along with key legalese phrases and words to their plain english counterparts. I hope to get a solid and complete database with approximately 2,000 sentence translations (as of now I am only at 33).