Word Alignments

Dr. Callison-Burch has asked me to perform Word Alignment HITs on Mechanical Turk. These HITs are only available for researchers working with him because of the difficulty level of the task. Word alignment, as defined by Wikipedia, is the natural language processing task of identifying translation relationships among the words (or more rarely multiword units) in a bitext, resulting […]

Mechanical Turk

As I introduced in a previous post, Specific Text-to-Text Generation Tasks, Mechanical Turk is a website that allows researchers to post HITs or Human Intelligence Tasks. We can post HITs where the Turker must rank the paraphrases produced by our machine translator. The Turker will rank the paraphrases from 1 to 5 based on two […]

Getting into the Data

I am now beginning to get into paraphrasing tasks; moving away from the two translation experiments that I ran and analyzed (the very small English to Spanish translation experiment and the larger English to Spanish translation during the Joshua tutorial). Juri gave me a large amount of paraphrasing input data that he asked me to […]

Joshua Tutorial

On Wednesday, I attended a Joshua Tutorial hosted by Matt Post. As a key developer of the Joshua Decoder, he was very helpful in building my understanding of the decoder and the overall pipeline. He helped me set up my account with the Human Language Technology Center of Excellence so I could have access to […]

Specific Text-to-Text Generation Tasks

For the past couple of days, I have focused on specific text-to-text generation tasks, including legalese to plain English translations and prose to poetry translations. Legalese is the type of writing used in legal documents that can be very difficult to understand for the general public. Signed in 2010, the Plain Writing Act requires that Federal […]