Dr. Callison-Burch has asked me to perform Word Alignment HITs on Mechanical Turk. These HITs are only available for researchers working with him because of the difficulty level of the task. Word alignment, as defined by Wikipedia, is the natural language processing task of identifying translation relationships among the words (or more rarely multiword units) in a bitext, resulting in a bipartite graph between the two sides of the bitext, with an arc between two words if and only if they are translations of one another. For these HITs, we are given a matrix like the one on the left where the computer has created the alignments. Some of the alignments are incorrect, so as a Turker, we can change the matrix to have the correct alignments like the matrix on the right. In the example below, the computer alignment was very good, most likely because of the symmetry of the two sentences:
There are detailed annotation guidelines, which also make these HITs challenging for a normal Turker.
The computer can accurately align words if they are the same in both sentences. But, problems arise when one exists in one sentence but not the other or when segments of the sentences are paraphrases of each other.
The word alignment example to the right is the computer result. If the words are exactly the same in each sentence, such as “following”, or close enough like “south american” versus “south america”, the computer can accurately align the words. But in other cases, the computer misses words that should be easily aligned, such as “read” and “reads.” There are other word examples in which the computer accurately aligns synonyms, such as “correspondent” and “reporter.”
Example 2 – Human Alignment shows the alignments I produced. The phrase “now listen” should be in pink bars, as it does not appear anywhere in the vertical sentence. Also the phrase “south american” is an adjective for the type of correspondent meek is and Michael is a reporter for the “south american section” so these two phrases should be blocked. It is also important to align “the report” in both of the sentences, even though it occurs at the beginning of the vertical sentence and the ending of the horizontal sentence.
Problems with the computer’s alignments that a human aligner does not have to abide by:
1. The computer cannot align words or block phrases with gray boxes (implying the words/phrases are similar in meaning, yet stated differently in the two sentences).
2. The computer also tries to align every word in both sentences, even if the word or phrase should have no alignment.
3. The computer is unable to use the pink bars (which indicate that there is no alignment for the word).
4. It seems as if the computer aligner attempts to follow the diagonal as much as possible. In most instances, this can be close to accurately align the sentences. In other cases where the word order is reversed, such as “give the book to me” compared with “give me the book,” the computer aligner is unsure how to handle the alignment and therefore there are many inaccuracies. For example:
The fourth example to the left is a much better computer alignment. The computer was able to make a connection between “noted” and “said” and “inspection” and “inspecting.” Because of the computer alignment problems mentioned above, the final phrase of the sentence could not be boxed, as I have done in Example 4 – Human Alignment. The words “that,” “to,” and “in” only appeared in the horizontal sentence and, therefore, should not be aligned to any of the words in the vertical sentence.
Because this sentence follows the diagonal pattern much better than Example 1, the computer alignment more closely resembles the human alignment. Note the gray box aligning “kadam” with “he.” This is a case of repetition which is handled by using a black box to align “kadam” with “kadam” and then a subsequent gray box to align “kadam” with “he” (which we know is referring to kadam from the context of the sentence). Computers are unable to understand the context of sentences, and this is what makes paraphrasing so challenging.
This aligner is optimized for foreign language translations as opposed to english-to-english paraphrasing. Therefore, the aligner has difficulty aligning words that may be the same, except for their tenses or conjugation. This was seen in Example 1, when the word “read” was not aligned with “reads.” I also noticed large discrepancies in the computer alignment when only one of the sentences contained commas. In almost every case, the computer aligner would align a comma with a random word in the other sentence. The computer aligner’s inability to accurately align paraphrases with boxed alignments is probably the most detrimental aspect when using it for text-to-text generation tasks. For the computer aligner to box paraphrases would involve a better overall grasp of the sentence. As presented in Example 3, the phrase “all walks of life” is a paraphrase of the phrase “all circles of the society.” As humans, we can easily see these are paraphrases and we can block the phrases. As a computer, it would need to align “of” with “circles of the society,” and “circles” with “walks of life,” both of which do not make sense as equivalencies. Therefore, the computer is unable to accurately align paraphrases with this aligner.
The implementation of a new aligner, specified for text-to-text alignment tasks, would result in better text-to-text translations. Xuchen Yao appears to have developed an aligner that would work much better with paraphrases. Below is an example of the difference in the aligner I have discussed and Xuchen’s adapted aligner.
The aligner used for the HITs can accurately align the diagonalized segment of the sentences, yet fails in the second half of the sentence when there is anti-symmetry. Whereas, Xuchen’s aligner accurately aligns the first and second parts of the sentences.