Jointly Constraining Parsing and Word Alignment on Bitexts
David Burkett, John Blitzer and Daniel Klein
Most modern systems for syntactic machine translation require training data in the form of a bitext with word alignments and syntactic parses of one or both sides. Typically, word alignments and parses are generated in a preprocessing phase using independent word aligners and monolingual parsers. However, word alignments and parses are not, in fact, independent, and so it should be possible to improve both by imposing some system of mutual constraints.
Recently, we developed a model for jointly parsing a bitext using various features of the input derived from a pair of baseline monolingual parsers, the candidate parses themselves, and the posterior probabilities from a standard model of word alignment between the sentence pairs. The key intuition is shown the example below, where a state-of-the-art English parser has chosen an incorrect structure (a) which is incompatible with the (correctly chosen) output of a comparable Chinese parser.
Our model learns the appropriate correspondences between languages by inducing a latent alignment between tree structures, and is trained by iteratively finding the optimal tree alignment for each pair of candidate parses, and then optimizing feature weights under the optimal alignments. Using this technique, we are able to improve F1 by 1.8 on in-domain Chinese sentences and by 2.5 on out-of-domain English sentences. Furthermore, by using our joint parsing model to preprocess the input to a syntactic MT system, we are able to improve BLEU by 2.4 points over the same system trained with parses from our baseline monolingual parsers .
Constraining Parsing and Word Alignment
We are currently investigating methods for training models that incorporate constraints between parses on both sides of a bitext and an alignment between the words of the sentences (and possibly the tree structures in the candidate parses). We look forward to presenting these results soon.
Figure 1: Two possible parse pairs for a Chinese-English sentence pair. The parses in a) are chosen by independent monolingual statistical parsers, but only the Chinese side is correct. The gold English parse shown in b) is further down in the 100-best list, despite being more consistent with the gold Chinese parse. The circles show where the two parses differ. Note that in b), the ADVP and PP nodes correspond nicely to Chinese tree nodes, whereas the correspondence for nodes in a), particularly the SBAR node, is less clear.
- D. Burkett and D. Klein, "Two Languages are Better than One (for Syntactic Parsing)," Proceedings of EMNLP, 2008.