Semi-Supervised Tri-Training for Explicit Discourse Argument Expansion
This paper describes a novel application of semi-supervision for shallow discourse parsing. We use a neural approach for sequence tagging and focus on the extraction of explicit discourse arguments. First, additional unlabeled data is prepared for semi-supervised learning. From this data, weak annotations are generated in a first setting and later used in another setting to study performance differences. In our studies, we show an increase in the performance of our models that ranges between 2-10{\%} F1 score. Further, we give some insights to the generated discourse annotations and compare the developed additional relations with the training relations. We release this new dataset of explicit discourse arguments to enable the training of large statistical models.
PDF Abstract