Affiliations: Department of Enterprise Engineering, University of Roma, Rome, Italy
Note: [] Corresponding author. Danilo Croce, Department of Enterprise Engineering, University of Roma, Tor Vergata, Via del Politecnico 1, 00133 Roma, Rome, Italy. E-mail: croce@info.uniroma2.it
Abstract: The use of complex grammatical features in statistical language learning assumes the availability of large scale training data and good quality parsers, especially for languages different from English. In this paper, we show how good quality FrameNet Semantic Role Labeling systems can be obtained without relying on full syntactic parsing, by backing off to surface grammatical representations and structured learning. In line with this approach, the ioB Annotation Based Engine for srL (BABEL) has been implemented as a flexible system for Semantic Role Labeling based on a Structured Support Vector Machine learning framework. While the underlying learning paradigm allows employing BABEL when no syntactic parser is available, its accuracy is in line with state-of-the-art systems for English. BABEL is among the best performing Semantic Role Labeling systems also for Italian, as recently evaluated in the role labeling task of the Frame Labeling over Italian Texts at the Evalita 2011 competition. Moreover, the same learning framework is applied to effectively acquire surface grammatical information, achieving state-of-the-art results also with respect to the Part-of-speech tagging task of the Evalita 2009 competition. Finally, BABEL can POS tag more than 1,500 word per second while the SRL module can process about 35 sentences per second, thus making its use straightforward in Web scale applications.
Keywords: Structured support vector machine, Semantic role labeling, Part-of-speech Tagging