Affiliations: CIMeC, University of Trento, Rovereto (TN), Italy | DISI, University of Trento, Povo (TN), Italy
Note: [] Corresponding author. Truc-Vien T. Nguyen, CIMeC, University of Trento, Corso Bettini 31, 38068 Rovereto (TN), Italy. E-mail: trucvien.nguyen@gmail.com
Abstract: We present a method for incorporating global features in named entity recognizers using reranking techniques and the combination of two state-of-the-art NER learning algorithms: conditional random fields (CRFs) and support vector machines (SVMs). The reranker employs two kinds of features: flat and structured features. The former are generated by a polynomial kernel encoding entity features whereas tree kernels are used to model dependencies amongst tagged candidate examples. The experiments on two standard corpora in two languages, i.e. the Italian EVALITA 2009 and the English CoNLL 2003 datasets, show a large improvement on CRFs in F-measure, i.e., from 80.34% to 84.33% and from 84.86% to 87.99%, respectively. Our analysis reveals that (i) both kernels provide a comparable improvement over the CRFs baseline; and (ii) their combination improves CRFs much more than the sum of the individual contributions, suggesting an interesting synergy.
Keywords: Named entity recognition, reranking, kernel methods, conditional random fields