Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Castillo, Estebana; * | Cervantes, Ofeliaa | Vilariño, Darnesb
Affiliations: [a] Universidad de las Américas Puebla, Department of Computer Science, Mexico | [b] Benemérita Universidad Autónoma de Puebla, Faculty of Computer Science, Mexico
Correspondence: [*] Corresponding author. Esteban Castillo, Universidad de las Américas Puebla, Department of Computer Science, Mexico. E-mail: esteban.castillojz@udlap.mx.
Abstract: This paper presents an approach to solve authorship verification, a forensic text problem which consists in determining whether or not an unknown document was written by a particular author, from some samples of the author’s writing style. The core of the approach is the use of a graph representation to extract relevant linguistic features based on network analysis techniques. The use of graphs provides rich data structures for representing lexical and syntactic aspects of texts, allowing the reinterpretation of centrality measures to extract linguistic features that do not depend entirely of stylistic elements of text documents. The proposed method is applied on the English language partitions of the clef PAN 2014 and 2015 author verification datasets, producing competitive results that outperform the state of the art baselines and are near (or surpass in one of the cases) to the best results reported so far, given the same training and test corpora. These experimental results showed that our interpretation of the four centrality measures: closeness, betweenness, degree and eigenvector allow to detect relevant patterns of an author’s writing style. In particular, words with high closeness which are part of some chunk phrases and words with high betweenness that are included in bigrams and trigrams, contribute in a more effective way to verify document authorship.
Keywords: Authorship verification, supervised learning, syntactic flow graph, social network analysis, centrality measures
DOI: 10.3233/JIFS-181934
Journal: Journal of Intelligent & Fuzzy Systems, vol. 36, no. 6, pp. 6075-6087, 2019
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl