Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Subtitle:
Article type: Research Article
Authors: Warintarawej, P. | Huchard, M. | Lafourcade, M. | Laurent, A.* | Pompidor, P.
Affiliations: LIRMM, CNRS-Université de Montpellier, Montpellier, France
Correspondence: [*] Corresponding author: A. Laurent, LIRMM, UMR 5506 CNRS-Université de Montpellier 2 161, rue Ada, 34095 Montpellier Cedex 05, France. Tel.: +33 0 467 418 585; Fax: +33 0 467 418 500; E-mail:laurent@lirmm.fr
Abstract: Identifier names (e.g., packages, classes, methods, variables) are one of most important software comprehension sources. Identifier names need to be analyzed in order to support collaborative software engineering and to reuse source codes. Indeed, they convey domain concept of softwares. For instance, ``getMinimumSupport'' would be associated with association rule concept in data mining softwares, while some are difficult to recognize such as the case of mixing parts of words (e.g., ``initFeatSet''). We thus propose methods for assisting automatic software understanding by classifying identifier names into domain concept categories. An innovative solution based on data mining algorithms is proposed. Our approach aims to learn character patterns of identifier names. The main challenges are (1) to automatically split identifier names into relevant constituent subnames (2) to build a model associating such a set of subnames to predefined domain concepts. For this purpose, we propose a novel manner for splitting such identifiers into their constituent words and use N-grams based text classification to predict the related domain concept. In this article, we report the theoretical method and the algorithms we propose, together with the experiments run on real software source codes that show the interest of our approach.
Keywords: Automatic software understanding, data mining, text classification, software engineering
DOI: 10.3233/IDA-150744
Journal: Intelligent Data Analysis, vol. 19, no. 4, pp. 761-778, 2015
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl