Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Karunaratne, Thashmeea; * | Boström, Henrika | Norinder, Ulfb; c; d
Affiliations: [a] Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden | [b] AstraZeneca Research and Development, Södertälje, Sweden | [c] Department of Pharmacy, Uppsala University, Uppsala, Sweden | [d] Department of Computational Chemistry, H. Lundbeck A/S, Valby, Denmark
Correspondence: [*] Corresponding author: Thashmee Karunaratne, Department of Computer and Systems Sciences, Stockholm University, Forum 100, SE-164 40 Kista, Sweden. E-mail: si-thk@dsv.su.se.
Abstract: Quantitative structure-activity relationship (QSAR) models have gained popularity in the pharmaceutical industry due to their potential to substantially decrease drug development costs by reducing expensive laboratory and clinical tests. QSAR modeling consists of two fundamental steps, namely, descriptor discovery and model building. Descriptor discovery methods are either based on chemical domain knowledge or purely data-driven. The former, chemoinformatics-based, and the latter, substructures-based, methods for QSAR modeling, have been developed quite independently. As a consequence, evaluations involving both types of descriptor discovery method are rarely seen. In this study, a comparative analysis of chemoinformatics-based and substructure-based approaches is presented. Two chemoinformatics-based approaches; ECFI and SELMA, are compared to five approaches for substructure discovery; CP, graphSig, MFI, MoFa and SUBDUE, using 18 QSAR datasets. The empirical investigation shows that one of the chemo-informatics-based approaches, ECFI, results in significantly more accurate models compared to all other methods, when used on their own. Results from combining descriptor sets are also presented, showing that the addition of ECFI descriptors to any other descriptor set leads to improved predictive performance for that set, while the use of ECFI descriptors in many cases also can be improved by adding descriptors generated by the other methods.
Keywords: QSAR modeling, chemical descriptors, graph mining
DOI: 10.3233/IDA-130581
Journal: Intelligent Data Analysis, vol. 17, no. 2, pp. 327-341, 2013
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn
For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl
如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl