Feature gene selection method based on logistic and correlation information entropy
In view of the characteristics of high dimension, small samples, nonlinearity and numeric type in the gene expression profile data, the logistic and the correlation information entropy are introduced into the feature gene selection. At first, the gene variable is screened preliminarily by logistic regression to obtain the genes that have a greater impact on the classification; then, the candidate features set is generated by deleting the unrelated features using Relief algorithm. On the basis of this, delete redundant features by using the correlation information entropy; finally, the feature gene subset is classified by using the classifier of support vector machine (SVM). Experimental results show that the proposed method can obtain smaller subset of genes and achieve higher recognition rate.