Feature Selection in Machine Learning: What is the statistical method to identif

Author Topic: Feature Selection in Machine Learning: What is the statistical method to identif  (Read 1260 times)

Offline rashidacse

  • Full Member
  • ***
  • Posts: 103
  • Test
    • View Profile
If you are using some sort of decision trees, a quick and dirty way is to throw the features into the model. You can identity the importance of features by their distances to the root.

For one-dimensional features:
You can also use area under the receiver operating characteristics (ROC) curve (AUC) to tell the importance of features. In short, the curve plots for different threshold, what is the false positive rate (fpr) vs. true positive rate (tpr). You can imagine, a perfect feature would have tpr = 1.0 and fpr = 0.0. The AUC is 1. A useless feature that has a constant value for all data points will have a ROC that is straight line between (0,0) and (1,1) so the AUC is 0.5. So the higher the AUC value is, the more relevant the feature is.  It is sometimes called in the concordance index in Statistics.

Another predictive power measurement is the correlation measures. That is, the correlation between the feature value and the label. If you are already set on a model, then the correlation can be between the latent continuous variable (for example, in a logistic regression model, this is the logistics function applied on the feature value) and the label. This square of this correlation is actually proportional to R Square [1].