Machine Learning 9 : Feature Selection

by mcmaur - 15:00

FEATURE SELECTION

Feature selection is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Feature selection techniques are used for four reasons:

simplification of models to make them easier to interpret by researchers/users
shorter training times
to avoid the curse of dimensionality
enhanced generalization by reducing overfitting

Add a new feature (feature extraction):

Feature extraction starts from an initial set of measured data and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning and generalization steps, and in some cases leading to better human interpretations.

Use human intuition
Code the new feature
Visualize
Repeat

Getting rid of a feature (feature selection):

There is an optimal number of feature that balances the bias and variance. So the process to find this point is called regularization.

There are two big univariate feature selection tools in sklearn: SelectPercentile and SelectKBest. The difference is pretty apparent by the names: SelectPercentile selects the X% of features that are most powerful (where X is a parameter) and SelectKBest selects the K features that are most powerful (where K is a parameter).

Lasso Regression:

One of this methods is Lasso regression that it introduces a penalty parameter for the number of feature, like this:

minimum SSE + |B|

So the formula implies that we find the perfect balance between the minimum sum of squared errors and the number of feature.

What also does is find the best feature because every “y” feature has a “m” coefficient so if you order your because by the m value you get a list of the most important feature.

Sklearn:

from sklearn.linea_model import Lasso

regression = Lasso()

regression.fit(features, labels)

regression.predict([2,4])

print regression.coef_

Tags : dev, development, job, learn, learning, machine, machine learning, scikit, sklearn, software, study, uda

MauroCerbai

Machine Learning 9 : Feature Selection

0 commenti

About me

Follow Me

recent posts

Categories

Blog Archive

MauroCerbai

Machine Learning 9 : Feature Selection

You May Also Like

0 commenti

About me

Follow Me

recent posts

Categories

Blog Archive