MauroCerbai: settembre 2017

SUPPORT VECTOR MACHINE

220px-Svm_separating_hyperplanes_(SVG).svg.png

Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall.

A good separation is achieved by the hyperplane that has the largest distance to the nearest training-data point of any class (so-called functional margin), since in general the larger the margin the lower the generalization error of the classifier.

In addition to performing linear classification, SVMs can efficiently perform a nonlinear classification using what is called the “kernel trick”, implicitly mapping their inputs into high-dimensional feature spaces.

Parameters:

C : controls tradeoff between smooth decision boundary and classifying points correctly.

small C → larger hyperplane (more points not correct)

large C → smaller hyperplane (less smooth boundary)

Gamma ɣ : define how far the influence of a single training example reaches

low ɣ → far (wiggling decision boundary)

high ɣ → close (straight smooth decision boundary)

Kernel : it is a function that takes a not linear separable input and adding more dimensions to it it give a linear separable space. Then the solution of it is still valid on the original space in input

common ones : linear, rbf, poly, sigmoid, precomputed

Overfitting:

Pay attention to overfitting your data because you can score an high accuracy on training points but it is not a good generalization, so in new data it might score a bad accuracy.

Sklearn:

from sklearn.svm import SVC

clf = SVC(kernel=”linear”)

clf.fit(feature_training, label_training)

prediction = clf.predict(feature_test)

accuracy = clf.score(feature_test, label_test)

No commenti

GRADE	SPEED LIMIT	SPEED
steep	yes	slow
steep	yes	slow
flat	no	fast
steep	no	fast

MauroCerbai

About me

Follow Me

recent posts

Categories

Blog Archive