SVM(Support Vector
Machine) Support vector machine, It's a common method of discrimination. In machine learning, Is a supervised learning model, Usually used for pattern recognition, Classification and regression analysis.


Matlab Prepared by Lin Zhirenlibsvm Toolkits work wellSVM train.Python We havesklearn Tool package for machine learning algorithm training,Scikit-Learn The library has implemented all the basic machine learning algorithms.

The following is referenced fromhttps://www.cnblogs.com/luyaoblog/p/6775342.html
<https://www.cnblogs.com/luyaoblog/p/6775342.html>
Blogs, In the originalPython2 Code of is updated toPython3 Code.
Morepython And machine learningomegaxyz.com <http://www.omegaxyz.com/>
The following is toIris Orchid dataset as an example:

Due toUCI Downloaded in databaseIris The original dataset looks like this, The first four columns are characteristic columns, The fifth column is the category column, There are three categoriesIris-setosa,
Iris-versicolor, Iris-virginica.

Need to usenumpy Split it.

Dataset download address:http://archive.ics.uci.edu/ml/machine-learning-databases/iris/
<http://archive.ics.uci.edu/ml/machine-learning-databases/iris/>

downloadiris.data that will do.

Python3 Code:
from sklearn import svm import numpy as np import matplotlib.pyplot as plt
import matplotlib as mpl from matplotlib import colors from
sklearn.model_selectionimport train_test_split def iris_type(s): it = {
b'Iris-setosa': 0, b'Iris-versicolor': 1, b'Iris-virginica': 2} return it[s]
path ='C:\\Users\\dell\\desktop\\iris.data' # Data file path data = np.loadtxt(path,
dtype=float, delimiter=',', converters={4: iris_type}) x, y = np.split(data, (4
,), axis=1) x = x[:, :2] x_train, x_test, y_train, y_test = train_test_split(x,
y, random_state=1, train_size=0.6) # clf = svm.SVC(C=0.1, kernel='linear',
decision_function_shape='ovr') clf = svm.SVC(C=0.8, kernel='rbf', gamma=20,
decision_function_shape='ovr') clf.fit(x_train, y_train.ravel())
print(clf.score(x_train, y_train))# accuracy y_hat = clf.predict(x_train)
print(clf.score(x_test, y_test)) y_hat2 = clf.predict(x_test) x1_min, x1_max =
x[:,0].min(), x[:, 0].max() # The first0 Column scope x2_min, x2_max = x[:, 1].min(), x[:, 1
].max()# The first1 Column scope x1, x2 = np.mgrid[x1_min:x1_max:200j, x2_min:x2_max:200j] #
Generate grid sample points grid_test = np.stack((x1.flat, x2.flat), axis=1) # Test point mpl.rcParams[
'font.sans-serif'] = [u'SimHei'] mpl.rcParams['axes.unicode_minus'] = False
cm_light = mpl.colors.ListedColormap(['#A0FFA0', '#FFA0A0', '#A0A0FF']) cm_dark
= mpl.colors.ListedColormap(['g', 'r', 'b']) grid_hat = clf.predict(grid_test)
# Forecast classification value grid_hat = grid_hat.reshape(x1.shape) # Make it the same shape as the input alpha = 0.5
plt.pcolormesh(x1, x2, grid_hat, cmap=cm_light)# Display of forecast value plt.plot(x[:, 0], x[:, 1
],'o', alpha=alpha, color='blue', markeredgecolor='k') plt.scatter(x_test[:, 0
], x_test[:,1], s=120, facecolors='none', zorder=10) # In loop test set samples plt.xlabel(
u' Calyx length', fontsize=13) plt.ylabel(u' Calyx width', fontsize=13) plt.xlim(x1_min, x1_max)
plt.ylim(x2_min, x2_max) plt.title(u'SVM classification', fontsize=15) plt.show()
*
split( data, Split position, axis=1( Horizontal segmentation) or 0( Vertical segmentation)).

*
x = x[:, :2] For the convenience of later drawing more intuitive, Therefore, only the first two columns of eigenvalue vectors are used for training.

*

sklearn.model_selection.train_test_split Randomly divided training set and test set.train_test_split(train_data,train_target,test_size= number,
random_state=0)

Parameter interpretation:

train_data: Sample feature set to be divided

train_target: Sample results to be divided

test_size: Sample proportion, If it's an integer, it's the number of samples

random_state: Is the seed of random number.


Random number seed: In fact, it's the number of random numbers in this group, When it is necessary to repeat the test, Ensure to get the same set of random numbers. Like you fill in every time1, You get the same random array with the same parameters. But fill in0 Or no fill, It's different every time. The generation of random numbers depends on the seed, The relationship between random number and seed follows the following two rules: Different seeds, Generate different random numbers; Same seed, Even if the instances are different, the same random number will be generated.

kernel=’linear’ Time, Linear kernel,C The bigger the classification, the better, But it is possible to over fit(defaul C=1).

kernel=’rbf’ Time(default), Gauss nucleus,gamma The smaller the value. More continuous classification interface;gamma The bigger the value is. Classification interface“ scattered”, The better the effect of classification, But it is possible to over fit.

Linear classification results:

rbf Kernel function classification results:

Morepython And machine learningomegaxyz.com <http://www.omegaxyz.com/>