Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as well as Regression problems. However, primarily, it is used for Classification problems in Machine Learning.
The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These extreme cases are called as support vectors, and hence algorithm is termed as Support Vector Machine. Consider the below diagram in which there are two different categories that are classified using a decision boundary or hyperplane:

Example: SVM can be understood with the example that we have used in the KNN classifier. Suppose we see a strange cat that also has some features of dogs, so if we want a model that can accurately identify whether it is a cat or dog, so such a model can be created by using the SVM algorithm. We will first train our model with lots of images of cats and dogs so that it can learn about different features of cats and dogs, and then we test it with this strange creature. So as support vector creates a decision boundary between these two data (cat and dog) and choose extreme cases (support vectors), it will see the extreme case of cat and dog. On the basis of the support vectors, it will classify it as a cat. Consider the below diagram:

SVM algorithm can be used for Face detection, image classification, text categorization, etc.

Types of SVM

SVM can be of two types:
• Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be classified into two classes by using a single straight line, then such data is termed as linearly separable data, and classifier is used called as Linear SVM classifier.
• Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a dataset cannot be classified by using a straight line, then such data is termed as non-linear data and classifier used is called as Non-linear SVM classifier.

Hyperplane and Support Vectors in the SVM algorithm:

Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-dimensional space, but we need to find out the best decision boundary that helps to classify the data points. This best boundary is known as the hyperplane of SVM.
The dimensions of the hyperplane depend on the features present in the dataset, which means if there are 2 features (as shown in image), then hyperplane will be a straight line. And if there are 3 features, then hyperplane will be a 2-dimension plane.
We always create a hyperplane that has a maximum margin, which means the maximum distance between the data points.
Support Vectors:
The data points or vectors that are the closest to the hyperplane and which affect the position of the hyperplane are termed as Support Vector. Since these vectors support the hyperplane, hence called a Support vector.

How does SVM works?

Linear SVM:
The working of the SVM algorithm can be understood by using an example. Suppose we have a dataset that has two tags (green and blue), and the dataset has two features x1 and x2. We want a classifier that can classify the pair(x1, x2) of coordinates in either green or blue. Consider the below image:

So as it is 2-d space so by just using a straight line, we can easily separate these two classes. But there can be multiple lines that can separate these classes. Consider the below image:

Hence, the SVM algorithm helps to find the best line or decision boundary; this best boundary or region is called as a hyperplane. SVM algorithm finds the closest point of the lines from both the classes. These points are called support vectors. The distance between the vectors and the hyperplane is called as margin. And the goal of SVM is to maximize this margin. The hyperplane with maximum margin is called the optimal hyperplane.

Non-Linear SVM:
If data is linearly arranged, then we can separate it by using a straight line, but for non-linear data, we cannot draw a single straight line. Consider the below image:

So, to separate these data points, we need to add one more dimension. For linear data, we have used two dimensions x and y, so for non-linear data, we will add a third-dimension z. It can be calculated as:
z=x2 +y2
By adding the third dimension, the sample space will become as below image:

So now, SVM will divide the datasets into classes in the following way. Consider the below image:

Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we convert it in 2d space with z=1, then it will become as:

Hence, we get a circumference of radius 1 in case of non-linear data.
Python Implementation of Support Vector Machine
Now we will implement the SVM algorithm using Python. Here we will use the same dataset user_data, which we have used in Logistic regression and KNN classification.
• Data Pre-processing step
Till the Data pre-processing step, the code will remain the same. Below is the code:

1. #Data Pre-processing Step 
2. # importing libraries 
3. import numpy as nm 
4. import matplotlib.pyplot as mtp 
5. import pandas as pd 
6. 
7. #importing datasets 
8. data_set= pd.read_csv('user_data.csv') 
9. 
10. #Extracting Independent and dependent Variable 
11. x= data_set.iloc[:, [2,3]].values 
12. y= data_set.iloc[:, 4].values 
13. 
14. # Splitting the dataset into training and test set. 
15. from sklearn.model_selection import train_test_split 
16. x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_state=0) 
17. #feature Scaling 
18. from sklearn.preprocessing import StandardScaler 
19. st_x= StandardScaler() 
20. x_train= st_x.fit_transform(x_train) 
21. x_test= st_x.transform(x_test)

After executing the above code, we will pre-process the data. The code will give the dataset as:

The scaled output for the test set will be:

Fitting the SVM classifier to the training set:
Now the training set will be fitted to the SVM classifier. To create the SVM classifier, we will import SVC class from Sklearn.svm library. Below is the code for it:

1. from sklearn.svm import SVC # "Support vector classifier"
2. classifier = SVC(kernel='linear', random_state=0)
3. classifier.fit(x_train, y_train)

In the above code, we have used kernel=’linear’, as here we are creating SVM for linearly separable data. However, we can change it for non-linear data. And then we fitted the classifier to the training dataset(x_train, y_train)
Output:
Out[8]:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape=’ovr’, degree=3, gamma=’auto_deprecated’,
kernel=’linear’, max_iter=-1, probability=False, random_state=0,
shrinking=True, tol=0.001, verbose=False)
The model performance can be altered by changing the value of C(Regularization factor), gamma, and kernel.
• Predicting the test set result:
Now, we will predict the output for test set. For this, we will create a new vector y_pred. Below is the code for it:

1. #Predicting the test set result
2. y_pred= classifier.predict(x_test)

After getting the y_pred vector, we can compare the result of y_pred and y_test to check the difference between the actual value and predicted value.
Output: Below is the output for the prediction of the test set:

• Creating the confusion matrix:
Now we will see the performance of the SVM classifier that how many incorrect predictions are there as compared to the Logistic regression classifier. To create the confusion matrix, we need to import the confusion_matrix function of the sklearn library. After importing the function, we will call it using a new variable cm. The function takes two parameters, mainly y_true( the actual values) and y_pred (the targeted value return by the classifier). Below is the code for it:

1. #Creating the Confusion matrix
2. from sklearn.metrics import confusion_matrix
3. cm= confusion_matrix(y_test, y_pred)

Output:

As we can see in the above output image, there are 66+24= 90 correct predictions and 8+2= 10 correct predictions. Therefore we can say that our SVM model improved as compared to the Logistic regression model.
• Visualizing the training set result:
Now we will visualize the training set result, below is the code for it:

1. from matplotlib.colors import ListedColormap 
2. x_set, y_set = x_train, y_train 
3. x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step =0.01), 
4. nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01)) 
5. mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(), x2.ravel()]).T).reshape(x1.shape), 
6. alpha = 0.75, cmap = ListedColormap(('red', 'green'))) 
7. mtp.xlim(x1.min(), x1.max()) 
8. mtp.ylim(x2.min(), x2.max()) 
9. for i, j in enumerate(nm.unique(y_set)): 
10. mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], 
11. c = ListedColormap(('red', 'green'))(i), label = j) 
12. mtp.title('SVM classifier (Training set)') 
13. mtp.xlabel('Age') 
14. mtp.ylabel('Estimated Salary') 
15. mtp.legend() 
16. mtp.show() 
Output:

By executing the above code, we will get the output as:

As we can see, the above output is appearing similar to the Logistic regression output. In the output, we got the straight line as hyperplane because we have used a linear kernel in the classifier. And we have also discussed above that for the 2d space, the hyperplane in SVM is a straight line.
• Visualizing the test set result:

1. #Visulaizing the test set result 
2. from matplotlib.colors import ListedColormap 
3. x_set, y_set = x_test, y_test 
4. x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step =0.01), 
5. nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01)) 
6. mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(), x2.ravel()]).T).reshape(x1.shape), 
7. alpha = 0.75, cmap = ListedColormap(('red','green' ))) 
8. mtp.xlim(x1.min(), x1.max()) 
9. mtp.ylim(x2.min(), x2.max()) 
10. for i, j in enumerate(nm.unique(y_set)): 
11. mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], 
12. c = ListedColormap(('red', 'green'))(i), label = j) 
13. mtp.title('SVM classifier (Test set)') 
14. mtp.xlabel('Age') 
15. mtp.ylabel('Estimated Salary') 
16. mtp.legend() 
17. mtp.show() 
Output:

By executing the above code, we will get the output as:

As we can see in the above output image, the SVM classifier has divided the users into two regions (Purchased or Not purchased). Users who purchased the SUV are in the red region with the red scatter points. And users who did not purchase the SUV are in the green region with green scatter points. The hyperplane has divided the two classes into Purchased and not purchased variable.
So, this brings us to the end of blog. This Tecklearn ‘Support Vector Machine Algorithm’ blog helps you with commonly asked questions if you are looking out for a job in Data Science. If you wish to learn Data Science and build a career in Data Science domain, then check out our interactive, Data Science using R Language Training, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

https://www.tecklearn.com/course/data-science-training-using-r-language/

Data Science using R Language Training

About the Course

Tecklearn’s Data Science using R Language Training develops knowledge and skills to visualize, transform, and model data in R language. It helps you to master the Data Science with R concepts such as data visualization, data manipulation, machine learning algorithms, charts, hypothesis testing, etc. through industry use cases, and real-time examples. Data Science course certification training lets you master data analysis, R statistical computing, connecting R with Hadoop framework, Machine Learning algorithms, time-series analysis, K-Means Clustering, Naïve Bayes, business analytics and more. This course will help you gain hands-on experience in deploying Recommender using R, Evaluation, Data Transformation etc.

Why Should you take Data Science Using R Training?

• The Average salary of a Data Scientist in R is $123k per annum – Glassdoor.com
• A recent market study shows that the Data Analytics Market is expected to grow at a CAGR of 30.08% from 2020 to 2023, which would equate to $77.6 billion.
• IBM, Amazon, Apple, Google, Facebook, Microsoft, Oracle & other MNCs worldwide are using data science for their Data analysis.

What you will Learn in this Course?

Introduction to Data Science
• Need for Data Science
• What is Data Science
• Life Cycle of Data Science
• Applications of Data Science
• Introduction to Big Data
• Introduction to Machine Learning
• Introduction to Deep Learning
• Introduction to R&R-Studio
• Project Based Data Science
Introduction to R
• Introduction to R
• Data Exploration
• Operators in R
• Inbuilt Functions in R
• Flow Control Statements & User Defined Functions
• Data Structures in R
Data Manipulation
• Need for Data Manipulation
• Introduction to dplyr package
• Select (), filter(), mutate(), sample_n(), sample_frac() & count() functions
• Getting summarized results with the summarise() function,
• Combining different functions with the pipe operator
• Implementing sql like operations with sqldf()
Visualization of Data
• Loading different types of datasets in R
• Arranging the data
• Plotting the graphs
Introduction to Statistics
• Types of Data
• Probability
• Correlation and Co-variance
• Hypothesis Testing
• Standardization and Normalization
Introduction to Machine Learning
• What is Machine Learning?
• Machine Learning Use-Cases
• Machine Learning Process Flow
• Machine Learning Categories
• Supervised Learning algorithm: Linear Regression and Logistic Regression
Logistic Regression
• Intro to Logistic Regression
• Simple Logistic Regression in R
• Multiple Logistic Regression in R
• Confusion Matrix
• ROC Curve
Classification Techniques
• What are classification and its use cases?
• What is Decision Tree?
• Algorithm for Decision Tree Induction
• Creating a Perfect Decision Tree
• Confusion Matrix
• What is Random Forest?
• What is Naive Bayes?
• Support Vector Machine: Classification
Decision Tree
• Decision Tree in R
• Information Gain
• Gini Index
• Pruning
Recommender Engines
• What is Association Rules & its use cases?
• What is Recommendation Engine & it’s working?
• Types of Recommendations
• User-Based Recommendation
• Item-Based Recommendation
• Difference: User-Based and Item-Based Recommendation
• Recommendation use cases
Time Series Analysis
• What is Time Series data?
• Time Series variables
• Different components of Time Series data
• Visualize the data to identify Time Series Components
• Implement ARIMA model for forecasting
• Exponential smoothing models
• Identifying different time series scenario based on which different Exponential Smoothing model can be applied

Got a question for us? Please mention it in the comments section and we will get back to you.

642