Support Vector Machines

Neil J.Salkind

doi:10.4135/9781412952644

Entry
Reader's guide
Entries A-Z
Subject index

Return to Entries

Support Vector Machines

Edited by:
Neil J. Salkind
In:Encyclopedia of Measurement and Statistics
Chapter DOI:https://doi.org/10.4135/9781412952644.n445
Subject:Anthropology, Business and Management, Criminology and Criminal Justice, Communication and Media Studies, Counseling and Psychotherapy, Economics, Education, Geography, Health, History, Marketing, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy, Social Work, Sociology, Science, Technology, Computer Science, Engineering, Mathematics, Medicine

Request Permissions

Show page numbers Hide page numbers

Support vector machines (SVM) are a system of machine learning (or classification) algorithm that constructs a classifier to assign a group label to each case on the basis of its attributes. The algorithm requires that there be two variables for each case in the data. The first is the group variable to be classified and predicted (such as disease status or treatment), and the second is the variable of attributes, which is usually multidimensional numerical data (such as amount of daily cigarette consumption or abundance of particular enzymes in the blood). Normally a classifier is learned from the method based on a set of training cases. The classifier is then applied to an independent test set to evaluate the classification accuracy.

SVM was first developed by Vladimir Vapnik in the 1990s. In the simplest situation, binary (e.g., “disease” vs. “normal”), separable, and linear classification is considered. The method seeks a hyperplane that separates the two groups of data with the maximum margin, where the margin is defined as the distance between the hyperplane and the closest example in the data. The idea came originally from the Vapnik Chervonenkis theory, which shows that the generalized classification error is minimized when the margin is maximized.

Why the name support vector machines? The answer is that the solution of the classification hyperplane for SVM depends only on the support vectors that are the closest cases to the hyperplane. All the remaining cases, farther away, do not contribute to the formulation of the classification hyperplane. The optimization problem is solved through quadratic programming techniques, and algorithms are available for fast implementation of SVM.

SVM is widely applied in virtually any classification application, including writing recognition, face recognition, disease detection, and other biological problems. Two important extensions in the development of SVM made it popular and feasible [Page 979]in practical applications: kernel methods and soft margin. Kernel methods are used to extend the concept of linear SVM to construct a nonlinear classifier. The idea is to map the current space to a higher-dimensional space, with the nonlinear classifier in the current space transformed to a linear one in the new, high-dimensional space. The distance structure (dot product) is simply replaced by a kernel function, and the optimization is performed similarly. Common choices of kernel functions include polynomial, radial basis, and sigmoid. Soft margin is used when the two groups of cases are not separable by any possible hyperplane. Penalties are given to “misclassified” cases in the target function, and the penalized “soft margin” is similarly optimized.

Assumptions and Applications of SVM

SVM is a distribution-free method (in contrast to methods like linear discriminant analysis). It is thus more robust to skewed or ill-behaved distributed data. The major considerations when using SVM are the selection of a proper kernel function and the parameters of penalties for soft margins. Selections of kernels and parameters are usually determined by maximizing the total accuracy in cross-validation.

Using the Computer

Since SVM is a relatively modern statistical technique, it is not implemented in most major statistical software (e.g., SAS and S-PLUS). In the following, an extension package, e1071, of R software is used to implement SVM, and an example of classification of car fuel efficiency is demonstrated. The data are a sub-sample of 25 cars from the “auto-mpg” data set in the UCI Machine Learning Repository. Fuel efficiency, horsepower, and weight are shown in Table 1. The goal of the classification problem is to classify inefficient and economic cars based on their attributes of horsepower and weight. In Figure 1, inefficient cars are shown on the right, and economic cars are on the left. Linear SVM is applied, and the resulting [Page 980]classification hyperplane, margin, and three support vectors (triangles) are indicated.

...

Sign in to access this content

Get a 30 day FREE TRIAL

Watch videos from a variety of sources bringing classroom topics to life
Read modern, diverse business cases
Explore hundreds of books and reference titles

No internet connection.

All search filters on the page have been cleared.

Your search has been saved.

Entry

Reader's guide

Entries A-Z

Subject index

Support Vector Machines

Assumptions and Applications of SVM

Using the Computer

Sign in to access this content

Get a 30 day FREE TRIAL

Read next

More like this

Sage Recommends