Quantcast
Channel: roman10 » Data Structure & Algorithms
Viewing all articles
Browse latest Browse all 15

Support Vector Machine Concept: VC Dimension

$
0
0

Given a set of n samples x_1, x_2, …, x_n, we want to label them as either -1 or 1. In total, there are 2^n possible label combinations.

A class of learning machines H can be used to label the samples. If for each label combination, we can always find a learning machine h \in H that labels it correctly, we then say that H shatters n points.

VC (Vapnik-Chervonekis) dimension is then defined as the maximum number of points that can be shattered by H, which measures the capacity of the hypothesis class H.

Note that VC(H) = 4 does not mean H can shatter any 4 points in the hyperplane, as long as there’s 4 points can be shattered by H, it’s good enough. And it also infers that for any > 4 (say 5) points in the hyperplane, it cannot be shattered by H.

Now consider a simple example to find the VC dimension of a class of hypothesis H. H is defined as straight line in a two dimensional hyperplane.

We can find the following 3 points a, b and c where all 2^3 possible label combinations can be classified correctly by H.

1

As mentioned before, it’s possible to find 3 points that H cannot shatter. For example, we just place a, b and c in the same line. There will be combinations that a straight line cannot label them correctly. But this doesn’t affect the VC dimension as it’s defined as the maximum number of points can be shattered.

One can try separate 4 points using a straight line, but no matter where we place those 4 points, there will be combinations that cannot be separated.

 


Viewing all articles
Browse latest Browse all 15

Latest Images

Trending Articles





Latest Images