採用一對一方法的多類SVM中未分配的類


2

您如何使用"一對多"方法處理多類支持向量機(多類SVM)中未分配的類?

可以說我的訓練數據分為A,B和C三個類別。我使用3個SVM一對多分類器。對於特定的測試實例,所有分類器分別說"不是A","不是B"和"不是C"。如何將此實例分配給一個類?我嘗試使用"未分配的類"類,但是錯誤率過高。

另一個問題:
當兩個或多個分類器將一個測試實例分配為肯定時,我應該選擇哪一個。我讀過某個地方,您應該使用" SVM的輸出功能"。那是什麼?

Matlab提供以下svmStruct:

  1. SupportVectors
  2. Alpha
  3. 偏見
  4. KernelFunction
  5. SupportVectorIndices
  6. 其他

如何使用它們生成"輸出值"?

此外,我不太了解普拉特的技術。如何使用svmStruct的元素實現它。This問題相關,但未得到解答。

4

Another question: When two or more classifiers assign a test instance as positive, which one should I go with. I read somewhere that you should use "output function of the SVM". Now what's that?

When used as a classifier, the label for instance $\mathbf{x}$ is decided based on the sign of the decision value $f(\mathbf{x})$. The decision value of an SVM for instance $\mathbf{x}$ is computed as follows:

$$ f(\mathbf{x}) = \sum_{i\in \mathcal{S}} \alpha_i y_i \kappa(\mathbf{x}_i,\mathbf{x}) + b . $$

The decision value itself can be used to rank decisions by confidence in the predicted label. A higher absolute value $|f(\mathbf{x})|$ indicates a larger distance from $\mathbf{x}$ to the separating hyperplane and as such signifies higher confidence in the predicted label.

Matlab gives the following svmStruct ... How do I use these to generate the "output value"?

Matlab's SVMStruct has all the ingredients for you to implement the decision function $f(\cdot)$ yourself:

  • SupportVectors: the set of $\mathbf{x}_i$'s,
  • Alpha: the $\alpha$ vector (probably even $\alpha.*y$),
  • Bias: the $b$ term,
  • KernelFunction: $\kappa(\cdot,\cdot)$.

Lets say my training data has three classes A, B, and C. I use 3 SVM one-vs-all classifiers. For a particular test instance, all classifiers say 'Not A', 'Not B' and 'Not C' respectively. How do I assign this instance to one of the classes?

Choose the class associated with the least confident (calibrated) not-this-class-prediction. Do not use decision values of different models directly since their distributions may differ (see below for simple instructions).

Calibrating decision values

You can compare decision values from one single model with eachother to perform ranking. You cannot, generally, compare decision values of different models with eachother directly as they are not calibrated.

In your case, unfortunately, you are in the latter situation. Therefore you would first need to perform some kind of calibration. A very rudimentary (but often sufficiently accurate) calibration approach would be dividing the decision values per model by the standard deviation of decision values you get for that model on an independent test set (not the training set!).

For example, assume you have two models which disagree on a label: $$ sign[f_1(\mathbf{x})] \neq sign[f_2(\mathbf{x})]. $$ You want to follow that model for which the confidence of the decision is highest, but you cannot directly compare the two (since the distribution of decision values may differ between both models). This is why we calibrate them first:

$$ \begin{align} f_1^{cal}(\mathbf{x}) &= f_1(\mathbf{x}) / \sigma_1 \\ f_2^{cal}(\mathbf{x}) &= f_2(\mathbf{x}) / \sigma_2 \end{align} $$

Now simply follow the label from the model with the highest calibrated confidence in its decision (e.g. highest absolute value of the calibrated decision value).