Supervised Learning - Classifiers

0004/07/01 Machine-Learning Reading time: about 5 mins
# Supervised Learning - Classifiers

Supervised learning techniques

  • Feature selection Mutual information
  • Bayes decision rule Naïve Bayes
  • Linear classifier Logistic regression Support vector machine
  • Nonlinear classifier K-nearest neighbors
  • Neural networks Single neuron ≈ logistic regression Deep neural networks
  • Regression Linear regression Polynomial regression Ridge regression

1. Classifiers

1.1 Naive Bayes Classifier

1.2 Nearest Neighbor Classifier (KNN)

1.3 Maximum Margin Classifier (SVM)

因为这里的 $c$ 为常数,因此可以除到 “$\ge$” 左侧的参数 $w,b$ 中,这样就得到了 Support Vector Machine(SVM) 的基本形式 \(\min_{w,b}\|w\|^2\)

\[s.t.\text{ }y^i(w^Tx^i+b)\ge 1,\forall i\]

where $y^i=\pm 1$ 表示 $x^i$ 属于 Class 1 or Class 2

Support Vectors are the points on the lines $w^Tx+b=\pm 1$

1.3.1 Dual problem of SVM

Firstly convert SVM equation to standard form \(\min_{w,b}\frac{1}{2}w^Tw\)

\[s.t.\text{ }1-y^i(w^Tx^i+b)\le 0,\forall i\]

Then construct Lagrangian function \(L(w,\alpha,\beta)=\frac{1}{2}w^Tw+\sum_{i=1}^m\alpha_i(1-y^i(w^Tx^i+b))\)

  • when $1-y^i(w^Tx^i+b)=0,\alpha_i > 0$
  • when $1-y^i(w^Tx^i+b)\ne 0,\alpha_i = 0$

Take derivative and set to zero (找最优的 $w^*$) \(\frac{\partial L}{\partial w}=0\to w=\sum_{i=1}^m\alpha_iy^ix^i\)

\[\frac{\partial L}{\partial b}=0\to \sum_{i=1}^m\alpha_iy^i=0\]

Plug back into Lagrangian and simplify to get the dual problem of SVM: \(L(w,\alpha,\beta)=\sum_{i=1}^m\alpha_i-\frac{1}{2}\sum_{i,j=1}^m\alpha_i\alpha_jy^iy^j({x^i}^Tx^j)\)

\[s.t.\begin{cases} \alpha_i\ge 0 \text{ }\forall i\\ \sum_i^m\alpha_iy^i=0 \end{cases}\]

通过求解以上 quadratic programming,求得 $\alpha_i^*,y^i$

1.3.2 Solve dual problem

首先求出 $w$: \(w=\sum_{i=1}^m\alpha_iy^ix^i\)

在对于任意 data $i$ such that $\alpha_i>0$,求出 $b$: \(1-y^i(w^Tx^i+b)=0\)

2. Regression

2.1 Linear Regression Model

2.2 Ridge Regression

2.3 Nonlinear Regression

Document Information

Search

    Table of Contents