Welcome to LZN's Blog!
Wind extinguishes a candle but energizes fire.
-
W8 Unsupervised Learning
Notes
KMeans
- Process:
-
- Randomly initialize cluster centroids
-
- Assign all data points to their nearest cluster centroids
-
- Move cluster centroids to the mean value of all of the samples assigned to each previous centroid.
-
- Repeats these last two steps until this value is less than a threshold.
- If no sample is assigned to the k-th centroid, just eliminate that centroid.
- Optimization Objectives
- Distortion cost function
- Multiple random initializations to avoid local optima. For large K, not work.
- Elbow method: Cost function J as a function of numer of clusters K.
Dimentionality Reduction
- Aim: 1. Data compression 2. Visualization
- Princical Component Analysis (PCA)
- Feature scaling / mean normalization before performing PCA.
- PCA procedure:
-
- compute covariance matrix Sigma
-
- eigen vector: [U, S, V]=svd(Sigma) or eig(Sigma)
-
- U=[U1, U2, …, Uk (col)] the first k cols will represent k-dim features reduced from n-dim original features.
-
- z=transpose(Ureduce)*x
- Choose k with the smallest value that 99% of variance is retained.
- [U, S, V], S is used for check variance, with diagnal entries S11 S22 … added up to check the variance.
- Do not use PCA to avoid overfitting (no info from Y)
-
W7 Large Margin Classification
Notes
SVM - Support Vector Machine “Large Margin Classifiers”
- SVM gives computational advantages compared with Logistic Regression, easier to solve.
- Form of SVM cost function:
J=A+lambda*B
–> J=C*A+B
- For Linerly seperable data set, larger C, sensitive to outliers, but yields large margin.
-
W6 Evaluation
Notes
Evaluation
- Training set (60%), cross-validation set (20%), and test set (20%).
- If a learning algorithm is suffering from high variance, getting more training data is likely to help.
- Small neural network is prone to underfitting
System Design
- Start from simple model
- Plot learning curves to decide if more data, features, etc. are likely to help
- Error analysis: (1) manually examine errors (2) what potential features could help to classify them.
- Skewed Classes: Positive examples# « Negative examples#
- Precission:
#True positive/#Predicted positive
- Recall:
#True positives/#Actural positives
- F1 score 2PR/(P+R)