Clasification
The training data comes in pairs. Yet the target values belong to a finite set. The goal is to pick the correct one as often as possible.
- Analyzing a text to determine if the sentiment is happy or sad.
- Using hockey players' statistics to predict which of two teams will probably win a match.
- Analyzing a song to determine if it is classical, techno, pop, or rock.
Binary Classification
There are only two classes to distinguish, typically "positive" or "negative".
Logistic Regression
If x is a Vector
Training
Loss Function
Cost Function
Evaluation
- Comparing training and testing result. If training accuracy is nearly 100%, it is overfitting.
- Since testing data is new data, it is used to test the accuracy on completely new data
Confusion Matrix
T = True F = False
Precision: TPos / (TPos + TNeg + TNeu)
Recall: TPos / (TPos / FNeg / FNeu)
F-Measure: 2*precision * recall (precision + recall)
Accuracy: (all true) / (all data)