Logistic Regression Algorithm
What is Logistic Regression?
Logistic Regression is a statistical model used for binary classification problems. Despite its name, it's a classification algorithm that predicts the probability of an outcome belonging to one of two classes.
How It Works
- Linear Combination: Calculate weighted sum of features
- Sigmoid Function: Transform to probability (0-1)
- Threshold: Classify based on probability (typically 0.5)
Formula
P(Y=1) = 1 / (1 + e^(-z))
where z = b0 + b1*x1 + b2*x2 + ... + bn*xn
Advantages
- Simple and Fast: Quick to train and predict
- Interpretable: Coefficients indicate feature importance
- Probability Output: Provides confidence in predictions
- No Hyperparameters: Minimal tuning required
- Works with Small Data: Doesn't require large datasets
Disadvantages
- Linear Boundaries: Cannot capture complex relationships
- Binary Only: Standard form only handles two classes
- Feature Engineering: Requires careful feature preparation
When to Use
- Binary classification problems
- When interpretability is important
- As a baseline model for comparison
- When you need probability estimates
Applications in CMMI-DCC
- Disease Classification: Disease vs. healthy
- Treatment Response: Responder vs. non-responder
- Feature Importance: Understanding which features predict outcomes
Related Terms
- Classification: Type of ML task
- Random Forest: More complex alternative
- ROC-AUC: Evaluation metric for classifiers