Outlier Detection Process
What is Outlier Detection?
Outlier Detection identifies data points that are significantly different from other observations. Outliers may indicate measurement errors, data entry mistakes, or genuinely unusual cases.
Methods in CMMI-DCC
IQR Method
- Uses Interquartile Range
- Outliers: values < Q1 - 1.5IQR or > Q3 + 1.5IQR
Z-Score Method
- Uses standard deviations
- Outliers: |z-score| > 3 (typically)
Isolation Forest
- Machine learning-based
- Identifies anomalies by isolation in random trees
Handling Outliers
- Remove: Delete outlier rows
- Cap/Winsorize: Replace with boundary values
- Transform: Log or other transformations
- Keep: If biologically meaningful
Related Terms
- Isolation Forest: ML-based outlier detection
- IQR Method: Statistical outlier detection
- Z-Score Method: Standard deviation-based detection