Outlier Detection Process

What is Outlier Detection?

Outlier Detection identifies data points that are significantly different from other observations. Outliers may indicate measurement errors, data entry mistakes, or genuinely unusual cases.

Methods in CMMI-DCC

IQR Method

  • Uses Interquartile Range
  • Outliers: values < Q1 - 1.5IQR or > Q3 + 1.5IQR

Z-Score Method

  • Uses standard deviations
  • Outliers: |z-score| > 3 (typically)

Isolation Forest

  • Machine learning-based
  • Identifies anomalies by isolation in random trees

Handling Outliers

  • Remove: Delete outlier rows
  • Cap/Winsorize: Replace with boundary values
  • Transform: Log or other transformations
  • Keep: If biologically meaningful

Related Terms

  • Isolation Forest: ML-based outlier detection
  • IQR Method: Statistical outlier detection
  • Z-Score Method: Standard deviation-based detection