ML Pipeline Process

What is an ML Pipeline?

An ML Pipeline (Machine Learning Pipeline) is an automated workflow that combines data preprocessing, feature engineering, model training, and evaluation into a streamlined process.

Pipeline Steps in CMMI-DCC

  1. Data Selection: Choose data types and features
  2. Preprocessing: Handle missing values, scale features
  3. Feature Selection: Identify most important variables
  4. Model Training: Train selected algorithm
  5. Evaluation: Assess performance using cross-validation
  6. Results: Feature importance, metrics, predictions

Creating a Pipeline

Navigate to Analysis -> ML Pipelines -> New ML Pipeline

Pipeline Status

  • Queued: Waiting to start
  • Processing: Currently running
  • Completed: Finished successfully
  • Failed: Encountered an error

Related Terms

  • Cross-Validation: Model evaluation technique
  • Feature Selection: Choosing important variables
  • Hyperparameter Tuning: Optimizing model parameters