Help Center

All Help Topics

Clinical Data

CBC (Complete Blood Count) Abbreviation

Common blood test that measures various components of blood including red blood cells, white blood cells, and platelets.

HCT (Hematocrit) Abbreviation

The percentage of red blood cells by volume in your blood.

HB (Hemoglobin) Abbreviation

The protein in red blood cells that carries oxygen throughout your body.

PLT (Platelet Count) Abbreviation

Number of platelets in your blood, which help with clotting.

RBC (Red Blood Cell Count) Abbreviation

Number of red blood cells in your blood, which carry oxygen.

WBC (White Blood Cell Count) Abbreviation

Number of white blood cells in your blood, which fight infection.

Reference Range Metric

The expected range of values for a healthy population, used to interpret test results.

CBC Units of Measurement Metric

Standard units used to report Complete Blood Count (CBC) test results.

Reference Range Status Metric

Status flags indicating whether a test result falls within, above, or below the expected reference range.

HLQ (Health and Lifestyle Questionnaire) Abbreviation

Comprehensive questionnaire collecting health and lifestyle information from study participants.

Data Management

Bulk Upload Process

Feature for uploading multiple records or files at once.

CSV Format Data type

Comma-Separated Values file format used for data import and export.

Metadata Data type

Data about data - descriptive information about datasets and samples.

Data Validation Process

Process of checking uploaded data for errors and consistency.

Data Dictionary Process

Catalog of data elements with their definitions, formats, and relationships.

Export Process

Feature to download data from CMMI-DCC for external analysis.

Downloads Ui component

Access to pre-generated and previously exported data files.

Master Submissions Ui component

Central record of all participant data submissions across studies.

Data Processing

Standard Scaling Method

Normalization technique that transforms features to have mean=0 and standard deviation=1.

MinMax Scaling Method

Normalization technique that transforms features to a fixed range, typically [0, 1].

Iterative Imputer Method

Advanced missing value imputation method that models each feature with missing values as a function of other features.

Mean Imputation Method

Simple missing value imputation method that replaces missing values with the mean of the feature.

Robust Scaling Method

Normalization technique using median and quartiles, robust to outliers.

Median Imputation Method

Missing value imputation using the median, more robust to outliers than mean imputation.

Mode Imputation Method

Missing value imputation using the most frequent value, typically for categorical data.

Outlier Detection Process

Process of identifying data points that differ significantly from the majority of observations.

IQR Method Method

Statistical method for outlier detection using the Interquartile Range.

Z-Score Method Method

Outlier detection method based on standard deviations from the mean.

Machine Learning

Random Forest Algorithm

Ensemble learning method that builds multiple decision trees and combines their predictions.

XGBoost Algorithm

Gradient boosting framework known for high performance and accuracy in predictive modeling.

Feature Selection Process

Process of selecting a subset of relevant features for model construction.

Isolation Forest Algorithm

Unsupervised anomaly detection algorithm that identifies outliers by isolating observations in random decision trees.

Cross-Validation Method

Technique to evaluate machine learning models by splitting data into training and validation sets multiple times.

Gradient Boosting Algorithm

Ensemble machine learning technique that builds models sequentially to correct previous errors.

Logistic Regression Algorithm

Statistical model for binary classification that predicts the probability of an outcome.

SVM (Support Vector Machine) Algorithm

Classification algorithm that finds the optimal hyperplane to separate classes with maximum margin.

KNN (K-Nearest Neighbors) Algorithm

Classification algorithm that predicts based on the majority class of the K closest training examples.

Neural Network Algorithm

Machine learning model inspired by the brain, using interconnected layers of nodes to learn patterns.

ML Pipeline Process

Automated workflow for building, training, and evaluating machine learning models.

Target Variable Metric

The outcome variable that a machine learning model is trained to predict.

Task Type (ML) Metric

The type of machine learning problem: Classification (categories) or Regression (continuous values).

F1 Score Metric

Harmonic mean of precision and recall, balancing both for classification evaluation.

Accuracy Metric

Percentage of correct predictions out of all predictions made.

Precision Metric

Of all positive predictions, what proportion was actually positive.

Recall Metric

Of all actual positives, what proportion was correctly identified.

MAE (Mean Absolute Error) Abbreviation

Average of absolute differences between predictions and actual values.

MSE (Mean Squared Error) Abbreviation

Average of squared differences between predictions and actual values.

RMSE (Root Mean Squared Error) Abbreviation

Square root of MSE, in the same units as the target variable.

ROC-AUC Metric

Area Under the ROC Curve, measuring classifier performance across all thresholds.

Confusion Matrix Metric

Table showing counts of true positives, false positives, true negatives, and false negatives.

Feature Importance Metric

Ranking of how much each feature contributes to model predictions.

Hyperparameter Tuning Process

Process of finding optimal algorithm settings to maximize model performance.

Grid Search Method

Exhaustive search over specified parameter combinations to find optimal settings.

Random Search Method

Random sampling of hyperparameter combinations, often faster than grid search.

Navigation

Global Search Ui component

Search feature that finds data across all data types and participants.

Saved Searches Ui component

Feature to save and reuse frequent search queries.

Filter Panel Ui component

Sidebar interface for filtering data by various criteria.

Omics Data

Metabolomics Data type

Study of small molecules (metabolites) within cells, biofluids, tissues, or organisms.

HMDB ID Identifier

Unique identifier from the Human Metabolome Database for metabolites.

LOD (Limit of Detection) Metric

The lowest quantity of a substance that can be distinguished from the absence of that substance.

Biofluid Data type

Biological fluid sample collected for analysis (e.g., blood, urine, plasma).

Date of Measurement Data type

The date when the sample was measured or analyzed in the laboratory.

Proteomics Data type

Large-scale study of proteins, particularly their structures and functions.

Experimental Method Data type

The analytical technique or procedure used to measure or analyze samples in the laboratory.

UniProt ID Identifier

Unique identifier from the UniProt database for proteins.

Abundance Metric

Relative quantity or concentration of a molecule (protein, metabolite, etc.) measured in omics experiments.

Fold Change Metric

Ratio of molecular abundance between two conditions, indicating up or down regulation.

Sequencing Technology Technology

High-throughput DNA sequencing methods used to determine the order of nucleotides in genetic material.

Alpha Diversity Metric

Measure of microbial diversity within a single sample or community.

Sample Type Identifier

Classification of biological samples based on their source, collection method, or biological matrix.

Metagenomics Data type

Genomic analysis of microbial communities directly from environmental or clinical samples.

Microbiome Data type

The community of microorganisms (bacteria, viruses, fungi) living in a specific environment like the gut.

Microbial Diversity Metric

Measure of the variety and abundance of different microbial species in a sample.

16S rRNA Sequencing Technology

Targeted sequencing method for identifying and classifying bacteria based on the 16S ribosomal RNA gene.

WGS (Whole Genome Sequencing) Abbreviation

Complete sequencing of an organism's entire genome, providing comprehensive genetic information.

Instrument (Sequencing) Technology

The specific sequencing hardware used to generate genomic data.

Project & Study

CMMI-DCC Abbreviation

Canadian Microbiome Mapping Initiative - Data Coordination Centre, the central hub for CMMI data.

CMMI ID Identifier

Unique participant identifier used across all CMMI studies to link data from multiple sources.

Study ID Identifier

Study-specific identifier for organizing and grouping participant data.

Cohort Data type

A group of participants sharing common characteristics for research purposes.

BCGP (British Columbia Generations Project) Abbreviation

A longitudinal health study from British Columbia providing clinical and omics data for CMMI research.

Participant ID Identifier

Unique identifier assigned to each study participant, used to link data across different data types.

Sample ID Identifier

Unique identifier for biological samples collected from participants.

Most Popular

CBC (Complete Blood Count) HMDB ID HB (Hemoglobin) Random Forest Standard Scaling RBC (Red Blood Cell Count) Feature Selection Isolation Forest Microbiome Task Type (ML)

Recently Added

Sample Type Spearman Rank Correlation P-Value Correlation Analysis BCGP (British Columbia Generations Project) CMMI-DCC Participant ID Sample ID Metagenomics Microbiome

Need More Help?

Can't find what you're looking for?

Browse FAQs