Interactive ML Concept Simulator

Teach Overfitting, Underfitting, and Cross-Validation dynamically.

💡 Classroom Instructions: Click inside the canvas above to plot custom data points! Try plotting a noisy wave pattern, then ramp up the complexity to see overfitting.

TRAIN SET
TEST SET
MSE (Loss)
0.00
MSE (Loss)
0.00
MAPE
0.0%
MAPE
0.0%
R² Score
0.00
R² Score
0.00

📊 Understanding Data Splits & Color Coding

Training Data (Blue Points)

These points represent the dataset elements exposed to the model during the learning phase. The mathematical curve adjusts itself exclusively to match these positions. When you increase the polynomial degree, notice how the blue line twists aggressively to reduce the Train Error score.

Validation / Test Data (Red Points)

These points are strictly hidden from the fitting calculations to simulate "unseen" real-world data. They measure the true generalizability of the model. When overfitting occurs (high polynomial degrees), the curve fits the blue points perfectly but swings wildly away from these red points, causing a massive spike in Test / Validation Error.

🔬 Dynamic Cross-Validation Logic: When 5-Fold Cross-Validation is enabled, each point on the graph displays a number indicating its assigned partition fold. Moving the Active Testing Fold slider shifts the Red Validation assignment to that chosen subset number, automatically transitioning all other folds into the Blue Training matrix. This illustrates how every sample gets a turn to act as validation data!