Train a real neural network in your browser — watch it learn to separate data point by point.
A neural network is a set of numbers (weights) connected in layers. During training, each weight adjusts slightly to reduce the gap between the network's prediction and the correct answer. Repeat thousands of times and the network "learns" the pattern.
Backpropagation computes how much each weight contributed to the error, then nudges it in the direction that reduces the loss. The learning rate controls how big each nudge is — too high and it overshoots, too low and it's slow.
Set the network to 2 hidden neurons and train on the XOR dataset. Can it solve XOR? Now try 4 neurons — what changes? Why does XOR require at least one hidden layer?