AI & ML Algorithm Calculator with Step-by-Step Math

Learn the theory, plug in your data, and watch the full manual calculation exactly how your exam expects it.
Built by a student, for students.

Show full manual working
Verify exam answers instantly
Covers 10+ core CS syllabi
Adversarial Search

Alpha-Beta Pruning

Alpha-Beta Pruning is an optimization of the Minimax algorithm that eliminates the need to evaluate every node in a game tree. Imagine playing chess and spotting a candidate move that immediately loses your Queen — a grandmaster stops analyzing that line instantly, because no follow-up move can make it worthwhile. Alpha-Beta formalizes that intuition: it maintains two bounds, α\alpha (the best score MAX can guarantee) and β\beta (the best score MIN can guarantee), and prunes entire subtrees the moment those bounds cross — delivering the identical optimal decision as full Minimax while evaluating a fraction of the nodes.

Adversarial Search

Minimax Algorithm

Minimax is a decision-making algorithm used in two-player, zero-sum games like Chess and Tic-Tac-Toe to determine the provably optimal move. Unlike a reckless gambler hoping the opponent blunders, Minimax assumes the opponent plays perfectly — every single time. It recursively maps every possible future game state into a tree, then propagates scores backward from the terminal outcomes: the MAXMAX player selects the branch with the highest guaranteed score, while the MINMIN player selects the branch that minimizes it, until the best move at the root is mathematically certain.

Pathfinding Search

A* Search Algorithm

AA^* (pronounced A-star) is the gold standard pathfinding algorithm in Artificial Intelligence, guaranteeing the shortest path between two points while remaining dramatically more efficient than exhaustive search. Where a GPS routing a delivery truck must balance roads already driven against estimated distance to the destination, AA^* formalizes exactly that tradeoff: it evaluates every candidate node using f(n)=g(n)+h(n)f(n) = g(n) + h(n), where g(n)g(n) is the exact cost from the start and h(n)h(n) is an admissible heuristic estimate to the goal. By expanding only the most promising nodes and permanently closing exhausted ones, AA^* provably finds the optimal path without exploring the entire graph.

Uninformed Search

Uniform Cost Search (Dijkstra's Algorithm)

Uniform Cost Search (UCS) — widely known as Dijkstra's Algorithm — is the foundational cost-optimal search algorithm in Artificial Intelligence, guaranteeing the cheapest path between two nodes in a weighted graph. Where BFS counts hops, UCS reads the price tag on every road: a direct two-mile shortcut and a scenic fifty-mile detour are not equal, and UCS treats them accordingly. By expanding nodes in strict order of their accumulated path cost using a priority queue, UCS systematically eliminates every cheaper alternative before committing to a solution, making it the definitive algorithm when edge weights represent real-world costs like distance, time, or fuel.

Pathfinding Search

Greedy Best-First Search

Greedy Best-First Search is a pathfinding algorithm that navigates toward a goal using only a heuristic estimate of remaining distance, completely ignoring the cost of the path already traveled. Imagine navigating a city by always turning toward the skyline landmark you're heading for — it feels intuitive until a river blocks your direct route and you realize you've walked miles in the wrong direction. By exclusively minimizing h(n)h(n), the estimated distance to the goal, Greedy Search expands nodes faster than any cost-aware algorithm, but its myopia makes it vulnerable to dead ends and suboptimal paths that a balanced algorithm like AA^* would have avoided entirely.

Machine Learning

Evaluation Metrics (Confusion Matrix)

A Confusion Matrix is the definitive evaluation framework for classification models in Machine Learning, transforming raw predictions into a structured breakdown of exactly where a model succeeds and where it fails. A model that correctly classifies 90% of patients as healthy sounds impressive — until you discover it missed every single cancer case, a failure mode raw accuracy completely conceals. By organizing predictions into a 2×22 \times 2 grid of True Positives, True Negatives, False Positives, and False Negatives, the matrix unlocks four precise metrics — Accuracy, Precision, Recall, and F1 Score — each exposing a different dimension of model performance that no single number can capture alone.

Distance-Based

K-Nearest Neighbors (KNN)

K-Nearest Neighbors (KNN) is a supervised classification algorithm built on a deceptively simple premise: a data point is most likely to belong to the same class as the points surrounding it. A real estate appraiser estimating a neighborhood's character doesn't build a complex model — they look at the nearest comparable properties and let proximity speak for itself. KNN formalizes that intuition by computing the distance between a new data point and every training example, then assigning the majority class among the kk closest neighbors — requiring no training phase, no learned parameters, and no assumptions about the underlying data distribution.

Tree-Based

Decision Tree (ID3)

A Decision Tree is a supervised learning algorithm that makes predictions by recursively partitioning a dataset into increasingly pure subsets, producing a human-readable tree of if-then rules. Imagine a doctor diagnosing a patient by asking the most discriminating question first — fever or no fever, then chest pain or not — narrowing the possibilities with each answer until a diagnosis is certain. The ID3 (Iterative Dichotomiser 3) variant formalizes that intuition using Entropy to measure impurity at each node and Information Gain to select the feature that produces the cleanest split, guaranteeing the most informative question is always asked first.

Tree-Based

Random Forest Classifier

Random Forest is an ensemble learning algorithm that constructs a collection of Decision Trees and aggregates their predictions through majority voting, producing a classifier far more robust than any individual tree could achieve alone. A single financial analyst studying one quarter of data might confidently recommend a bad investment — but a committee of analysts, each independently reviewing different data samples and metrics, collectively arrives at a far more reliable verdict. Random Forest formalizes that wisdom by training each tree on a bootstrap sample of the training data and restricting every node split to a random subset of p\sqrt{p} features, deliberately introducing diversity that cancels out individual errors and eliminates the overfitting that makes standalone Decision Trees unreliable on unseen data.

Clustering

K-Means Clustering

K-Means Clustering is the foundational unsupervised learning algorithm in Machine Learning, discovering hidden structure in unlabeled data by partitioning it into kk distinct groups without any target labels to guide it. A retail company analyzing millions of anonymous purchase records has no predefined customer categories — K-Means finds them anyway, grouping buyers by behavior rather than by any label a human assigned. The algorithm iteratively assigns each data point to its nearest centroid, then recomputes each centroid as the mean of its assigned points, repeating until the assignments stabilize — producing compact, well-separated clusters used everywhere from customer segmentation to image compression to anomaly detection.

Probabilistic

Naive Bayes Classifier

Naive Bayes is a probabilistic classification algorithm grounded in Bayes' Theorem, predicting the most likely class for a data point by computing the posterior probability of each class given the observed features. A spam filter trained on millions of emails doesn't analyze sentence structure or sender intent — it simply multiplies the probability of each individual word appearing in spam, and that blunt calculation proves remarkably effective. The "naive" assumption that all features are conditionally independent given the class label is statistically aggressive and rarely true in practice, yet Naive Bayes consistently delivers fast, accurate classifications in high-dimensional domains like text analysis, medical diagnosis, and real-time content filtering precisely because that simplification makes the math tractable.

Association Rules

Apriori Algorithm

The Apriori algorithm is the foundational technique in association rule mining, discovering hidden co-occurrence patterns in transactional datasets that drive recommendation systems, market basket analysis, and medical diagnosis. When a supermarket notices that customers who buy nappies on Friday evenings also buy beer at a rate far above chance, that insight didn't come from intuition — it came from Apriori systematically mining thousands of receipts for statistically significant item combinations. The algorithm exploits a single elegant principle — that every subset of a frequent itemset must itself be frequent — to aggressively prune the search space at each pass, making it computationally feasible to extract meaningful association rules from datasets containing thousands of distinct items.

Evolutionary

Genetic Algorithm (One-Max)

The Genetic Algorithm is a population-based optimization technique inspired by Darwinian natural selection, evolving candidate solutions toward an optimal answer through successive generations of selection, crossover, and mutation. Where calculus-based optimizers require a smooth, differentiable landscape to navigate, a Genetic Algorithm treats the search space like an ecosystem — brutal, competitive, and indifferent to mathematical convenience. The One-Max problem is the canonical entry point: starting from a population of random binary strings, the algorithm scores each string by counting its 1s, selects the fittest survivors as parents, recombines their bits through crossover, and introduces random mutations — repeating until natural selection converges the entire population to a string of all 1s.

Evolutionary

Genetic Algorithm (Knapsack)

The Genetic Knapsack problem takes the basic Genetic Algorithm and applies it to a real-world scenario where you have to make the best possible choices under a strict limit. A military logistics officer packing a supply helicopter faces the same fundamental problem — maximum mission value, strict weight ceiling, and far too many possible combinations to evaluate exhaustively. The Genetic Algorithm encodes each candidate packing decision as a binary chromosome where each bit represents whether an item is included or excluded, then evolves the population through selection, crossover, and mutation while enforcing the weight constraint as a fitness penalty — producing near-optimal solutions to a problem that belongs to the computationally intractable class of NP-hard combinatorics.

Regression

K-Nearest Neighbors (KNN) Regression

K-Nearest Neighbors Regression is a non-parametric algorithm that predicts continuous numerical values by averaging the outcomes of the most similar training examples, rather than voting on a class label. Where KNN Classification answers "which category does this belong to?", KNN Regression answers "what number should this be?" — predicting a house's sale price from its square footage and bedroom count by finding the kk most similar homes and averaging what they actually sold for. Because it fits no global equation and makes no distributional assumptions, KNN Regression adapts naturally to complex, nonlinear relationships that a straight regression line would fundamentally misrepresent.

Regression

Linear Regression

Linear Regression is the foundational supervised learning algorithm for predicting continuous numerical outcomes, modeling the relationship between one or more input variables and a target value as a straight line. A professor estimating a student's final exam score from hours studied isn't guessing — they're applying the same core logic: more hours studied predicts a higher score, and that relationship can be quantified precisely. By minimizing the sum of squared residuals between predicted and actual values, Linear Regression finds the unique best-fit line through the data, producing a mathematical equation that extrapolates reliably to new inputs within the range of the training distribution.

Regression

Multiple Linear Regression

Multiple Linear Regression extends simple linear regression to model a target variable as a linear combination of two or more input features simultaneously, fitting a hyperplane through multi-dimensional data rather than a line through two-dimensional points. Predicting a house's sale price from square footage alone ignores half the picture — the neighborhood, the age of the building, the number of bathrooms each independently shift the outcome, and Multiple Linear Regression quantifies every contribution at once. By solving for all coefficients simultaneously using the Normal Equation — a single matrix operation of the form β=(XTX)1XTy\beta = (X^TX)^{-1}X^Ty — the algorithm finds the unique hyperplane that minimizes total prediction error across every feature in the dataset.

Uninformed Search

Breadth-First Search (BFS)

Breadth-First Search (BFS) is the foundational uninformed search algorithm in Artificial Intelligence, exploring a graph level by level until the target is found. Imagine a drop of water hitting a pond — the ripples expand outward perfectly evenly in all directions, reaching every point one step away before any point two steps away. BFS formalizes that expansion using a FIFO queue, guaranteeing it discovers the shallowest path to the goal first, making it the definitive algorithm for finding shortest paths in unweighted graphs and mazes where no heuristic information is available.

Uninformed Search

Depth-First Search (DFS)

Depth-First Search (DFS) is an uninformed search algorithm that explores a graph by committing fully to one path before considering any alternative. Where BFS spreads outward like ripples on a pond, DFS behaves like a maze-runner who keeps one hand on the wall — plunging as deeply as possible down a single corridor until hitting a dead end, then backtracking to the nearest unexplored junction. By using a stack rather than a queue, DFS trades the path-length guarantees of BFS for dramatic memory efficiency, since it only needs to remember the nodes along the current active path rather than every node on every frontier level.