A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Convolutional Neural Networks

Graduate Depth 95 in the knowledge graph ☐ I know this ☆ Set as goal

27topics build on this

655prerequisites beneath it

Backpropagation Algorithm Activation Functions in Neural Networks +4 more→→Capsule Networks Deep Learning for Signal Processing +4 more

Core Idea

CNNs exploit spatial structure with convolutional layers learning local filters. Pooling reduces dimensionality preserving features. Shared weights reduce parameters and improve translation equivariance. CNNs dominate computer vision tasks.

Explainer

From backpropagation, you know how to train a fully connected neural network by computing gradients of a loss function with respect to every weight. Now imagine feeding a 256×256 color image into such a network. The input has 256 × 256 × 3 ≈ 196,000 values. If the first hidden layer has just 1,000 neurons, that is nearly 200 million weights in a single layer — far too many to train effectively, and the network would have no understanding that nearby pixels are more related than distant ones. Convolutional neural networks solve both problems by replacing full connections with small, sliding filters that exploit the spatial structure of images.

A convolutional layer applies a small filter (typically 3×3 or 5×5 pixels) that slides across the entire input, computing a dot product at each position. This produces a feature map — a 2D output where each value indicates how strongly that local patch of the image matches the filter's pattern. A single layer applies many such filters in parallel, each learning to detect a different feature. In early layers, filters typically learn edges, corners, and color gradients. In deeper layers, they compose these into textures, parts (like eyes or wheels), and eventually whole objects. The critical insight is weight sharing: the same filter with the same weights is applied at every spatial position. This means the network uses the same detector everywhere, dramatically reducing the number of parameters and making the network translation equivariant — if a cat's ear moves 50 pixels to the right in the image, the corresponding activation in the feature map also shifts by 50 pixels.

Pooling layers (typically max pooling) follow convolutional layers and reduce the spatial dimensions by summarizing small regions — for example, taking the maximum value in each 2×2 block. This serves two purposes: it reduces the computational cost for subsequent layers, and it introduces a degree of translation invariance — small shifts in the input produce the same pooled output. The combination of convolution (detecting local features with shared weights) followed by pooling (compressing spatial resolution) is repeated several times, creating a hierarchy of increasingly abstract representations. The final feature maps are flattened and fed into one or more fully connected layers that produce the classification output.

Training a CNN uses the same backpropagation algorithm you already know, but the gradient computation is adapted for the convolution operation. Because weights are shared across all spatial positions, the gradient for each filter weight is the sum of gradients from every position where that filter was applied. This makes CNNs not only more parameter-efficient but also faster to train than equivalently expressive fully connected networks. Modern architectures like ResNet, VGG, and EfficientNet are variations on this theme, adding skip connections, deeper stacks, and architecture search. The core principle remains unchanged: by building spatial locality and weight sharing into the network's structure, CNNs encode a powerful inductive bias — the assumption that the same local patterns are relevant regardless of where they appear — that makes them extraordinarily effective for images, video, audio spectrograms, and any data with grid-like spatial structure.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Linear Regression in Machine Learning → Neural Network Fundamentals → Backpropagation Algorithm → Multilayer Perceptrons (MLPs) → Activation Functions in Neural Networks → Convolutional Neural Networks

Longest path: 96 steps · 655 total prerequisite topics

Prerequisites (6)

Backpropagation Algorithmhard Matrix Multiplicationsoft Partial Derivatives: Definition and Computationsoft Matrix Operationssoft Activation Functions in Neural Networkssoft Graph Neural Networkssoft

Leads To (6)

Capsule Networkshard Deep Learning for Signal Processinghard Deep Q-Networks (DQN)hard Object Detection Networkshard Semantic Segmentationhard Transfer Learning in Neural Networkssoft