What is the primary role of an activation function in a neural network neuron?
ATo normalize the weighted sum so it always falls between 0 and 1
BTo introduce non-linearity, allowing the network to learn non-linear relationships
CTo initialize the weights before training begins
DTo compute the gradient during backpropagation
Without a non-linear activation function, stacking multiple layers of neurons produces only a linear transformation — equivalent to a single linear layer regardless of depth. The activation function (e.g., ReLU, sigmoid, tanh) breaks this linearity, enabling the network to represent and learn complex, non-linear mappings from inputs to outputs.
Question 2 True / False
A deep neural network with many layers but mainly linear activation functions can approximate any continuous function, given enough neurons.
TTrue
FFalse
Answer: False
The composition of any number of linear functions is itself a linear function. No matter how many layers you stack, a purely linear network collapses to a single matrix multiplication (and bias addition). The universal approximation theorem requires non-linear activations. Depth without non-linearity adds no representational power.
Question 3 Short Answer
In a single neuron, what computation is performed before the activation function is applied?
Think about your answer, then reveal below.
Model answer: The neuron computes a weighted sum of all its inputs plus a bias term: z = w₁x₁ + w₂x₂ + ... + wₙxₙ + b. This is a linear combination of the input vector and the weight vector, shifted by b. The activation function f is then applied to z to produce the neuron's output.
This pre-activation value z is a dot product of the weight vector and the input vector plus bias. The weights determine how much each input contributes; the bias shifts the activation threshold. Understanding this as a linear regression followed by a non-linear squashing function helps connect neural networks to the linear models that are their prerequisites.