The mutual information I(S; R) between stimulus S and neural response R measures how much information the response conveys about the stimulus. Why is I(S; R) = 0 not the same as independence in the usual sense?
AIt is the same; I(S; R) = 0 implies independence
BI(S; R) = 0 means the response provides zero information about the stimulus, but this is different from independence if the response depends deterministically on other variables
CThe concepts are the same, just different terminology
DI(S; R) measures only linear relationships
I(S; R) = 0 means the response R provides zero information about the stimulus S — they are independent in the information-theoretic sense (D_KL(p(s,r) || p(s)p(r)) = 0). However, R could still depend on other variables (e.g., internal noise, behavioral state) that are not related to S. Independence (statistical independence) is exactly what I = 0 means. A neuron's response could be highly structured (deterministic firing patterns) but carry zero information about the stimulus if that structure reflects internal dynamics rather than stimulus-driven changes. In practice, I(S; R) > 0 for sensory neurons, and its magnitude reflects the fidelity of the neural code.
Question 2 True / False
Fisher information quantifies the precision of neural encoding of a parameter theta, while mutual information quantifies information content. Are these concepts measuring different things?
TTrue
FFalse
Answer: True
Fisher information F(theta) measures the curvature of the log-likelihood landscape: F(theta) = E[(d/d_theta log p(r|theta))^2]. It quantifies the slope of the likelihood curve — steep curvature means small changes in theta are detectable. The Cramer-Rao bound states that the minimum variance of any unbiased estimator is lower-bounded by 1/F(theta). Mutual information I(S; R) quantifies total information (in bits) about the stimulus conveyed by the response. Both are relevant: Fisher information captures local precision (how sharply the neuron distinguishes nearby stimuli), while mutual information captures total information (summed over the entire stimulus range). For Gaussian encodings, the two are related, but for highly non-Gaussian responses (threshold effects, saturation), they can differ substantially.
Question 3 Short Answer
Explain the relationship between neural information rate (bits per spike or bits per second) and the metabolic cost of neural firing. Why do neurons use high firing rates?
Think about your answer, then reveal below.
Model answer: Neural firing has metabolic costs: each action potential requires ATP for ion pumps (Na+/K+-ATPase). Firing at a higher baseline rate consumes more energy. However, high rates allow the neuron to transmit information efficiently. The information rate (bits per unit time or bits per spike) must account for the bandwidth constraint: spike timing precision is limited by refractoriness (~1-2 ms), so spike counts are discrete and quantized. To transmit information at a given rate in bits/second, a neuron with limited temporal resolution must increase its firing rate. For example, a neuron firing at 10 Hz with ±10 ms timing precision can encode roughly log_2(100) ≈ 6-7 bits, while a neuron firing at 100 Hz could encode more. The energy-information tradeoff is fundamental: neurons optimize the firing rate to maximize information transmitted per unit metabolic cost, balancing efficiency and performance based on task demands.
Sensory neurons in dim light fire at high rates despite metabolic cost because visual information is precious. Motor neurons controlling precise movement also fire at high rates. In contrast, neurons encoding slowly-changing (low-bandwidth) information fire at low rates, minimizing cost. The nervous system appears to allocate neural resources (firing rate, energy) based on information requirements — regions with high information needs have higher firing rates and metabolic consumption.
Question 4 Multiple Choice
In a population of N neurons encoding a stimulus parameter theta, the information carried by the population grows as N (or better with correlation-reducing mechanisms). How does this relate to the concept of 'redundancy' vs. 'synergy' in neural coding?
ARedundancy and synergy are the same concept
BRedundancy (shared information between neurons) reduces population information below N*I_single, while synergy (information from combinations of neurons not in individuals) increases it. Optimal populations minimize redundancy and exploit synergy
CPopulations are always redundant because neurons are correlated
DPopulations are always synergistic because of connectivity
The total information in a population can be decomposed: I_pop = sum_i I_single_i - I_redundancy + I_synergy. If each neuron independently carried I_single bits, and they were uncorrelated, the population would carry N*I_single bits. In reality, neurons share information (redundancy), reducing the total. But neurons can also carry information in combinations (higher-order statistics) that no individual neuron carries (synergy). Optimal populations minimize redundancy (reduce correlation) while exploiting synergy (use population patterns for high-fidelity encoding). In the visual cortex, neurons encoding similar features are often correlated (redundancy) but their population patterns are optimized for downstream decoding (synergy). The balance between redundancy and synergy depends on the task and efficiency constraints.