A GNN is trained to classify nodes in a citation network. After training, you discover the model uses 3 layers of neighborhood aggregation. What information does each node's learned representation capture?
AEach node's representation encodes features of all nodes within 3 hops (its 3-hop neighborhood)
BEach node's representation captures only its own features, since layers process nodes independently
CEach node's representation encodes features of its immediate neighbors only, regardless of depth
DEach node's representation is an average of all other nodes in the graph
Each GNN layer aggregates information from 1-hop neighbors. After k layers, a node's representation has 'seen' its k-hop neighborhood. With 3 layers, every node incorporates feature information from nodes up to 3 edges away. This is the core mechanism that allows GNNs to capture structural context.
Question 2 Multiple Choice
What is the key difference between a Graph Convolutional Network (GCN) and a Graph Attention Network (GAT) in how they aggregate neighbor information?
AGCNs treat all neighbors as equally important (scaled by degree); GATs learn to weight neighbors differently based on their relevance
BGCNs use attention mechanisms; GATs use fixed degree-normalized aggregation
CGCNs work on directed graphs; GATs only work on undirected graphs
DGCNs can handle graph-level tasks; GATs are restricted to node-level tasks
GCNs apply a fixed aggregation based on the normalized adjacency matrix — every neighbor contributes equally, scaled by its degree. GATs borrow the attention mechanism to learn per-neighbor importance weights, allowing the model to focus on the most relevant connections. This makes GATs more expressive in settings where neighbor importance varies.
Question 3 True / False
Stacking more GNN layers allows each node to incorporate feature information from more distant nodes in the graph.
TTrue
FFalse
Answer: True
Each aggregation layer extends the receptive field by one hop. After k layers, each node's representation reflects its k-hop neighborhood. This is directly analogous to how deep convolutional networks capture larger spatial contexts by stacking convolutional layers.
Question 4 True / False
GNNs handle graph-structured data by first converting each graph to a fixed-length feature vector (flattening the structure), then feeding that vector into a standard neural network.
TTrue
FFalse
Answer: False
This describes the naive (and flawed) approach to graphs. GNNs instead operate directly on the graph structure using neighborhood aggregation — they never flatten the graph. Flattening loses structural information (who is connected to whom) and requires a canonical node ordering, which graphs don't have. The entire point of GNNs is to design operations that respect graph structure.
Question 5 Short Answer
Why can't you simply flatten a graph into a fixed-length vector and feed it into a standard feedforward neural network, and how does the message-passing framework address this limitation?
Think about your answer, then reveal below.
Model answer: Graphs have variable size and no canonical node ordering — the same graph can be described by many different adjacency matrices depending on how nodes are numbered. Flattening destroys structural information about which nodes are connected. Message passing addresses this by defining computations that are invariant to node ordering: each node aggregates information from its neighbors through learned functions, so the representation reflects the actual topology rather than an arbitrary indexing.
The fundamental challenge is that graph isomorphism makes any fixed-length encoding order-dependent and lossy. Message passing is permutation-equivariant: relabeling the nodes produces consistently relabeled representations, not different ones. This structural invariance is what allows GNNs to generalize across graphs of different sizes and structures.