Questions: Neural Language Models and Transformers

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

A transformer model, trained only on next-token prediction with no explicit grammatical rules, correctly handles subject-verb agreement across long embedded relative clauses in sentence types that appear rarely in its training data. What would this finding most strongly suggest?

AThe model has memorized the specific sentences from training data
BStatistical pattern-matching over sufficient data can produce some degree of structural generalization, challenging the claim that LLMs purely match surface patterns
CThe model has an innate grammatical faculty equivalent to Universal Grammar
DLong-distance dependencies are not actually processed by the attention mechanism
Question 2 Multiple Choice

What problem with earlier sequential neural architectures does the transformer's attention mechanism directly solve?

ASequential models could not be parallelized during training, making them impossible to scale
BInformation from early in a sequence could fade out before the end, making long-range dependencies hard to capture; attention allows direct connections between any two positions
CSequential models could not process sentences longer than about 20 words
DAttention allows the model to access external knowledge bases that sequential models could not
Question 3 True / False

Large language models are trained on next-token prediction — they learn to predict which word comes next — without being given explicit rules about grammar or meaning.

TTrue
FFalse
Question 4 True / False

LLMs' strong performance on language benchmarks demonstrates that human language acquisition does not require innate grammatical knowledge, definitively settling the debate over Universal Grammar.

TTrue
FFalse
Question 5 Short Answer

Why does the transformer's attention mechanism give it an advantage over step-by-step sequential processing for understanding language? Give an example of a sentence type where this advantage is particularly important.

Think about your answer, then reveal below.