Questions — Neural Language Models and Transformers

Question 1 Multiple Choice

A transformer model, trained only on next-token prediction with no explicit grammatical rules, correctly handles subject-verb agreement across long embedded relative clauses in sentence types that appear rarely in its training data. What would this finding most strongly suggest?

AThe model has memorized the specific sentences from training data

BStatistical pattern-matching over sufficient data can produce some degree of structural generalization, challenging the claim that LLMs purely match surface patterns

CThe model has an innate grammatical faculty equivalent to Universal Grammar

DLong-distance dependencies are not actually processed by the attention mechanism

Question 2 Multiple Choice

What problem with earlier sequential neural architectures does the transformer's attention mechanism directly solve?

ASequential models could not be parallelized during training, making them impossible to scale

BInformation from early in a sequence could fade out before the end, making long-range dependencies hard to capture; attention allows direct connections between any two positions

CSequential models could not process sentences longer than about 20 words

DAttention allows the model to access external knowledge bases that sequential models could not

Question 3 True / False

Large language models are trained on next-token prediction — they learn to predict which word comes next — without being given explicit rules about grammar or meaning.

TTrue

FFalse

Question 4 True / False

LLMs' strong performance on language benchmarks demonstrates that human language acquisition does not require innate grammatical knowledge, definitively settling the debate over Universal Grammar.

TTrue

FFalse

Question 5 Short Answer

Why does the transformer's attention mechanism give it an advantage over step-by-step sequential processing for understanding language? Give an example of a sentence type where this advantage is particularly important.

Think about your answer, then reveal below.

Questions: Neural Language Models and Transformers