Questions: Multi-Task Learning

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

A team builds a single multi-task model to simultaneously predict (1) movie review sentiment and (2) whether a medical record indicates diabetes. Both tasks perform worse than single-task baselines. What is the most likely cause?

AThe batch size was too small to support gradient updates from two different loss functions simultaneously.
BThe tasks don't share meaningful feature structure, so shared layers are pulled toward incompatible representations, harming both tasks.
CMulti-task learning always requires significantly more training data than single-task models to achieve competitive performance.
DThe learning rate should be doubled to compensate for the gradient signal being split across two tasks.
Question 2 Multiple Choice

In hard parameter sharing, why does training with auxiliary tasks often improve performance on the MAIN task, even when no new labeled examples are added for that task?

AAuxiliary tasks supply more training labels for the main task by transferring examples across task heads.
BShared layers are forced to learn features that generalize across all tasks, acting as an implicit regularizer that prevents overfitting to quirks in the main task's training data.
CAuxiliary tasks reduce the effective learning rate for the main task's output head, preventing gradient explosion.
DSeparate task-specific heads isolate the auxiliary tasks, ensuring they don't influence the shared representation at all.
Question 3 True / False

Multi-task learning can improve a model's performance on a target task even when no additional labeled data is provided for that target task.

TTrue
FFalse
Question 4 True / False

Adding more tasks to a multi-task learning setup usually improves the performance of most task in the model, because more diverse gradients produce better shared representations.

TTrue
FFalse
Question 5 Short Answer

Explain why task compatibility is critical for multi-task learning to work, using the concept of shared representations.

Think about your answer, then reveal below.