A documentary filmmaker wants to show how a family's grief evolves across three months. Which approach best leverages the affordances of each mode?
AA written essay describing the family's routines in precise detail
BA single photograph capturing an emotional moment in the home
CA film combining footage, spoken narration, and music to show change over time
DAn audio recording of a family interview to convey intimacy and voice
Video shows process over time — how something unfolds, changes, and moves — which is exactly what the filmmaker needs. It also allows mode interaction: footage, narration, and music each shape how the others are read, producing meaning that no single mode achieves alone. The other options each use a single mode well, but none can show temporal change and emotional development simultaneously the way film does. The key insight is matching mode to communicative need, not reaching for the most familiar option.
Question 2 Multiple Choice
A student writes an article about air pollution data and adds a large stock photo of a smoggy skyline because the page looks bare. What does this illustrate about multimodal composition?
AEffective use of the image mode — visual content improves reader engagement
BIndexical use of image — the photo points to the real phenomenon described
CDecorative use of image — adds volume without adding meaning
DStrategic mode selection based on the emotional needs of the audience
A decorative image fills space but does no communicative work — it is the multimodal equivalent of padding. An image earns its place only if it is informational (conveying data), indexical (pointing to something real and specific), or emotional (producing a response tied to the content). A generic stock photo of any smoggy city does none of these things specifically. The most tempting wrong answer is A — adding images often *feels* like an improvement, but engagement and meaning are different things.
Question 3 True / False
Combining text with an image generally produces meaning equal to the sum of what each mode contributes separately.
TTrue
FFalse
Answer: False
Modes *interact* — they shape how each other is read — which means the combined meaning can be greater than, less than, or qualitatively different from what either mode achieves alone. A caption reframes a photograph; a photograph makes a caption feel factual. When modes work against each other (cheerful music under disturbing images), the tension itself becomes the meaning. Treating multimodal composition as simple addition misses this interactive dimension entirely.
Question 4 True / False
Context — including the medium, situation, and audience expectations — determines which modes are available and appropriate for a given composition.
TTrue
FFalse
Answer: True
A peer-reviewed article, a protest poster, and a podcast each permit different modes and create different expectations. A podcast must convey everything aurally; a poster must work at a glance with minimal text; an academic article expects text with purposeful figures. Effective multimodal composers do not choose a medium and then fill it — they start with purpose and audience, assess what modes the context makes available, and design accordingly. Context constrains and shapes modal choice.
Question 5 Short Answer
Why is it insufficient to simply 'add images to text' when composing multimodally? What should guide the choice of which mode to use for a given communicative purpose?
Think about your answer, then reveal below.
Model answer: Adding images without purpose creates decoration — volume without meaning. Every mode should be chosen because it is the best tool for a specific communicative task: images for spatial relationships or emotional tone, video for process over time, audio for intimacy and presence, text for precision and logical structure. The guiding question is: what cognitive or emotional work does this element need to do, and which mode is best suited to do it in this context?
The deeper point is that modes have affordances — things they are particularly good at — and effective composition assigns communicative tasks to the mode that handles them best. Decorative choices waste the reader's attention and dilute the impact of the elements that are genuinely load-bearing. Starting from purpose (what do I need to communicate?) rather than from medium (what should I fill this space with?) is the practical discipline multimodal composition requires.