An e-commerce platform embeds full customer information (name, address, email) inside every order document. A customer updates their shipping address. What problem does this create?
AThe update fails because document databases do not support partial document updates
BEvery order document containing the old address must be updated, creating write-side redundancy across potentially thousands of records
CThe schema-free nature of document databases prevents field-level updates
DNothing — embedding is always the correct approach for data that is read together
This is the classic write-side redundancy problem caused by embedding data that is shared and updated independently. When customer info is embedded in every order, a single address change requires updating every order document — a potentially expensive scatter-update. This is why shared, frequently-updated data is better *referenced* (storing a customer ID and resolving it in application code) rather than embedded. The decision depends on access pattern: if orders always display the address as-of-order-time and never need the current address, embedding might be intentional.
Question 2 Multiple Choice
A developer says their document database application has 'no schema' and therefore requires no schema migrations. Why is this claim misleading?
AIt is not misleading — document databases truly have no schema requirements
BThe database has no schema, but the application code enforces an implicit schema by expecting specific fields and types; evolving that code is effectively a schema migration
CDocument databases have schemas stored in a separate metadata collection that must be updated
DSchema migrations are only required when adding new collections, not when modifying fields
Schema flexibility means the storage engine doesn't enforce field presence or types — any document can have any fields. But the application code that reads those documents absolutely expects certain fields to exist with certain types. When you add a required field or change a type, every document that doesn't conform will cause application errors. Migrating all existing documents (or handling the missing-field case in code) is functionally a schema migration. Libraries like Mongoose make this implicit schema explicit at the application layer.
Question 3 True / False
Operations on a single document in a document database are guaranteed to be atomic.
TTrue
FFalse
Answer: True
Atomicity at the document level is a core guarantee of document databases — you will never see a document in a half-updated state. This is intentional: since a document is the unit of data retrieval, it should also be the unit of consistency. The critical consequence is that your document boundaries become your consistency boundaries. If two related pieces of data need to update atomically, they must either be in the same document or you must use the database's multi-document transaction support (with its associated overhead).
Question 4 True / False
Embedding most related data in one document is typically preferable to referencing because it eliminates joins and makes reads faster.
TTrue
FFalse
Answer: False
Embedding is the right choice when data is always read together, is owned by the parent document, and doesn't grow unboundedly. But it causes problems when the embedded data is shared across many documents (requiring scattered updates), when it changes frequently and independently, or when it grows large (document bloat makes every read fetch more data than needed). The correct principle is to model based on access patterns: embed what is read together, reference what is updated independently or shared. There is no universally correct choice.
Question 5 Short Answer
How should you decide whether to embed related data inside a document or reference it by ID, and what are the key tradeoffs?
Think about your answer, then reveal below.
Model answer: Embed when the related data is always read together with the parent, is owned exclusively by that document, has bounded size, and is updated atomically with the parent. Reference when the related data is shared across many documents, changes frequently and independently, grows without bound, or needs to be updated without touching the parent. The tradeoff: embedding gives fast single-read access but creates write-side redundancy for shared data; referencing avoids redundancy but requires application-level join logic and extra queries.
This decision is the central modeling skill in document databases and has no universal answer. The guiding question is always: which queries does my application run most often, and how does my data change? A comment embedded in a blog post is always read with the post and edited rarely — embed it. A product category shared across thousands of product documents and renamed regularly — reference it. The document boundary is also the atomicity boundary, so consistency requirements also influence the choice.