Explain why Diag(M) — rather than a set of axioms describing M's properties — is the right tool for finding structures that 'contain' M in the embedding sense.
Think about your answer, then reveal below.
Model answer: Axioms describing M's properties are general: they say what kind of structure M is (e.g., 'a group', 'a field'), but they are satisfied by many structures that have nothing to do with M specifically. A model of the axioms for groups need not contain M as a subgroup — it just needs to be some group. Diag(M) is specific to M: it names every element of M explicitly with constants and records every atomic fact about those named elements. Any structure satisfying Diag(M) must contain interpretations for all those constants satisfying all those atomic facts, which forces it to contain an isomorphic copy of M. The diagram technique turns 'M embeds into N' from a semantic relationship that must be verified by finding an embedding into a syntactic condition — N models a particular theory — that can be manipulated with compactness and other logical tools.
This syntactic handle on embeddings is what makes the diagram construction so useful in model theory. The upward Löwenheim-Skolem theorem, for example, is proved by adding Diag(M) to a theory with witnesses for new elements and applying compactness: any finite subset of the combined theory has a model, so the whole theory has a model, and that model is an extension of M. The diagram turns a construction problem (build a larger structure containing M) into a consistency problem (does this theory have a model?), which is solved by compactness.