A deterministic finite automaton (DFA) is the simplest model of computation: a finite set of states, an input alphabet, a transition function that maps each state-symbol pair to exactly one next state, a start state, and a set of accept states. A DFA reads an input string one symbol at a time, following transitions deterministically, and accepts if it ends in an accept state. The class of languages recognized by DFAs is exactly the regular languages — a proper subset of the context-free languages and far weaker than what Turing machines can decide.
Draw state diagrams for small DFAs that accept concrete languages — strings ending in "01", strings with an even number of 1s, binary multiples of 3. Trace inputs through the diagram by hand before formalizing the transition function as a table. This builds intuition for how finite memory constrains what can be recognized.
A DFA is the simplest formal model of computation — and its simplicity is the point. You can think of a DFA as a machine with a fixed, finite number of "moods" (states), and its only memory is which mood it currently occupies. Each time it reads a symbol, it transitions — deterministically, with no choice — to a new state dictated by the transition function δ: Q × Σ → Q. When the input string is exhausted, the machine reports whether it's in an accepting state. That's the entire mechanism.
From your set-theoretic background, the formal definition is a 5-tuple (Q, Σ, δ, q₀, F): a finite set of states Q, an input alphabet Σ, the total function δ, a start state q₀ ∈ Q, and a set of accept states F ⊆ Q. The word "total" is critical — δ must be defined for every (state, symbol) pair. If you want to reject certain inputs, you introduce a dead state (or sink state), a state from which all transitions loop back to itself with no path to acceptance. Its existence keeps δ total without granting any additional accepting power.
The key insight about DFAs is that their computational power is exactly bounded by what finite memory can represent. A DFA can recognize "all binary strings with an even number of 1s" because you only need to track a parity bit — two states suffice, one for "seen an even count so far" and one for "odd count so far." But a DFA cannot recognize {aⁿbⁿ : n ≥ 1} because that requires remembering an arbitrarily large count n, which exceeds any finite state space. The class of languages DFAs recognize — the regular languages — is precisely the class where finite memory suffices.
Drawing state diagrams is the most effective way to internalize this. Each circle is a state, each directed arrow is a transition labeled by a symbol, and double circles mark accept states. A DFA recognizing "binary strings representing multiples of 3" needs exactly three states, corresponding to remainders 0, 1, and 2 modulo 3. The structure of the problem — what you need to remember about the input history — determines the minimum number of states. This is not incidental: it is the content of the Myhill-Nerode theorem, which characterizes regular languages exactly in terms of the number of "distinguishable" input histories.