Gene regulatory networks (GRNs) describe how transcription factors, signaling molecules, and regulatory elements control gene expression patterns. Computational GRN inference reconstructs these regulatory relationships from data: co-expression networks identify genes that are coordinately regulated, ChIP-seq reveals direct transcription factor-target relationships, and perturbation experiments (knockouts, overexpression) establish causal regulatory links. Methods range from correlation-based (WGCNA) to information-theoretic (ARACNE, mutual information) to causal inference (Bayesian networks, Granger causality). GRNs explain how cells establish and maintain identity, respond to signals, and develop from undifferentiated precursors.
Build a small co-expression network from an RNA-seq time series using WGCNA: identify gene modules, find hub genes in each module, and check whether the hubs are known transcription factors. Then examine a published GRN for a well-studied system (e.g., embryonic stem cell pluripotency network) and trace how key transcription factors regulate each other in feedback loops.
A cell's identity — whether it is a neuron, a liver cell, or a stem cell — is defined by which genes it expresses. Gene regulatory networks are the wiring diagrams that control these expression patterns. Transcription factors bind to regulatory DNA elements and activate or repress their target genes, which may include other transcription factors, creating cascades and feedback loops that establish and maintain cell states. Understanding these networks is central to developmental biology, cancer research, and cellular reprogramming.
Co-expression network analysis is the most accessible entry point. Given expression data across many conditions or time points, genes that consistently rise and fall together are grouped into modules. WGCNA (Weighted Gene Co-expression Network Analysis) is the standard tool: it computes pairwise correlations, applies a soft threshold to create a weighted network, and identifies modules using hierarchical clustering. Hub genes — those with the highest connectivity within a module — are candidates for key regulators. If a module's hub is a known transcription factor with binding motifs enriched in the promoters of module members, the evidence for a regulatory relationship strengthens. But co-expression is correlation, not causation, and the network represents co-regulation patterns, not direct regulatory wiring.
Direct regulatory inference requires additional data. ChIP-seq maps where transcription factors bind across the genome, identifying potential direct targets. Motif analysis scans promoter and enhancer sequences for transcription factor binding site matches. Perturbation experiments — CRISPR knockouts, siRNA knockdowns, inducible overexpression — measure the functional consequence of changing a regulator's activity. The most powerful GRN inference integrates all three: expression data (which genes change), binding data (which genes are directly bound), and perturbation data (which changes are caused by the regulator). Methods like CellOracle and SCENIC combine these data types to build context-specific regulatory networks.
The resulting networks reveal fundamental organizational principles. Feedforward loops (regulator A activates B, and both A and B activate C) filter transient signals and ensure robust activation. Feedback loops (A activates B, B represses A) create oscillations or bistable switches. Master regulators — transcription factors at the top of regulatory hierarchies — can reprogram cell identity when ectopically expressed (as demonstrated by the Yamanaka factors that convert fibroblasts to induced pluripotent stem cells). Network topology analysis identifies these key nodes, prioritizes therapeutic targets in disease, and provides the mechanistic framework for understanding how genotype (variants in regulatory elements) connects to phenotype (altered gene expression programs and cellular behavior).