Questions — Deep Learning for Signal Processing

Question 1 Multiple Choice

A CNN trained on spectrograms to classify audio typically learns filters in the first layer that resemble frequency-selective patterns (e.g., bands centered at different frequencies). Is this coincidence, or is there a deeper reason the network discovers frequency structure?

ACoincidence — the network randomly discovers useful features through gradient descent

BThe network is implicitly performing a time-frequency decomposition because the spectrogram itself is a time-frequency representation, and early convolutional filters learn to extract time-frequency patterns — analogous to handcrafted features like MFCCs

CThe network is forced by the loss function to learn frequencies for audio classification

DNeural networks always discover frequency structure; there is no special connection to spectrograms

Question 2 Multiple Choice

Recurrent neural networks (LSTMs, GRUs) are well-suited for time-series signal processing because they have internal state (memory) that persists across time steps. But computational cost grows with sequence length. How does this affect real-time signal processing on long signals?

ARNNs work fine for long signals; the memory requirement is negligible compared to CNNs

BRNNs process sequentially (state must be computed for step 1, then 2, then 3, ...), so processing a long signal requires forward pass through many steps. Chunking (dividing signal into windows) or attention mechanisms (allowing direct interaction between distant time steps) can reduce latency, but this loses the global temporal context

CRNNs are designed for long sequences; there is no computational limitation

DUse only CNNs for real-time processing; RNNs should be avoided

Question 3 True / False

Transfer learning in signal processing: train a network on a large dataset (e.g., general audio from YouTube), then fine-tune on your target task (e.g., whale call detection) with limited labeled data. Why does transfer learning work, and when does it fail?

TTrue

FFalse

Question 4 True / False

Attention mechanisms in neural networks allow the network to focus on relevant parts of the input signal when processing. For time-series signal processing, how does attention provide an advantage over fixed convolutional receptive fields?

TTrue

FFalse

Question 5 Short Answer

Explain the difference between supervised learning (labeled audio data) and unsupervised learning (unlabeled signal) for signal processing. When is unsupervised learning necessary, and what are the challenges?

Think about your answer, then reveal below.

Questions: Deep Learning for Signal Processing