Keybord: Space bar to play/pause, left/right cursor and home/end key to jump.
Mouse: Click to change position, hover over any spectrogram to change source.
spectrogram of a 30-second test clip containing vocals
corresponding ground truth
spectrogram of corresponding vocal track
predictions of CNN-α (trained on weak labels)
saliency map of CNN-α
summarized saliency map of CNN-α
predictions of CNN-β (trained on predictions of CNN-α)
saliency map of CNN-β
summarized saliency map of CNN-β
predictions of CNN-γ (trained on tanh squashed saliency map of CNN-β)
for comparison: spectrogram of estimated vocal track extracted with
KAML