Skip to content

Toggle Spectrogram Preview for Audio #384

Open
@Path-A

Description

@Path-A

Is your feature request related to a problem? Please describe.
Classifying or segmenting audio with only a waveform preview can be time-consuming or difficult, especially with noisy audio data. Some data is more easily segmented by looking at frequency content over time.

Describe the solution you'd like
Include a toggle to preview a spectrogram representation of an audio clip. Some common python libraries to generate these are Librosa or Scipy.signal.

Describe alternatives you've considered
I've manually generated the spectrograms and saved them as images to be used within the image classification labeling tool. The downsides of this are threefold.

  1. Labeling audio this way does not allow for temporal segmentation. The user must classify the entire spectrogram, not simply a vertical fraction of it. A user could, in theory, use the image annotation tool, but it would be tedious and the user would need to convert bounding boxes to its corresponding time in the audio clip.
  2. The user can no longer listen to the audio clip while viewing the spectrogram image.
  3. The user generated spectograms require temporary additional storage requirements.

Additional context
Each user's spectrogram needs may differ, such as their sound of interest being within the low or high frequency areas of the spectrogram. To keep implementation simple, use default spectrogram parameters that generalize well and potentially allow users to zoom in on this general spectrogram. A more robust solution would allow the user to specify a few parameters to generate the spectrogram that they would want. Lastly, I include an example of a log-scaled spectrogram with its accompanying waveform.
Example

Metadata

Metadata

Assignees

Labels

audiocommunity:feature-requestFeature Request from the community reviewed by the community team.community:reviewedIssue has been reviewed by the Label Studio Community Team.editorLabel Studio FrontendfeatureFeature requestoften asked

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions