Session

Because Nobody Wants to Edit Drums: Building Trainable Audio Production Tools via Machine Learning

There is a vast untapped potential for "intelligent" audio tools that empower human musicians, composers, producers and engineers to do their work more efficiently and enable customized, trainable experiences that unlock new forms of creativity. Usually when people speak of machine learning and audio, they refer to speech recognition or speech synthesis; and when "machine learning" and "music" appear together, it's often either regarding data analytics of music-business (micro)transactions for publishing & streaming, or regarding algorithmically generated MIDI compositions. In this session, rather, we will explore some of the core technologies behind tools to help humans better understand and manipulate audio signals and the instruments that produce them, to organize and find samples & loops based on learnable preferences, and novel, trainable signal-processing 'plug-ins'. The music industry is facing numerous challenges, but new tools can offer opportunities for efficient, collaborative and above all creative activities that inspire artists, engineers and listeners alike.

Topics We'll Look At:
- A quick survey of commercial ML tech currently available for the music production toolchain
- Classification and regression of time series data (not just for stocks & weather)
- Convolutional and Recurrent Neural Networks for audio (admit it, you're tired of MNIST & cats-vs-dogs)
- Cloud computing (or, empowering laptop users with GPU-compute capability)
- Data-parallel multi-GPU and multiprocessor execution (for speed)
- Homomorphic encryption (to preserve IP & privacy)
- Object detection (e.g. for imagery of vibrating musical instruments, and for audio segmentation)
- A little math, such as function spaces and convexity (to help your training converge quickly)

Frameworks & Libraries Discussed:
- Keras, with the Tensorflow backend
- PyTorch
- librosa
- Flask
- OpenMined (a public decentralized interface for the distribution & consumption of private data for training machine learning models)

What You'll Get:
A survey of the sounds and sights of some ear-opening and Nashville-specific open-source machine learning application development that you can get involved in. We've actually applied some of the same algorithms to medical audio too. What could be more "Nashville" than music, healthcare and technology?

Who This is For:
- Anyone interested in machine learning and/or musical audio
- Those with an interest in seeing research-grade technology made accessible to consumers
- People who want a broad survey with references and links they can follow for further investigation
- Those looking to extend their expertise from other data science fields into the musical audio arena

Scott Hawley

Associate Professor of Physics at Belmont University, and Founder of ASPIRE Research Co-op

Actions

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top