Comment by prashp - Hacker Neue

prashp Aug 20, 2024 parent

Is it possible to model reverb using a neural network (e.g. wavenet or LSTMs) for real-time use? Is this what something like Neural DSP is doing under the hood?

duped Aug 20, 2024

The trick is not to use NNs for DSP but to discover the parameters for the DSP. In other words you hardcode the signal flow architecture using a common technique like an FDN but then train a NN to find "good" sounding parameters, like comparing to a convolution reverb or recordings.

The thing about reverb is they require a lot of state and nonlinearity is undesirable.

squeaky-clean Aug 20, 2024

For reverb I don't see much practical use, mainly because you can capture a pretty-much-perfect recreation of a real space with an impulse response. No need for thousands or millions of rounds of training a network. For unrealistic reverbs, you have the problem that to get training data you'd have to invent several unrealistic reverb effects to apply to sounds. And once you've made those effects, there's not really any reason to neural-netify them.

For NeuralDSP it's a bit different because they use NN's to simulate a guitar amp circuit which is a nonlinear system and so there's no simple way to "capture" the effect the way that you can for reverb sims or speaker sims. And while you can make a very accurate model using something like SPICE, that won't run in realtime. With traditional amp modeling you basically take the SPICE version and try to optimize and cheat as much as you can so it can run in realtime, at the cost of accuracy.

So that's what NeuralDSP's goal is, a system that approximates the amplifier but can also be computed in real-time, except done using a trained NN instead of a human-optimized variant of the SPICE circuit.

They have a couple whitepapers on their website, though none of them go deep enough to really give away their secret sauce. But basically according to them, making a NN model of an amplifier at a fixed setting is fairly simple. Where they had to get novel with it is adjustable settings/parameters. E.g. turning the drive up, or turning the treble down. Just capturing a few hundred or thousand models based on adjusting parameters and cross-fading between them doesn't sound realistic. So they had to come up with a larger model architecture that can "learn" those parameter changes.

https://arxiv.org/pdf/2403.08559

intalentive Aug 20, 2024

It’s not that hard, you just collect a lot of data. Much easier with a robot turning the knobs. Predict the next sample based on input and knob settings.

landonxjames Aug 20, 2024

Not sure about Neural DSP or reverbs in general, but real-time neural network based DSP seems very possible. The open source Neural amp modeler[1] would be a good place to start diving in.

[1] https://www.neuralampmodeler.com/the-code

prashp OP Aug 20, 2024

I have tried NAM but with limited success in modeling some time-based effects (e.g. octave shifting). However, I have not tried to model reverb effects.

intalentive Aug 20, 2024

To handle time-based effects you need a custom architecture.

https://www.research.ed.ac.uk/en/publications/neural-modelli...

Don’t use NAM. Learn PyTorch.

prashp OP Aug 20, 2024

NAM uses pytorch for its NN implementation?

intalentive Aug 20, 2024

It is 100% possible and there are a slew of tricks you can use to get big performance boosts with negligible cost to accuracy.

prashp OP Aug 20, 2024

Do you know what the tricks are?

intalentive Aug 20, 2024

1. Don’t use LSTMs (4 vector-matrix multiplies) or GRUs (3 multiplies). Use a fixed Hippo matrix to update state. Just 1 multiply and since it’s fixed you can unroll during training, much faster than backprop through time.

2. Write SIMD intrinsics by hand. None of the libraries are as fast.

3. Don’t use sigmoid or tanh functions as your nonlinear activation. Instead approximate them with the softsign function which is much cheaper.

Depends on exact architecture, but these optimizations have yielded 10-30x improvement for single threaded CPU real time audio applications.

When GPU audio matures all this may be unnecessary.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous