RT FFT Convolver
Status: π§ Work in progress β core engine is functional, offline high-fidelity mode and crossfade IR switching are still planned.
Every guitarist who runs direct into a DAW or amp-sim uses an impulse response (IR) β a short WAV file that captures how a real speaker cabinet colours the sound. The problem is latency. Standard FFT convolution is efficient but introduces a full buffer of delay before any output arrives. For a 64-sample buffer at 48 kHz that is 1.3 ms β barely audible in isolation, but stack it with driver and interface round-trips and it becomes unplayable.
The solution is partitioned convolution with a zero-latency head: apply the very first samples of the IR directly in the time domain (zero added delay), then switch to FFT-based processing for the rest of the tail where efficiency matters. The result is a convolution engine that sounds like a full FFT convolution but feels like there is none.
rt-fft-convolver is a real-time-safe Rust library for zero-latency partitioned convolution. It is designed as a building block for guitar cabinet simulators, reverb engines, and DAW plugins.
Features
- Zero algorithmic latency β direct-form convolution for the IR head; overlap-save FFT for the tail. No added delay beyond what the IR itself dictates.
- Real-time safe β zero heap allocations in the audio callback. Construct and load on a non-real-time thread; call
processfrom the render thread. - Anti-denormal protection β built-in
DenormalGuard/flush_to_zero()to prevent the CPU spikes caused by near-zero floating-point values during silent passages. - Multi-channel support β mono (
UniformPartitionEngine), independent stereo (StereoConvolver), and true-stereo 4-channel matrix (TrueStereoConvolver). - Auto-resampling β
load_ir()reads any common WAV format and resamples to the host sample rate in one call. - Auto-normalizing mixer β
Mixercomputes the IRβs L2-norm gain at setup time and provides an equal-power dry/wet crossfade with no per-sample branches. - SIMD acceleration β complex multiply-accumulate uses the
widecrate for vectorized throughput on x86/ARM.
Quick start
[dependencies]
rt-fft-convolver = "0.1"Mono cabinet simulation:
use rt_fft_convolver::{UniformPartitionEngine, load_ir};
// Off the render thread β allocates
let ir = load_ir("cabinet.wav", 48_000)?;
let mut engine = UniformPartitionEngine::new(&ir.channels[0], block_size);
// On the render thread β zero allocation
let mut output = vec![0.0f32; block_size];
engine.process(&input, &mut output);Stereo with a shared mono IR:
use rt_fft_convolver::{StereoConvolver, load_ir};
let ir = load_ir("cabinet.wav", 48_000)?;
let mut conv = StereoConvolver::new_mono_ir(&ir.channels[0], block_size);
conv.process(&in_l, &in_r, &mut out_l, &mut out_r);True-stereo (4-channel mic capture):
use rt_fft_convolver::{TrueStereoConvolver, load_ir};
// A true-stereo IR WAV has 4 channels: LL, LR, RL, RR
let ir = load_ir("true_stereo_cab.wav", 48_000)?;
let mut conv = TrueStereoConvolver::new(
&ir.channels[0], // L β L
&ir.channels[1], // L β R
&ir.channels[2], // R β L
&ir.channels[3], // R β R
block_size,
);
conv.process(&in_l, &in_r, &mut out_l, &mut out_r);Dry/wet mix with auto-normalization:
use rt_fft_convolver::Mixer;
// Compute IR gain (1 / βIRββ) once, then blend 30% dry / 70% wet
let mut mixer = Mixer::new(&ir_samples, 0.7);
mixer.process(&dry, &wet, &mut output);Anti-denormal guard (drop at the start of your audio callback):
use rt_fft_convolver::DenormalGuard;
fn audio_callback(buffer: &mut [f32]) {
let _guard = DenormalGuard::new(); // sets FTZ for this scope
// β¦ process β¦
}Alternatives
| rt-fft-convolver | fft-convolver | zita-convolver | |
|---|---|---|---|
| Language | Rust | Rust | C++ |
| Algorithmic latency | 0 samples | 1 block | 1 block (small stage) |
| Real-time safe | β | β | β |
| True-stereo (4-ch) | β | β | β (up to 64ch) |
| WAV load + resample | β | β | β |
| Dry/wet mixer | β | β | β |
| Anti-denormal guard | β | β | β |
| License | MIT | MIT | GPL-3 |
fft-convolver is a faithful Rust port of the well-known HiFi-LoFi FFTConvolver C++ library. It uses uniform partitioning with an optional two-stage variant for long IRs. One block of latency is inherent in the algorithm β that is the trade-off this library avoids with its direct-form head.
zita-convolver is the reference C++ implementation used in Guitarix and several LV2 plugins. It supports large multi-channel matrices and uses a dual-partition scheme, but it requires linking C++ and carries a GPL licence, which restricts use in commercial plugins.
Whatβs coming
- Offline high-fidelity mode (non-partitioned, full precision for studio rendering)
- Crossfade IR switch β seamless cabinet changes without audio clicks