RT FFT Convolver

Status: 🚧 Work in progress — core engine is functional, offline high-fidelity mode and crossfade IR switching are still planned.

Every guitarist who runs direct into a DAW or amp-sim uses an impulse response (IR) — a short WAV file that captures how a real speaker cabinet colours the sound. The problem is latency. Standard FFT convolution is efficient but introduces a full buffer of delay before any output arrives. For a 64-sample buffer at 48 kHz that is 1.3 ms — barely audible in isolation, but stack it with driver and interface round-trips and it becomes unplayable.

The solution is partitioned convolution with a zero-latency head: apply the very first samples of the IR directly in the time domain (zero added delay), then switch to FFT-based processing for the rest of the tail where efficiency matters. The result is a convolution engine that sounds like a full FFT convolution but feels like there is none.

rt-fft-convolver is a real-time-safe Rust library for zero-latency partitioned convolution. It is designed as a building block for guitar cabinet simulators, reverb engines, and DAW plugins.

Features

Zero algorithmic latency — direct-form convolution for the IR head; overlap-save FFT for the tail. No added delay beyond what the IR itself dictates.
Real-time safe — zero heap allocations in the audio callback. Construct and load on a non-real-time thread; call process from the render thread.
Anti-denormal protection — built-in DenormalGuard / flush_to_zero() to prevent the CPU spikes caused by near-zero floating-point values during silent passages.
Multi-channel support — mono (UniformPartitionEngine), independent stereo (StereoConvolver), and true-stereo 4-channel matrix (TrueStereoConvolver).
Auto-resampling — load_ir() reads any common WAV format and resamples to the host sample rate in one call.
Auto-normalizing mixer — Mixer computes the IR’s L2-norm gain at setup time and provides an equal-power dry/wet crossfade with no per-sample branches.
SIMD acceleration — complex multiply-accumulate uses the wide crate for vectorized throughput on x86/ARM.

Quick start

[dependencies]
rt-fft-convolver = "0.1"

Mono cabinet simulation:

use rt_fft_convolver::{UniformPartitionEngine, load_ir};

// Off the render thread — allocates
let ir = load_ir("cabinet.wav", 48_000)?;
let mut engine = UniformPartitionEngine::new(&ir.channels[0], block_size);

// On the render thread — zero allocation
let mut output = vec![0.0f32; block_size];
engine.process(&input, &mut output);

Stereo with a shared mono IR:

use rt_fft_convolver::{StereoConvolver, load_ir};

let ir = load_ir("cabinet.wav", 48_000)?;
let mut conv = StereoConvolver::new_mono_ir(&ir.channels[0], block_size);

conv.process(&in_l, &in_r, &mut out_l, &mut out_r);

True-stereo (4-channel mic capture):

use rt_fft_convolver::{TrueStereoConvolver, load_ir};

// A true-stereo IR WAV has 4 channels: LL, LR, RL, RR
let ir = load_ir("true_stereo_cab.wav", 48_000)?;
let mut conv = TrueStereoConvolver::new(
    &ir.channels[0], // L → L
    &ir.channels[1], // L → R
    &ir.channels[2], // R → L
    &ir.channels[3], // R → R
    block_size,
);

conv.process(&in_l, &in_r, &mut out_l, &mut out_r);

Dry/wet mix with auto-normalization:

use rt_fft_convolver::Mixer;

// Compute IR gain (1 / ‖IR‖₂) once, then blend 30% dry / 70% wet
let mut mixer = Mixer::new(&ir_samples, 0.7);
mixer.process(&dry, &wet, &mut output);

Anti-denormal guard (drop at the start of your audio callback):

use rt_fft_convolver::DenormalGuard;

fn audio_callback(buffer: &mut [f32]) {
    let _guard = DenormalGuard::new(); // sets FTZ for this scope
    // … process …
}

Alternatives

	rt-fft-convolver	fft-convolver	zita-convolver
Language	Rust	Rust	C++
Algorithmic latency	0 samples	1 block	1 block (small stage)
Real-time safe	✅	✅	✅
True-stereo (4-ch)	✅	❌	✅ (up to 64ch)
WAV load + resample	✅	❌	❌
Dry/wet mixer	✅	❌	❌
Anti-denormal guard	✅	❌	—
License	MIT	MIT	GPL-3

fft-convolver is a faithful Rust port of the well-known HiFi-LoFi FFTConvolver C++ library. It uses uniform partitioning with an optional two-stage variant for long IRs. One block of latency is inherent in the algorithm — that is the trade-off this library avoids with its direct-form head.

zita-convolver is the reference C++ implementation used in Guitarix and several LV2 plugins. It supports large multi-channel matrices and uses a dual-partition scheme, but it requires linking C++ and carries a GPL licence, which restricts use in commercial plugins.

What’s coming

Offline high-fidelity mode (non-partitioned, full precision for studio rendering)
Crossfade IR switch — seamless cabinet changes without audio clicks

Repo · crates.io · lib.rs