PCAP Toolkit

A colleague’s problem kicked this off: replaying PCAPs through a Suricata probe was painful because timestamps had to be manually shifted to align with the replay window. That one fix quickly grew into a full-featured PCAP processing toolkit written in Rust — designed around a core constraint: captures that exceed available RAM.

The two-pass chronological sort keeps only ~20 bytes per packet in memory (timestamp, offset, length), so it scales to terabyte-sized captures without loading anything into RAM. A second standout is the pure-Rust BPF filter parser — no libpcap linkage — that covers the tcpdump grammar used in practice.

stats — streaming single-pass summary: time range, packet/byte counts, unique IPs, and per-flow statistics keyed on 5-tuple with deterministic bidirectional flow IDs
filtering — composable rules: protocol, IP/CIDR, port ranges, TCP flags, packet length, time window, and BPF expressions (tcp and dst port 443)
two-pass sorting — RAM-efficient chronological merge across multiple input files; optional on-disk index (~20 MB / 1 M packets) for massive captures; output can be time-sliced into hourly/daily files
traffic modification — timestamp shifting, per-protocol payload truncation, IP address mapping including cross-family IPv4↔IPv6 remapping with full header and checksum recalculation
export — JSON (JSONL), Apache Parquet, and Apache Avro for direct ingestion into DuckDB, Spark, Snowflake, or Elasticsearch; optional Zstd payload compression
replay — honour original inter-packet timing or apply a speed multiplier; send to one or multiple interfaces simultaneously

Compared to similar tools

Feature	pcap-toolkit	tshark	tcpreplay	gopherCap
RAM-efficient sort (TB-scale)	✅	❌	❌	❌
Chronological merge across files	✅	❌	❌	❌
BPF expression filtering	✅	✅	✅	✅
Timestamp shifting	✅	❌	✅	✅
IP address remapping	✅	❌	✅	❌
Cross-family IPv4↔IPv6 remap	✅	❌	❌	❌
Parquet / Avro export	✅	❌	❌	❌
Per-flow statistics	✅	✅	❌	❌
Multi-interface simultaneous replay	✅	❌	❌	✅
No libpcap dependency	✅	❌	❌	❌

tshark remains the best choice for interactive protocol inspection and deep display-filter queries. tcpreplay excels at replaying at line rate. gopherCap handles large-scale replay scenarios. pcap-toolkit fills the gap between them: batch processing, pipeline integration, and traffic modification at scale — without requiring libpcap on the host.

Common workflows

Prepare a capture for Suricata replay — merge a week of files, shift timestamps to now, then replay:

pcap-toolkit sort week/*.pcap --output sorted.pcap --timestamp-start 2024-06-01T00:00:00Z
pcap-toolkit replay sorted.pcap --interface eth0

Export to Parquet for DuckDB analysis:

pcap-toolkit export capture.pcap --output traffic.parquet
# duckdb: SELECT src_ip, dst_ip, dst_port, count(*) FROM read_parquet('traffic.parquet') GROUP BY ALL

Incident triage — get flow stats, then filter to suspicious traffic:

pcap-toolkit stats capture.pcap
pcap-toolkit sort capture.pcap --output suspicious.pcap \
    --src-ip 10.0.0.0/8 --filter "tcp and dst port 443"

Merge and time-slice a month of captures into hourly files:

pcap-toolkit sort day*.pcap --output /archive/ --slice 1h --on-disk

Installation

cargo install pcap-toolkit

Pre-built binaries for Linux (x86_64, aarch64), macOS, and Windows are available on the releases page. Replay requires CAP_NET_RAW on Linux.

Repo · crates.io · lib.rs