PCAP Toolkit
A colleague’s problem kicked this off: replaying PCAPs through a Suricata probe was painful because timestamps had to be manually shifted to align with the replay window. That one fix quickly grew into a full-featured PCAP processing toolkit written in Rust — designed around a core constraint: captures that exceed available RAM.
The two-pass chronological sort keeps only ~20 bytes per packet in memory (timestamp, offset, length), so it scales to terabyte-sized captures without loading anything into RAM. A second standout is the pure-Rust BPF filter parser — no libpcap linkage — that covers the tcpdump grammar used in practice.
- stats — streaming single-pass summary: time range, packet/byte counts, unique IPs, and per-flow statistics keyed on 5-tuple with deterministic bidirectional flow IDs
- filtering — composable rules: protocol, IP/CIDR, port ranges, TCP flags, packet length, time window, and BPF expressions (
tcp and dst port 443) - two-pass sorting — RAM-efficient chronological merge across multiple input files; optional on-disk index (~20 MB / 1 M packets) for massive captures; output can be time-sliced into hourly/daily files
- traffic modification — timestamp shifting, per-protocol payload truncation, IP address mapping including cross-family IPv4↔IPv6 remapping with full header and checksum recalculation
- export — JSON (JSONL), Apache Parquet, and Apache Avro for direct ingestion into DuckDB, Spark, Snowflake, or Elasticsearch; optional Zstd payload compression
- replay — honour original inter-packet timing or apply a speed multiplier; send to one or multiple interfaces simultaneously
Compared to similar tools
| Feature | pcap-toolkit | tshark | tcpreplay | gopherCap |
|---|---|---|---|---|
| RAM-efficient sort (TB-scale) | ✅ | ❌ | ❌ | ❌ |
| Chronological merge across files | ✅ | ❌ | ❌ | ❌ |
| BPF expression filtering | ✅ | ✅ | ✅ | ✅ |
| Timestamp shifting | ✅ | ❌ | ✅ | ✅ |
| IP address remapping | ✅ | ❌ | ✅ | ❌ |
| Cross-family IPv4↔IPv6 remap | ✅ | ❌ | ❌ | ❌ |
| Parquet / Avro export | ✅ | ❌ | ❌ | ❌ |
| Per-flow statistics | ✅ | ✅ | ❌ | ❌ |
| Multi-interface simultaneous replay | ✅ | ❌ | ❌ | ✅ |
| No libpcap dependency | ✅ | ❌ | ❌ | ❌ |
tshark remains the best choice for interactive protocol inspection and deep display-filter queries. tcpreplay excels at replaying at line rate. gopherCap handles large-scale replay scenarios. pcap-toolkit fills the gap between them: batch processing, pipeline integration, and traffic modification at scale — without requiring libpcap on the host.
Common workflows
Prepare a capture for Suricata replay — merge a week of files, shift timestamps to now, then replay:
pcap-toolkit sort week/*.pcap --output sorted.pcap --timestamp-start 2024-06-01T00:00:00Z
pcap-toolkit replay sorted.pcap --interface eth0Export to Parquet for DuckDB analysis:
pcap-toolkit export capture.pcap --output traffic.parquet
# duckdb: SELECT src_ip, dst_ip, dst_port, count(*) FROM read_parquet('traffic.parquet') GROUP BY ALLIncident triage — get flow stats, then filter to suspicious traffic:
pcap-toolkit stats capture.pcap
pcap-toolkit sort capture.pcap --output suspicious.pcap \
--src-ip 10.0.0.0/8 --filter "tcp and dst port 443"Merge and time-slice a month of captures into hourly files:
pcap-toolkit sort day*.pcap --output /archive/ --slice 1h --on-diskInstallation
cargo install pcap-toolkitPre-built binaries for Linux (x86_64, aarch64), macOS, and Windows are available on the releases page. Replay requires CAP_NET_RAW on Linux.