Ultimately, upstreaming didn’t happen2 even though we ironed out a lot of the
overhead, ensuring that it could be compiled out cleanly and had 0 runtime
overhead if the syscall wasn’t used to enable it. This was largely due to a
shift to the
TCP_INFO socket option for all TCP stats, rather than the
This has itched at the back of my brain since, as the RFC4898 spec is far richer
than something that could be captured in
tcp_info, and arguably more
Thanks to kris nóva, i heard about eBPF, and specifically how it allowed a sandboxed userspace program to attach to kernel functions. It was already being used to do some network traffic analysis, and even rerouting, but it struck me that maybe it could also be used to collect rich statistics on TCP connections.
It’s taken a few attempts to get both the approach right, and BPF itself is quite finicky. The C code is only validated at runtime, the error messages are arcane, and knowing when to copy things from kernel to userspace is something that isn’t well documented. However, it can be picked up (even if i still don’t feel like I know exactly what i’m doing).
In its current form, the project is hooked into a few kernel functions, allowing ~half of the RFC4898 metrics to be collected. It dumps the results to stdout as JSON, and the latest set of commits have refined the project structure. It’s just beyond “proof of concept” phase.
- Add more hooks
- Validate the metrics
- Add a bunch of tests!
If any of you reading this are interested in helping out, the repo is open for business!