eBPF. It doesn’t stand for anything. But it might mean bank
March 10, 2025
Meta says it has managed to reduce the CPU cycles of its top services by 20 percent through its Strobelight profiling orchestration suite, which relies on the open source eBPF project.
Fans of the mass audience monetization biz may be delighted to learn that this translates to a 10 to 20 percent reduction in the number of servers required for Facebooking, Instagramming, WhatApping, and whatever it is that people do with headsets and avatars.
eBPF doesn’t stand for anything anymore. It used to be an acronym for extended Berkeley Packet Filter but its remit has expanded to the point that the alphabet jumble is no longer moored to a limiting identity.
The open source software, which has its own foundation, provides a way to run sandboxed programs within the operating system kernel, for Linux and, as a work-in-progress, for Windows. The idea being to run software relatively safely within a privileged kernel context without having to build and insert kernel modules, package the software as a driver, or recompile the kernel to include the desired functionality.
Running within the kernel is useful for service optimization, particularly at scale where small bottlenecks and inefficiencies can be amplified to great detriment. Collecting data across a diverse set of systems without degrading performance, such that the data is consistent and interpretable across multiple kernel versions, is not a trivial challenge.
Meta developed open source Strobelight, which orchestrates a variety of profiling applications that utilize eBPF, to collect observability data – logs of system events, metrics that measure performance, traces of network connections. Its goal was to make its infrastructure more efficient, which reduces expenses and has operational advantages.
“eBPF allows the safe injection of custom code into the kernel, which enables very low overhead collection of different types of data and unlocks so many possibilities in the observability space that it’s hard to imagine how Strobelight would work without it,” said Meta software engineer Jordan Rome in January.
Strobelight presently consists of 42 different profiling applications, a number of arguable significance. These profilers measure memory, function call counts, events in various programming languages, AI GPU usage, service request latency, and soon.
As noted in the eBPF Foundation’s recent case study of Meta’s stupendous server savings, 15,000 servers’ worth of annual capacity were saved with a single one-character code change.
It was an ampersand (&). But it will be evaluated by Meta bean counters as a dollar sign.
According to Rome, “A seasoned performance engineer was looking through Strobelight data and discovered that by filtering on a particular std::vector function call (using the symbolized file and line number) he could identify computationally expensive array copies that happen unintentionally with the ‘auto’ keyword in C++.”
After finding one of these costly array copies in the path of one of Meta’s major ad services, the engineer determined that the vector copy wasn’t intentional. So he added an “&” after the auto keyword to turn the copy into a reference, which avoids unnecessary data duplication by pointing to the data rather than reproducing it.
“It was a one-character commit, which, after it was shipped to production, equated to an estimated 15,000 servers in capacity savings per year,” said Rome.
One can only imagine the savings to be had from applying the delete character. ®
Search
RECENT PRESS RELEASES
Related Post