zkalc's methodology

Disclaimer: zkalc aims to provide adequate results while being easy to use. Actual results will be different from our estimates.

In this page, we describe our benchmarking pipeline and how we derive our results. In short:

For each supported operation, we run benchmarks that measure its performance. We use criterion.rs to take multiple samples, and then select the average.
We collect the benchmark results inside frontend/data/ and make it freely available.
For each operation, we fit a function to its benchmark results. We use linear interpolation inside the benchmark bounds and least squares regression outside the benchmarking bounds.
When a user queries zkalc for an operation of size $n$ , we estimate the its running time using the produced function.

We will now go deeper into the above process, also linking to the relevant places in the codebase.

Running benchmarks

Inside backend/ we collect benchmarks for all the supported public-key operations under different configurations. All benchmarks were run with multithreading enabled and with best settings available. Consider this benchmark for the arkworks library. At compile time, we enable the feature flags parallel,std,asm. The microbenchmarks inside look a bit like the following:

Running make inside backend/ will run every benchmark, storing results inside perf/, where they are post-processed and collected in our API format.

Collecting benchmarks

We store our benchmarking results in JSON inside frontend/data and they are free for everyone to use.

Given a curve key curve (available in curves.json), a library key lib that supports the given curve (available in libraries.json) and a machine key machine (available in machines.json), it is possible to fetch the samples in the file data/{curve}/{library}/{machine}.json.

For instance, this is the file reporting the benchmarks for curve BLS12-381 using arkworks on an M1 Pro 2021:

Estimating the running time

Now we have benchmark data for every operation in the perf/ directory. The next step is to fit a function $f(x)$ to every operation, so that when a user queries us for an operation with arbitrary size $n$ , we can answer it by evaluating $f(n)$ .

For simple operations like basic scalar multiplication and field addition (which are not amortized) we consider them to be sequential computations. That is, if a single scalar multiplication takes $x$ seconds, $n$ such operations will take $n \cdot x$ seconds. That results in a simple linear function $f(x) = n \cdot x$ .

More complicated operations like MSMs and pairing products are amortized and their performance doesn't follow a simple linear curve. For such operations, we collect benchmark data for various sizes. For example, consider the figure below which displays the benchmark data from a $\mathbb G_1$ MSM operation for sizes from $2$ to $2^{21}$ (both axis are in log scale):

To answer user queries within the benchmark range, we perform polynomial interpolation over the benchmark data.

That is, for each pair of benchmark data $(x_i, f(x_i))$ and $(x_{i+1}, f(x_{i+1}))$ we trace the line that goes through both points. We end up with a piecewise function that covers the entire benchmark range, as we can see in the figure below:

For user queries outside of the benchmarking range, we extrapolate via non-linear least squares. To make things more exciting we decided that it should be done... in Javascript inside your browser.

Here is an example of the extrapolation behavior in MSM outside of the benchmarking range (that is, after $2^{21}$ ):

In the specific case of MSMs, Pippenger's complexity is well known to be asymptotically $O({n} / {\log n})$ . Hence, we use least squares to fit the data set to a function $h(x) = \frac{a x + b}{\log x}$ solving for $a, b$ .

We do not expect extrapolation to faithfully follow the benchmarks. We believe however that the estimates provide a rough idea of how long an algorithm will take. In addition, it's worth pointing out while we try to provide useful results for arbitrary operation sizes, we cannot guarantee that the requested machine will be able to handle the requested size. For example, a commodity laptop will likely run out of memory before it computes an MSM of size $2^{28}$ .

In the end of this process, we end up with a piecewise function for each operation that we can query inside and outside the benchmarking range to answer user queries.

Hazmat

We provide an estimator function for everyone to use inside the browser's console. Just open Developer Tools, and check out the function estimator.

There is still lots of ways we can improve zkalc. If you want to hack around, and you don't generally like to open the browser console when a stranger says so, check the TODO file to see how you can also help!