Skip to content

Conversation

@thevilledev
Copy link
Contributor

Motivation

The min, max, mean, and median builtin functions were using reflection for every element in the input array, calling reflect.Value.Index(i).Interface() in a loop. This causes significant allocations (100+ per call for 100 elements) because each .Interface() call triggers heap allocations.

Changes

Avoid reflection and per-element allocations for common typed slices ([]int, []float64, []any) by adding type-switch fast paths that iterate directly without calling reflect.Value.Interface().

Added benchmarks:

  • Benchmark_min
  • Benchmark_max
  • Benchmark_mean
  • Benchmark_median

Benchstat results against master branch:

cpu: Apple M1
          │  master.out  │               fix.out               │
          │    sec/op    │   sec/op     vs base                │
_min-8      2850.5n ± 0%   427.7n ± 1%  -85.00% (p=0.000 n=10)
_max-8      2877.5n ± 1%   443.2n ± 0%  -84.60% (p=0.000 n=10)
_mean-8     2140.0n ± 1%   227.2n ± 0%  -89.38% (p=0.000 n=10)
_median-8    4.801µ ± 1%   1.360µ ± 1%  -71.67% (p=0.000 n=10)
geomean      3.030µ        491.9n       -83.76%

          │  master.out  │               fix.out                │
          │     B/op     │     B/op      vs base                │
_min-8      1648.00 ± 0%     48.00 ± 0%  -97.09% (p=0.000 n=10)
_max-8      1648.00 ± 0%     48.00 ± 0%  -97.09% (p=0.000 n=10)
_mean-8     1656.00 ± 0%     56.00 ± 0%  -96.62% (p=0.000 n=10)
_median-8   4.391Ki ± 0%   2.047Ki ± 0%  -53.38% (p=0.000 n=10)
geomean     2.071Ki          128.2       -93.95%

          │  master.out  │              fix.out               │
          │  allocs/op   │ allocs/op   vs base                │
_min-8      102.000 ± 0%   2.000 ± 0%  -98.04% (p=0.000 n=10)
_max-8      102.000 ± 0%   2.000 ± 0%  -98.04% (p=0.000 n=10)
_mean-8     103.000 ± 0%   3.000 ± 0%  -97.09% (p=0.000 n=10)
_median-8    211.00 ± 0%   11.00 ± 0%  -94.79% (p=0.000 n=10)
geomean       122.6        3.390       -97.24%

Further comments

Other typed slices (e.g., []int32, []uint64, custom numeric types) fall back to the original reflection-based path. Open to feedback whether we should include more typed slices here.

Similar optimizations could potentially be applied to other builtins like sum.

@antonmedv
Copy link
Member

Nice! I like such optimizations!

@thevilledev
Copy link
Contributor Author

Thanks! So much fun :)

Needs more tests and fine-tuning, but I'll mark it ready once there 🤞

@thevilledev thevilledev changed the title perf(builtin): min, max, mean, median fast paths WIP perf(builtin): min, max, mean, median fast paths Jan 21, 2026
@thevilledev thevilledev force-pushed the perf/min-max-median-mean branch from b9bb62d to 0918bbd Compare January 22, 2026 09:03
Avoid reflection and per-element allocations for common typed slices
([]int, []float64, []any) by adding type-switch fast paths that iterate
directly without calling reflect.Value.Interface().

For []any containing numeric types, the fast path handles int and float64
directly, falling back to reflection for other numeric types (int32, etc.)
to keep the code compact while still avoiding per-element recursion.

Falls back to reflection for other slice types to maintain compatibility.

Signed-off-by: Ville Vesilehto <ville@vesilehto.fi>
@thevilledev thevilledev force-pushed the perf/min-max-median-mean branch from 0918bbd to 4e035dd Compare January 22, 2026 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants