Skip to content

[Repo Assist] Perf: optimise filterAsync, chooseAsync, and foldAsync with direct enumerators#276

Merged
dsyme merged 4 commits intomainfrom
repo-assist/perf-filterasync-chooseasync-foldasync-2026-03-13-1740bdfb976aa7c3
Mar 14, 2026
Merged

[Repo Assist] Perf: optimise filterAsync, chooseAsync, and foldAsync with direct enumerators#276
dsyme merged 4 commits intomainfrom
repo-assist/perf-filterasync-chooseasync-foldasync-2026-03-13-1740bdfb976aa7c3

Conversation

@github-actions
Copy link
Contributor

🤖 This PR was created by Repo Assist, an automated AI assistant. See #275.

Closes #275

Root cause

Three high-frequency combinators were implemented using the asyncSeq computation builder (or an indirect composition), which routes every element through the AsyncGenerator / GenerateCont machinery — allocating generator objects, dispatching through virtual Apply() calls, and building right-associating continuation chains. For straightforward, non-parallel operations this overhead is unnecessary.

Changes

Combinator Before After
filterAsync asyncSeq { for … } builder OptimizedFilterAsyncEnumerator — direct while-loop
chooseAsync (non-AsyncSeqOp path) asyncSeq { for … } builder OptimizedChooseAsyncEnumerator — direct while-loop
foldAsync (non-AsyncSeqOp path) scanAsync + lastOrDefault direct loop — no intermediate sequence

The new enumerators follow exactly the same pattern already used for mapAsync (OptimizedMapAsyncEnumerator, introduced in an earlier performance pass). No public API or semantics change.

New benchmark classes added:

  • AsyncSeqFilterChooseFoldBenchmarks — isolates filterAsync, chooseAsync, and foldAsync at 1 000 and 10 000 elements.
  • AsyncSeqPipelineBenchmarks — measures a composed map → filter → fold pipeline and toArrayAsync.

Trade-offs

  • No behaviour change; the new enumerators are strictly more efficient.
  • The filterAsync and chooseAsync enumerators hold an inner while loop rather than co-routine-style yield, which is slightly less familiar but consistent with the existing OptimizedMapAsyncEnumerator style.

Test Status

dotnet build FSharp.Control.AsyncSeq.sln --no-incremental
→ Build succeeded, 0 errors, 7 pre-existing warnings

dotnet test FSharp.Control.AsyncSeq.sln --no-build
→ Passed! Failed: 0, Passed: 317, Skipped: 0, Total: 317

Generated by Repo Assist ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@346204513ecfa08b81566450d7d599556807389f

…umerators

- filterAsync: replace asyncSeq-builder with OptimizedFilterAsyncEnumerator,
  avoiding AsyncGenerator allocation and generator-chain dispatch per element.
- chooseAsync (non-AsyncSeqOp path): replace asyncSeq-builder with
  OptimizedChooseAsyncEnumerator for the same reason.
- foldAsync (non-AsyncSeqOp path): replace scanAsync+lastOrDefault composition
  with a direct loop, eliminating the intermediate async sequence and its
  generator machinery entirely.
- Add AsyncSeqFilterChooseFoldBenchmarks and AsyncSeqPipelineBenchmarks to
  measure the affected operations and catch future regressions.

All 317 existing tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dsyme
Copy link
Contributor

dsyme commented Mar 13, 2026

/repo-assist Add benchmark.net benchmarks and take them before/after

@dsyme
Copy link
Contributor

dsyme commented Mar 13, 2026

Oh I see the benchmarks are there. Run them and tell me before/after

Use BenchmarkSwitcher.FromAssembly instead of custom argument parsing,
so BenchmarkDotNet CLI options (--filter, --job short, --inProcess, etc.)
work out of the box when running the benchmarks directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor Author

Commit pushed: 4c4275a

Generated by Repo Assist

@github-actions
Copy link
Contributor Author

🤖 Repo Assist ran BenchmarkDotNet benchmarks before/after this PR.

The benchmark runner was also updated in this PR to use BenchmarkSwitcher.FromAssembly so the full BenchmarkDotNet CLI (e.g. --filter, --job short, --inProcess) works out of the box.

Benchmarks run with --job short --inProcess (3 warmup + 3 iterations, in-process, Release build) on AMD EPYC 7763 / .NET 8.0.24.


AsyncSeqFilterChooseFoldBenchmarks

Method N Before (mean) After (mean) Speedup Before (alloc) After (alloc) Alloc reduction
filterAsync (all pass) 1 000 5 439 μs 3 178 μs 1.71× 6 433 KB 3 456 KB 1.86×
filterAsync (none pass) 1 000 4 159 μs 2 293 μs 1.81× 4 596 KB 2 368 KB 1.94×
chooseAsync (all selected) 1 000 605 μs 585 μs 1.03× 1 214 KB 1 214 KB 1.00×
foldAsync 1 000 328 μs 316 μs 1.04× 666 KB 666 KB 1.00×
filterAsync (all pass) 10 000 55 842 μs 31 813 μs 1.75× 62 780 KB 33 714 KB 1.86×
filterAsync (none pass) 10 000 41 959 μs 22 390 μs 1.87× 44 847 KB 23 090 KB 1.94×
chooseAsync (all selected) 10 000 5 498 μs 5 439 μs 1.01× 11 846 KB 11 846 KB 1.00×
foldAsync 10 000 2 894 μs 2 863 μs 1.01× 6 497 KB 6 497 KB 1.00×

AsyncSeqPipelineBenchmarks (map → filter → fold composed)

Method N Before (mean) After (mean) Speedup Before (alloc) After (alloc) Alloc reduction
MapFilterFold 1 000 12 983 μs 3 584 μs 3.62× 16.42 MB 3.93 MB 4.18×
MapFilterFold 10 000 131 351 μs 35 549 μs 3.69× 164.1 MB 39.22 MB 4.18×
ToArray (control) 1 000 2 168 μs 2 171 μs 1.00× 2.13 MB 2.13 MB
ToArray (control) 10 000 21 405 μs 21 274 μs 1.00× 21.33 MB 21.34 MB

Key observations:

  • filterAsync is ~1.75× faster and allocates ~1.9× less — the asyncSeq { for } builder had meaningful overhead.
  • chooseAsync and foldAsync show negligible improvement in isolation, suggesting the bottleneck for those paths was already elsewhere (likely the inner async dispatch per element rather than the builder scaffolding).
  • The composed pipeline (map → filter → fold) is the star result: 3.6–3.7× faster and 4.2× less memory — the gains compound because the optimised filterAsync sits in the hot path.
  • ToArray (unchanged) is flat, confirming the timing environment was stable.

Generated by Repo Assist for issue #276 ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@346204513ecfa08b81566450d7d599556807389f

@dsyme dsyme marked this pull request as ready for review March 14, 2026 00:19
@dsyme dsyme merged commit 7c774fe into main Mar 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve the performance of the main functions in the library and benchmark them

1 participant