Skip to content

Add LLM benchmarking framework to staging#2405

Open
kubraaksux wants to merge 1 commit intoapache:mainfrom
kubraaksux:llm-benchmark
Open

Add LLM benchmarking framework to staging#2405
kubraaksux wants to merge 1 commit intoapache:mainfrom
kubraaksux:llm-benchmark

Conversation

@kubraaksux
Copy link

Generic LLM benchmark suite for evaluating inference performance across different backends (vLLM, Ollama, OpenAI, MLX).

Features:

  • Multiple workload categories: math (GSM8K), reasoning (BoolQ, LogiQA), summarization (XSum, CNN/DM), JSON extraction
  • Pluggable backend architecture for different inference engines
  • Performance metrics: latency, throughput, memory usage
  • Accuracy evaluation per workload type
  • HTML report generation

This framework can be used to evaluate SystemDS LLM inference components once they are developed.

Generic LLM benchmark suite for evaluating inference performance
across different backends (vLLM, Ollama, OpenAI, MLX).

Features:
- Multiple workload categories: math (GSM8K), reasoning (BoolQ, LogiQA),
  summarization (XSum, CNN/DM), JSON extraction
- Pluggable backend architecture for different inference engines
- Performance metrics: latency, throughput, memory usage
- Accuracy evaluation per workload type
- HTML report generation

This framework can be used to evaluate SystemDS LLM inference
components once they are developed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant