Feature Request: Repeated Evaluations in Evaluation Report

### Description

Hi,
We are using the in-built evals to measure llm performance. Due to the inherent stochasticity of the process we would like to evaluate the same dataset multiple times. This would allow one to more robustly gauge how a e.g. a prompt changes the outcome distribution for scores. 

At the moment the workaround would be to manually collect EvaluationReports from different runs, group the individual ReportCases and aggregate them manually into a new report. While doable, this feels somewhat cumbersome and at the same time like sth. others would benefit from as well!
 

### References

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Repeated Evaluations in Evaluation Report #2053

Description

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Repeated Evaluations in Evaluation Report #2053

Description

Description

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions