common : refactor common_sampler + grammar logic changes #17937

ggerganov · 2025-12-11T13:19:42Z

Extracting some refactoring portions from #17004 to make the review easier:

Simplify and make safer the management of llama objects (samplers, contexts, model)
The common_init_result now also owns the sampler chains constructed during common_init_from_params()
The sampler chains of common_init_result are constructed before the model and the context - we will need this for sampling : add support for backend sampling #17004 in order to optionally pass the samplers during the construction of the context

Another change related to the grammar logic (the explanation is in the referenced comment):

No longer maintain a separate sampler chain for the grammar
Merge the grammar into the main common_sampler chain
The grammar is now always applied first to the raw logits, before the rest of the samplers

The main reason for this change is to make the integration of #17004 compatible with grammar usage and to simplify the logic for handling the grammar when it is present. The main concern is that this will likely hurt the performance when grammar sampling is involved, since we no longer do the "rejection sampling" trick. I think it's better to put effort to optimize the performance of the grammar in general so we don't need to do the trick at all.

examples/batched/batched.cpp

common : refactor common_sampler + grammar logic changes

7ee3c35

ggerganov requested review from JohannesGaessler and ngxson as code owners December 11, 2025 13:19

ggerganov mentioned this pull request Dec 11, 2025

server: handle limiting maximum reasoning budget #17750

Open

loci-dev mentioned this pull request Dec 11, 2025

UPSTREAM PR #17937: common : refactor common_sampler + grammar logic changes auroralabs-loci/llama.cpp#523

Open

github-actions bot added examples server labels Dec 11, 2025

tests : increase max_tokens to get needed response

68c5654

github-actions bot added the python python script changes label Dec 11, 2025

ggerganov requested a review from danbev December 12, 2025 12:32

danbev reviewed Dec 12, 2025

View reviewed changes

examples/batched/batched.cpp Outdated Show resolved Hide resolved

danbev reviewed Dec 12, 2025

View reviewed changes

examples/batched/batched.cpp Outdated Show resolved Hide resolved

examples/batched/batched.cpp Show resolved Hide resolved

batched : fix uninitialized samplers

2fa9874

danbev approved these changes Dec 12, 2025

View reviewed changes

ggerganov merged commit 254098a into master Dec 14, 2025
75 of 78 checks passed

ggerganov deleted the gg/common-refactor branch December 14, 2025 08:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

common : refactor common_sampler + grammar logic changes #17937

common : refactor common_sampler + grammar logic changes #17937

ggerganov commented Dec 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

common : refactor common_sampler + grammar logic changes #17937

common : refactor common_sampler + grammar logic changes #17937

Conversation

ggerganov commented Dec 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants