Eval bug: failed to restore kv cache on gpt-oss:120b with parallel processing

### Name and Version

version: 7157 (583cb8341)
built with clang version 19.1.5 for x86_64-pc-windows-msvc

### Operating systems

Windows

### GGML backends

CUDA

### Hardware

RTX 5060 Ti + 9950x + 96 GB RAM

### Models

`--gpt-oss-120b-default` (see below for extra parameters that might affect it)

### Problem description & steps to reproduce

I am running `llama-server` with

```
      --gpt-oss-120b-default
      --ctx-size 0
      --kv-unified
      --jinja
      --chat-template-kwargs {\"reasoning_effort\":\"high\"}
      -ub 2048 -b 2048
      --cpu-moe --n-gpu-layers 999
      --prio -1
      --parallel 8
```

And two parallel agentic workflows: one via llama.vscode, another via codex.

Both contexts are far from 64K (somewhere in 2xK each). Yet, llama-server falls back to full prompt re-processing "due to lack of cache data".

### First Bad Commit

_No response_

### Relevant log output

```shell
srv  params_from_: Chat format: GPT-OSS
slot get_availabl: id  7 | task -1 | selected slot by LCP similarity, sim_best = 0.901 (> 0.100 thold), f_keep = 0.823
slot launch_slot_: id  7 | task -1 | sampler chain: logits -> logit-bias -> penalties -> dry -> top-n-sigma -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist
slot launch_slot_: id  7 | task 60314 | processing task
slot update_slots: id  7 | task 60314 | new prompt, n_ctx_slot = 131072, n_keep = 0, task.n_tokens = 27291
slot update_slots: id  7 | task 60314 | n_past = 24582, slot.prompt.tokens.size() = 29882, seq_id = 7, pos_min = 29755, n_swa = 128
state_read_meta: failed to find available cells in kv cache
state_seq_set_data: error loading state: failed to restore kv cache
slot update_slots: id  7 | task 60314 | failed to restore context checkpoint (pos_min = 21700, pos_max = 24516, size = 99.068 MiB)
slot update_slots: id  7 | task 60314 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: failed to restore kv cache on gpt-oss:120b with parallel processing #17527

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: failed to restore kv cache on gpt-oss:120b with parallel processing #17527

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions