clip: move model cgraphs into their own files #17965

ngxson · 2025-12-12T15:04:40Z

Some models like deepseek-ocr (#17400), gemma3n (#17961) and LFM2-audio (#17694) adds quite a lot of code into clip.cpp making it hard to track changes.

This PR moves cgraph builder of all vision/audio models into their own file, mirroring the recent refactoring in libllama by @pwilkin

Migration guide for pending PRs

Create a new file under mtmd/models/your_model.cpp
Implement the code inside newly created file (simply copy your existing build_*() function from clip.cpp into here)
NOTE: You won't be able to access clip_ctx directly from here. However, you can add custom fields into clip_graph and modify its constructor inside clip.cpp to copy fields into clip_graph
Add class definition into models.h
Modify clip_image_build_graph() in clip.cpp to use the new cgraph

ngxson · 2025-12-12T15:10:46Z

Test results:

[vision] OK:   ggml-org/SmolVLM-500M-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/SmolVLM2-2.2B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/SmolVLM2-500M-Video-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/gemma-3-4b-it-GGUF:Q4_K_M
[vision] OK:   THUDM/glm-edge-v-5b-gguf:Q4_K_M
[vision] OK:   second-state/Llava-v1.5-7B-GGUF:Q2_K
[vision] OK:   cjpais/llava-1.6-mistral-7b-gguf:Q3_K_M
[vision] OK:   ibm-research/granite-vision-3.2-2b-GGUF:Q4_K_M
[vision] OK:   second-state/MiniCPM-Llama3-V-2_5-GGUF:Q2_K
[vision] OK:   openbmb/MiniCPM-V-2_6-gguf:Q2_K
[vision] OK:   openbmb/MiniCPM-o-2_6-gguf:Q4_0
[vision] OK:   bartowski/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/InternVL2_5-1B-GGUF:Q8_0
[vision] OK:   ggml-org/InternVL3-1B-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/Qwen2.5-Omni-3B-GGUF:Q4_K_M
[vision] OK:   ggml-org/LFM2-VL-450M-GGUF:Q8_0
[vision] OK:   ggml-org/granite-docling-258M-GGUF:Q8_0
[vision] OK:   ggml-org/LightOnOCR-1B-1025-GGUF:Q8_0
[audio]  OK:   ggml-org/ultravox-v0_5-llama-3_2-1b-GGUF:Q8_0
[audio]  OK:   ggml-org/Qwen2.5-Omni-3B-GGUF:Q4_K_M
[audio]  OK:   ggml-org/Voxtral-Mini-3B-2507-GGUF:Q4_K_M
[vision] OK:   ggml-org/pixtral-12b-GGUF:Q4_K_M
[vision] OK:   ggml-org/Mistral-Small-3.1-24B-Instruct-2503-GGUF
[vision] OK:   ggml-org/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2-VL-7B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-VL-3B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-VL-7B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen3-VL-2B-Instruct-GGUF:Q8_0
[vision] OK:   ggml-org/InternVL3-8B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/InternVL3-14B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-Omni-7B-GGUF:Q4_K_M
[audio]  OK:   ggml-org/ultravox-v0_5-llama-3_1-8b-GGUF:Q4_K_M
[audio]  OK:   ggml-org/Qwen2.5-Omni-7B-GGUF:Q4_K_M
[vision] OK:   ggml-org/Qwen2.5-VL-72B-Instruct-GGUF:Q4_K_M
[vision] OK:   ggml-org/Llama-4-Scout-17B-16E-Instruct-GGUF:IQ1_S

ggerganov · 2025-12-12T16:04:18Z

tools/mtmd/models/whisper-enc.cpp

The filenaming convention is to use dashes instead of underscores for C++ source files

Merged with PR ggml-org#17965

ngxson added 3 commits December 12, 2025 15:55

clip: move model cgraphs into their own files

d06b456

Merge branch 'master' into xsn/clip_refactor_smaller_files

b05ce36

more explicit enums

c3851bd

ngxson requested a review from ggerganov December 12, 2025 15:04

ngxson mentioned this pull request Dec 12, 2025

mtmd: Add DeepSeekOCR Support #17400

Open

fix linux build

fe23130

ggerganov approved these changes Dec 12, 2025

View reviewed changes

ngxson added 2 commits December 12, 2025 17:45

fix naming

f996ff5

missing headers

d24b6c4

github-actions bot added the examples label Dec 12, 2025

nits: add comments for contributors

368aaff

ngxson merged commit e39a2ce into ggml-org:master Dec 12, 2025
68 of 69 checks passed

eelbaz mentioned this pull request Dec 13, 2025

mtmd, llama: add GLM4V vision-language model support #17967

Closed

sfallah added a commit to sfallah/llama.cpp that referenced this pull request Dec 13, 2025

Merge pull request #11 from sfallah/sf/deepseek-ocr-merge_#17965

1b38ccf

Merged with PR ggml-org#17965

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

clip: move model cgraphs into their own files #17965

clip: move model cgraphs into their own files #17965

Uh oh!

ngxson commented Dec 12, 2025 •

edited

Loading

Uh oh!

ngxson commented Dec 12, 2025

Uh oh!

ggerganov Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clip: move model cgraphs into their own files #17965

clip: move model cgraphs into their own files #17965

Uh oh!

Conversation

ngxson commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Migration guide for pending PRs

Uh oh!

ngxson commented Dec 12, 2025

Uh oh!

ggerganov Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ngxson commented Dec 12, 2025 •

edited

Loading