Skip to content

Conversation

@CISC
Copy link
Collaborator

@CISC CISC commented Dec 14, 2025

Handle rope scaling in set_gguf_parameters to deduplicate code and support the new rope_parameters (where rope_theta also has moved) introduced in huggingface/transformers#39847

Obsoletes #18008

@github-actions github-actions bot added the python python script changes label Dec 14, 2025
@CISC CISC requested review from ggerganov and ngxson December 14, 2025 03:25
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming that gpt-oss converts correctly. Will take a look at the rest of the changes, but probably won't be able to give much additional feedback on the python code.

self.model_name = model_name
self.dir_model_card = dir_model # overridden in convert_lora_to_gguf.py

# Ensure "rope_theta" and "rope_type" is mirrored in rope_parameters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it probably better to extract this into a new method called load_rope_params

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered it, but it got a little awkward as a static method and not much sense as a regular method.

@CISC CISC merged commit 5c8a717 into master Dec 14, 2025
8 of 9 checks passed
@CISC CISC deleted the cisc/convert-rope-parameters branch December 14, 2025 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants