-
Notifications
You must be signed in to change notification settings - Fork 633
Fix nvfp4 convert_and_update_tensor shape check #2670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: 乙划 <zht108229@antgroup.com>
for more information, see https://pre-commit.ci
Greptile OverviewGreptile SummaryThis PR updates This fits into the PyTorch NVFP4 quantization flow by ensuring shape consistency checks and downstream buffer creation use a consistent logical shape when both rowwise and columnwise representations are present. Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Py as Python NVFP4Tensor
participant Q as NVFP4Quantizer::convert_and_update_tensor
participant R as rowwise_data buffer
participant C as columnwise_data buffer
Py->>Q: convert_and_update_tensor(tensor)
Q->>Py: read attrs (_rowwise_data/_columnwise_data/...)
alt columnwise_data exists
Q->>C: getTensorShape(columnwise_data)
C-->>Q: shape(fp4-col)
Q-->>Q: convert_shape_back_from_fp4(..., true)
alt rowwise_data also exists
Q->>R: getTensorShape(rowwise_data)
R-->>Q: shape(fp4-row)
Q-->>Q: convert_shape_back_from_fp4(..., false)
Q-->>Q: compressShapeTo2D(expected_shape)
Q-->>Q: NVTE_CHECK(col_shape == expected_shape_2d)
Q-->>Q: shape = expected_shape (ground truth)
end
else only rowwise_data exists
Q->>R: getTensorShape(rowwise_data)
R-->>Q: shape(fp4-row)
Q-->>Q: convert_shape_back_from_fp4(..., false)
end
Q-->>Py: allocate missing buffers using inferred shape
Q-->>Py: return updated TensorWrapper + py::object
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 2 comments
Additional Comments (2)
|
|
@skydoorkai I think the comments from Greptile make sense (even though they are not very high priority). Could you fix those? Other than that LGTM. |
Description
This is to fix #2607
For nvfp4's columnwise data , it is using enforced 2D shape. Thus, the original check would fail if rowwise_data shape is 3D shape.
To fix :
(1) expected_data should be enforced into 2D shape from rowwise_data's shape.
(2) use rowwise_data's shape as the “ground truth" shape.
Fixes # (issue)
Type of change
Changes
Please list the changes introduced in this PR:
Checklist: