Update lora def by lucylq · Pull Request #18247 · pytorch/executorch

lucylq · 2026-03-17T20:10:32Z

Summary

Update lora def to use nn.linear instead of torch.nn.functional.linear

Remove debug string from FQN. Otherwise we get different quant indices for lora and non-lora, after this change, and the foundation weights are no longer shareable...
Having nn.Linear as a submodule allows torchao quant to capture it, and we don't need custom logic filtering LoraLinear. lora_a and lora_b are also captured, which changes our results slightly. Note that only lora_a is quantized (shape [16, dim], where dim %32 ==0), as lora_b has shape [dim, 16], where 16 %32 != 0 and fails the group size check.
Having nn.Linear as submodule also means we need to remap lora weights, as weight names have an extra 'linear' in them.
Add @Property for weight, bias so that it remains BC and can be treated as a regular linear module. This is used in [load_weights_from_attention.py](

executorch/examples/models/llama/static_attention.py

Line 1154 in eb92cec

def load_weights_from_attention_mha(

Test plan

CI

pytorch-bot · 2026-03-17T20:10:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18247

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 251 Pending

As of commit 2e04f71 with merge base eb92cec ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-03-17T20:11:56Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

Updates the LLaMA example LoRA linear implementation to call an nn.Linear module in forward() (instead of torch.nn.functional.linear), presumably to align with module-based patterns.

Changes:

Replace torch.nn.functional.linear(x, self.weight, self.bias) with self.linear(x).
Introduce self.linear = nn.Linear(...) during LoRALinear initialization.

Comments suppressed due to low confidence (1)

examples/models/llama/lora.py:36

linear and weight are now undefined after switching to self.linear; bias = linear.bias ... and register_parameter("weight", ...) will raise at init time. Either derive bias/weight from self.linear (or remove the extra register_parameter calls entirely) so construction works.

        self.linear = nn.Linear(in_dim, out_dim, bias=use_bias)
        bias = linear.bias if self.use_bias else None
        self.register_parameter("weight", nn.Parameter(weight))
        self.register_parameter(
            "bias", nn.Parameter(bias) if bias is not None else None
        )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

examples/models/llama/lora.py

Copilot

Pull request overview

Updates the Llama LoRA implementation to call an nn.Linear submodule directly (instead of torch.nn.functional.linear) and simplifies quantization filtering accordingly.

Changes:

Refactors LoRALinear to wrap a real nn.Linear module and use it in forward.
Adds partial state-dict backward compatibility by remapping the legacy weight key.
Simplifies the 8da*w quantization filter_fn to only consider nn.Linear modules.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
examples/models/llama/source_transformation/quantize.py	Simplifies the 8da4w/8da8w quantization filter to target `nn.Linear` modules only.
examples/models/llama/lora.py	Refactors `LoRALinear` to delegate to an internal `nn.Linear` and updates forward/state-dict behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

examples/models/llama/lora.py

Copilot

Pull request overview

Updates the LLaMA LoRA implementation to wrap a real nn.Linear submodule (instead of calling torch.nn.functional.linear directly), enabling TorchAO quantization tooling to recognize and quantize the base linear layer consistently across LoRA and non-LoRA models.

Changes:

Refactors LoRALinear to contain self.linear: nn.Linear, adds weight/bias properties for backward-compatible access, and remaps old checkpoint keys on load.
Simplifies TorchAO 8da*xw quantization filtering to target nn.Linear modules only (no special-casing LoRA modules).
Makes XNNPACK constant serialization keys content-hash-based (removes tensor-name prefix) and updates the LoRA CI expectation string accordingly.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
examples/models/llama/source_transformation/quantize.py	Simplifies TorchAO quantization filter to match `nn.Linear` modules and apply group-size compatibility logic.
examples/models/llama/lora.py	Refactors `LoRALinear` to wrap an `nn.Linear` submodule; adds BC `weight`/`bias` accessors and state-dict key remapping.
backends/xnnpack/operators/node_visitor.py	Changes named constant key generation to be solely SHA256-based to stabilize dedup/indexing behavior.
.ci/scripts/test_lora.sh	Updates expected output prefix text for the quantized LoRA test case.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

examples/models/llama/lora.py

        self.use_bias = use_bias
        self.dropout = dropout

-        linear = nn.Linear(in_dim, out_dim, bias=use_bias)
-        weight = linear.weight
-        bias = linear.bias if self.use_bias else None
-        self.register_parameter("weight", nn.Parameter(weight))
-        self.register_parameter(
-            "bias", nn.Parameter(bias) if bias is not None else None
-        )
-
+        self.linear = nn.Linear(in_dim, out_dim, bias=use_bias)
        self.dropout = nn.Dropout(p=dropout) if dropout > 0.0 else nn.Identity()
        self.lora_a = nn.Linear(in_features=in_dim, out_features=rank, bias=False)


meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 17, 2026

lucylq marked this pull request as ready for review March 17, 2026 20:10

Copilot AI review requested due to automatic review settings March 17, 2026 20:10

Copilot started reviewing on behalf of lucylq March 17, 2026 20:11 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

examples/models/llama/lora.py Outdated Show resolved Hide resolved

lucylq force-pushed the lfq.lora-def branch from 05f81b8 to d23c0d2 Compare March 17, 2026 20:31

Copilot AI review requested due to automatic review settings March 17, 2026 20:59

lucylq force-pushed the lfq.lora-def branch from d23c0d2 to bc4b3df Compare March 17, 2026 20:59

Copilot started reviewing on behalf of lucylq March 17, 2026 21:00 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

examples/models/llama/lora.py Show resolved Hide resolved

examples/models/llama/lora.py Outdated Show resolved Hide resolved

lucylq force-pushed the lfq.lora-def branch from bc4b3df to 72a9449 Compare March 17, 2026 23:22

lucylq requested a review from digantdesai as a code owner March 17, 2026 23:22

Copilot AI review requested due to automatic review settings March 18, 2026 18:13

lucylq force-pushed the lfq.lora-def branch 2 times, most recently from 45fd1f3 to 1ebb7d7 Compare March 18, 2026 18:16

lucylq requested a review from JacobSzwejbka March 18, 2026 18:16

Copilot started reviewing on behalf of lucylq March 18, 2026 18:18 View session

Copilot AI reviewed Mar 18, 2026

View reviewed changes

Update lora def

2e04f71

lucylq force-pushed the lfq.lora-def branch from 1ebb7d7 to 2e04f71 Compare March 18, 2026 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update lora def#18247

Update lora def#18247
lucylq wants to merge 1 commit intomainfrom
lfq.lora-def

lucylq commented Mar 17, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 17, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 17, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lucylq commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18247

⏳ No Failures, 251 Pending

Uh oh!

github-actions bot commented Mar 17, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lucylq commented Mar 17, 2026 •

edited

Loading

pytorch-bot bot commented Mar 17, 2026 •

edited

Loading

This PR needs a `release notes:` label