Skip to content

Arm backend: Add Ethos-U FVP tests for MLPerf Tiny models#18225

Merged
tirwu01 merged 1 commit intopytorch:mainfrom
tirwu01:mlperf-tiny-models
Mar 19, 2026
Merged

Arm backend: Add Ethos-U FVP tests for MLPerf Tiny models#18225
tirwu01 merged 1 commit intopytorch:mainfrom
tirwu01:mlperf-tiny-models

Conversation

@tirwu01
Copy link
Collaborator

@tirwu01 tirwu01 commented Mar 17, 2026

Add model definitions and Arm backend tests for four MLPerf Tiny benchmark models: ResNet8, DS-CNN, Deep AutoEncoder, and MobileNetV1-0.25.

Model definitions are placed under examples/models/mlperf_tiny/. Each model has tests for tosa_FP, tosa_INT, u55_INT and u85_INT pipelines in backends/arm/test/models/.

Notable model adaptations for Arm delegation:

  • Deep AutoEncoder: Fuse Linear + BatchNorm1d pairs before export since the TOSA quantizer only annotates conv + batch_norm patterns.
  • DS-CNN: Replace AvgPool2d(24, 5) with AdaptiveAvgPool2d(1) to satisfy the Ethos-U55 stride <= 3 constraint; the DecomposeAdaptiveAvgPool2dPass decomposes it into stride-1 pools.

Change-Id: I8dbf5e8a4b80996faab9f850c21740899f6b36fd

Summary

[PLEASE REMOVE] See CONTRIBUTING.md's Pull Requests for ExecuTorch PR guidelines.

[PLEASE REMOVE] If this PR closes an issue, please add a Fixes #<issue-id> line.

[PLEASE REMOVE] If this PR introduces a fix or feature that should be the upcoming release notes, please add a "Release notes: " label. For a list of available release notes labels, check out CONTRIBUTING.md's Pull Requests.

Test plan

[PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

@pytorch-bot
Copy link

pytorch-bot bot commented Mar 17, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18225

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 3 Pending

As of commit 3056745 with merge base fb90480 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 17, 2026
@tirwu01
Copy link
Collaborator Author

tirwu01 commented Mar 17, 2026

@pytorchbot label ciflow/trunk

@tirwu01 tirwu01 added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Mar 17, 2026
@tirwu01 tirwu01 requested review from SaoirseARM and mansnils March 17, 2026 10:20
@zingo
Copy link
Collaborator

zingo commented Mar 17, 2026

Hi @digantdesai @rascani and @psiddh this adds a few new mpu nice models to the example folder and need a Meta review 🙏 🙂

@tirwu01 tirwu01 added the release notes: none Do not include this in the release notes label Mar 17, 2026
}


def test_mobilenet_v1_025_tosa_FP():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do they add any more coverage besides what we have for Mobilenet? Let's just add them only in examples if they aren't too different from what we already have, rationale is the CI job freq

Copy link
Collaborator Author

@tirwu01 tirwu01 Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, MobileNetV1-0.25 is a distinct model from MobileNetV2/V3 — it's the specific architecture used in the MLPerf Tiny.These four models (ResNet8, DS-CNN, Deep AutoEncoder, MobileNetV1-0.25) are the standard MLPerf Tiny benchmark suite and are tested together as a set.

}


def test_ds_cnn_tosa_FP():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here as MV1

}


def test_deep_autoencoder_tosa_FP():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here as MV1

}


def test_resnet8_tosa_FP():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here as MV1

psiddh added a commit that referenced this pull request Mar 18, 2026
… and torch patterns

Add 21 new test cases across 3 files that exercise the Cortex-M
quantizer and pass manager on small composite models. These mirror
the Arm backend's test_nn_modules/test_nn_functional/test_torch_functions
pattern (PR #18225) but target the Cortex-M pipeline.

Tests cover: ConvBnReLU, LinearReLU, ConvTranspose2d, AdaptiveAvgPool2d,
MaxPool2d, AvgPool2d, Softmax, Hardswish, Hardsigmoid, depthwise
separable conv, inverted residual blocks (MobileNet-style), and
multi-op functional compositions.

All tests use test_dialect() which runs quantize→export→to_edge→
run_passes→compare_outputs entirely on the host (no FVP needed).

Co-authored-by: Claude <noreply@anthropic.com>
psiddh added a commit that referenced this pull request Mar 18, 2026
… and torch patterns

Add 21 new test cases across 3 files that exercise the Cortex-M
quantizer and pass manager on small composite models. These mirror
the Arm backend's test_nn_modules/test_nn_functional/test_torch_functions
pattern (PR #18225) but target the Cortex-M pipeline.

Tests cover: ConvBnReLU, LinearReLU, ConvTranspose2d, AdaptiveAvgPool2d,
MaxPool2d, AvgPool2d, Softmax, Hardswish, Hardsigmoid, depthwise
separable conv, inverted residual blocks (MobileNet-style), and
multi-op functional compositions.

All tests use test_dialect() which runs quantize→export→to_edge→
run_passes→compare_outputs entirely on the host (no FVP needed).

Co-authored-by: Claude <noreply@anthropic.com>
psiddh added a commit that referenced this pull request Mar 18, 2026
… and torch patterns

Add 21 new test cases across 3 files that exercise the Cortex-M
quantizer and pass manager on small composite models. These mirror
the Arm backend's test_nn_modules/test_nn_functional/test_torch_functions
pattern (PR #18225) but target the Cortex-M pipeline.

Tests cover: ConvBnReLU, LinearReLU, ConvTranspose2d, AdaptiveAvgPool2d,
MaxPool2d, AvgPool2d, Softmax, Hardswish, Hardsigmoid, depthwise
separable conv, inverted residual blocks (MobileNet-style), and
multi-op functional compositions.

All tests use test_dialect() which runs quantize→export→to_edge→
run_passes→compare_outputs entirely on the host (no FVP needed).

Co-authored-by: Claude <noreply@anthropic.com>
psiddh added a commit that referenced this pull request Mar 18, 2026
… and torch patterns

Add 21 new test cases across 3 files that exercise the Cortex-M
quantizer and pass manager on small composite models. These mirror
the Arm backend's test_nn_modules/test_nn_functional/test_torch_functions
pattern (PR #18225) but target the Cortex-M pipeline.

Tests cover: ConvBnReLU, LinearReLU, ConvTranspose2d, AdaptiveAvgPool2d,
MaxPool2d, AvgPool2d, Softmax, Hardswish, Hardsigmoid, depthwise
separable conv, inverted residual blocks (MobileNet-style), and
multi-op functional compositions.

All tests use test_dialect() which runs quantize→export→to_edge→
run_passes→compare_outputs entirely on the host (no FVP needed).

Co-authored-by: Claude <noreply@anthropic.com>
Add model definitions and Arm backend tests for four MLPerf Tiny
benchmark models: ResNet8, DS-CNN, Deep AutoEncoder, and
MobileNetV1-0.25.

Model definitions are placed under examples/models/mlperf_tiny/.
Each model has tests for tosa_FP, tosa_INT, u55_INT and u85_INT
pipelines in backends/arm/test/models/.

Notable model adaptations for Arm delegation:
- Deep AutoEncoder: Fuse Linear + BatchNorm1d pairs before export
  since the TOSA quantizer only annotates conv + batch_norm patterns.
- DS-CNN: Replace AvgPool2d(24, 5) with AdaptiveAvgPool2d(1) to
  satisfy the Ethos-U55 stride <= 3 constraint; the
  DecomposeAdaptiveAvgPool2dPass decomposes it into stride-1 pools.

Change-Id: I8dbf5e8a4b80996faab9f850c21740899f6b36fd
Signed-off-by: Tirui Wu <tirui.wu@arm.com>
@tirwu01 tirwu01 force-pushed the mlperf-tiny-models branch from 03c86e3 to 3056745 Compare March 19, 2026 10:49
@tirwu01 tirwu01 requested a review from lucylq as a code owner March 19, 2026 10:49
@tirwu01 tirwu01 merged commit 1925873 into pytorch:main Mar 19, 2026
386 of 395 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants