[FEATURE]: Track chat templates in modelCard for ML model supply chain integrity

## Describe the feature

Chat templates are executable Jinja2 programs bundled with instruction-tuned and chat-finetuned models that translate structured input into the token sequences the model expects. They are present across the large majority of deployed LLMs and SLMs today, including diffusion-based language models such as LLaDA 2.0. They are typically distributed as Jinja2 strings, either embedded in GGUF file metadata or stored in `tokenizer_config.json` within HuggingFace model repositories.

They run automatically before model inference, mapping conversational roles (user, assistant, system) into model-specific serialized formats using special tokens (e.g., Llama's `<|start_header_id|>`, Qwen's `<|im_start|>`, Mistral's `[INST]`).

Despite their critical role in the inference pipeline, chat templates are not currently tracked in CycloneDX model cards. This means a component that directly controls how all user input reaches a model has no representation in the BOM.

### Why this matters

Recent peer-reviewed research shows that a maliciously modified chat template is sufficient to backdoor a model at inference time. The attacker only needs to change the template file itself. Across 18 models spanning 7 families, triggered backdoors achieved roughly 80% success rates while staying dormant under normal use and evading HuggingFace's existing security scanners.

This has been accepted as a workshop paper at ICLR 2026:

> Inference-Time Backdoors via Hidden Instructions in LLM Chat Templates
>
> https://arxiv.org/abs/2602.04653v2

The paper demonstrates two attack scenarios:

1. **Integrity degradation**: The model produces plausible but subtly wrong answers (e.g., incorrect dates for historical facts) while maintaining fluent output.
2. **Forbidden resource emission**: The model emits attacker-controlled URLs, either in plaintext, hidden in HTML comments, or Base64-encoded.

### Relationship to existing work

Issue #702 already identifies chat templates as a TODO item for the MLBOM 2.0 schema rework, noting that `inputs` and `outputs` "are really (chat) template parameters (which may vary by template as models can have multiple)." This proposal provides a concrete, research-backed design for that item and makes the security case for treating it as a first-class field rather than a property extension.

This proposal targets CycloneDX 2.0, aligning with the broader MLBOM schema improvements tracked in #702 and the modularization goals in #631.

## Possible solutions

Add a `chatTemplate` object to `modelCard.modelParameters` (or to the top-level `modelCard` if fields are reorganized per #702).

### Proposed schema structure

```json
"chatTemplate": {
  "type": "object",
  "description": "Chat template used to format structured input into token sequences.",
  "properties": {
    "format": {
      "type": "string",
      "description": "The template language or format.",
      "examples": ["jinja2", "chatml"]
    },
    "content": {
      "type": "string",
      "description": "The raw template content."
    },
    "hashes": {
      "type": "array",
      "description": "Cryptographic hashes of the template for integrity verification.",
      "items": { "$ref": "#/definitions/hash" }
    },
    "signature": {
      "$ref": "#/definitions/signature",
      "description": "Cryptographic signature from the model provider for provenance."
    },
    "specialTokens": {
      "type": "array",
      "description": "Special tokens the template uses for role demarcation.",
      "items": {
        "type": "object",
        "properties": {
          "role": { "type": "string" },
          "startDelimiter": { "type": "string" },
          "endDelimiter": { "type": "string" }
        }
      }
    }
  }
}
```

This reuses existing CycloneDX primitives (`hash`, `signature`) rather than introducing new types, which keeps the addition minimal while directly addressing the paper's core recommendations:

| Paper recommendation                  | CycloneDX mechanism              |
|---------------------------------------|----------------------------------|
| Treat templates as security-relevant  | First-class schema field         |
| Cryptographic signing for provenance  | Existing `signature` type        |
| Automated anomaly detection           | `hashes` enable integrity checks |
| Deployer-side auditing                | `content` field for inspection   |

Since models can support multiple chat templates (as noted in #702), this field could also be an array to capture template variants.

## Alternatives

- **Property extension only**: Use the existing `properties` name-value mechanism to store template metadata. This is possible today but provides no structure, no integrity guarantees, and no tooling interoperability.
- **External reference**: Point to the template via `externalReferences`. This captures location but not content or hashes, limiting offline auditing and integrity verification.

## Additional context

- ICLR 2026 workshop paper (accepted): https://arxiv.org/abs/2602.04653v2
- CycloneDX 2.0 MLBOM schema discussion: #702
- CycloneDX 2.0 tracker: #631
- HuggingFace chat template documentation: https://huggingface.co/docs/transformers/main/en/chat_templating
- The paper also shows that "defensive" chat templates can improve model robustness against jailbreaks by 12.5% without degrading benign performance, suggesting that template tracking has value for both security auditing and safety documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE]: Track chat templates in modelCard for ML model supply chain integrity #862

Describe the feature

Why this matters

Relationship to existing work

Possible solutions

Proposed schema structure

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Paper recommendation	CycloneDX mechanism
Treat templates as security-relevant	First-class schema field
Cryptographic signing for provenance	Existing `signature` type
Automated anomaly detection	`hashes` enable integrity checks
Deployer-side auditing	`content` field for inspection

Uh oh!

[FEATURE]: Track chat templates in modelCard for ML model supply chain integrity #862

Description

Describe the feature

Why this matters

Relationship to existing work

Possible solutions

Proposed schema structure

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions