Skip to content

Showcase: automated template compatibility testing + AI debugging workflow alignment#280

Draft
kalanyuz wants to merge 162 commits intocre-reliability-load-testfrom
experimental/agent-skills
Draft

Showcase: automated template compatibility testing + AI debugging workflow alignment#280
kalanyuz wants to merge 162 commits intocre-reliability-load-testfrom
experimental/agent-skills

Conversation

@kalanyuz
Copy link

@kalanyuz kalanyuz commented Feb 23, 2026

PR: Showcase automated template compatibility testing and AI debugging workflow alignment

Summary

Brief overview of what this PR implements:

  1. Template Compatibility Guardrail: Adds a unified compatibility suite for template IDs 1..5 (including IDs 3 and 5) with a drift canary.
  2. Process and Operational Hardening: Adds CI path-gated enforcement, documents embedded-vs-dynamic source modes, and codifies repeatable QA/template/TUI agent workflows.
flowchart LR
    A[Template or creinit change] --> B[Path Filter Job]
    B --> C[TestTemplateCompatibility]
    C --> D[PR Signal]
    D --> E[QA + Skills + Docs Alignment]
Loading

Changes

1. Template Compatibility Automation

File: test/template_compatibility_test.go

What changed:

  • Added TestTemplateCompatibility table-driven suite for template IDs 1..5.
  • Added explicit cases for TS_HelloWorld_Template3 and TS_ConfHTTP_Template5.
  • Added drift canary TestTemplateCompatibility_AllTemplatesCovered.

Result: Template init/build/simulate regressions are caught deterministically and coverage drift is flagged early.

2. CI Enforcement Path

File: .github/workflows/pull-request-main.yml

What changed:

  • Added template-compat-path-filter.
  • Added ci-test-template-compat on Linux + Windows.
  • Runs go test -v -timeout 20m -run TestTemplateCompatibility ./test/ when template-impacting paths change.

Result: Compatibility checks become a targeted PR guardrail instead of ad-hoc local validation.

3. Template Repository Operational Setup

Files:

  • submodules.yaml
  • scripts/setup-submodules.sh
  • Makefile
  • .gitignore

What changed:

  • Added clone/update/clean workflow for external cre-templates reference checkout.
  • Added make targets and managed gitignore behavior.

Result: External template workspace setup is reproducible without using Git submodules.

4. Agent and Framework Alignment

Files:

  • AGENTS.md
  • .claude/skills/cre-add-template/**
  • .claude/skills/cre-qa-runner/**
  • .claude/skills/cre-cli-tui-testing/**
  • .claude/skills/using-cre-cli/**
  • testing-framework/*.md
  • .claude/skills/skill-audit-report.md

What changed:

  • Documented current embedded template source and branch-gated upcoming dynamic pull mode.
  • Updated skills with clearer invocation boundaries and dynamic-mode handling guidance.
  • Aligned testing-framework docs to dual-source model and advisory-first dynamic validation rollout.

Result: Consistent operator + agent guidance across docs, skills, and CI expectations.

Impact

Modified (selected files)

  • test/template_compatibility_test.go - New unified compatibility suite + canary.
  • .github/workflows/pull-request-main.yml - Path filter + compatibility job.
  • AGENTS.md - Source-mode architecture and maintenance workflow.
  • submodules.yaml - External template repo relationship config.
  • scripts/setup-submodules.sh - Deterministic clone/update/clean script.
  • .claude/skills/cre-add-template/SKILL.md - Template workflow and checklist boundaries.
  • .claude/skills/cre-qa-runner/SKILL.md - QA runbook/reporting + source provenance guidance.
  • testing-framework/README.md - Embedded baseline + branch-gated dynamic framing.

Affected (no runtime behavior change)

  • cmd/creinit/* runtime flow remains embedded-template baseline.
  • ✅ Dynamic-template behavior is documented as branch-gated, not active default behavior.

Testing

Compatibility Suite

  • go test -v -timeout 20m -run TestTemplateCompatibility ./test/
  • ✅ Includes IDs 1,2,3,4,5 and canary coverage guard.

Skill/Docs Consistency Checks

  • ✅ Skill description overlap scan (rg "^description:" .claude/skills/*/SKILL.md).
  • ✅ Cross-doc alignment pass across AGENTS.md, testing-framework/*.md, and updated skill docs.

Deployment Notes

Pre-Deployment Requirements

  • No production deployment steps required.
  • CI workflow changes require standard Actions permissions already present in repo workflow.

Deployment Order

  1. Merge PR.
  2. Let PR workflow enforce template-compat job on relevant path changes.
  3. Use updated skills/runbooks for subsequent template additions and pre-release QA.

Backward Compatibility

  • Backward compatible.
  • No CLI API/runtime behavior changes introduced by this PR.

Verification Checklist

  • Unified compatibility test covers template IDs 1..5.
  • Canary guard exists for registry/test-table drift.
  • CI path filter + compatibility job added.
  • Skills updated for invocation clarity and dynamic-mode branch gating.
  • Testing framework docs aligned to embedded-now + dynamic-next model.

Rollback Plan

If issues occur:

  1. Revert workflow additions in .github/workflows/pull-request-main.yml.
  2. Revert compatibility test file test/template_compatibility_test.go.
  3. Revert docs/skills updates in AGENTS.md, .claude/skills/**, and testing-framework/*.md.

silaslenihan and others added 30 commits October 27, 2025 20:09
…alues (#124)

* Limit LogTrigger filtering / event encoding to indexed fields

* Added sanitation for empty topic values

* added missing nil reflect import for empty bindings
* feat: add preview-build pipeline

* Potential fix for code scanning alert no. 7: Workflow does not contain permissions

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* update

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* fix compile error

* regenerate bindings
* feat: add update command

* feat: add update command

* fix: use commandPath for excluded commands

* chore: gendoc

* chore: address comments

* chore: address comments

* chore: address comments

* chore: update maxExtractSize to 500MB for windows

* improve message on windows

* delay tmpDir cleanup
* update init cmd next step. Added RPC prompt for por template.

* updated workflow settings template so we have different value per environment for config and workflow name

* lint fix

* Added rpc-url flag for init cmd

* fix init test

* chore: fix tests

* fix cre init that would validate rpc-url flag

* fix cre init that would validate rpc-url flag

* fix cre_init test

* update docs

* remove comment

---------

Co-authored-by: anirudhwarrier <12178754+anirudhwarrier@users.noreply.github.com>
Co-authored-by: Akshay Aggarwal <aks.agg94@gmail.com>
* logic to check latest release from repo

* Updated update check script + logger priority

* added sync.WaitGroup so that coroutine for version check and telemetry doesn't get killed by process termination (may happen if command executed is very fast, such as cre version)

* cleaned up comments

* updated version check timeout to 6 secs

* gendoc

* fixed linter errors

* fixed comments

* Fix update display for sub-root command (i.e workflow, account, secrets) Make sure update warning is always display last

* go mod tidy

* gendocs

* moved update package to internal/update

* fixed order of sequence for update check message display

* fix telemetry not being called

* Revert waitgroup to time.sleep

* fix emitter event function

* revert telemetry change

---------

Co-authored-by: Akshay Aggarwal <akshay.aggarwal@smartcontract.com>
Co-authored-by: anirudhwarrier <12178754+anirudhwarrier@users.noreply.github.com>
* Limit LogTrigger filtering / event encoding to indexed fields

* Added sanitation for empty topic values

* added missing nil reflect import for empty bindings

* Reworked log trigger encoding in bindings

* bumped sdk version and regenerated example bindings
* verify ownership before execution

* lint

* fix lint issues

---------

Co-authored-by: anirudhwarrier <12178754+anirudhwarrier@users.noreply.github.com>
increase timout on http requests
* add update cmd in the exclusion for update check

* added version check for cre update to skip if already on the latest
* fix context cancel and improve user event detail

* mod tidy

* fix tests
* reorder and enrich

* lint

* cleanup
* reorder and enrich

* lint

* cleanup

* Add flags and decrease sleep
fix: error on refresh token
Reduce sleep duration before exiting the command.
* Update TS templates to use the latest SDK

* Point to beta release
* Validate credentials

* fix tests
* Add workflowID to user event

* clean up logs
Update default ts sdk version to beta.1
* add workflow language to runtimecontext

* lint

* review feedback
poopoothegorilla and others added 24 commits February 6, 2026 14:00
* Truncate tag to allow longer workflow names

* make lint
Suggest cre-sdk 1.0.9 as default in typescript templates
* Bump chainlink/v2 -- valdiate confidential HTTP in simulator

* Fix templates and tests

* fix ts template

* fix
…257)

* Add secrets to cre cli e2e tests, test conf http in simulator

* remove comment

* bump cl core to develop branch
Bump confidentialhttp SDK and move EncryptOutput to HTTPRequest

cre-sdk-go PR #101 moved EncryptOutput from ConfidentialHTTPRequest to
HTTPRequest. Update the por_workflow test to match the new API.
Added worldchain mainnet mock forwarder
* Initial change: convert conf http to DON runtime

* update go mods to latest sha

* bump go mods

---------

Co-authored-by: Prashant Yadav <prashant.yadav@smartcontract.com>
…rapper functions with functional options pattern:

    - ui.Confirm(title, ...ConfirmOption) — yes/no prompt, with WithLabels() and WithDescription()
    - ui.Input(title, ...InputOption) — single text input, with WithInputDescription() and
  WithPlaceholder()
    - ui.Select[T](title, options) — generic selection prompt using SelectOption[T]
    - ui.InputForm(fields) — multi-field form using InputField structs (with validation, suggestions,
  etc.)
  - internal/ui/prompts_test.go — Unit tests for option functions and struct construction
Comment on lines 17 to 45
runs-on: ubuntu-latest
outputs:
run-template-compat: ${{ steps.filter.outputs.run_template_compat }}
steps:
- name: Checkout the repo
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 #4.1.7
with:
fetch-depth: 0

- name: Detect template-impacting changes
id: filter
shell: bash
run: |
if [[ "${{ github.event_name }}" == "merge_group" ]]; then
echo "run_template_compat=true" >> "$GITHUB_OUTPUT"
exit 0
fi

base_sha="${{ github.event.pull_request.base.sha }}"
head_sha="${{ github.event.pull_request.head.sha }}"
changed_files="$(git diff --name-only "${base_sha}" "${head_sha}")"

if echo "${changed_files}" | grep -E '^(cmd/creinit/|cmd/creinit/template/|test/)' >/dev/null; then
echo "run_template_compat=true" >> "$GITHUB_OUTPUT"
else
echo "run_template_compat=false" >> "$GITHUB_OUTPUT"
fi

ci-test-template-compat:

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 3 days ago

To fix the problem, add an explicit permissions block to the template-compat-path-filter job so its GITHUB_TOKEN is scoped to the least privileges it needs. This job only checks out code and runs local git diff using the cloned repo; it doesn’t need to write to GitHub or access special scopes like id-token. Thus, contents: read is sufficient and matches CodeQL’s suggested minimal configuration.

Concretely, in .github/workflows/pull-request-main.yml, under jobs: template-compat-path-filter:, add a permissions: section at the same indentation level as runs-on: and outputs:, with contents: read. No additional imports or methods are needed; it is a pure YAML configuration change and does not alter the job’s functional behavior.

Suggested changeset 1
.github/workflows/pull-request-main.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/pull-request-main.yml b/.github/workflows/pull-request-main.yml
--- a/.github/workflows/pull-request-main.yml
+++ b/.github/workflows/pull-request-main.yml
@@ -15,6 +15,8 @@
 jobs:
   template-compat-path-filter:
     runs-on: ubuntu-latest
+    permissions:
+      contents: read
     outputs:
       run-template-compat: ${{ steps.filter.outputs.run_template_compat }}
     steps:
EOF
@@ -15,6 +15,8 @@
jobs:
template-compat-path-filter:
runs-on: ubuntu-latest
permissions:
contents: read
outputs:
run-template-compat: ${{ steps.filter.outputs.run_template_compat }}
steps:
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
@kalanyuz kalanyuz changed the base branch from main to feature/dynamic-templates February 23, 2026 10:31
@kalanyuz kalanyuz changed the base branch from feature/dynamic-templates to cre-reliability-load-test February 23, 2026 10:31
kalanyuz and others added 3 commits February 24, 2026 15:16
- Add policy snapshot for required merge gates vs advisory diagnostics

- Add canonical PASS/FAIL/SKIP/BLOCKED reason taxonomy and S/AI/M mapping

- Mark Playwright credential bootstrap as proposal-only, non-baseline
…inements (#290)

* validation: end-to-end skill QA, validation report, and framework refinements

Ran the full cre-qa-runner skill end-to-end against the testing framework
deliverables. This PR captures the validation evidence, fixes discovered
during the process, and skill improvements.

Key changes:
- Validation report with 38 PASS / 1 FAIL (pre-existing) / 27 SKIP / 19 BLOCKED
- Validation plan, execution strategy, and stakeholder handoff report
- Failure taxonomy (12 codes) and evidence block format in reporting-rules.md
- Skill-auditor skill added, audit report expanded to all 6 skills
- Playwright setup doc and skill reference updates
- CI path filter now includes internal/ for template compat
- cre-qa-runner SKILL.md: added rule to preserve all template checklist items
- .env.example for credential setup documentation
- Scripts patched: rg -> grep, init_report.sh template path fix

Made-with: Cursor

* docs(validation): add manual operator validation results to report

Adds Section D documenting 18 manual checks Wilson ran independently
in the Cursor IDE terminal, plus the end-to-end cre-qa-runner skill
test that produced the dated QA report.

Made-with: Cursor

* chore: remove skill-audit-report.md

Made-with: Cursor

* fix: resolve all 4 open P3 gaps from validation report

- Update validation-and-report-plan.md Stream 4 to reflect playwright-cli
  skill exists (was marked "Does not exist")
- Merge design doc taxonomy codes into reporting-rules.md (12 → 16 codes:
  added FAIL_TUI, FAIL_NEGATIVE_PATH, FAIL_CONTRACT, BLOCKED_AUTH)
- Add Code column to QA report template for FAIL/BLOCKED traceability
- Improve collect_versions.sh terminal detection (Cursor, VS Code, TERM
  fallback instead of "unknown")

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.