Skip to content

【QA TEST Don't merge 】support eb5#6944

Open
mmglove wants to merge 2 commits intoPaddlePaddle:developfrom
mmglove:liucong-eb5
Open

【QA TEST Don't merge 】support eb5#6944
mmglove wants to merge 2 commits intoPaddlePaddle:developfrom
mmglove:liucong-eb5

Conversation

@mmglove
Copy link

@mmglove mmglove commented Mar 20, 2026

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick,PR标题需遵循格式,在最开始加上[Cherry-Pick]标签,以及最后面加上原PR ID,例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

Copilot AI review requested due to automatic review settings March 20, 2026 03:04
@paddle-bot
Copy link

paddle-bot bot commented Mar 20, 2026

Thanks for your contribution!

@paddle-bot paddle-bot bot added the contributor External developers label Mar 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

该 PR 标题为“support eb5”,从改动内容看主要是在模型加载/量化(尤其 NVFP4 MoE)以及权重后处理路径上做了调整,并插入了较多调试日志;同时对 FDConfig.override_name_from_config() 的层数逻辑做了破坏性改动。

Changes:

  • 调整 modules_to_convert():合并多来源的 exclude pattern,并尝试适配不同模型前缀名。
  • 修改 NVFP4 MoE:变更 gate/up 权重加载顺序开关、跳过 blockscale 的 swizzle/interleave 处理、增加大量运行期日志。
  • 在权重 transpose、Linear 权重加载、默认 loader 等路径加入(或保留注释的)调试日志;并强制 num_hidden_layers = 1

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
fastdeploy/model_executor/utils.py 给权重 transpose/ckpt suffix rename 增加日志与分支处理(含 early return)。
fastdeploy/model_executor/model_loader/default_loader_v1.py 增加被注释掉的参数打印调试代码。
fastdeploy/model_executor/layers/utils.py 扩展量化转换的模块排除规则来源与前缀适配逻辑。
fastdeploy/model_executor/layers/quantization/nvfp4.py 更改 NVFP4 MoE 权重加载顺序与 blockscale 处理,并加入大量 info 日志。
fastdeploy/model_executor/layers/moe/moe.py 增加若干注释掉的调试日志。
fastdeploy/model_executor/layers/linear.py 引入多处 info 日志(构建/加载/分片加载路径)。
fastdeploy/config.py 注释掉原 remove_tail_layer 逻辑并强制设置 num_hidden_layers = 1

Comment on lines +133 to +137
logger.info(f"weight_name:{weight_name}")
weight = getattr(layer, weight_name)
if not weight._is_initialized():
logger.info("权重没初始化啊!")
return
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process_weight_transpose 里新增的 logger.info 会在每个权重后处理时输出,且包含中文日志(“权重没初始化啊!”),容易造成线上日志噪声/国际化问题。建议删除这些 info 日志,或至少改为 logger.debug 并用英文消息。

Copilot uses AI. Check for mistakes.
Comment on lines +363 to +369
# if hasattr(self, "num_hidden_layers") and self.runner != "pooling":
# if hasattr(self, "remove_tail_layer"):
# if self.remove_tail_layer is True:
# self.num_hidden_layers -= 1
# elif isinstance(self.remove_tail_layer, int):
# self.num_hidden_layers -= self.remove_tail_layer
self.num_hidden_layers = 1
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

override_name_from_config 里将 num_hidden_layers 强制设置为 1 会影响几乎所有模型的层数配置(例如构建 decoder 层数量、KV cache 形状等),属于全局破坏性改动;同时把原先的 remove_tail_layer 逻辑整段注释掉也会改变已有行为。建议恢复原逻辑,仅在确有需求的特定模型/配置条件下调整层数,并通过配置项显式控制。

Copilot uses AI. Check for mistakes.
Comment on lines 508 to +512
@property
def load_up_proj_weight_first(self) -> bool:
# FlashInfer CUTLASS kernel assumes [Up, Gate] Proj as W13
# 目前默认给True
return True
return False
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVFP4 的 load_up_proj_weight_first 从 True 改为 False,会直接改变 MoE gate/up 两个 shard 的加载顺序(moe.py 会据此计算 param_shard_offset),容易造成 gate/up 权重对调而精度异常。若只为支持特定 checkpoint/模型,建议通过配置或在检测到特定格式时切换,而不要改成对所有 NVFP4 全局生效。

Copilot uses AI. Check for mistakes.
Comment on lines 545 to +546
return loaded_weight_name
fd_suffix_map = {}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename_offline_ckpt_suffix_to_fd_suffix 里外层已经定义过 fd_suffix_map(并标了 noqa),这里又在 fn 内重新创建同名变量,会遮蔽外层变量且引入冗余,降低可读性。建议移除外层无用的 fd_suffix_map,或保留外层并在 fn 中只更新引用,避免重复定义。

Copilot uses AI. Check for mistakes.
Comment on lines +363 to +369
# if hasattr(self, "num_hidden_layers") and self.runner != "pooling":
# if hasattr(self, "remove_tail_layer"):
# if self.remove_tail_layer is True:
# self.num_hidden_layers -= 1
# elif isinstance(self.remove_tail_layer, int):
# self.num_hidden_layers -= self.remove_tail_layer
self.num_hidden_layers = 1
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

当前 PR 标题未按仓库约定的“[TAG]Title”格式(例如 [BugFix] ...),且描述基本仍是模板占位(Motivation/Modifications/Tests 未填写)。为避免后续合入/回溯困难,建议补全 PR 标题标签并在描述中说明改动动机、影响范围与验证方式。

Copilot uses AI. Check for mistakes.
Comment on lines +562 to +565
logger.info(f"up_gate_proj_input_scale:{up_gate_proj_input_scale}")
logger.info(f"up_gate_proj_weight_scale_2:{up_gate_proj_weight_scale_2}")
logger.info(f"down_proj_input_scale:{down_proj_input_scale}")
logger.info(f"down_proj_weight_scale_2:{layer.down_proj_weight_scale_2}")
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process_weights_after_loading() 末尾新增的 logger.info 会在权重加载后对每个 MoE layer 打印 scale 值,既噪声大也可能泄露内部数值细节;同时这些张量可能较大,格式化也有开销。建议移除或降级为 debug,并加可控开关。

Copilot uses AI. Check for mistakes.
Comment on lines +227 to 228
logger.info(f"self.quant_method:{self.quant_method}")
self.quant_method.process_loaded_weights(self, weight_tensor)
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_weight() 中新增的 logger.info 会在每个 Linear 权重加载时触发,大模型/多层会产生大量日志并影响加载耗时。建议删除该 info 日志或改为 logger.debug 并增加可配置开关。

Copilot uses AI. Check for mistakes.
)

self.hidden_size = fd_config.model_config.hidden_size
logger.info(f"prefix:{prefix}")
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReplicatedLinear.init 中新增的 logger.info 会在模型构建时对每个线性层输出 prefix,日志量很大且对定位问题帮助有限。建议移除或改为 debug,并通过开关控制仅在排障时开启。

Suggested change
logger.info(f"prefix:{prefix}")
logger.debug(f"prefix:{prefix}")

Copilot uses AI. Check for mistakes.
"gate",
], f"loaded_shard_id must be one of ['qkv', 'gate'], but got {loaded_shard_id}"

logger.info(f"loaded_shard_id:{loaded_shard_id}")
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weight_loader() 中对 loaded_shard_id 的 logger.info 会在 shard 级别频繁触发(尤其是分片加载/TP 场景),容易造成日志刷屏。建议改为 debug 或移除。

Suggested change
logger.info(f"loaded_shard_id:{loaded_shard_id}")
logger.debug(f"loaded_shard_id:{loaded_shard_id}")

Copilot uses AI. Check for mistakes.
@mmglove
Copy link
Author

mmglove commented Mar 20, 2026

AI CI Agent Test

This is a test comment from AI CI Agent.

  • Time: 2026-03-20T11:30:43.867741
  • Repo: PaddlePaddle/FastDeploy
  • PR: 6944

@mmglove
Copy link
Author

mmglove commented Mar 20, 2026

AI CI Agent | skill: pr_review_agent

Request Changes - P0/P1 问题列表:

P0 - 严重问题:

  1. fastdeploy/config.py:367 - self.num_hidden_layers = 1 强制将所有模型层数设为1,破坏所有模型的正常功能

P1 - 重要问题:

  1. fastdeploy/model_executor/layers/quantization/nvfp4.py:509 - load_up_proj_weight_first 返回值从 True 改为 False,改变 MoE 权重加载顺序,可能导致权重加载错误

  2. fastdeploy/model_executor/layers/quantization/nvfp4.py:535-538 - 注释掉了 weight_scale.shape[2] % 16 == 0dtype == paddle.float8_e4m3fn 的断言检查,移除了关键验证逻辑,可能导致运行时错误

  3. fastdeploy/model_executor/layers/quantization/nvfp4.py:540,546 - 跳过 _process_scale_interleaved 处理直接使用原始权重,可能破坏 NVFP4 量化精度

  4. 多处添加 logger.info 调试日志(linear.py、nvfp4.py、utils.py),影响推理性能

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 20, 2026 06:26
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ mmglove
❌ mpgemm
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

default_initializer=paddle.nn.initializer.Constant(0),
is_bias=False,
)
logger.info(f"weight_tmp:{weight_tmp}")
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logger.info(f"weight_tmp:{weight_tmp}") 会打印 Parameter 对象(可能包含设备/shape 等),在加载阶段频繁调用会显著拖慢并污染日志。建议删除该 info 日志或改为 debug 且仅打印 shape/dtype 等摘要信息。

Suggested change
logger.info(f"weight_tmp:{weight_tmp}")
logger.debug(
"Created temporary weight parameter for %s with shape=%s, dtype=%s",
weight_name,
tuple(weight_tmp.shape),
str(weight_tmp.dtype),
)

Copilot uses AI. Check for mistakes.
Comment on lines 543 to 547
def fn(loaded_weight_name, is_moe):
if fd_config.quant_config is None or fd_config.quant_config.is_checkpoint_bf16:
return loaded_weight_name
fd_suffix_map = {}
# Can be extended to other offline quantization suffixes if needed.
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fn 内部重新赋值 fd_suffix_map = {} 会遮蔽外层同名变量,且属于冗余代码(后面会在分支里覆盖)。建议删除该行或直接复用外层映射,减少不必要的变量重置/可读性负担。

Copilot uses AI. Check for mistakes.
Comment on lines 535 to +542
for name, weight_scale in [
("up_gate", layer.up_gate_proj_weight_scale),
("down", layer.down_proj_weight_scale),
]:
assert weight_scale.shape[2] % 16 == 0, f"Expected {name}_weight_scale.dim(2) to be divisible by 16"
assert (
weight_scale.dtype == paddle.float8_e4m3fn
), f"{name} Weight Blockscale must be represented as FP8-E4M3"

up_gate_proj_blockscale_swizzled = _process_scale_interleaved(layer.up_gate_proj_weight_scale)
free_tensor(layer.up_gate_proj_weight_scale)
layer.up_gate_proj_weight_scale = None
if weight_scale.shape[2] % 4 != 0:
logger.warning(
"NVFP4 %s_weight_scale K' not multiple of 4: shape=%s, group_size=%s",
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里移除了对 weight_scale dtype/维度可整除性的硬断言(之前用于保证后续 scale interleave/swizzle 的前置条件),现在即使 shape 不满足要求也会继续执行,可能导致 FlashInfer 内核读取越界或数值错误。建议保留必要的断言/显式错误(至少保证 K' 可被 4 整除且 dtype 为 float8_e4m3fn),避免仅 warning 后继续跑。

Copilot uses AI. Check for mistakes.
Comment on lines 1181 to 1183

logger.info(f"loaded_shard_id:{loaded_shard_id}")
if loaded_shard_id == "qkv":
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weight_loader 在热路径以 info 级别打印 loaded_shard_id,会在权重分片加载过程中被高频调用,导致日志膨胀和加载变慢。建议删除或降级为 debug,并仅在排障开关打开时记录。

Copilot uses AI. Check for mistakes.
param.tensor_track.mark(start=param_shard_offset, end=param_shard_offset + param_shard_size)

param = slice_fn(param, output_dim, start=param_shard_offset, end=param_shard_offset + param_shard_size)
logger.info(f"loaded_weight.shape:{loaded_weight.shape}")
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qkv_weight_loader 中新增的 logger.info(loaded_weight.shape) 位于循环/分片加载路径,会产生大量日志并影响加载性能。建议删除该日志或改为 debug 且仅在异常分支(assert 失败前)输出必要信息。

Suggested change
logger.info(f"loaded_weight.shape:{loaded_weight.shape}")
if param.shape != loaded_weight.shape:
logger.debug(
"Shape mismatch before loading qkv weight, param.shape=%s, loaded_weight.shape=%s",
param.shape,
loaded_weight.shape,
)

Copilot uses AI. Check for mistakes.
Comment on lines +548 to 560
# up_gate_proj_blockscale_swizzled = _process_scale_interleaved(layer.up_gate_proj_weight_scale)
up_gate_proj_blockscale_swizzled = layer.up_gate_proj_weight_scale
create_parameter_and_copy(
layer, name="up_gate_proj_blockscale_swizzled", weight=up_gate_proj_blockscale_swizzled
)
down_proj_blockscale_swizzled = _process_scale_interleaved(layer.down_proj_weight_scale)
free_tensor(layer.up_gate_proj_weight_scale)
layer.up_gate_proj_weight_scale = None

# down_proj_blockscale_swizzled = _process_scale_interleaved(layer.down_proj_weight_scale)
down_proj_blockscale_swizzled = layer.down_proj_weight_scale
create_parameter_and_copy(layer, name="down_proj_blockscale_swizzled", weight=down_proj_blockscale_swizzled)
free_tensor(layer.down_proj_weight_scale)
layer.down_proj_weight_scale = None
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up_gate/down 的 blockscale_swizzled 现在直接复用原始 weight_scale(并注释掉 _process_scale_interleaved),但后续 flashinfer_cutlass_fused_moe 仍按“swizzled blockscale”来消费(quant_scales 里传入 view(int32))。如果 kernel 期望的是 interleaved/swizzled 布局,这会导致错误结果。建议恢复 _process_scale_interleaved 或明确实现与 kernel 对齐的 swizzle 逻辑,并在注释/文档中说明原因。

Copilot uses AI. Check for mistakes.
Comment on lines +604 to +614
logger.info(f"up_gate_proj_input_scale:{layer.up_gate_proj_input_scale_quant}")
logger.info(f"g1_alphas:{layer.g1_alphas}")
logger.info(
f"layer.up_gate_proj_blockscale_swizzled:{layer.up_gate_proj_blockscale_swizzled.view(paddle.float8_e4m3fn)}"
)
logger.info(f"down_proj_input_scale_quant:{layer.down_proj_input_scale_quant}")
logger.info(
f"layer.down_proj_blockscale_swizzled:{layer.down_proj_blockscale_swizzled.view(paddle.float8_e4m3fn)}"
)
logger.info(f"g2_alphas:{layer.g2_alphas}")

Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里在 MoE apply 的热路径使用 logger.info 打印缩放/权重 blockscale 的内容(包含 view(float8) 的张量值),会造成严重的性能下降并可能刷爆日志。建议删除这些 info 日志,或改为 debug 并只打印 shape/dtype/统计信息(max/min),且确保默认关闭。

Suggested change
logger.info(f"up_gate_proj_input_scale:{layer.up_gate_proj_input_scale_quant}")
logger.info(f"g1_alphas:{layer.g1_alphas}")
logger.info(
f"layer.up_gate_proj_blockscale_swizzled:{layer.up_gate_proj_blockscale_swizzled.view(paddle.float8_e4m3fn)}"
)
logger.info(f"down_proj_input_scale_quant:{layer.down_proj_input_scale_quant}")
logger.info(
f"layer.down_proj_blockscale_swizzled:{layer.down_proj_blockscale_swizzled.view(paddle.float8_e4m3fn)}"
)
logger.info(f"g2_alphas:{layer.g2_alphas}")
def _log_tensor_stats(name, tensor):
# Debug-only tensor stats to avoid logging full tensor contents in hot path
try:
t_min = float(paddle.min(tensor))
t_max = float(paddle.max(tensor))
except Exception:
t_min, t_max = None, None
logger.debug(
"MoE quant tensor stats - %s: shape=%s, dtype=%s, min=%s, max=%s",
name,
list(tensor.shape),
str(tensor.dtype),
t_min,
t_max,
)
_log_tensor_stats("up_gate_proj_input_scale_quant", layer.up_gate_proj_input_scale_quant)
_log_tensor_stats("g1_alphas", layer.g1_alphas)
_log_tensor_stats("up_gate_proj_blockscale_swizzled", layer.up_gate_proj_blockscale_swizzled)
_log_tensor_stats("down_proj_input_scale_quant", layer.down_proj_input_scale_quant)
_log_tensor_stats("down_proj_blockscale_swizzled", layer.down_proj_blockscale_swizzled)
_log_tensor_stats("g2_alphas", layer.g2_alphas)

Copilot uses AI. Check for mistakes.
Comment on lines +293 to +302
# logger.info(f"param:{param}")
output_size = param[expert_id - self.expert_id_offset].shape[SHARD_ID_TO_SHARDED_DIM["gate"]]
shard_offsets = [
# (shard_id, shard_offset, shard_size)
("gate", 0, output_size // 2 * self.tp_size),
("up", output_size // 2 * self.tp_size, output_size // 2 * self.tp_size),
]

# logger.info(f"shard_offsets是啥:{shard_offsets}")

Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该函数中新增了多处注释掉的 logger.info 调试残留。建议在合入前删除这些注释代码,避免影响可读性;如需调试,请使用受控的 debug 开关。

Copilot uses AI. Check for mistakes.
@mmglove
Copy link
Author

mmglove commented Mar 20, 2026

AI CI Agent | skill: ci_failed_agent

Details

CI Failure Analysis

现在让我生成完整的分析报告:


🔍 PR #6944 CI 执行分析


📋 PR 基本信息

字段 内容
标题 【QA TEST Don't merge 】support eb5
作者 @mmglove
状态 🟢 OPEN
分支 liucong-eb5develop
可合并状态 ⚠️ CONFLICTING (有冲突)
PR 链接 #6944
创建时间 2026-03-20 03:04:22 UTC

🚦 CI 检查结果

总览: ✅ 1 通过 | ❌ 1 失败 | ⏳ 1 待处理

失败的 Job

状态 Job 名称 耗时 详情
Trigger Jenkins for PR 54s 查看

通过/待处理的 Job

状态 Job 名称 耗时 详情
Remove skip-ci labels on new commits 5s 查看
license/cla - 签署 CLA

❌ 失败原因分析

Trigger Jenkins for PR (Exit Code: 1)

错误摘要: Jenkins 测试 Job paddle_fastdeploy_metax_smoketest 执行失败

具体错误:

Traceback (most recent call last):
  File "/app/main.py", line 131, in <module>
    main()
  File "/app/main.py", line 123, in main
    raise Exception(
Exception: Build status returned "FAILURE". Build has failed ☹️.

详细执行过程:

1. ✅ 成功连接到 Jenkins (cicd.metax-tech.com)
2. ✅ 成功触发 Jenkins job: paddle_fastdeploy_metax_smoketest #5251
3. ❌ Jenkins job 内部测试执行失败

Jenkins Job 链接: https://cicd.metax-tech.com/job/paddle_fastdeploy_metax_smoketest/5251/console


🔬 根因分析

分析维度 结论
失败类型 Jenkins 内部测试失败
直接原因 smoketest 测试未通过
可能原因
  • 🔸 新增的 eb5 功能代码有问题
  • 🔸 代码与 develop 分支存在冲突 (PR 显示 CONFLICTING)
  • 🔸 测试环境/配置问题
  • 🔸 依赖的组件未正确安装或配置 |

💡 修复建议

1. 首要任务:解决分支冲突

# 1. 同步 develop 分支的最新代码
git checkout liucong-eb5
git fetch origin develop
git rebase origin/develop

# 2. 解决冲突后提交
git add .
git rebase --continue
git push origin liucong-eb5 --force

2. 查看详细 Jenkins 测试日志

访问以下链接查看具体哪个测试用例失败:

3. 根据 Jenkins 日志定位问题

在 Jenkins 日志中搜索以下关键字:

- FAILED (测试失败)
- AssertionError (断言错误)
- error: (编译错误)
- Traceback (Python 堆栈)

4. 本地复现测试

# 尝试在本地运行 smoketest 验证功能
# 具体命令请参考 Jenkins job 配置或项目文档

5. 签署 CLA

# CLA 签署状态为 PENDING,需要签署
# 访问: https://cla-assistant.io/PaddlePaddle/FastDeploy?pullRequest=6944

⚠️ 特别注意

这是一个 QA 测试用 PR,标题明确标注 "Don't merge",失败是正常的测试反馈。建议:

  1. ✅ 确认这是预期的测试失败还是代码问题
  2. ✅ 如果是代码问题,在修复后重新触发 CI
  3. ✅ 修复后可再次触发 CI 验证

📌 后续操作

操作 命令/链接
查看冲突详情 git checkout develop && git pull && git checkout liucong-eb5 && git merge develop
查看 Jenkins 日志 Console Output
重新触发 CI 在 PR 中写新评论 /retest 或推新 commit
签署 CLA CLA Assistant

分析时间: 2026-03-20


是否需要我将此分析报告发布到 PR #6944 的评论区?

@mmglove
Copy link
Author

mmglove commented Mar 20, 2026

AI CI Agent | skill: ci_failed_agent

Details

CI Failure Analysis

🔍 PR #6944 CI 执行分析


🚦 CI 检查结果

总览: ✅ 1 通过 | ❌ 1 失败 | ⏳ 1 运行中

状态 Job 名称 耗时 详情
Trigger Jenkins for PR 54s 查看
Remove skip-ci labels on new commits 5s 查看
license/cla - 查看

❌ 失败原因分析

Trigger Jenkins for PR (Exit Code: 1)

错误摘要: Jenkins job 构建失败

具体错误:

JENKINS_ACTION: Successfully connected to Jenkins.
JENKINS_ACTION: Start to start jenkins job
JENKINS_ACTION: jenkins job url: https://cicd.metax-tech.com/job/paddle_fastdeploy_metax_smoketest/5251/console
Exception: Build status returned "FAILURE". Build has failed ☹️.

根因分析:

修复建议:

  1. 访问 Jenkins 控制台查看详细日志:
    https://cicd.metax-tech.com/job/paddle_fastdeploy_metax_smoketest/5251/console
    
  2. 根据 Jenkins 日志定位具体失败原因(可能是测试失败、编译错误或环境问题)
  3. 修复代码后重新触发 CI

📌 常见失败处理

失败 Job 处理方式
Trigger Jenkins for PR 查看 Jenkins 控制台日志,定位具体失败原因

分析时间: 2026-03-20

@mmglove
Copy link
Author

mmglove commented Mar 20, 2026

AI CI Agent Test - Suggestion Demo

建议修复

- name: test
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Fix permission issue
      run: chmod +x scripts/fix.sh

代码建议

-     print("debug info")
+     logger.debug("debug info")

完整建议

def process_data(data):
-     result = data.strip().lower()
+     result = data.strip().lower().strip()
     return result
  • Time: 2026-03-20T18:21:37.864034
  • Repo: PaddlePaddle/FastDeploy
  • PR: 6944

@mmglove
Copy link
Author

mmglove commented Mar 20, 2026

AI CI Agent Test - Suggestion Demo

建议修复

- name: test
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Fix permission issue
      run: chmod +x scripts/fix.sh

代码建议

-     print("debug info")
+     logger.debug("debug info")

完整建议

def process_data(data):
-     result = data.strip().lower()
+     result = data.strip().lower().strip()
     return result
  • Time: 2026-03-20T18:21:39.244401
  • Repo: PaddlePaddle/FastDeploy
  • PR: 6944

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants