Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions conf/serviceConfig/longjob.xml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@
<message>
<name>org.zstack.header.longjob.APIResumeLongJobMsg</name>
</message>
<message>
<name>org.zstack.header.longjob.APISuspendLongJobMsg</name>
</message>
<message>
<name>org.zstack.header.longjob.APICleanLongJobMsg</name>
</message>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
package org.zstack.header.longjob;

import org.zstack.header.message.APIEvent;
import org.zstack.header.rest.RestResponse;

@RestResponse(allTo = "inventory")
public class APISuspendLongJobEvent extends APIEvent {
private LongJobInventory inventory;

public LongJobInventory getInventory() {
return inventory;
}

public void setInventory(LongJobInventory inventory) {
this.inventory = inventory;
}

public APISuspendLongJobEvent() {
super();
}

public APISuspendLongJobEvent(String apiId) {
super(apiId);
}

public static APISuspendLongJobEvent __example__() {
APISuspendLongJobEvent event = new APISuspendLongJobEvent();
LongJobInventory inv = new LongJobInventory();
inv.setUuid(uuid());
inv.setState(LongJobState.Suspended);
event.setInventory(inv);
return event;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
package org.zstack.header.longjob

import org.zstack.header.longjob.LongJobInventory
import org.zstack.header.errorcode.ErrorCode

doc {

title "在这里输入结构的名称"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

文档标题未正确填写

title "在这里输入结构的名称" 是占位符文本,应当替换为实际的结构名称,例如 "暂停长任务事件""APISuspendLongJobEvent"

建议修复
-	title "在这里输入结构的名称"
+	title "暂停长任务事件"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
title "在这里输入结构的名称"
title "暂停长任务事件"
🤖 Prompt for AI Agents
In
`@header/src/main/java/org/zstack/header/longjob/APISuspendLongJobEventDoc_zh_cn.groovy`
at line 8, Replace the placeholder title string in
APISuspendLongJobEventDoc_zh_cn.groovy: change the line title "在这里输入结构的名称" to a
proper Chinese title such as "暂停长任务事件" (or the English identifier
"APISuspendLongJobEvent") so the document header accurately reflects the
structure; update only the title(...) call in that file to the chosen
descriptive text.


ref {
name "inventory"
path "org.zstack.header.longjob.APISuspendLongJobEvent.inventory"
desc "null"
type "LongJobInventory"
since "5.5.0"
clz LongJobInventory.class
}
field {
name "success"
desc ""
type "boolean"
since "5.5.0"
}
Comment on lines +10 to +23
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

字段描述缺失或为占位符

  • Line 13: desc "null" 应描述 inventory 字段的含义
  • Line 20: desc "" 应描述 success 字段的含义
建议修复
 	ref {
 		name "inventory"
 		path "org.zstack.header.longjob.APISuspendLongJobEvent.inventory"
-		desc "null"
+		desc "长任务详情"
 		type "LongJobInventory"
 		since "5.5.0"
 		clz LongJobInventory.class
 	}
 	field {
 		name "success"
-		desc ""
+		desc "操作是否成功"
 		type "boolean"
 		since "5.5.0"
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ref {
name "inventory"
path "org.zstack.header.longjob.APISuspendLongJobEvent.inventory"
desc "null"
type "LongJobInventory"
since "5.5.0"
clz LongJobInventory.class
}
field {
name "success"
desc ""
type "boolean"
since "5.5.0"
}
ref {
name "inventory"
path "org.zstack.header.longjob.APISuspendLongJobEvent.inventory"
desc "Long job inventory details"
type "LongJobInventory"
since "5.5.0"
clz LongJobInventory.class
}
field {
name "success"
desc "Whether the operation succeeded"
type "boolean"
since "5.5.0"
}
🤖 Prompt for AI Agents
In
`@header/src/main/java/org/zstack/header/longjob/APISuspendLongJobEventDoc_zh_cn.groovy`
around lines 10 - 23, The documentation contains missing placeholder
descriptions for the APISuspendLongJobEvent fields; update the ref block for
"inventory" (path "org.zstack.header.longjob.APISuspendLongJobEvent.inventory",
clz LongJobInventory) to include a concise Chinese description of what the
LongJobInventory represents (e.g., long job metadata being suspended), and
update the field block "success" to include a Chinese description indicating
whether the suspend operation succeeded (boolean true/false); ensure both desc
strings are meaningful, concise, and match existing doc style.

ref {
name "error"
path "org.zstack.header.longjob.APISuspendLongJobEvent.error"
desc "错误码,若不为null,则表示操作失败, 操作成功时该字段为null",false
type "ErrorCode"
since "5.5.0"
clz ErrorCode.class
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
package org.zstack.header.longjob;

import org.springframework.http.HttpMethod;
import org.zstack.header.identity.Action;
import org.zstack.header.message.APIMessage;
import org.zstack.header.message.APIParam;
import org.zstack.header.rest.RestRequest;

@Action(category = LongJobConstants.ACTION_CATEGORY)
@RestRequest(
path = "/longjobs/{uuid}/actions",
isAction = true,
method = HttpMethod.PUT,
responseClass = APISuspendLongJobEvent.class
)
public class APISuspendLongJobMsg extends APIMessage implements LongJobMessage {
@APIParam(resourceType = LongJobVO.class, checkAccount = true)
private String uuid;

public String getUuid() {
return uuid;
}

public void setUuid(String uuid) {
this.uuid = uuid;
}

public static APISuspendLongJobMsg __example__() {
APISuspendLongJobMsg msg = new APISuspendLongJobMsg();
msg.setUuid(uuid());
return msg;
}

@Override
public String getLongJobUuid() {
return uuid;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
package org.zstack.header.longjob

import org.zstack.header.longjob.APISuspendLongJobEvent

doc {
title "SuspendLongJob"

category "longjob"

desc """在这里填写API描述"""

rest {
request {
url "PUT /v1/longjobs/{uuid}/actions"

header (Authorization: 'OAuth the-session-uuid')

clz APISuspendLongJobMsg.class

desc """"""

params {

column {
name "uuid"
enclosedIn "suspendLongJob"
desc "资源的UUID,唯一标示该资源"
location "url"
type "String"
optional false
since "5.5.0"
}
column {
name "systemTags"
enclosedIn ""
desc "系统标签"
location "body"
type "List"
optional true
since "5.5.0"
}
column {
name "userTags"
enclosedIn ""
desc "用户标签"
location "body"
type "List"
optional true
since "5.5.0"
}
}
}

response {
clz APISuspendLongJobEvent.class
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
public interface LongJob {
void start(LongJobVO job, ReturnValueCompletion<APIEvent> completion);
default void cancel(LongJobVO job, ReturnValueCompletion<Boolean> completion) {}
default void suspend(LongJobVO job, ReturnValueCompletion<Boolean> completion) {}
default void resume(LongJobVO job, ReturnValueCompletion<APIEvent> completion) {}
Comment on lines +13 to 14
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

新增接口方法请补充 Javadoc 说明。

suspend(...) 是新接口方法,按规范需 Javadoc,建议说明默认 no-op 行为及 completion 的期望调用方式。根据编码规范。

📝 建议补充的注释示例
+    /**
+     * Suspend the long job if supported.
+     * Default implementation is no-op; implementations should invoke completion.
+     */
     default void suspend(LongJobVO job, ReturnValueCompletion<Boolean> completion) {}
🤖 Prompt for AI Agents
In `@header/src/main/java/org/zstack/header/longjob/LongJob.java` around lines 13
- 14, Add Javadoc to the new LongJob.suspend(LongJobVO,
ReturnValueCompletion<Boolean>) method (and update resume(LongJobVO,
ReturnValueCompletion<APIEvent>) if missing) that states this default
implementation is a no-op, documents the semantics of the parameters (LongJobVO
job and the completion callback), and clearly describes the expected behavior
for calling completion (e.g., invoke completion.success(true/false) on success
or completion.fail(Throwable) on error). Reference the LongJob interface and the
exact method signatures so the doc clarifies default behavior and the
caller/implementer contract for invoking the provided ReturnValueCompletion.

default void clean(LongJobVO job, NoErrorCompletion completion) {}
default Class getAuditType() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ public interface LongJobFactory {
LongJob getLongJob(String jobName);
TreeMap<String, String> getFullJobName();
boolean supportCancel(String jobName);
boolean supportSuspend(String jobName);
boolean supportResume(String jobName);
boolean supportClean(String jobName);
}
10 changes: 10 additions & 0 deletions longjob/src/main/java/org/zstack/longjob/LongJobFactoryImpl.java
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ public class LongJobFactoryImpl implements LongJobFactory, Component {
private TreeMap<String, String> fullJobName = new TreeMap<>();

private Set<String> notSupportCancelJobType = new HashSet<>();
private Set<String> notSupportSuspendJobType = new HashSet<>();
private Set<String> notSupportResumeJobType = new HashSet<>();
private Set<String> notSupportCleanJobType = new HashSet<>();

Expand Down Expand Up @@ -80,6 +81,11 @@ public boolean supportCancel(String jobName) {
return !notSupportCancelJobType.contains(jobName);
}

@Override
public boolean supportSuspend(String jobName) {
return !notSupportSuspendJobType.contains(jobName);
}

@Override
public boolean supportResume(String jobName) {
return !notSupportResumeJobType.contains(jobName);
Expand All @@ -96,6 +102,10 @@ private void checkBehaviorSupported(String jobName, LongJob job) {
notSupportCancelJobType.add(jobName);
}

if (method.getName().equals("suspend") && method.isDefault()) {
notSupportSuspendJobType.add(jobName);
}

if (method.getName().equals("resume") && method.isDefault()) {
notSupportResumeJobType.add(jobName);
}
Expand Down
76 changes: 76 additions & 0 deletions longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,8 @@ private void handleApiMessage(APIMessage msg) {
handle((APIRerunLongJobMsg) msg);
} else if (msg instanceof APIResumeLongJobMsg) {
handle((APIResumeLongJobMsg) msg);
} else if (msg instanceof APISuspendLongJobMsg) {
handle((APISuspendLongJobMsg) msg);
} else if (msg instanceof APICleanLongJobMsg) {
handle((APICleanLongJobMsg) msg);
} else {
Expand Down Expand Up @@ -424,6 +426,80 @@ public String getName() {
});
}

private void handle(APISuspendLongJobMsg msg) {
thdf.chainSubmit(new ChainTask(msg) {
@Override
public String getSyncSignature() {
return "longjob-" + msg.getUuid();
}

@Override
public void run(SyncTaskChain chain) {
final APISuspendLongJobEvent evt = new APISuspendLongJobEvent(msg.getId());
suspendLongJob(msg.getUuid(), new ReturnValueCompletion<LongJobVO>(chain) {
@Override
public void success(LongJobVO vo) {
evt.setInventory(LongJobInventory.valueOf(vo));
bus.publish(evt);
chain.next();
}

@Override
public void fail(ErrorCode errorCode) {
evt.setError(errorCode);
bus.publish(evt);
chain.next();
}
});
}

@Override
public String getName() {
return String.format("suspend-longjob-%s", msg.getUuid());
}
});
}

private void suspendLongJob(String uuid, ReturnValueCompletion<LongJobVO> completion) {
Tuple t = Q.New(LongJobVO.class).eq(LongJobVO_.uuid, uuid).select(LongJobVO_.state, LongJobVO_.jobName).findTuple();
LongJobState currentState = t.get(0, LongJobState.class);
String jobName = t.get(1, String.class);

if (currentState == LongJobState.Suspended) {
LongJobVO vo = dbf.findByUuid(uuid, LongJobVO.class);
completion.success(vo);
return;
}

if (currentState != LongJobState.Running) {
completion.fail(err(ORG_ZSTACK_LONGJOB_10002, LongJobErrors.NOT_SUPPORTED, "can only suspend running jobs, current state: %s", currentState));
return;
}

if (!longJobFactory.supportSuspend(jobName)) {
completion.fail(err(ORG_ZSTACK_LONGJOB_10002, LongJobErrors.NOT_SUPPORTED, "job type %s does not support suspend", jobName));
return;
}

LongJobVO vo = dbf.findByUuid(uuid, LongJobVO.class);
LongJob job = longJobFactory.getLongJob(vo.getJobName());

job.suspend(vo, new ReturnValueCompletion<Boolean>(completion) {
@Override
public void success(Boolean suspended) {
LongJobVO updatedVo = changeState(uuid, LongJobStateEvent.suspend);
logger.info(String.format("longjob [uuid:%s, name:%s] has been suspended", vo.getUuid(), vo.getName()));
completion.success(updatedVo);
}
Comment on lines +487 to +493
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

必须处理 suspended 返回值,否则可能错误地标记为 Suspended。
当前无论 job.suspend(...) 返回 true/false 都会切到 Suspended。若实现返回 false 表示未真正暂停,这里会造成状态错误。

建议仅在 suspended == true 时变更状态,或按契约将 false 视为失败/保持原状态。

🔧 建议修正示例
job.suspend(vo, new ReturnValueCompletion<Boolean>(completion) {
    `@Override`
    public void success(Boolean suspended) {
-        LongJobVO updatedVo = changeState(uuid, LongJobStateEvent.suspend);
-        logger.info(String.format("longjob [uuid:%s, name:%s] has been suspended", vo.getUuid(), vo.getName()));
-        completion.success(updatedVo);
+        if (Boolean.TRUE.equals(suspended)) {
+            LongJobVO updatedVo = changeState(uuid, LongJobStateEvent.suspend);
+            logger.info(String.format("longjob [uuid:%s, name:%s] has been suspended", vo.getUuid(), vo.getName()));
+            completion.success(updatedVo);
+        } else {
+            logger.info(String.format("longjob [uuid:%s, name:%s] suspend returned false; keep state unchanged", vo.getUuid(), vo.getName()));
+            completion.success(dbf.findByUuid(uuid, LongJobVO.class));
+        }
    }
🤖 Prompt for AI Agents
In `@longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java` around
lines 487 - 493, The success handler for job.suspend currently ignores the
suspended Boolean and always transitions the job to Suspended; modify the
anonymous ReturnValueCompletion<Boolean> in the job.suspend(...) call so it
checks the suspended value and only calls changeState(uuid,
LongJobStateEvent.suspend) and completion.success(...) when suspended == true,
and otherwise call completion.fail(...) or completion.success(vo) (to keep
original state) per the method contract; update the success(Boolean suspended)
method in that block and ensure logging reflects whether the suspend actually
succeeded.


@Override
public void fail(ErrorCode errorCode) {
logger.error(String.format("failed to suspend longjob [uuid:%s, name:%s]: %s", vo.getUuid(), vo.getName(), errorCode));
completion.fail(errorCode);
}
});
}
Comment on lines +484 to +501
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# 查找 LongJobState 和 LongJobStateEvent 枚举定义
find . -name "*.java" -type f | xargs grep -l "enum LongJobState\|enum LongJobStateEvent" | head -5

Repository: MatheMatrix/zstack

Length of output: 253


🏁 Script executed:

# 在 LongJobManagerImpl.java 中查找 cancelLongJob 和其他状态转换方法
rg -n "cancelLongJob|suspendLongJob|resumeLongJob|changeState" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java -A 5 -B 1

Repository: MatheMatrix/zstack

Length of output: 6832


🏁 Script executed:

# 查找 LongJobState 和 LongJobStateEvent 的定义
rg -n "enum LongJobState|enum LongJobStateEvent" --type=java -A 25

Repository: MatheMatrix/zstack

Length of output: 4185


🏁 Script executed:

# 查看 Running 状态的所有可能转移
rg -n "Running\.transactions" header/src/main/java/org/zstack/header/longjob/LongJobState.java -A 10

Repository: MatheMatrix/zstack

Length of output: 750


🏁 Script executed:

# 查看 changeState 方法的实现
rg -n "private.*changeState|public.*changeState" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java -A 15

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 用不同方式查找 changeState 方法
rg -n "changeState" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | head -20

Repository: MatheMatrix/zstack

Length of output: 824


🏁 Script executed:

# 查找 changeState 的完整实现
rg -n "def changeState|LongJobVO changeState" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java -A 20

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 使用 ast-grep 查找 changeState 方法
ast-grep --pattern 'LongJobVO changeState($_, $_) { $$$ }'

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 查找 changeState 方法定义,搜索返回类型
rg -n "changeState\(" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java -B 2 | grep -E "private|public|protected" -A 2

Repository: MatheMatrix/zstack

Length of output: 1026


🏁 Script executed:

# 在整个文件中搜索方法签名
cat longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | grep -n "changeState" -B 5 | head -50

Repository: MatheMatrix/zstack

Length of output: 2176


🏁 Script executed:

# 直接查看文件的特定行范围,寻找 changeState 方法定义
wc -l longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java

Repository: MatheMatrix/zstack

Length of output: 131


🏁 Script executed:

# 查找 changeState 的完整定义(在文件后部分)
tail -400 longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | head -200 | grep -n "changeState" -B 5 -A 15

Repository: MatheMatrix/zstack

Length of output: 4088


🏁 Script executed:

# 搜索所有的方法定义,找到 changeState
rg -n "^\s*(private|public|protected|static).*changeState" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 用 sed 查找 changeState 方法定义
sed -n '900,1005p' longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | grep -n "changeState" -B 5 -A 20

Repository: MatheMatrix/zstack

Length of output: 1126


🏁 Script executed:

# 使用 sed 直接查看文件特定行数范围
sed -n '750,850p' longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java

Repository: MatheMatrix/zstack

Length of output: 3523


🏁 Script executed:

# 搜索 LongJobVO 返回类型的方法定义
rg -n "LongJobVO.*changeState|changeState.*\{" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java -B 1 -A 20 | head -80

Repository: MatheMatrix/zstack

Length of output: 4100


🏁 Script executed:

# 搜索 private/public 修饰的 changeState 方法
sed -n '800,900p' longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | grep -n "changeState\|private\|public"

Repository: MatheMatrix/zstack

Length of output: 513


🏁 Script executed:

# 用 awk 找到 changeState 方法的确切位置
awk '/private.*changeState|public.*changeState/{print NR":"$0; for(i=1;i<=30;i++){getline; print NR":"$0}}' longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | head -60

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 从后往前搜索 changeState 的方法定义
rg -n "^\s*private\s+LongJobVO\s+changeState|^\s*public\s+LongJobVO\s+changeState" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java -A 30

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 直接查看整个文件,搜索所有的 LongJobVO changeState 方法
grep -n "LongJobVO changeState" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java -A 25

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 搜索包含 changeState 的所有行
rg "changeState" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java -n

Repository: MatheMatrix/zstack

Length of output: 824


🏁 Script executed:

# 尝试搜索方法定义的其他模式
rg -n "changeState.*\(" longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | tail -20

Repository: MatheMatrix/zstack

Length of output: 824


🏁 Script executed:

# 查看文件末尾部分,changeState 可能定义在那里
tail -200 longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java

Repository: MatheMatrix/zstack

Length of output: 7196


🏁 Script executed:

# 查找 LongJobManagerImpl 的类定义和继承
head -50 longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | grep -n "class\|extends\|implements"

Repository: MatheMatrix/zstack

Length of output: 44


🏁 Script executed:

# 搜索所有文件中的 changeState 方法定义
rg -n "LongJobVO.*changeState\s*\(" --type=java -B 2 -A 20 | head -100

Repository: MatheMatrix/zstack

Length of output: 10796


可考虑添加中间状态以保持一致性,但当前设计可能依赖幂等性

对比 cancelLongJob 的实现,取消操作在调用 job.cancel() 之前会先设置 Canceling 中间状态(第 319 行),以防止并发请求干扰。而 suspendLongJob 在调用 job.suspend() 前没有设置中间状态,而是在成功后才通过 changeState() 改为 Suspended 状态(第 490 行)。

这意味着两个并发的 suspend 请求都可能通过 Running 状态检查(第 465 行)并调用 job.suspend()。不过值得注意的是:

  1. changeState() 方法使用了 synchronized (jobUuids.intern(uuid)) 进行同步化处理
  2. 状态机允许 Suspended → suspend → Suspended 的幂等转换

这表明当前设计可能依赖 suspend 操作的幂等性。建议明确文档说明 suspend 操作是否设计为幂等的,或考虑添加 Suspending 中间状态以与 cancel 的设计模式保持一致。

🤖 Prompt for AI Agents
In `@longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java` around
lines 484 - 501, The suspend path lacks the intermediate "Suspending" state used
by cancelLongJob, which can allow concurrent suspend calls to race; before
calling job.suspend(...) in suspendLongJob, set the intermediate state (e.g.,
changeState(uuid, LongJobStateEvent.suspending) or the appropriate enum) while
holding the same synchronization (synchronized(jobUuids.intern(uuid))) as
cancelLongJob, then call job.suspend; on success move to Suspended (as now) and
on failure revert or record the proper state and propagate the error, mirroring
the cancelLongJob pattern (use the same locking, changeState calls, and error
handling around job.suspend to ensure consistency).


private void handle(ResumeLongJobMsg msg) {
thdf.chainSubmit(new ChainTask(msg) {
@Override
Expand Down
22 changes: 22 additions & 0 deletions longjob/src/main/java/org/zstack/longjob/LongJobUtils.java
Original file line number Diff line number Diff line change
Expand Up @@ -162,11 +162,33 @@ static LongJobStateEvent getEventOnError(ErrorCode errorCode) {
return LongJobStateEvent.suspend;
} else if (errorCode.isError(LongJobErrors.CANCELED)) {
return LongJobStateEvent.canceled;
} else if (isRecoverableError(errorCode)) {
return LongJobStateEvent.suspend;
} else {
return LongJobStateEvent.fail;
}
}

/**
* Check if an error is marked as recoverable in its opaque field.
* Any business module can mark an error as recoverable by setting
* "longJobRecoverable" to true in the error code's opaque field.
* This allows the long job framework to automatically suspend instead of fail
* for recoverable errors, enabling automatic retry after service restart.
*
* @param errorCode the error code to check
* @return true if the error is marked as recoverable, false otherwise
*/
private static boolean isRecoverableError(ErrorCode errorCode) {
// Check if error code has recoverable flag in opaque
// Any business module can set this flag to indicate the error is recoverable
Object recoverable = errorCode.getFromOpaque("longJobRecoverable");
if (recoverable instanceof Boolean && (Boolean) recoverable) {
return true;
}
return false;
}
Comment on lines +165 to +190
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# 1. 查找 getEventOnError 方法定义和调用点
echo "=== Looking for getEventOnError definition and calls ==="
rg -nP --type=java 'getEventOnError' longjob/

# 2. 查找相关的 opaque key 常量是否已存在
echo -e "\n=== Checking for existing OPAQUE constants ==="
rg -nP --type=java 'longJobRecoverable|OPAQUE.*KEY' longjob/

# 3. 查找 ErrorCode 处理模式
echo -e "\n=== Checking ErrorCode null handling patterns ==="
rg -nP --type=java 'if.*errorCode\s*==\s*null|errorCode\.get' longjob/ | head -20

Repository: MatheMatrix/zstack

Length of output: 920


🏁 Script executed:

#!/bin/bash
# 查看 LongJobManagerImpl.java 第 721 行的上下文,了解 errorCode 是否可能为空
echo "=== Context around line 721 in LongJobManagerImpl.java ==="
sed -n '710,735p' longjob/src/main/java/org/zstack/longjob/LongJobManagerImpl.java | cat -n

# 也需要查看完整的 getEventOnError 方法定义
echo -e "\n=== Full getEventOnError method ==="
sed -n '160,192p' longjob/src/main/java/org/zstack/longjob/LongJobUtils.java | cat -n

Repository: MatheMatrix/zstack

Length of output: 3296


🏁 Script executed:

#!/bin/bash
# 查看 fail 回调方法是否来自接口定义,以及 ErrorCode 是否应该允许为空
echo "=== Search for fail callback interface ==="
rg -nP --type=java 'void\s+fail\s*\(\s*ErrorCode' longjob/ | head -5

# 查看 ZStack 中 ErrorCode 的设计模式
echo -e "\n=== Check ErrorCode null patterns in broader codebase ==="
fd --type f -name "*.java" header | xargs rg -l 'ErrorCode.*null|null.*ErrorCode' | head -5

# 检查这个 opaque key 是否在其他地方被使用或定义
echo -e "\n=== Search for similar opaque key patterns ==="
rg -nP --type=java 'getFromOpaque|setToOpaque' longjob/ | head -10

Repository: MatheMatrix/zstack

Length of output: 1646


🏁 Script executed:

#!/bin/bash
# 查看 ErrorCode 接口/类的定义,了解它是否应该允许为空
echo "=== Check ErrorCode definition ==="
fd --type f -name "ErrorCode.java" | xargs cat -n | head -50

# 查看更多 getFromOpaque 的使用模式,了解是否有其他地方定义了相关常量
echo -e "\n=== Look for opaque key constants in ErrorCode or related classes ==="
rg -nP --type=java 'OPAQUE|Opaque' header/ | grep -i key | head -10

# 检查是否有已有的常量定义模式
echo -e "\n=== Check for constant definitions for opaque keys ==="
rg -nP --type=java 'static\s+final\s+String.*=' header/ | grep -i opaque | head -10

Repository: MatheMatrix/zstack

Length of output: 933


🏁 Script executed:

#!/bin/bash
# 查看 ErrorCode.java 的完整定义
echo "=== Check ErrorCode definition ==="
find . -name "ErrorCode.java" -path "*/header/*" -exec cat {} \; | head -100

# 看看 opaque 相关的实现
echo -e "\n=== Check opaque implementation in ErrorCode ==="
find . -name "ErrorCode.java" -path "*/header/*" -exec grep -A5 "putToOpaque\|getFromOpaque" {} \;

Repository: MatheMatrix/zstack

Length of output: 3078


将 opaque key 字符串提取为常量

根据编码规范,应避免魔法值。建议将 "longJobRecoverable" 字符串定义为 static final 常量,便于跨模块复用并防止拼写错误。此外,建议在 isRecoverableError() 方法中增加防御性的 null 检查,提高代码鲁棒性。

建议修改
+    private static final String LONG_JOB_RECOVERABLE_OPAQUE_KEY = "longJobRecoverable";
+
     private static boolean isRecoverableError(ErrorCode errorCode) {
+        if (errorCode == null) {
+            return false;
+        }
-        Object recoverable = errorCode.getFromOpaque("longJobRecoverable");
-        if (recoverable instanceof Boolean && (Boolean) recoverable) {
-            return true;
-        }
-        return false;
+        Object recoverable = errorCode.getFromOpaque(LONG_JOB_RECOVERABLE_OPAQUE_KEY);
+        return Boolean.TRUE.equals(recoverable);
     }
🤖 Prompt for AI Agents
In `@longjob/src/main/java/org/zstack/longjob/LongJobUtils.java` around lines 165
- 190, Extract the magic string "longJobRecoverable" into a static final
constant (e.g., LONG_JOB_RECOVERABLE) in LongJobUtils and replace its usages
inside isRecoverableError(ErrorCode) with that constant; also harden
isRecoverableError by null-checking errorCode and the opaque value (ensure
errorCode != null, getFromOpaque(...) != null) before instanceof/boolean checks
and return false on nulls to avoid NPEs.


private static void setExecuteTimeIfNeed(LongJobVO job) {
if (job.getExecuteTime() == null) {
long time = (System.currentTimeMillis() - job.getCreateDate().getTime()) / 1000;
Expand Down
Loading