Skip to content

refactor: 给kook适配器尽可能覆盖pydantic数据类,使其更容易获取数据和校验消息结构#5719

Open
shuiping233 wants to merge 13 commits intoAstrBotDevs:masterfrom
shuiping233:refactor/add_kook_event_data_type
Open

refactor: 给kook适配器尽可能覆盖pydantic数据类,使其更容易获取数据和校验消息结构#5719
shuiping233 wants to merge 13 commits intoAstrBotDevs:masterfrom
shuiping233:refactor/add_kook_event_data_type

Conversation

@shuiping233
Copy link
Contributor

@shuiping233 shuiping233 commented Mar 3, 2026

Modifications / 改动点

  • astrbot/core/platform/sources/kook, 将大部分ws消息事件和请求接口定义成pydantic数据类,并将大量字段get方法改成更好使用的类型判断
  • tests/test_kook, pydantic的to_json行为和之前有点不同,所以要跟着修改一下测试用例
  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
  • 👀 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
  • 🤓 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到了 requirements.txtpyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
  • 😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

重构 KOOK 适配器和客户端,以在 WebSocket 事件、API 响应以及卡片消息结构中使用强类型的 Pydantic 模型,并相应调整消息解析、自发消息过滤以及测试。

Enhancements:

  • 使用 Pydantic 数据类和可区分联合类型替代临时的基于 dict 的 KOOK WebSocket 和 API 负载处理,用于 message、hello、resume-ack 事件以及网关/用户响应。
  • 通过引入显式的模块类型枚举、共享基础模型工具,以及用于文本和图片的结构化解析辅助函数,加强 KOOK 卡片消息和模块的类型定义。
  • 通过追踪机器人昵称/用户名、使用模式匹配优化信号处理,以及改进对无效响应或事件的错误日志记录,改进 KOOK 客户端行为。

Tests:

  • 更新 KOOK 测试和夹具以与基于 Pydantic 的模型、新的枚举用法以及针对卡片和订单消息调整后的 JSON 序列化保持一致。
Original summary in English

Summary by Sourcery

Refactor the KOOK adapter and client to use strongly-typed Pydantic models for WebSocket events, API responses, and card message structures, and adjust message parsing, self-message filtering, and tests accordingly.

Enhancements:

  • Replace ad-hoc dict-based KOOK WebSocket and API payload handling with Pydantic data classes and discriminated unions for message, hello, resume-ack events, and gateway/user responses.
  • Strengthen KOOK card message and module typing by introducing an explicit module type enum, shared base model utilities, and structured parsing helpers for text and images.
  • Improve KOOK client behavior by tracking bot nickname/username, refining signal handling via pattern matching, and enhancing error logging for invalid responses or events.

Tests:

  • Update KOOK tests and fixtures to align with Pydantic-based models, new enum usages, and adjusted JSON serialization for card and order messages.

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Mar 3, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

此拉取请求对KOOK适配器进行了重大重构,通过全面引入Pydantic数据类来处理KOOK平台的WebSocket事件和API响应。这一改进旨在为适配器提供更强大的类型安全、自动数据验证和更清晰的数据结构定义,从而显著提升代码的健壮性、可维护性和开发效率。它简化了消息的解析和构造过程,使得适配器在处理复杂数据时更加可靠和易于理解。

Highlights

  • Pydantic数据类引入: KOOK适配器现在尽可能使用Pydantic数据类来定义WebSocket消息事件和请求接口,以增强数据获取的便利性和消息结构的校验。
  • 消息处理逻辑优化: 重构了消息处理方法,将大量字段的get方法替换为更易用的类型判断,并直接使用Pydantic模型进行数据访问和验证。
  • 测试用例更新: 由于Pydantic的to_json行为与之前有所不同,相应的KOOK测试用例已进行修改以适应新的序列化方式。
Changelog
  • astrbot/core/platform/sources/kook/kook_adapter.py
    • 引入了pydantic库和新的KOOK类型定义。
    • 修改了_should_ignore_event_by_bot_nickname方法,直接使用Pydantic事件对象的author_id
    • 更新了_on_received方法以接收KookMessageEvent,并增加了对系统消息的显式处理。
    • 重构了_parse_kmarkdown_text_message_parse_card_message方法,利用Pydantic模型进行消息解析和验证。
    • 新增了_handle_section_text_handle_image_group辅助方法,以简化卡片消息的解析逻辑。
    • 调整了convert_message方法,使其接受KookMessageEvent并使用Pydantic字段获取消息属性和进行类型检查。
  • astrbot/core/platform/sources/kook/kook_client.py
    • 引入了pydantic库和多个新的KOOK事件/响应类型。
    • bot_name属性更新为bot_nickname,并新增了bot_username属性。
    • 重构了get_bot_infoget_gateway_url方法,使用Pydantic模型解析和验证API响应。
    • 修改了listen方法,将传入的WebSocket消息解析为KookWebsocketEvent Pydantic对象。
    • 更新了_handle_signal_handle_hello_handle_pong_handle_reconnect_handle_resume_ack方法,以处理Pydantic事件对象。
  • astrbot/core/platform/sources/kook/kook_event.py
    • 引入了KookModuleType枚举。
    • handle_audio函数中,将FileModuletype字段更新为KookModuleType.AUDIO
  • astrbot/core/platform/sources/kook/kook_types.py
    • 移除了dataclass装饰器和field导入,全面转向Pydantic BaseModel
    • 引入了KookBaseDataClass作为所有Pydantic模型的基础类,提供了from_dictfrom_jsonto_dictto_json方法。
    • 定义了KookModuleType枚举,用于表示各种卡片模块类型。
    • 更新了现有的卡片消息元素和模块类(如PlainTextElementKmarkdownElementImageElementButtonElementHeaderModuleSectionModule等),使其继承自KookBaseDataClass,并使用Literal类型结合KookModuleType枚举值定义type字段。
    • 修改了AnyModule的定义,使用AnnotatedField(discriminator="type")以实现更好的类型判别。
    • 重构了KookCardMessage,使其继承自KookBaseDataClass,并移除了自定义的to_dict/to_json方法,转而依赖KookBaseDataClass的实现。
    • KookCardMessageContainer添加了from_dict类方法。
    • 引入了新的Pydantic模型,用于KOOK WebSocket信令(KookMessageSignal)、频道类型(KookChannelType)、作者信息(KookAuthor)、KMarkdown内容(KookKMarkdown)、额外数据(KookExtra)以及各种事件类型(KookMessageEventKookHelloEventKookResumeAckEventKookWebsocketEvent)。
    • KookWebsocketEvent中实现了model_validator,用于在解析前将signal注入到data字段中,以便Pydantic正确进行类型判别。
    • 添加了API响应的Pydantic模型:KookApiResponseBaseKookUserMeDataKookUserMeResponsekookGatewayIndexDataKookGatewayIndexResponse
  • tests/test_kook/data/kook_card_data.json
    • 调整了JSON对象内部键的顺序,使其与Pydantic的默认序列化顺序保持一致,特别是将type字段移至开头。
  • tests/test_kook/test_kook_event.py
    • 更新了OrderMessage的实例化方式,使用关键字参数(index=1)而非位置参数。
    • 修改了音频卡片消息的预期JSON字符串,以反映新的Pydantic序列化顺序。
  • tests/test_kook/test_kook_types.py
    • 引入了KookModuleType枚举。
    • test_all_kook_card_type中,更新了FileModule的实例化方式,使用KookModuleType.FILE
Activity
  • 拉取请求由shuiping233创建。
  • 作者提供了详细的更改描述,包括动机和修改点。
  • 作者确认此更改不是破坏性变更。
  • 作者已完成检查清单,表明已进行讨论、测试,未引入新依赖,且未包含恶意代码。
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot
Copy link

dosubot bot commented Mar 3, 2026

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@dosubot dosubot bot added the area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. label Mar 3, 2026
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我发现了 4 个问题,并给出了一些整体反馈:

  • _parse_card_message 中,当 KookCardMessageContainer.from_dict 抛出 ValidationError 时,你目前只是记录了错误然后继续执行,这会导致 card_list 未定义并在运行时出错;建议在记录日志后立刻返回一个安全的兜底值(例如 return [], "")。
  • KookCardMessageContainer.from_dict 当前返回的是一个裸的 list[KookCardMessage],而不是 KookCardMessageContainer 实例,这对于一个定义在该类型上的 classmethod 来说有些出乎意料;建议返回 cls(card_list),以便调用方始终获得容器类型。
  • kookGatewayIndexData 的命名(首字母小写)与其他 Pydantic 模型(如 KookUserMeData 等)不一致;建议重命名为 KookGatewayIndexData 以保持清晰和一致性。
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `_parse_card_message`, when `KookCardMessageContainer.from_dict` raises a `ValidationError` you only log the error and continue, which will leave `card_list` undefined and lead to a runtime error; consider returning a safe fallback (e.g., `return [], ""`) immediately after logging.
- `KookCardMessageContainer.from_dict` currently returns a bare `list[KookCardMessage]` rather than an instance of `KookCardMessageContainer`, which is surprising for a classmethod on that type; consider returning `cls(card_list)` so callers consistently get the container type.
- The naming of `kookGatewayIndexData` (lowercase leading letter) is inconsistent with the rest of the Pydantic models (`KookUserMeData`, etc.); consider renaming it to `KookGatewayIndexData` for clarity and consistency.

## Individual Comments

### Comment 1
<location path="astrbot/core/platform/sources/kook/kook_adapter.py" line_range="276-277" />
<code_context>
+        try:
+            card_list = KookCardMessageContainer.from_dict(json.loads(content))
+
+        except pydantic.ValidationError as e:
+            logger.error(f"[KOOK] 解析卡片消息错误: {e}")
+            logger.error(f"[KOOK] 原始消息内容: {data.to_json()}")

</code_context>
<issue_to_address>
**issue (bug_risk):** Catching pydantic.ValidationError here will currently raise NameError and also leaves card_list undefined.

Consider importing `pydantic` in this module and returning early from `_parse_card_message` on validation failure (e.g. `return [], ""`) so later code doesn’t iterate over an undefined `card_list`. Alternatively, ensure the exception handling path always initializes `card_list` before it is used.
</issue_to_address>

### Comment 2
<location path="astrbot/core/platform/sources/kook/kook_types.py" line_range="253-254" />
<code_context>
-        return json.dumps(self.to_dict(), indent=indent, ensure_ascii=ensure_ascii)
-

 class KookCardMessageContainer(list[KookCardMessage]):
     """卡片消息容器(列表),此类型可以直接to_json后发送出去"""
</code_context>
<issue_to_address>
**suggestion (bug_risk):** from_dict on KookCardMessageContainer returns a plain list instead of an instance of cls, which can be surprising.

`from_dict` currently returns a plain `list[KookCardMessage]` (`[KookCardMessage.from_dict(item) for item in raw_data]`), which conflicts with being defined on `KookCardMessageContainer` and with the type hint/`to_json` behavior. This inconsistency can cause subtle type/behavior issues. Consider returning a `KookCardMessageContainer` instance instead, e.g. `cls(KookCardMessage.from_dict(item) for item in raw_data)`.
</issue_to_address>

### Comment 3
<location path="astrbot/core/platform/sources/kook/kook_types.py" line_range="220" />
<code_context>

 # 所有模块的联合类型
-AnyModule = (
+AnyModule = Annotated[
     HeaderModule
     | SectionModule
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new KOOK models by removing unnecessary discriminated unions and base-class abstractions while keeping the same typing and behavior.

You can reduce the new abstraction/indirection without losing any of the new functionality.

### 1. Drop discriminated `AnyModule` while keeping enums/literals

You already have `KookModuleType` and per‑module `type: Literal[... ]`, so you don’t actually need a discriminated union here. Pydantic will still serialize/deserialize correctly as long as the field names/types match KOOK’s payloads.

Current:

```python
AnyModule = Annotated[
    HeaderModule
    | SectionModule
    | ImageGroupModule
    | ContainerModule
    | ActionGroupModule
    | ContextModule
    | DividerModule
    | FileModule
    | CountdownModule
    | InviteModule,
    Field(discriminator="type"),
]
```

This can be simplified to:

```python
AnyModule = (
    HeaderModule
    | SectionModule
    | ImageGroupModule
    | ContainerModule
    | ActionGroupModule
    | ContextModule
    | DividerModule
    | FileModule
    | CountdownModule
    | InviteModule
)
```

`KookCardMessage.modules: list[AnyModule]` will still validate/serialize fine because each concrete module has a fixed `type` literal; the only thing you lose is automatic discriminated union resolution, which you’re not relying on anywhere else.

### 2. Simplify `KookWebsocketEvent` by moving parsing to the client

The discriminated union + pre‑validator on `KookWebsocketEvent` introduces quite a bit of protocol‑specific cleverness into the type layer:

```python
class KookWebsocketEvent(KookBaseDataClass):
    signal: KookMessageSignal = Field(..., alias="s")
    data: Annotated[
        KookMessageEvent | KookHelloEvent | KookResumeAckEvent | None,
        Field(discriminator="signal"),
    ] = Field(None, alias="d")

    @model_validator(mode="before")
    @classmethod
    def _inject_signal_into_data(cls, data: Any) -> Any:
        ...
```

Since you still match on `event.signal` in the client, you can simplify `KookWebsocketEvent` to treat `data` as an untyped payload and do the per‑signal parsing where you already branch:

```python
class KookWebsocketEvent(KookBaseDataClass):
    signal: KookMessageSignal = Field(..., alias="s")
    d: dict[str, Any] | None = Field(None, alias="d")
    sn: int | None = None
```

Then, in your client code:

```python
async def _handle_signal(self, raw_event: dict[str, Any]) -> None:
    event = KookWebsocketEvent.from_dict(raw_event)

    match event.signal:
        case KookMessageSignal.MESSAGE:
            msg = KookMessageEvent.from_dict(event.d or {})
            # handle message
        case KookMessageSignal.HELLO:
            hello = KookHelloEvent.from_dict(event.d or {})
            # handle hello
        case KookMessageSignal.RESUME_ACK:
            resume = KookResumeAckEvent.from_dict(event.d or {})
            # handle resume-ack
        # ...
```

You retain the same strong typing for `KookMessageEvent`, `KookHelloEvent`, `KookResumeAckEvent`, but the WebSocket envelope becomes a straightforward representation of the raw protocol, without discriminator plumbing or mutation in a pre‑validator.

### 3. Narrow the use of `KookBaseDataClass` to where it adds value

Right now `KookBaseDataClass` mostly renames `model_validate`/`model_dump`/`model_dump_json` and turns on `extra="allow"` globally:

```python
class KookBaseDataClass(BaseModel):
    model_config = ConfigDict(extra="allow", arbitrary_types_allowed=True)

    @classmethod
    def from_dict(cls, raw_data: dict):
        return cls.model_validate(raw_data)

    @classmethod
    def from_json(cls, raw_data: str | bytes | bytearray):
        return cls.model_validate_json(raw_data)

    def to_dict(self, by_alias=False, exclude_none=True) -> dict:
        return self.model_dump(by_alias=by_alias, exclude_none=exclude_none)

    def to_json(...):
        return self.model_dump_json(...)
```

If you want to keep the friendlier `from_*`/`to_*` names (they’re convenient for HTTP/WS payloads), consider:

* Using `KookBaseDataClass` only for **API/WS payload models** (`KookMessageEvent`, `KookUserMeResponse`, etc.).
* Letting card models (`KookCardMessage`, `KookCardModelBase` and friends) inherit directly from `BaseModel` since they already have simple usage and don’t need `extra="allow"`:

```python
class KookCardModelBase(BaseModel):
    type: str

class KookCardMessage(BaseModel):
    type: Literal[KookModuleType.CARD] = KookModuleType.CARD
    # ...
```

This reduces one layer of inheritance/indirection in the hot path for card handling, while keeping the convenience helpers where they actually simplify call‑sites (HTTP and WebSocket parsing).
</issue_to_address>

### Comment 4
<location path="astrbot/core/platform/sources/kook/kook_adapter.py" line_range="268" />
<code_context>

-    def _parse_card_message(self, data: dict) -> tuple[list, str]:
-        content = data.get("content", "[]")
+    def _parse_card_message(self, data: KookMessageEvent) -> tuple[list, str]:
+        content = data.content
         if not isinstance(content, str):
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `_parse_card_message` to a single linear traversal with a generic image-extraction helper so the hot path depends only on a few core fields instead of many module-specific helpers and pattern matches.

You can keep the typed models and still reduce complexity/coupling a bit in the hot path with small, local changes.

### 1. Make `_parse_card_message` robust and more linear

Right now, if `KookCardMessageContainer.from_dict` fails, `card_list` is undefined and the function continues. Also, the pattern-matching + helpers cause a lot of jumping for a very narrow extraction (text + image URLs).

You can keep the typed models but make the traversal more linear and self‑contained:

```python
def _parse_card_message(self, data: KookMessageEvent) -> tuple[list, str]:
    content = data.content
    if not isinstance(content, str):
        content = str(content)

    try:
        card_list = KookCardMessageContainer.from_dict(json.loads(content))
    except pydantic.ValidationError as e:
        logger.error(f"[KOOK] 解析卡片消息错误: {e}")
        logger.error(f"[KOOK] 原始消息内容: {data.to_json()}")
        # 明确失败返回,避免 card_list 未定义
        return [Plain(text="[卡片消息解析失败]")], "[卡片消息解析失败]"

    text_parts: list[str] = []
    images: list[str] = []

    for card in card_list:
        for module in card.modules:
            module_type = module.type

            if module_type == "section":
                # inline 原来的 _handle_section_text
                text = getattr(module, "text", None)
                if isinstance(text, (KmarkdownElement, PlainTextElement)):
                    if text.content:
                        text_parts.append(text.content)

            elif module_type in ("container", "image-group"):
                # 复用一个简洁的图片提取 helper(见下)
                images.extend(self._extract_image_urls(module.elements))

            elif module_type == "header":
                header_text = getattr(getattr(module, "text", None), "content", None)
                if header_text:
                    text_parts.append(header_text)
            else:
                logger.debug(f"[KOOK] 跳过或未处理模块: {module_type}")

    text = "".join(text_parts)
    message: list = []
    if text:
        message.append(Plain(text=text))
    for img_url in images:
        message.append(Image(file=img_url))
    return message, text
```

### 2. Replace two module‑specific helpers with one focused image extractor

Instead of `_handle_section_text` and `_handle_image_group(module: ContainerModule | ImageGroupModule)`, you can keep a single helper that only depends on the *elements*, not on the specific module classes. This keeps the adapter less tightly coupled to the full card schema:

```python
def _extract_image_urls(self, elements) -> list[str]:
    """从 elements 中提取合法图片 URL,不关心具体 module 类型。"""
    valid_urls: list[str] = []
    for el in elements or []:
        image_src = getattr(el, "src", None)
        if not isinstance(image_src, str):
            logger.warning(
                f'[KOOK] 处理卡片中的图片时发生错误,图片url "{image_src}" 应该为str类型, 而不是 "{type(image_src)}" '
            )
            continue
        if not image_src.startswith(("http://", "https://")):
            logger.warning(f"[KOOK] 屏蔽非http图片url: {image_src}")
            continue
        valid_urls.append(image_src)
    return valid_urls
```

This way:

- The hot path only cares about `module.type`, `module.text.content`, and `elements[].src`, similar to the original dict-based version.
- You still benefit from typed models and validation, but the adapter logic is less tightly coupled to every specific module class.
- The number of helper methods shrinks and the control flow becomes easier to read without losing functionality.
</issue_to_address>

Sourcery 对开源项目免费——如果你觉得我们的代码审查有帮助,欢迎分享 ✨
帮我变得更有用!请在每条评论上点击 👍 或 👎,我会根据你的反馈改进后续的代码审查。
Original comment in English

Hey - I've found 4 issues, and left some high level feedback:

  • In _parse_card_message, when KookCardMessageContainer.from_dict raises a ValidationError you only log the error and continue, which will leave card_list undefined and lead to a runtime error; consider returning a safe fallback (e.g., return [], "") immediately after logging.
  • KookCardMessageContainer.from_dict currently returns a bare list[KookCardMessage] rather than an instance of KookCardMessageContainer, which is surprising for a classmethod on that type; consider returning cls(card_list) so callers consistently get the container type.
  • The naming of kookGatewayIndexData (lowercase leading letter) is inconsistent with the rest of the Pydantic models (KookUserMeData, etc.); consider renaming it to KookGatewayIndexData for clarity and consistency.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `_parse_card_message`, when `KookCardMessageContainer.from_dict` raises a `ValidationError` you only log the error and continue, which will leave `card_list` undefined and lead to a runtime error; consider returning a safe fallback (e.g., `return [], ""`) immediately after logging.
- `KookCardMessageContainer.from_dict` currently returns a bare `list[KookCardMessage]` rather than an instance of `KookCardMessageContainer`, which is surprising for a classmethod on that type; consider returning `cls(card_list)` so callers consistently get the container type.
- The naming of `kookGatewayIndexData` (lowercase leading letter) is inconsistent with the rest of the Pydantic models (`KookUserMeData`, etc.); consider renaming it to `KookGatewayIndexData` for clarity and consistency.

## Individual Comments

### Comment 1
<location path="astrbot/core/platform/sources/kook/kook_adapter.py" line_range="276-277" />
<code_context>
+        try:
+            card_list = KookCardMessageContainer.from_dict(json.loads(content))
+
+        except pydantic.ValidationError as e:
+            logger.error(f"[KOOK] 解析卡片消息错误: {e}")
+            logger.error(f"[KOOK] 原始消息内容: {data.to_json()}")

</code_context>
<issue_to_address>
**issue (bug_risk):** Catching pydantic.ValidationError here will currently raise NameError and also leaves card_list undefined.

Consider importing `pydantic` in this module and returning early from `_parse_card_message` on validation failure (e.g. `return [], ""`) so later code doesn’t iterate over an undefined `card_list`. Alternatively, ensure the exception handling path always initializes `card_list` before it is used.
</issue_to_address>

### Comment 2
<location path="astrbot/core/platform/sources/kook/kook_types.py" line_range="253-254" />
<code_context>
-        return json.dumps(self.to_dict(), indent=indent, ensure_ascii=ensure_ascii)
-

 class KookCardMessageContainer(list[KookCardMessage]):
     """卡片消息容器(列表),此类型可以直接to_json后发送出去"""
</code_context>
<issue_to_address>
**suggestion (bug_risk):** from_dict on KookCardMessageContainer returns a plain list instead of an instance of cls, which can be surprising.

`from_dict` currently returns a plain `list[KookCardMessage]` (`[KookCardMessage.from_dict(item) for item in raw_data]`), which conflicts with being defined on `KookCardMessageContainer` and with the type hint/`to_json` behavior. This inconsistency can cause subtle type/behavior issues. Consider returning a `KookCardMessageContainer` instance instead, e.g. `cls(KookCardMessage.from_dict(item) for item in raw_data)`.
</issue_to_address>

### Comment 3
<location path="astrbot/core/platform/sources/kook/kook_types.py" line_range="220" />
<code_context>

 # 所有模块的联合类型
-AnyModule = (
+AnyModule = Annotated[
     HeaderModule
     | SectionModule
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new KOOK models by removing unnecessary discriminated unions and base-class abstractions while keeping the same typing and behavior.

You can reduce the new abstraction/indirection without losing any of the new functionality.

### 1. Drop discriminated `AnyModule` while keeping enums/literals

You already have `KookModuleType` and per‑module `type: Literal[... ]`, so you don’t actually need a discriminated union here. Pydantic will still serialize/deserialize correctly as long as the field names/types match KOOK’s payloads.

Current:

```python
AnyModule = Annotated[
    HeaderModule
    | SectionModule
    | ImageGroupModule
    | ContainerModule
    | ActionGroupModule
    | ContextModule
    | DividerModule
    | FileModule
    | CountdownModule
    | InviteModule,
    Field(discriminator="type"),
]
```

This can be simplified to:

```python
AnyModule = (
    HeaderModule
    | SectionModule
    | ImageGroupModule
    | ContainerModule
    | ActionGroupModule
    | ContextModule
    | DividerModule
    | FileModule
    | CountdownModule
    | InviteModule
)
```

`KookCardMessage.modules: list[AnyModule]` will still validate/serialize fine because each concrete module has a fixed `type` literal; the only thing you lose is automatic discriminated union resolution, which you’re not relying on anywhere else.

### 2. Simplify `KookWebsocketEvent` by moving parsing to the client

The discriminated union + pre‑validator on `KookWebsocketEvent` introduces quite a bit of protocol‑specific cleverness into the type layer:

```python
class KookWebsocketEvent(KookBaseDataClass):
    signal: KookMessageSignal = Field(..., alias="s")
    data: Annotated[
        KookMessageEvent | KookHelloEvent | KookResumeAckEvent | None,
        Field(discriminator="signal"),
    ] = Field(None, alias="d")

    @model_validator(mode="before")
    @classmethod
    def _inject_signal_into_data(cls, data: Any) -> Any:
        ...
```

Since you still match on `event.signal` in the client, you can simplify `KookWebsocketEvent` to treat `data` as an untyped payload and do the per‑signal parsing where you already branch:

```python
class KookWebsocketEvent(KookBaseDataClass):
    signal: KookMessageSignal = Field(..., alias="s")
    d: dict[str, Any] | None = Field(None, alias="d")
    sn: int | None = None
```

Then, in your client code:

```python
async def _handle_signal(self, raw_event: dict[str, Any]) -> None:
    event = KookWebsocketEvent.from_dict(raw_event)

    match event.signal:
        case KookMessageSignal.MESSAGE:
            msg = KookMessageEvent.from_dict(event.d or {})
            # handle message
        case KookMessageSignal.HELLO:
            hello = KookHelloEvent.from_dict(event.d or {})
            # handle hello
        case KookMessageSignal.RESUME_ACK:
            resume = KookResumeAckEvent.from_dict(event.d or {})
            # handle resume-ack
        # ...
```

You retain the same strong typing for `KookMessageEvent`, `KookHelloEvent`, `KookResumeAckEvent`, but the WebSocket envelope becomes a straightforward representation of the raw protocol, without discriminator plumbing or mutation in a pre‑validator.

### 3. Narrow the use of `KookBaseDataClass` to where it adds value

Right now `KookBaseDataClass` mostly renames `model_validate`/`model_dump`/`model_dump_json` and turns on `extra="allow"` globally:

```python
class KookBaseDataClass(BaseModel):
    model_config = ConfigDict(extra="allow", arbitrary_types_allowed=True)

    @classmethod
    def from_dict(cls, raw_data: dict):
        return cls.model_validate(raw_data)

    @classmethod
    def from_json(cls, raw_data: str | bytes | bytearray):
        return cls.model_validate_json(raw_data)

    def to_dict(self, by_alias=False, exclude_none=True) -> dict:
        return self.model_dump(by_alias=by_alias, exclude_none=exclude_none)

    def to_json(...):
        return self.model_dump_json(...)
```

If you want to keep the friendlier `from_*`/`to_*` names (they’re convenient for HTTP/WS payloads), consider:

* Using `KookBaseDataClass` only for **API/WS payload models** (`KookMessageEvent`, `KookUserMeResponse`, etc.).
* Letting card models (`KookCardMessage`, `KookCardModelBase` and friends) inherit directly from `BaseModel` since they already have simple usage and don’t need `extra="allow"`:

```python
class KookCardModelBase(BaseModel):
    type: str

class KookCardMessage(BaseModel):
    type: Literal[KookModuleType.CARD] = KookModuleType.CARD
    # ...
```

This reduces one layer of inheritance/indirection in the hot path for card handling, while keeping the convenience helpers where they actually simplify call‑sites (HTTP and WebSocket parsing).
</issue_to_address>

### Comment 4
<location path="astrbot/core/platform/sources/kook/kook_adapter.py" line_range="268" />
<code_context>

-    def _parse_card_message(self, data: dict) -> tuple[list, str]:
-        content = data.get("content", "[]")
+    def _parse_card_message(self, data: KookMessageEvent) -> tuple[list, str]:
+        content = data.content
         if not isinstance(content, str):
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `_parse_card_message` to a single linear traversal with a generic image-extraction helper so the hot path depends only on a few core fields instead of many module-specific helpers and pattern matches.

You can keep the typed models and still reduce complexity/coupling a bit in the hot path with small, local changes.

### 1. Make `_parse_card_message` robust and more linear

Right now, if `KookCardMessageContainer.from_dict` fails, `card_list` is undefined and the function continues. Also, the pattern-matching + helpers cause a lot of jumping for a very narrow extraction (text + image URLs).

You can keep the typed models but make the traversal more linear and self‑contained:

```python
def _parse_card_message(self, data: KookMessageEvent) -> tuple[list, str]:
    content = data.content
    if not isinstance(content, str):
        content = str(content)

    try:
        card_list = KookCardMessageContainer.from_dict(json.loads(content))
    except pydantic.ValidationError as e:
        logger.error(f"[KOOK] 解析卡片消息错误: {e}")
        logger.error(f"[KOOK] 原始消息内容: {data.to_json()}")
        # 明确失败返回,避免 card_list 未定义
        return [Plain(text="[卡片消息解析失败]")], "[卡片消息解析失败]"

    text_parts: list[str] = []
    images: list[str] = []

    for card in card_list:
        for module in card.modules:
            module_type = module.type

            if module_type == "section":
                # inline 原来的 _handle_section_text
                text = getattr(module, "text", None)
                if isinstance(text, (KmarkdownElement, PlainTextElement)):
                    if text.content:
                        text_parts.append(text.content)

            elif module_type in ("container", "image-group"):
                # 复用一个简洁的图片提取 helper(见下)
                images.extend(self._extract_image_urls(module.elements))

            elif module_type == "header":
                header_text = getattr(getattr(module, "text", None), "content", None)
                if header_text:
                    text_parts.append(header_text)
            else:
                logger.debug(f"[KOOK] 跳过或未处理模块: {module_type}")

    text = "".join(text_parts)
    message: list = []
    if text:
        message.append(Plain(text=text))
    for img_url in images:
        message.append(Image(file=img_url))
    return message, text
```

### 2. Replace two module‑specific helpers with one focused image extractor

Instead of `_handle_section_text` and `_handle_image_group(module: ContainerModule | ImageGroupModule)`, you can keep a single helper that only depends on the *elements*, not on the specific module classes. This keeps the adapter less tightly coupled to the full card schema:

```python
def _extract_image_urls(self, elements) -> list[str]:
    """从 elements 中提取合法图片 URL,不关心具体 module 类型。"""
    valid_urls: list[str] = []
    for el in elements or []:
        image_src = getattr(el, "src", None)
        if not isinstance(image_src, str):
            logger.warning(
                f'[KOOK] 处理卡片中的图片时发生错误,图片url "{image_src}" 应该为str类型, 而不是 "{type(image_src)}" '
            )
            continue
        if not image_src.startswith(("http://", "https://")):
            logger.warning(f"[KOOK] 屏蔽非http图片url: {image_src}")
            continue
        valid_urls.append(image_src)
    return valid_urls
```

This way:

- The hot path only cares about `module.type`, `module.text.content`, and `elements[].src`, similar to the original dict-based version.
- You still benefit from typed models and validation, but the adapter logic is less tightly coupled to every specific module class.
- The number of helper methods shrinks and the control flow becomes easier to read without losing functionality.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

这次重构做得非常出色,通过引入 Pydantic 数据类,极大地提升了 KOOK 适配器的代码质量、可读性和健壮性。对 WebSocket 事件和 API 响应使用强类型模型,特别是利用 Pydantic 的可区分联合(Discriminated Unions)来处理不同类型的事件和卡片模块,是非常好的实践。这使得数据校验和访问更加安全和直观。代码整体结构更清晰,也更容易维护和扩展。我在代码中发现了一些可以进一步优化的小地方,请查看具体的审查评论。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:platform The bug / feature is about IM platform adapter, such as QQ, Lark, Telegram, WebChat and so on. size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant