fix: parser treats 'atomic' in identifiers as BEGIN ATOMIC keyword#4747
fix: parser treats 'atomic' in identifiers as BEGIN ATOMIC keyword#47477ttp wants to merge 3 commits intosupabase:developfrom
Conversation
xdmden
left a comment
There was a problem hiding this comment.
Nice fix! The whitespace boundary check is the right approach. Two suggestions:
-
Simplify and fix
offset == 0edge case — whenoffset == 0, there's noBEGINbeforeATOMIC, so it should never enterAtomicState. Changingoffset >= 0tooffset > 0handles this and lets us collapse the nested ifs into one condition. Since whitespace characters are all ASCII, we can also skip the rune decode and just cast the byte directly. -
More test coverage — the current test covers
atomic_test(prefix), but it's worth covering suffix (my_function_atomic), mid-identifier (is_atomic), and digit-boundary (col1atomic) cases too, since these are all patterns users are likely to hit.
Co-authored-by: Dmitry Deniskin <166256275+xdmden@users.noreply.github.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughModified the ATOMIC block detection logic in the parser to require a whitespace character immediately before the BEGIN_ATOMIC keyword, preventing false matches when "atomic" appears as part of identifiers. Added test cases to verify that "atomic" substrings in table names, function names, and column names do not trigger atomic block parsing. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@pkg/parser/state.go`:
- Around line 49-53: The current check that enters AtomicState only validates
the preceding boundary and still misclassifies identifiers that start with
"atomic" (e.g., "atomic_test"); update the logic around the detection that uses
BEGIN_ATOMIC/END_ATOMIC so it also validates the trailing boundary before
returning AtomicState: either peek the rune(s) immediately after the matched
"atomic" and ensure the next rune is not an identifier character (letters,
digits, or '_'), or introduce a small candidate state that consumes the
following rune and only transitions to AtomicState if that rune is not an
identifier char; reference the existing AtomicState type and the
BEGIN_ATOMIC/END_ATOMIC constants and ensure you use the same
identifier-character test used elsewhere (e.g., unicode.IsLetter/IsDigit or the
repo's isIdentChar helper) when deciding the trailing boundary.
- Around line 49-53: Replace the duplicated nested if with the intended single
condition: remove the original if using offset >= 0 and keep the new combined
condition (offset > 0 && strings.EqualFold(string(data[offset:]), BEGIN_ATOMIC)
&& unicode.IsSpace(rune(data[offset-1]))), so the function returns
&AtomicState{prev: s, delimiter: []byte(END_ATOMIC)} under that single check;
ensure only one if block surrounds that return and that BEGIN_ATOMIC/END_ATOMIC
and AtomicState are referenced as in the diff.
🧹 Nitpick comments (1)
pkg/parser/state_test.go (1)
171-191: Add coverage for identifiers that start withatomic(no schema prefix).The new cases cover
"atomic"inside identifiers, but not the prefix case likeatomic_test, which is still a likely false-positive. Adding a test here will lock the fix.✅ Suggested test
t.Run("atomic in table name", func(t *testing.T) { sql := []string{"create table public.atomic_test (id int);", "\nselect 1;"} checkSplit(t, sql) }) + + t.Run("atomic prefix in identifier", func(t *testing.T) { + sql := []string{"create table atomic_test (id int);", "\nselect 1;"} + checkSplit(t, sql) + })
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Summary
Parser incorrectly treats
atomicsubstring in identifiers asBEGIN ATOMICkeyword, causing migration failures.Problem
Migrations with table/column names containing
atomicfail with:ERROR: cannot insert multiple commands into a prepared statement (SQLSTATE 42601)
Example:
Root cause: Parser matches atomic in identifiers as the BEGIN ATOMIC keyword, enters a state where semicolons are ignored, and merges multiple statements into one.
Solution
Added word boundary check: ATOMIC only recognized as keyword when preceded by whitespace.
This prevents false matches in identifiers while preserving correct BEGIN ATOMIC function body handling.
Related
Summary by CodeRabbit
Bug Fixes
Tests