Skip to content

ZJIT: Refine singleton class invariant to only invalidate on method shadowing#949

Closed
tekknolagi wants to merge 35 commits intoShopify:masterfrom
tekknolagi:singleton-class-shadowing-invariant
Closed

ZJIT: Refine singleton class invariant to only invalidate on method shadowing#949
tekknolagi wants to merge 35 commits intoShopify:masterfrom
tekknolagi:singleton-class-shadowing-invariant

Conversation

@tekknolagi
Copy link

Summary

  • Refine NoSingletonClass invariant to NoSingletonClassWithShadowingMethod: only invalidate when a method defined on a singleton class shadows a method on the original class
  • Move the C hook from make_singleton_class() in class.c to rb_method_entry_make() in vm_method.c, checking rb_method_entry(super, mid) for shadowing
  • Creating a singleton class alone, or defining non-shadowing methods on one, no longer invalidates JIT code

Test plan

  • 1431 unit tests pass (including 5 new singleton class tests)
  • 27 integration tests pass
  • CI

tomascco and others added 30 commits February 23, 2026 22:03
Following up ruby#16225, I'm implementing fold on AND (&) operations.

- Ruby turns this operation into a C `&` in https://github.com/tomascco/ruby/blob/880eb3c76f09c44c85389ac5efbc555afafa3e6f/numeric.c#L5197-L5210
- Rustc compiler emits an `and` assembly instruction on x86: https://godbolt.org/z/Ee74vKTfb

So the operations are equivalent. I tested manually some cases and the results match.
The following code:

```ruby
x = Object.new
sc1 = x.singleton_class
sc2 = sc1.singleton_class
x.freeze
```

Would freeze sc1 but not sc2, even though sc1 would be frozen.

Handle this by walking the class pointer chain for the object.
If the class is a singleton class, and it isn't frozen, and the
attached object for the singleton class is the object, the
singleton class should be frozen, and we move to the next iteration.

Fixes [Bug #20319]
Add a new optimization pass that deduplicates PatchPoint instructions
asserting the same invariant within a basic block when no intervening
instruction could invalidate it.

This is common in practice — e.g., `1 + 2 + 3` produces two identical
`PatchPoint MethodRedefined(Integer, +, cme)` because each `+` call
inserts one. The second is redundant since nothing between them can
redefine Integer#+.

The approach adds a `PatchPoint` leaf to the abstract heap DAG so the
effect system can distinguish instructions that could invalidate
invariants (writes to PatchPoint, e.g. Send/CCall) from those that
cannot (e.g. IncrCounter, Const). The pass maintains a HashSet of
seen invariants per block and skips duplicates, clearing the set when
an instruction writes to the PatchPoint heap.

Removed PatchPoints' orphaned Snapshot dependencies are cleaned up by
the subsequent DCE pass.

I added a new test with snapshots to `zjit/src/hir/tests.rs` in the first commit
so that you can see how the change affects it in the second commit:
ruby@88cf26b#diff-38be021d8d0035cca77d3eab5ccc530109e7832f4711a5beb4dfced7001787ae

closes Shopify#779
This will allow prism to pass buffer sizes to the Ruby GC.

It also helps avoid buffer overflow as it confirms the size was
correctly tracked all the way until the buffer is freed.

ruby/prism@a5c3ee3e4c
There's no reason to ever call this and the "count" value is ignored.
Because this is part of the public API I left it as a stub, but it could
likely be removed.
This doesn't work because actions/checkout uses a fetch depth of 1, so
every CI run acts as though man pages were made in the most recent
commit.

Additionally the VCS code uses committed date, so this check will fail
if any PR is made that changes man pages, but then is merged the
following day JST.
We can hopefully de-duplicate this GetEP soon.
…uby#16237)

Both instructions are pure computations with no side effects:
IsBitNotEqual is a cmp+csel (identical to IsBitEqual, already Empty),
and FixnumBitCheck is a test+csel on a tagged Fixnum bit. Neither
reads memory, writes anything, allocates, or can raise exceptions.
Guards don't write to (e.g. PatchPoint) memory, just change if we
side-exit into the interpreter or not, so lower their effects.
Previously, when passing an unshareable proc to a Ractor we would get
the message:

    'Ractor.new': allocator undefined for Proc (TypeError)

With this change we get:

    'Ractor.new': can not copy unshareable object #<Proc:0x00007f1b31713600 test.rb:2> (Ractor::Error)

Which should hopefully be a little clearer about the problem and make it
simpler to debug.
…by#16240)

CheckInterrupts only reads ec->interrupt_flag and conditionally
side-exits — it doesn't touch PatchPoint invariants or the allocator.
Add an InterruptFlag leaf under Memory in the abstract heap DAG and
narrow CheckInterrupts from Any to read(InterruptFlag)|write(Control).

This lets remove_redundant_patch_points see through CheckInterrupts,
eliminating duplicate PatchPoints that were previously separated by
an interrupt check.
Before:

    Optimized HIR:
    fn block in <main>@benchmarks/setivar.rb:40:
    bb1():
      EntryPoint interpreter
      v1:BasicObject = LoadSelf
      Jump bb3(v1)
    bb2():
      EntryPoint JIT(0)
      v4:BasicObject = LoadArg :self@0
      Jump bb3(v4)
    bb3(v6:BasicObject):
      v10:CPtr = GetEP 1
      v11:BasicObject = LoadField v10, :obj@0xffffffffffffffe8    <=
      PatchPoint NoSingletonClass(TheClass@0x1037633c0)
      PatchPoint MethodRedefined(TheClass@0x1037633c0, set_value_loop@0xf5f1, cme:0x103a14650)
      v21:HeapObject[class_exact:TheClass] = GuardType v11, HeapObject[class_exact:TheClass]
      v22:BasicObject = SendDirect v21, 0x16cf41630, :set_value_loop (0x16cf41670)
      CheckInterrupts
      Return v22

After:

    Optimized HIR:
    fn block in <main>@benchmarks/setivar.rb:40:
    bb1():
      EntryPoint interpreter
      v1:BasicObject = LoadSelf
      Jump bb3(v1)
    bb2():
      EntryPoint JIT(0)
      v4:BasicObject = LoadArg :self@0
      Jump bb3(v4)
    bb3(v6:BasicObject):
      v10:CPtr = GetEP 1
      v11:BasicObject = LoadField v10, :obj@-0x18    <=
      PatchPoint NoSingletonClass(TheClass@0x102bc3420)
      PatchPoint MethodRedefined(TheClass@0x102bc3420, set_value_loop@0xf5f1, cme:0x102e945f0)
      v21:HeapObject[class_exact:TheClass] = GuardType v11, HeapObject[class_exact:TheClass]
      v22:BasicObject = SendDirect v21, 0x16dadd630, :set_value_loop (0x16dadd670)
      CheckInterrupts
      Return v22
Previously classes and modules were pre-aged. Ie. as soon as they're
allocated they are aged to old_age - 1. This was done with the
assumption that classes are generally always long lived, so we should
assume that any that survive a single GC can be considered old.

This commit keeps the same semantics, but moves the logic out of the
allocation path, in order to simplify allocation. Classes and modules
are now set to old-age the first time they are marked.
Previously we printed the struct name of T_DATA both in
rb_raw_obj_info_buitin_type and in rb_raw_obj_info_common via type_name.
…ruby#16249)

Before this change, gen_function_stub and
gen_function_stub_hit_trampoline communicated via a scratch register.
We would like gen_function_stub_hit_trampoline to have more freedom with
regard to the registers it uses, especially for the CCall in to
function_stub_hit.  Instead of communicating via scratch register, we'll
communicate via stack.

Practically speaking, this means:

* Stop using x15 (scratch reg) to communicate iseq call addr from call
  stub to function sub hit trampoline; use stack instead
* Don't try to CCall with x15 as first argument; can't use scratch reg
  in parallel move of arguments

Here is pseudo assembly of before this commit:

```
some_send_direct_in_a_ruby_method(JIT code):
  mov x15, gen_function_stub
  mov x0, self
  mov x1, 1
  blr x15

gen_function_stub:
  mov x15, 0xISEQADDR (the address of the ISEQ we _want_ to compile)
  jmp function_stub_hit_trampoline

function_stub_hit_trampoline:
  function prologue
  cpush ALL_JIT_REGS
  mov x0, x15 # currently x15 is 0xISEQADDR
  mov x1, CFP
  mov x2, SP
  blr function_stub_hit
  mov x15, x0 # write jump address to x15 (code pointer for compiled iseq)
  cpop ALL_JIT_REGS
  function epilogue
  jmp x15
```

Here is pseudo assembly of after this commit:

```
some_send_direct_in_a_ruby_method(JIT code):
  mov x15, gen_function_stub
  mov x0, self
  mov x1, 1
  blr x15

gen_function_stub:
  mov x15, 0xISEQADDR (the address of the ISEQ we _want_ to compile)
  push x15
  jmp function_stub_hit_trampoline

function_stub_hit_trampoline:
  pop x15 # get the ISEQ addr from gen_function_stub
  function prologue
  cpush ALL_JIT_REGS
  mov x0, x15 # currently x15 is 0xISEQADDR
  mov x1, CFP
  mov x2, SP
  blr function_stub_hit
  mov x15, x0 # write jump address to x15 (code pointer for compiled iseq)
  cpop ALL_JIT_REGS
  function epilogue
  jmp x15
```
LibreSSL docs state this can fail. OpenSSL has no docs for this, but the
implementation goes via BIO_ctrl() which can fail (e.g. via a callback).
Example Valgrind trace:
```
Invalid read of size 1
  memmove (at /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
  memcpy (string_fortified.h:29)
  ossl_str_new (ossl.c:107)
  ossl_membio2str (ossl_bio.c:36)
  ossl_x509ext_get_value (ossl_x509ext.c:390)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  rb_catch_obj (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  rb_catch_obj (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_vm_invoke_proc (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_proc_call_kw (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  ruby_run_node (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  (below main) (libc_start_call_main.h:58)
```

ruby/openssl@4fb09446d6
ndossche and others added 5 commits February 26, 2026 12:10
If it fails, then the duplicated item is not freed.
Solve it by making sure the push cannot fail by reserving the necessary
stack size upfront.

Example Valgrind trace:
```
14,009 (384 direct, 13,625 indirect) bytes in 1 blocks are definitely lost in loss record 27,287 of 27,373
  malloc (at /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
  CRYPTO_zalloc (at /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
  <unknown stack frame>
  ASN1_item_new (at /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
 *ossl_x509_alloc (ossl_x509cert.c:97)
  <unknown stack frame>
  rb_class_new_instance_pass_kw (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  rb_catch_obj (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  rb_catch_obj (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
  <unknown stack frame>
  <unknown stack frame>
  <unknown stack frame>
  rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3)
```

ruby/openssl@b53628b127
This converts the cc_refinement_table in the VM into a standard
TypedData object using rb_gc_declare_weak_references to allow it to be
.

This allows us to store all the refinements in a flat array, and simply
remove them as that are GC'd, we don't need a set for efficient deletes.

This should also allow us to assume that imemo_callcache never requires
additional cleanup.
…hadowing

Previously, the NoSingletonClass invariant invalidated JIT code whenever
a singleton class was created for an instance of a given class. This was
overly conservative—many uses of singleton classes (freeze, introspection,
non-shadowing methods) don't affect method dispatch for the original class.

Refine the invariant to NoSingletonClassWithShadowingMethod: only
invalidate when a method is defined on a singleton class that shadows
a method on the original class. The hook moves from make_singleton_class()
in class.c to rb_method_entry_make() in vm_method.c, where it checks
whether the newly defined method exists on the superclass chain.

Correctness is always maintained by GuardType, which checks the exact
class of the receiver—singleton class objects fail this check and
side-exit regardless of the patchpoint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@tekknolagi tekknolagi closed this Feb 26, 2026
@semgrep-pr-bot
Copy link

⚠️ Deprecated LLM Models Detected

This PR contains references to 1 deprecated model(s). Consider migrating to the recommended replacements.

⚠️ o3 (openai)

  • Occurrences: 1
  • Severity: WARNING
  • Shopify Replacement: gpt-5.1
  • Hotswap Date: 2025-12-15

Locations:

  • bootstraptest/test_ractor.rb:1367

Recommended Action: Replace with gpt-5.1 before 2025-12-15


📚 Resources

💡 This is an informational warning - it will not block your PR. However, please plan to migrate away from deprecated models before their shutdown dates.
💡 Please reach out to #help-proxy-shopify-ai if you have any questions.

🤖 Posted by Deprecated Models GitHub App

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.