ZJIT: Refine singleton class invariant to only invalidate on method shadowing#949
Closed
tekknolagi wants to merge 35 commits intoShopify:masterfrom
Closed
ZJIT: Refine singleton class invariant to only invalidate on method shadowing#949tekknolagi wants to merge 35 commits intoShopify:masterfrom
tekknolagi wants to merge 35 commits intoShopify:masterfrom
Conversation
Following up ruby#16225, I'm implementing fold on AND (&) operations. - Ruby turns this operation into a C `&` in https://github.com/tomascco/ruby/blob/880eb3c76f09c44c85389ac5efbc555afafa3e6f/numeric.c#L5197-L5210 - Rustc compiler emits an `and` assembly instruction on x86: https://godbolt.org/z/Ee74vKTfb So the operations are equivalent. I tested manually some cases and the results match.
The following code: ```ruby x = Object.new sc1 = x.singleton_class sc2 = sc1.singleton_class x.freeze ``` Would freeze sc1 but not sc2, even though sc1 would be frozen. Handle this by walking the class pointer chain for the object. If the class is a singleton class, and it isn't frozen, and the attached object for the singleton class is the object, the singleton class should be frozen, and we move to the next iteration. Fixes [Bug #20319]
Add a new optimization pass that deduplicates PatchPoint instructions asserting the same invariant within a basic block when no intervening instruction could invalidate it. This is common in practice — e.g., `1 + 2 + 3` produces two identical `PatchPoint MethodRedefined(Integer, +, cme)` because each `+` call inserts one. The second is redundant since nothing between them can redefine Integer#+. The approach adds a `PatchPoint` leaf to the abstract heap DAG so the effect system can distinguish instructions that could invalidate invariants (writes to PatchPoint, e.g. Send/CCall) from those that cannot (e.g. IncrCounter, Const). The pass maintains a HashSet of seen invariants per block and skips duplicates, clearing the set when an instruction writes to the PatchPoint heap. Removed PatchPoints' orphaned Snapshot dependencies are cleaned up by the subsequent DCE pass. I added a new test with snapshots to `zjit/src/hir/tests.rs` in the first commit so that you can see how the change affects it in the second commit: ruby@88cf26b#diff-38be021d8d0035cca77d3eab5ccc530109e7832f4711a5beb4dfced7001787ae closes Shopify#779
This will allow prism to pass buffer sizes to the Ruby GC. It also helps avoid buffer overflow as it confirms the size was correctly tracked all the way until the buffer is freed. ruby/prism@a5c3ee3e4c
There's no reason to ever call this and the "count" value is ignored. Because this is part of the public API I left it as a stub, but it could likely be removed.
This doesn't work because actions/checkout uses a fetch depth of 1, so every CI run acts as though man pages were made in the most recent commit. Additionally the VCS code uses committed date, so this check will fail if any PR is made that changes man pages, but then is merged the following day JST.
We can hopefully de-duplicate this GetEP soon.
…uby#16237) Both instructions are pure computations with no side effects: IsBitNotEqual is a cmp+csel (identical to IsBitEqual, already Empty), and FixnumBitCheck is a test+csel on a tagged Fixnum bit. Neither reads memory, writes anything, allocates, or can raise exceptions.
Guards don't write to (e.g. PatchPoint) memory, just change if we side-exit into the interpreter or not, so lower their effects.
Previously, when passing an unshareable proc to a Ractor we would get
the message:
'Ractor.new': allocator undefined for Proc (TypeError)
With this change we get:
'Ractor.new': can not copy unshareable object #<Proc:0x00007f1b31713600 test.rb:2> (Ractor::Error)
Which should hopefully be a little clearer about the problem and make it
simpler to debug.
…by#16240) CheckInterrupts only reads ec->interrupt_flag and conditionally side-exits — it doesn't touch PatchPoint invariants or the allocator. Add an InterruptFlag leaf under Memory in the abstract heap DAG and narrow CheckInterrupts from Any to read(InterruptFlag)|write(Control). This lets remove_redundant_patch_points see through CheckInterrupts, eliminating duplicate PatchPoints that were previously separated by an interrupt check.
Before:
Optimized HIR:
fn block in <main>@benchmarks/setivar.rb:40:
bb1():
EntryPoint interpreter
v1:BasicObject = LoadSelf
Jump bb3(v1)
bb2():
EntryPoint JIT(0)
v4:BasicObject = LoadArg :self@0
Jump bb3(v4)
bb3(v6:BasicObject):
v10:CPtr = GetEP 1
v11:BasicObject = LoadField v10, :obj@0xffffffffffffffe8 <=
PatchPoint NoSingletonClass(TheClass@0x1037633c0)
PatchPoint MethodRedefined(TheClass@0x1037633c0, set_value_loop@0xf5f1, cme:0x103a14650)
v21:HeapObject[class_exact:TheClass] = GuardType v11, HeapObject[class_exact:TheClass]
v22:BasicObject = SendDirect v21, 0x16cf41630, :set_value_loop (0x16cf41670)
CheckInterrupts
Return v22
After:
Optimized HIR:
fn block in <main>@benchmarks/setivar.rb:40:
bb1():
EntryPoint interpreter
v1:BasicObject = LoadSelf
Jump bb3(v1)
bb2():
EntryPoint JIT(0)
v4:BasicObject = LoadArg :self@0
Jump bb3(v4)
bb3(v6:BasicObject):
v10:CPtr = GetEP 1
v11:BasicObject = LoadField v10, :obj@-0x18 <=
PatchPoint NoSingletonClass(TheClass@0x102bc3420)
PatchPoint MethodRedefined(TheClass@0x102bc3420, set_value_loop@0xf5f1, cme:0x102e945f0)
v21:HeapObject[class_exact:TheClass] = GuardType v11, HeapObject[class_exact:TheClass]
v22:BasicObject = SendDirect v21, 0x16dadd630, :set_value_loop (0x16dadd670)
CheckInterrupts
Return v22
Previously classes and modules were pre-aged. Ie. as soon as they're allocated they are aged to old_age - 1. This was done with the assumption that classes are generally always long lived, so we should assume that any that survive a single GC can be considered old. This commit keeps the same semantics, but moves the logic out of the allocation path, in order to simplify allocation. Classes and modules are now set to old-age the first time they are marked.
Previously we printed the struct name of T_DATA both in rb_raw_obj_info_buitin_type and in rb_raw_obj_info_common via type_name.
…ruby#16249) Before this change, gen_function_stub and gen_function_stub_hit_trampoline communicated via a scratch register. We would like gen_function_stub_hit_trampoline to have more freedom with regard to the registers it uses, especially for the CCall in to function_stub_hit. Instead of communicating via scratch register, we'll communicate via stack. Practically speaking, this means: * Stop using x15 (scratch reg) to communicate iseq call addr from call stub to function sub hit trampoline; use stack instead * Don't try to CCall with x15 as first argument; can't use scratch reg in parallel move of arguments Here is pseudo assembly of before this commit: ``` some_send_direct_in_a_ruby_method(JIT code): mov x15, gen_function_stub mov x0, self mov x1, 1 blr x15 gen_function_stub: mov x15, 0xISEQADDR (the address of the ISEQ we _want_ to compile) jmp function_stub_hit_trampoline function_stub_hit_trampoline: function prologue cpush ALL_JIT_REGS mov x0, x15 # currently x15 is 0xISEQADDR mov x1, CFP mov x2, SP blr function_stub_hit mov x15, x0 # write jump address to x15 (code pointer for compiled iseq) cpop ALL_JIT_REGS function epilogue jmp x15 ``` Here is pseudo assembly of after this commit: ``` some_send_direct_in_a_ruby_method(JIT code): mov x15, gen_function_stub mov x0, self mov x1, 1 blr x15 gen_function_stub: mov x15, 0xISEQADDR (the address of the ISEQ we _want_ to compile) push x15 jmp function_stub_hit_trampoline function_stub_hit_trampoline: pop x15 # get the ISEQ addr from gen_function_stub function prologue cpush ALL_JIT_REGS mov x0, x15 # currently x15 is 0xISEQADDR mov x1, CFP mov x2, SP blr function_stub_hit mov x15, x0 # write jump address to x15 (code pointer for compiled iseq) cpop ALL_JIT_REGS function epilogue jmp x15 ```
LibreSSL docs state this can fail. OpenSSL has no docs for this, but the implementation goes via BIO_ctrl() which can fail (e.g. via a callback). Example Valgrind trace: ``` Invalid read of size 1 memmove (at /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) memcpy (string_fortified.h:29) ossl_str_new (ossl.c:107) ossl_membio2str (ossl_bio.c:36) ossl_x509ext_get_value (ossl_x509ext.c:390) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> rb_catch_obj (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> rb_catch_obj (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_vm_invoke_proc (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_proc_call_kw (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> ruby_run_node (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> (below main) (libc_start_call_main.h:58) ``` ruby/openssl@4fb09446d6
If it fails, then the duplicated item is not freed. Solve it by making sure the push cannot fail by reserving the necessary stack size upfront. Example Valgrind trace: ``` 14,009 (384 direct, 13,625 indirect) bytes in 1 blocks are definitely lost in loss record 27,287 of 27,373 malloc (at /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) CRYPTO_zalloc (at /usr/lib/x86_64-linux-gnu/libcrypto.so.3) <unknown stack frame> ASN1_item_new (at /usr/lib/x86_64-linux-gnu/libcrypto.so.3) *ossl_x509_alloc (ossl_x509cert.c:97) <unknown stack frame> rb_class_new_instance_pass_kw (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> rb_catch_obj (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_yield (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) rb_ary_each (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> rb_catch_obj (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) <unknown stack frame> <unknown stack frame> <unknown stack frame> rb_vm_exec (at /usr/lib/x86_64-linux-gnu/libruby-3.2.so.3.2.3) ``` ruby/openssl@b53628b127
This converts the cc_refinement_table in the VM into a standard TypedData object using rb_gc_declare_weak_references to allow it to be . This allows us to store all the refinements in a flat array, and simply remove them as that are GC'd, we don't need a set for efficient deletes. This should also allow us to assume that imemo_callcache never requires additional cleanup.
…hadowing Previously, the NoSingletonClass invariant invalidated JIT code whenever a singleton class was created for an instance of a given class. This was overly conservative—many uses of singleton classes (freeze, introspection, non-shadowing methods) don't affect method dispatch for the original class. Refine the invariant to NoSingletonClassWithShadowingMethod: only invalidate when a method is defined on a singleton class that shadows a method on the original class. The hook moves from make_singleton_class() in class.c to rb_method_entry_make() in vm_method.c, where it checks whether the newly defined method exists on the superclass chain. Correctness is always maintained by GuardType, which checks the exact class of the receiver—singleton class objects fail this check and side-exit regardless of the patchpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
NoSingletonClassinvariant toNoSingletonClassWithShadowingMethod: only invalidate when a method defined on a singleton class shadows a method on the original classmake_singleton_class()in class.c torb_method_entry_make()in vm_method.c, checkingrb_method_entry(super, mid)for shadowingTest plan