Conversation
N5N3
added a commit
that referenced
this pull request
Oct 29, 2021
commit c054dbc Author: Shuhei Kadowaki <40514306+aviatesk@users.noreply.github.com> Date: Fri Oct 29 01:31:55 2021 +0900 optimizer: eliminate allocations (JuliaLang#42833) commit 6a9737d Author: Jeff Bezanson <jeff.bezanson@gmail.com> Date: Thu Oct 28 12:23:53 2021 -0400 fix JuliaLang#42659, move `jl_coverage_visit_line` to runtime library (JuliaLang#42810) commit c762f10 Author: Marc Ittel <35898736+MarcMush@users.noreply.github.com> Date: Thu Oct 28 12:19:13 2021 +0200 change `julia` to `julia-repl` in docstrings (JuliaLang#42824) Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com> commit 9f52ec0 Author: Dilum Aluthge <dilum@aluthge.com> Date: Thu Oct 28 05:30:11 2021 -0400 CI (Buildkite): Update all rootfs images to the latest versions (JuliaLang#42802) * CI (Buildkite): Update all rootfs images to the latest versions * Re-sign all of the signed pipelines commit 404e584 Author: DilumAluthgeBot <43731525+DilumAluthgeBot@users.noreply.github.com> Date: Wed Oct 27 21:11:04 2021 -0400 🤖 Bump the Statistics stdlib from 74897fe to 5256d57 (JuliaLang#42826) Co-authored-by: Dilum Aluthge <dilum@aluthge.com> commit c74814e Author: Jeff Bezanson <jeff.bezanson@gmail.com> Date: Wed Oct 27 16:34:46 2021 -0400 reset `RandomDevice` file from `__init__` (JuliaLang#42537) This prevents us from seeing an invalid `IOStream` object from a saved system image, and also ensures the files are opened once for all threads. commit 05ed348 Author: Jeff Bezanson <jeff.bezanson@gmail.com> Date: Wed Oct 27 15:24:17 2021 -0400 only visit nonfunction_mt once when traversing method tables (JuliaLang#42821) commit d71b77d Author: DilumAluthgeBot <43731525+DilumAluthgeBot@users.noreply.github.com> Date: Tue Oct 26 20:39:08 2021 -0400 🤖 Bump the Downloads stdlib from 5f1509d to dbb0625 (JuliaLang#42811) Co-authored-by: Dilum Aluthge <dilum@aluthge.com> commit b4fddc1 Author: DilumAluthgeBot <43731525+DilumAluthgeBot@users.noreply.github.com> Date: Tue Oct 26 14:46:20 2021 -0400 🤖 Bump the Pkg stdlib from bc32103f to 26918395 (JuliaLang#42806) Co-authored-by: Dilum Aluthge <dilum@aluthge.com> commit 6a386de Author: Dilum Aluthge <dilum@aluthge.com> Date: Tue Oct 26 12:15:51 2021 -0400 CI (Buildkite): make sure to hit ignore any unencrypted repo keys, regardless of where they are located in the repository (JuliaLang#42803) commit 021a6b5 Author: Shuhei Kadowaki <40514306+aviatesk@users.noreply.github.com> Date: Wed Oct 27 01:08:33 2021 +0900 optimizer: clean up inlining test code (JuliaLang#42804) commit 16eb196 Merge: 21ebabf 1510eaa Author: Shuhei Kadowaki <40514306+aviatesk@users.noreply.github.com> Date: Tue Oct 26 23:25:41 2021 +0900 Merge pull request JuliaLang#42766 from JuliaLang/avi/42754 optimizer: fix JuliaLang#42754, inline union-split const-prop'ed sources commit 21ebabf Author: Kristoffer Carlsson <kcarlsson89@gmail.com> Date: Tue Oct 26 16:11:32 2021 +0200 simplify code loading test now that TOML files are parsed with a real TOML parser (JuliaLang#42328) commit 1510eaa Author: Shuhei Kadowaki <aviatesk@gmail.com> Date: Mon Oct 25 01:35:12 2021 +0900 optimizer: fix JuliaLang#42754, inline union-split const-prop'ed sources This commit complements JuliaLang#39754 and JuliaLang#39305: implements a logic to use constant-prop'ed results for inlining at union-split callsite. Currently it works only for cases when constant-prop' succeeded for all (union-split) signatures. > example ```julia julia> mutable struct X # NOTE in order to confuse `fieldtype_tfunc`, we need to have at least two fields with different types a::Union{Nothing, Int} b::Symbol end; julia> code_typed((X, Union{Nothing,Int})) do x, a # this `setproperty` call would be union-split and constant-prop will happen for # each signature: inlining would fail if we don't use constant-prop'ed source # since the approximated inlining cost of `convert(fieldtype(X, sym), a)` would # end up very high if we don't propagate `sym::Const(:a)` x.a = a x end |> only |> first ``` > before this commit ```julia CodeInfo( 1 ─ %1 = Base.setproperty!::typeof(setproperty!) │ %2 = (isa)(a, Nothing)::Bool └── goto #3 if not %2 2 ─ %4 = π (a, Nothing) │ invoke %1(_2::X,🅰️ :Symbol, %4::Nothing)::Any └── goto #6 3 ─ %7 = (isa)(a, Int64)::Bool └── goto #5 if not %7 4 ─ %9 = π (a, Int64) │ invoke %1(_2::X,🅰️ :Symbol, %9::Int64)::Any └── goto #6 5 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 6 ┄ return x ) ``` > after this commit ```julia CodeInfo( 1 ─ %1 = (isa)(a, Nothing)::Bool └── goto #3 if not %1 2 ─ Base.setfield!(x, :a, nothing)::Nothing └── goto #6 3 ─ %5 = (isa)(a, Int64)::Bool └── goto #5 if not %5 4 ─ %7 = π (a, Int64) │ Base.setfield!(x, :a, %7)::Int64 └── goto #6 5 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 6 ┄ return x ) ``` commit 4c3ae20 Author: Chris Foster <chris42f@gmail.com> Date: Tue Oct 26 21:48:32 2021 +1000 Make Base.ifelse a generic function (JuliaLang#37343) Allow user code to directly extend `Base.ifelse` rather than needing a special package for it. commit 2e388e3 Author: Shuhei Kadowaki <aviatesk@gmail.com> Date: Mon Oct 25 01:30:09 2021 +0900 optimizer: eliminate excessive specialization in inlining code This commit includes several code quality improvements in inlining code: - eliminate excessive specializations around: * `item::Pair{Any, Any}` constructions * iterations on `Vector{Pair{Any, Any}}` - replace `Pair{Any, Any}` with new, more explicit data type `InliningCase` - remove dead code
N5N3
pushed a commit
that referenced
this pull request
Oct 29, 2021
This commit complements JuliaLang#39754 and JuliaLang#39305: implements a logic to use constant-prop'ed results for inlining at union-split callsite. Currently it works only for cases when constant-prop' succeeded for all (union-split) signatures. > example ```julia julia> mutable struct X # NOTE in order to confuse `fieldtype_tfunc`, we need to have at least two fields with different types a::Union{Nothing, Int} b::Symbol end; julia> code_typed((X, Union{Nothing,Int})) do x, a # this `setproperty` call would be union-split and constant-prop will happen for # each signature: inlining would fail if we don't use constant-prop'ed source # since the approximated inlining cost of `convert(fieldtype(X, sym), a)` would # end up very high if we don't propagate `sym::Const(:a)` x.a = a x end |> only |> first ``` > before this commit ```julia CodeInfo( 1 ─ %1 = Base.setproperty!::typeof(setproperty!) │ %2 = (isa)(a, Nothing)::Bool └── goto #3 if not %2 2 ─ %4 = π (a, Nothing) │ invoke %1(_2::X,🅰️ :Symbol, %4::Nothing)::Any └── goto #6 3 ─ %7 = (isa)(a, Int64)::Bool └── goto #5 if not %7 4 ─ %9 = π (a, Int64) │ invoke %1(_2::X,🅰️ :Symbol, %9::Int64)::Any └── goto #6 5 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 6 ┄ return x ) ``` > after this commit ```julia CodeInfo( 1 ─ %1 = (isa)(a, Nothing)::Bool └── goto #3 if not %1 2 ─ Base.setfield!(x, :a, nothing)::Nothing └── goto #6 3 ─ %5 = (isa)(a, Int64)::Bool └── goto #5 if not %5 4 ─ %7 = π (a, Int64) │ Base.setfield!(x, :a, %7)::Int64 └── goto #6 5 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 6 ┄ return x ) ```
N5N3
pushed a commit
that referenced
this pull request
Mar 25, 2022
In JuliaLang#44635, we observe that occasionally a call to `view(::SubArray, ::Colon, ...)` dispatches to the wrong function. The post-inlining IR is in relevant part: ``` │ │ %8 = (isa)(I, Tuple{Colon, UnitRange{Int64}, SubArray{Int64, 2, UnitRange{Int64}, Tuple{Matrix{Int64}}, false}})::Bool └───│ goto #3 if not %8 2 ──│ %10 = π (I, Tuple{Colon, UnitRange{Int64}, SubArray{Int64, 2, UnitRange{Int64}, Tuple{Matrix{Int64}}, false}}) │ │ @ indices.jl:324 within `to_indices` @ multidimensional.jl:859 │ │┌ @ multidimensional.jl:864 within `uncolon` │ ││┌ @ indices.jl:351 within `Slice` @ indices.jl:351 │ │││ %11 = %new(Base.Slice{Base.OneTo{Int64}}, %7)::Base.Slice{Base.OneTo{Int64}} │ │└└ │ │┌ @ essentials.jl:251 within `tail` │ ││ %12 = Core.getfield(%10, 2)::UnitRange{Int64} │ ││ %13 = Core.getfield(%10, 3)::SubArray{Int64, 2, UnitRange{Int64}, Tuple{Matrix{Int64}}, false} │ │└ │ │ @ indices.jl:324 within `to_indices` └───│ goto #5 │ @ indices.jl:324 within `to_indices` @ indices.jl:333 │┌ @ tuple.jl:29 within `getindex` 3 ──││ %15 = Base.getfield(I, 1, true)::Function │ │└ │ │ invoke Base.to_index(A::SubArray{Int64, 3, Array{Int64, 3}, Tuple{Vector{Int64}, Base.Slice{Base.OneTo{Int64}}, UnitRange{Int64}}, false}, %15::Function)::Union{} ``` Here we expect the `isa` at `%8` to always be [1]. However, we seemingly observe the result that the branch is not taken and we instead end up in the fallback `to_index`, which (correctly) complains that the colon should have been dereferenced to an index. After some investigation of the relevant rr trace, what turns out to happen here is that the va tuple we compute in codegen gets garbage collected before the call to `emit_isa`, causing a use-after-free read, which happens to make `emit_isa` think that the isa condition is impossible, causing it to fold the branch away. The fix is to simply add the relevant GC root. It's a bit unfortunate that this wasn't caught by the GC verifier. It would have in principle been capable of doing so, but it is currently disabled for C++ sources. It would be worth revisiting this in the future to see if it can't be made to work. Fixes JuliaLang#44635. [1] The specialization heuristics decided to widen `Colon` to `Function`, which doesn't make much sense here, but regardless, it shouldn't crash.
N5N3
pushed a commit
that referenced
this pull request
Apr 3, 2022
Currently the optimizer handles abstract callsite only when there is a single dispatch candidate (in most cases), and so inlining and static-dispatch are prohibited when the callsite is union-split (in other word, union-split happens only when all the dispatch candidates are concrete). However, there are certain patterns of code (most notably our Julia-level compiler code) that inherently need to deal with abstract callsite. The following example is taken from `Core.Compiler` utility: ```julia julia> @inline isType(@nospecialize t) = isa(t, DataType) && t.name === Type.body.name isType (generic function with 1 method) julia> code_typed((Any,)) do x # abstract, but no union-split, successful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (x isa Main.DataType)::Bool └── goto #3 if not %1 2 ─ %3 = π (x, DataType) │ %4 = Base.getfield(%3, :name)::Core.TypeName │ %5 = Base.getfield(Type{T}, :name)::Core.TypeName │ %6 = (%4 === %5)::Bool └── goto #4 3 ─ goto #4 4 ┄ %9 = φ (#2 => %6, #3 => false)::Bool └── return %9 ) => Bool julia> code_typed((Union{Type,Nothing},)) do x # abstract, union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #4 3 ─ %4 = Main.isType(x)::Bool └── goto #4 4 ┄ %6 = φ (#2 => false, #3 => %4)::Bool └── return %6 ) => Bool ``` (note that this is a limitation of the inlining algorithm, and so any user-provided hints like callsite inlining annotation doesn't help here) This commit enables inlining and static dispatch for abstract union-split callsite. The core idea here is that we can simulate our dispatch semantics by generating `isa` checks in order of the specialities of dispatch candidates: ```julia julia> code_typed((Union{Type,Nothing},)) do x # union-split, unsuccessful inlining isType(x) end |> only CodeInfo( 1 ─ %1 = (isa)(x, Nothing)::Bool └── goto #3 if not %1 2 ─ goto #9 3 ─ %4 = (isa)(x, Type)::Bool └── goto #8 if not %4 4 ─ %6 = π (x, Type) │ %7 = (%6 isa Main.DataType)::Bool └── goto #6 if not %7 5 ─ %9 = π (%6, DataType) │ %10 = Base.getfield(%9, :name)::Core.TypeName │ %11 = Base.getfield(Type{T}, :name)::Core.TypeName │ %12 = (%10 === %11)::Bool └── goto #7 6 ─ goto #7 7 ┄ %15 = φ (#5 => %12, #6 => false)::Bool └── goto #9 8 ─ Core.throw(ErrorException("fatal error in type inference (type bound)"))::Union{} └── unreachable 9 ┄ %19 = φ (#2 => false, #7 => %15)::Bool └── return %19 ) => Bool ``` Inlining/static-dispatch of abstract union-split callsite will improve the performance in such situations (and so this commit will improve the latency of our JIT compilation). Especially, this commit helps us avoid excessive specializations of `Core.Compiler` code by statically-resolving `@nospecialize`d callsites, and as the result, the # of precompiled statements is now reduced from `2005` ([`master`](f782430)) to `1912` (this commit). And also, as a side effect, the implementation of our inlining algorithm gets much simplified now since we no longer need the previous special handlings for abstract callsites. One possible drawback would be increased code size. This change seems to certainly increase the size of sysimage, but I think these numbers are in an acceptable range: > [`master`](f782430) ``` ❯ du -shk usr/lib/julia/* 17604 usr/lib/julia/corecompiler.ji 194072 usr/lib/julia/sys-o.a 169424 usr/lib/julia/sys.dylib 23784 usr/lib/julia/sys.dylib.dSYM 103772 usr/lib/julia/sys.ji ``` > this commit ``` ❯ du -shk usr/lib/julia/* 17512 usr/lib/julia/corecompiler.ji 195588 usr/lib/julia/sys-o.a 170908 usr/lib/julia/sys.dylib 23776 usr/lib/julia/sys.dylib.dSYM 105360 usr/lib/julia/sys.ji ```
N5N3
pushed a commit
that referenced
this pull request
Jun 27, 2022
…Lang#45790) Currently the `@nospecialize`-d `push!(::Vector{Any}, ...)` can only take a single item and we will end up with runtime dispatch when we try to call it with multiple items: ```julia julia> code_typed(push!, (Vector{Any}, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing │ %2 = Base.arraylen(a)::Int64 │ Base.arrayset(true, a, item, %2)::Vector{Any} └── return a ) => Vector{Any} julia> code_typed(push!, (Vector{Any}, Any, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.append!(a, iter)::Vector{Any} └── return %1 ) => Vector{Any} ``` This commit adds a new specialization that it can take arbitrary-length items. Our compiler should still be able to optimize the single-input case as before via the dispatch mechanism. ```julia julia> code_typed(push!, (Vector{Any}, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000001, 0x0000000000000001))::Nothing │ %2 = Base.arraylen(a)::Int64 │ Base.arrayset(true, a, item, %2)::Vector{Any} └── return a ) => Vector{Any} julia> code_typed(push!, (Vector{Any}, Any, Any)) 1-element Vector{Any}: CodeInfo( 1 ─ %1 = Base.arraylen(a)::Int64 │ $(Expr(:foreigncall, :(:jl_array_grow_end), Nothing, svec(Any, UInt64), 0, :(:ccall), Core.Argument(2), 0x0000000000000002, 0x0000000000000002))::Nothing └── goto #7 if not true 2 ┄ %4 = φ (#1 => 1, #6 => %14)::Int64 │ %5 = φ (#1 => 1, #6 => %15)::Int64 │ %6 = Base.getfield(x, %4, true)::Any │ %7 = Base.add_int(%1, %4)::Int64 │ Base.arrayset(true, a, %6, %7)::Vector{Any} │ %9 = (%5 === 2)::Bool └── goto #4 if not %9 3 ─ goto #5 4 ─ %12 = Base.add_int(%5, 1)::Int64 └── goto #5 5 ┄ %14 = φ (#4 => %12)::Int64 │ %15 = φ (#4 => %12)::Int64 │ %16 = φ (#3 => true, #4 => false)::Bool │ %17 = Base.not_int(%16)::Bool └── goto #7 if not %17 6 ─ goto #2 7 ┄ return a ) => Vector{Any} ``` This commit also adds the equivalent implementations for `pushfirst!`.
N5N3
pushed a commit
that referenced
this pull request
Jun 29, 2022
When calling `jl_error()` or `jl_errorf()`, we must check to see if we
are so early in the bringup process that it is dangerous to attempt to
construct a backtrace because the data structures used to provide line
information are not properly setup.
This can be easily triggered by running:
```
julia -C invalid
```
On an `i686-linux-gnu` build, this will hit the "Invalid CPU Name"
branch in `jitlayers.cpp`, which calls `jl_errorf()`. This in turn
calls `jl_throw()`, which will eventually call `jl_DI_for_fptr` as part
of the backtrace printing process, which fails as the object maps are
not fully initialized. See the below `gdb` stacktrace for details:
```
$ gdb -batch -ex 'r' -ex 'bt' --args ./julia -C invalid
...
fatal: error thrown and no exception handler available.
ErrorException("Invalid CPU name "invalid".")
Thread 1 "julia" received signal SIGSEGV, Segmentation fault.
0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
1277 /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h: No such file or directory.
#0 0xf75bd665 in std::_Rb_tree<unsigned int, std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo>, std::_Select1st<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> >, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__k=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_tree.h:1277
#1 std::map<unsigned int, JITDebugInfoRegistry::ObjectInfo, std::greater<unsigned int>, std::allocator<std::pair<unsigned int const, JITDebugInfoRegistry::ObjectInfo> > >::lower_bound (__x=<optimized out>, this=0x248) at /usr/local/i686-linux-gnu/include/c++/9.1.0/bits/stl_map.h:1258
#2 jl_DI_for_fptr (fptr=4155049385, symsize=symsize@entry=0xffffcfa8, slide=slide@entry=0xffffcfa0, Section=Section@entry=0xffffcfb8, context=context@entry=0xffffcf94) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1181
#3 0xf75c056a in jl_getFunctionInfo_impl (frames_out=0xffffd03c, pointer=4155049385, skipC=0, noInline=0) at /cache/build/default-amdci5-4/julialang/julia-master/src/debuginfo.cpp:1210
#4 0xf7a6ca98 in jl_print_native_codeloc (ip=4155049385) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:636
#5 0xf7a6cd54 in jl_print_bt_entry_codeloc (bt_entry=0xf0798018) at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:657
#6 jlbacktrace () at /cache/build/default-amdci5-4/julialang/julia-master/src/stackwalk.c:1090
#7 0xf7a3cd2b in ijl_no_exc_handler (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:605
#8 0xf7a3d10a in throw_internal (ct=ct@entry=0xf070c010, exception=<optimized out>, exception@entry=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:638
#9 0xf7a3d330 in ijl_throw (e=0xf0794010) at /cache/build/default-amdci5-4/julialang/julia-master/src/task.c:654
#10 0xf7a905aa in ijl_errorf (fmt=fmt@entry=0xf7647cd4 "Invalid CPU name \"%s\".") at /cache/build/default-amdci5-4/julialang/julia-master/src/rtutils.c:77
#11 0xf75a4b22 in (anonymous namespace)::createTargetMachine () at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:823
#12 JuliaOJIT::JuliaOJIT (this=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/jitlayers.cpp:1044
#13 0xf7531793 in jl_init_llvm () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8585
#14 0xf75318a8 in jl_init_codegen_impl () at /cache/build/default-amdci5-4/julialang/julia-master/src/codegen.cpp:8648
#15 0xf7a51a52 in jl_restore_system_image_from_stream (f=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2131
#16 0xf7a55c03 in ijl_restore_system_image_data (buf=0xe859c1c0 <jl_system_image_data> "8'\031\003", len=125161105) at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2184
#17 0xf7a55cf9 in jl_load_sysimg_so () at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:424
#18 ijl_restore_system_image (fname=0x80a0900 "/build/bk_download/julia-d78fdad601/lib/julia/sys.so") at /cache/build/default-amdci5-4/julialang/julia-master/src/staticdata.c:2157
#19 0xf7a3bdfc in _finish_julia_init (rel=rel@entry=JL_IMAGE_JULIA_HOME, ct=<optimized out>, ptls=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:741
#20 0xf7a3c8ac in julia_init (rel=<optimized out>) at /cache/build/default-amdci5-4/julialang/julia-master/src/init.c:728
#21 0xf7a7f61d in jl_repl_entrypoint (argc=<optimized out>, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/src/jlapi.c:705
#22 0x080490a7 in main (argc=3, argv=0xffffddf4) at /cache/build/default-amdci5-4/julialang/julia-master/cli/loader_exe.c:59
```
To prevent this, we simply avoid calling `jl_errorf` this early in the
process, punting the problem to a later PR that can update guard
conditions within `jl_error*`.
mbauman
pushed a commit
that referenced
this pull request
Dec 11, 2023
This is part of the work to address JuliaLang#51352 by attempting to allow the compiler to perform SRAO on persistent data structures like `PersistentDict` as if they were regular immutable data structures. These sorts of data structures have very complicated internals (with lots of mutation, memory sharing, etc.), but a relatively simple interface. As such, it is unlikely that our compiler will have sufficient power to optimize this interface by analyzing the implementation. We thus need to come up with some other mechanism that gives the compiler license to perform the requisite optimization. One way would be to just hardcode `PersistentDict` into the compiler, optimizing it like any of the other builtin datatypes. However, this is of course very unsatisfying. At the other end of the spectrum would be something like a generic rewrite rule system (e-graphs anyone?) that would let the PersistentDict implementation declare its interface to the compiler and the compiler would use this for optimization (in a perfect world, the actual rewrite would then be checked using some sort of formal methods). I think that would be interesting, but we're very far from even being able to design something like that (at least in Base - experiments with external AbstractInterpreters in this direction are encouraged). This PR tries to come up with a reasonable middle ground, where the compiler gets some knowledge of the protocol hardcoded without having to know about the implementation details of the data structure. The basic ideas is that `Core` provides some magic generic functions that implementations can extend. Semantically, they are not special. They dispatch as usual, and implementations are expected to work properly even in the absence of any compiler optimizations. However, the compiler is semantically permitted to perform structural optimization using these magic generic functions. In the concrete case, this PR introduces the `KeyValue` interface which consists of two generic functions, `get` and `set`. The core optimization is that the compiler is allowed to rewrite any occurrence of `get(set(x, k, v), k)` into `v` without additional legality checks. In particular, the compiler performs no type checks, conversions, etc. The higher level implementation code is expected to do all that. This approach closely matches the general direction we've been taking in external AbstractInterpreters for embedding additional semantics and optimization opportunities into Julia code (although we generally use methods there, rather than full generic functions), so I think we have some evidence that this sort of approach works reasonably well. Nevertheless, this is certainly an experiment and the interface is explicitly declared unstable. ## Current Status This is fully working and implemented, but the optimization currently bails on anything but the simplest cases. Filling all those cases in is not particularly hard, but should be done along with a more invasive refactoring of SROA, so we should figure out the general direction here first and then we can finish all that up in a follow-up cleanup. ## Obligatory benchmark Before: ``` julia> using BenchmarkTools julia> function foo() a = Base.PersistentDict(:a => 1) return a[:a] end foo (generic function with 1 method) julia> @benchmark foo() BenchmarkTools.Trial: 10000 samples with 993 evaluations. Range (min … max): 32.940 ns … 28.754 μs ┊ GC (min … max): 0.00% … 99.76% Time (median): 49.647 ns ┊ GC (median): 0.00% Time (mean ± σ): 57.519 ns ± 333.275 ns ┊ GC (mean ± σ): 10.81% ± 2.22% ▃█▅ ▁▃▅▅▃▁ ▁▃▂ ▂ ▁▂▄▃▅▇███▇▃▁▂▁▁▁▁▁▁▁▁▂▂▅██████▅▂▁▁▁▁▁▁▁▁▁▁▂▃▃▇███▇▆███▆▄▃▃▂▂ ▃ 32.9 ns Histogram: frequency by time 68.6 ns < Memory estimate: 128 bytes, allocs estimate: 4. julia> @code_typed foo() CodeInfo( 1 ─ %1 = invoke Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}(Base.HashArrayMappedTries.undef::UndefInitializer, 1::Int64)::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} │ %2 = %new(Base.HashArrayMappedTries.HAMT{Symbol, Int64}, %1, 0x00000000)::Base.HashArrayMappedTries.HAMT{Symbol, Int64} │ %3 = %new(Base.HashArrayMappedTries.Leaf{Symbol, Int64}, :a, 1)::Base.HashArrayMappedTries.Leaf{Symbol, Int64} │ %4 = Base.getfield(%2, :data)::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} │ %5 = $(Expr(:boundscheck, true))::Bool └── goto #5 if not %5 2 ─ %7 = Base.sub_int(1, 1)::Int64 │ %8 = Base.bitcast(UInt64, %7)::UInt64 │ %9 = Base.getfield(%4, :size)::Tuple{Int64} │ %10 = $(Expr(:boundscheck, true))::Bool │ %11 = Base.getfield(%9, 1, %10)::Int64 │ %12 = Base.bitcast(UInt64, %11)::UInt64 │ %13 = Base.ult_int(%8, %12)::Bool └── goto #4 if not %13 3 ─ goto #5 4 ─ %16 = Core.tuple(1)::Tuple{Int64} │ invoke Base.throw_boundserror(%4::Vector{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}}, %16::Tuple{Int64})::Union{} └── unreachable 5 ┄ %19 = Base.getfield(%4, :ref)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} │ %20 = Base.memoryref(%19, 1, false)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} │ Base.memoryrefset!(%20, %3, :not_atomic, false)::MemoryRef{Union{Base.HashArrayMappedTries.HAMT{Symbol, Int64}, Base.HashArrayMappedTries.Leaf{Symbol, Int64}}} └── goto #6 6 ─ %23 = Base.getfield(%2, :bitmap)::UInt32 │ %24 = Base.or_int(%23, 0x00010000)::UInt32 │ Base.setfield!(%2, :bitmap, %24)::UInt32 └── goto #7 7 ─ %27 = %new(Base.PersistentDict{Symbol, Int64}, %2)::Base.PersistentDict{Symbol, Int64} └── goto #8 8 ─ %29 = invoke Base.getindex(%27::Base.PersistentDict{Symbol, Int64},🅰️ :Symbol)::Int64 └── return %29 ``` After: ``` julia> using BenchmarkTools julia> function foo() a = Base.PersistentDict(:a => 1) return a[:a] end foo (generic function with 1 method) julia> @benchmark foo() BenchmarkTools.Trial: 10000 samples with 1000 evaluations. Range (min … max): 2.459 ns … 11.320 ns ┊ GC (min … max): 0.00% … 0.00% Time (median): 2.460 ns ┊ GC (median): 0.00% Time (mean ± σ): 2.469 ns ± 0.183 ns ┊ GC (mean ± σ): 0.00% ± 0.00% ▂ █ ▁ █ ▂ █▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁█ █ 2.46 ns Histogram: log(frequency) by time 2.47 ns < Memory estimate: 0 bytes, allocs estimate: 0. julia> @code_typed foo() CodeInfo( 1 ─ return 1 ```
N5N3
pushed a commit
that referenced
this pull request
Apr 22, 2024
Followup to JuliaLang#53833 Fixes a failure seen in JuliaLang#53974 (below) I believe this is the more correct check to make? The heapsnapshot generated from this PR is viewable in vscode. ``` 2024-04-06 09:33:58 EDT From worker 7: ERROR: Base.InvalidCharError{Char}('\xc1\xae') 2024-04-06 09:33:58 EDT From worker 7: Stacktrace: 2024-04-06 09:33:58 EDT From worker 7: [1] throw_invalid_char(c::Char) 2024-04-06 09:33:58 EDT From worker 7: @ Base ./char.jl:86 2024-04-06 09:33:58 EDT From worker 7: [2] UInt32 2024-04-06 09:33:58 EDT From worker 7: @ ./char.jl:133 [inlined] 2024-04-06 09:33:58 EDT From worker 7: [3] category_code 2024-04-06 09:33:58 EDT From worker 7: @ ./strings/unicode.jl:339 [inlined] 2024-04-06 09:33:58 EDT From worker 7: [4] isassigned 2024-04-06 09:33:58 EDT From worker 7: @ ./strings/unicode.jl:355 [inlined] 2024-04-06 09:33:58 EDT From worker 7: [5] isassigned 2024-04-06 09:33:58 EDT From worker 7: @ /cache/build/tester-amdci5-14/julialang/julia-master/julia-41d026beaf/share/julia/stdlib/v1.12/Unicode/src/Unicode.jl:138 [inlined] 2024-04-06 09:33:58 EDT From worker 7: [6] print_str_escape_json(stream::IOStream, s::String) 2024-04-06 09:33:58 EDT From worker 7: @ Profile.HeapSnapshot /cache/build/tester-amdci5-14/julialang/julia-master/julia-41d026beaf/share/julia/stdlib/v1.12/Profile/src/heapsnapshot_reassemble.jl:239 2024-04-06 09:33:59 EDT From worker 7: [7] (::Profile.HeapSnapshot.var"#5#6"{IOStream})(strings_io::IOStream) 2024-04-06 09:33:59 EDT From worker 7: @ Profile.HeapSnapshot /cache/build/tester-amdci5-14/julialang/julia-master/julia-41d026beaf/share/julia/stdlib/v1.12/Profile/src/heapsnapshot_reassemble.jl:192 ```
N5N3
pushed a commit
that referenced
this pull request
Apr 22, 2024
…ce. (JuliaLang#54113) The former also handles vectors of pointers, which can occur after vectorization: ``` #5 0x00007f5bfe94de5e in llvm::cast<llvm::PointerType, llvm::Type> (Val=<optimized out>) at llvm/Support/Casting.h:578 578 assert(isa<To>(Val) && "cast<Ty>() argument of incompatible type!"); (rr) up #6 GCInvariantVerifier::visitAddrSpaceCastInst (this=this@entry=0x7ffd022fbf56, I=...) at julia/src/llvm-gc-invariant-verifier.cpp:66 66 unsigned ToAS = cast<PointerType>(I.getDestTy())->getAddressSpace(); (rr) call I.dump() %23 = addrspacecast <4 x ptr addrspace(10)> %wide.load to <4 x ptr addrspace(11)>, !dbg !43 ``` Fixes aborts seen in JuliaLang#53070
N5N3
pushed a commit
that referenced
this pull request
Aug 31, 2024
…aLang#55600) As an application of JuliaLang#55545, this commit avoids the insertion of `:throw_undef_if_not` nodes when the defined-ness of a slot is guaranteed by abstract interpretation. ```julia julia> function isdefined_nothrow(c, x) local val if c val = x end if @isdefined val return val end return zero(Int) end; julia> @code_typed isdefined_nothrow(true, 42) ``` ```diff diff --git a/old b/new index c4980a5c9c..3d1d6d30f0 100644 --- a/old +++ b/new @@ -4,7 +4,6 @@ CodeInfo( 3 ┄ %3 = φ (#2 => x, #1 => #undef)::Int64 │ %4 = φ (#2 => true, #1 => false)::Bool └── goto #5 if not %4 -4 ─ $(Expr(:throw_undef_if_not, :val, :(%4)))::Any -└── return %3 +4 ─ return %3 5 ─ return 0 ) => Int64 ```
N5N3
pushed a commit
that referenced
this pull request
Oct 4, 2024
…JuliaLang#55803) This slightly improves our (LLVM) codegen for `Core.throw_methoderror` and `Core.current_scope` ```julia julia> foo() = Core.current_scope() julia> bar() = Core.throw_methoderror(+, nothing) ``` Before: ```llvm ; Function Signature: foo() define nonnull ptr @julia_foo_2488() #0 { top: %0 = call ptr @jl_get_builtin_fptr(ptr nonnull @"+Core.#current_scope#2491.jit") %Builtin_ret = call nonnull ptr %0(ptr nonnull @"jl_global#2492.jit", ptr null, i32 0) ret ptr %Builtin_ret } ; Function Signature: bar() define void @julia_bar_589() #0 { top: %jlcallframe1 = alloca [2 x ptr], align 8 %0 = call ptr @jl_get_builtin_fptr(ptr nonnull @"+Core.#throw_methoderror#591.jit") %jl_nothing = load ptr, ptr @jl_nothing, align 8 store ptr @"jl_global#593.jit", ptr %jlcallframe1, align 8 %1 = getelementptr inbounds ptr, ptr %jlcallframe1, i64 1 store ptr %jl_nothing, ptr %1, align 8 %Builtin_ret = call nonnull ptr %0(ptr nonnull @"jl_global#592.jit", ptr nonnull %jlcallframe1, i32 2) call void @llvm.trap() unreachable } ``` After: ```llvm ; Function Signature: foo() define nonnull ptr @julia_foo_713() #0 { top: %thread_ptr = call ptr asm "movq %fs:0, $0", "=r"() #5 %tls_ppgcstack = getelementptr inbounds i8, ptr %thread_ptr, i64 -8 %tls_pgcstack = load ptr, ptr %tls_ppgcstack, align 8 %current_scope = getelementptr inbounds i8, ptr %tls_pgcstack, i64 -72 %0 = load ptr, ptr %current_scope, align 8 ret ptr %0 } ; Function Signature: bar() define void @julia_bar_1581() #0 { top: %jlcallframe1 = alloca [2 x ptr], align 8 %jl_nothing = load ptr, ptr @jl_nothing, align 8 store ptr @"jl_global#1583.jit", ptr %jlcallframe1, align 8 %0 = getelementptr inbounds ptr, ptr %jlcallframe1, i64 1 store ptr %jl_nothing, ptr %0, align 8 %jl_f_throw_methoderror_ret = call nonnull ptr @jl_f_throw_methoderror(ptr null, ptr nonnull %jlcallframe1, i32 2) call void @llvm.trap() unreachable } ```
N5N3
pushed a commit
that referenced
this pull request
Oct 23, 2024
Prior to this, especially on macOS, the gc-safepoint here would cause the process to segfault as we had already freed the current_task state. Rearrange this code so that the GC interactions (except for the atomic store to current_task) are all handled before entering GC safe, and then signaling the thread is deleted (via setting current_task = NULL, published by jl_unlock_profile_wr to other threads) is last. ``` ERROR: Exception handler triggered on unmanaged thread. Process 53827 stopped * thread #5, stop reason = EXC_BAD_ACCESS (code=2, address=0x100018008) frame #0: 0x0000000100b74344 libjulia-internal.1.12.0.dylib`jl_delete_thread [inlined] jl_gc_state_set(ptls=0x000000011f8b3200, state='\x02', old_state=<unavailable>) at julia_threads.h:272:9 [opt] 269 assert(old_state != JL_GC_CONCURRENT_COLLECTOR_THREAD); 270 jl_atomic_store_release(&ptls->gc_state, state); 271 if (state == JL_GC_STATE_UNSAFE || old_state == JL_GC_STATE_UNSAFE) -> 272 jl_gc_safepoint_(ptls); 273 return old_state; 274 } 275 STATIC_INLINE int8_t jl_gc_state_save_and_set(jl_ptls_t ptls, Target 0: (julia) stopped. (lldb) up frame #1: 0x0000000100b74320 libjulia-internal.1.12.0.dylib`jl_delete_thread [inlined] jl_gc_state_save_and_set(ptls=0x000000011f8b3200, state='\x02') at julia_threads.h:278:12 [opt] 275 STATIC_INLINE int8_t jl_gc_state_save_and_set(jl_ptls_t ptls, 276 int8_t state) 277 { -> 278 return jl_gc_state_set(ptls, state, jl_atomic_load_relaxed(&ptls->gc_state)); 279 } 280 #ifdef __clang_gcanalyzer__ 281 // these might not be a safepoint (if they are no-op safe=>safe transitions), but we have to assume it could be (statically) (lldb) frame #2: 0x0000000100b7431c libjulia-internal.1.12.0.dylib`jl_delete_thread(value=0x000000011f8b3200) at threading.c:537:11 [opt] 534 ptls->root_task = NULL; 535 jl_free_thread_gc_state(ptls); 536 // then park in safe-region -> 537 (void)jl_gc_safe_enter(ptls); 538 } ``` (test incorporated into JuliaLang#55793)
N5N3
pushed a commit
that referenced
this pull request
Oct 23, 2024
E.g. this allows `finalizer` inlining in the following case:
```julia
mutable struct ForeignBuffer{T}
const ptr::Ptr{T}
end
const foreign_buffer_finalized = Ref(false)
function foreign_alloc(::Type{T}, length) where T
ptr = Libc.malloc(sizeof(T) * length)
ptr = Base.unsafe_convert(Ptr{T}, ptr)
obj = ForeignBuffer{T}(ptr)
return finalizer(obj) do obj
Base.@assume_effects :notaskstate :nothrow
foreign_buffer_finalized[] = true
Libc.free(obj.ptr)
end
end
function f_EA_finalizer(N::Int)
workspace = foreign_alloc(Float64, N)
GC.@preserve workspace begin
(;ptr) = workspace
Base.@assume_effects :nothrow @noinline println(devnull, "ptr = ", ptr)
end
end
```
```julia
julia> @code_typed f_EA_finalizer(42)
CodeInfo(
1 ── %1 = Base.mul_int(8, N)::Int64
│ %2 = Core.lshr_int(%1, 63)::Int64
│ %3 = Core.trunc_int(Core.UInt8, %2)::UInt8
│ %4 = Core.eq_int(%3, 0x01)::Bool
└─── goto #3 if not %4
2 ── invoke Core.throw_inexacterror(:convert::Symbol, UInt64::Type, %1::Int64)::Union{}
└─── unreachable
3 ── goto #4
4 ── %9 = Core.bitcast(Core.UInt64, %1)::UInt64
└─── goto #5
5 ── goto #6
6 ── goto #7
7 ── goto #8
8 ── %14 = $(Expr(:foreigncall, :(:malloc), Ptr{Nothing}, svec(UInt64), 0, :(:ccall), :(%9), :(%9)))::Ptr{Nothing}
└─── goto #9
9 ── %16 = Base.bitcast(Ptr{Float64}, %14)::Ptr{Float64}
│ %17 = %new(ForeignBuffer{Float64}, %16)::ForeignBuffer{Float64}
└─── goto #10
10 ─ %19 = $(Expr(:gc_preserve_begin, :(%17)))
│ %20 = Base.getfield(%17, :ptr)::Ptr{Float64}
│ invoke Main.println(Main.devnull::Base.DevNull, "ptr = "::String, %20::Ptr{Float64})::Nothing
│ $(Expr(:gc_preserve_end, :(%19)))
│ %23 = Main.foreign_buffer_finalized::Base.RefValue{Bool}
│ Base.setfield!(%23, :x, true)::Bool
│ %25 = Base.getfield(%17, :ptr)::Ptr{Float64}
│ %26 = Base.bitcast(Ptr{Nothing}, %25)::Ptr{Nothing}
│ $(Expr(:foreigncall, :(:free), Nothing, svec(Ptr{Nothing}), 0, :(:ccall), :(%26), :(%25)))::Nothing
└─── return nothing
) => Nothing
```
However, this is still a WIP. Before merging, I want to improve EA's
precision a bit and at least fix the test case that is currently marked
as `broken`. I also need to check its impact on compiler performance.
Additionally, I believe this feature is not yet practical. In
particular, there is still significant room for improvement in the
following areas:
- EA's interprocedural capabilities: currently EA is performed ad-hoc
for limited frames because of latency reasons, which significantly
reduces its precision in the presence of interprocedural calls.
- Relaxing the `:nothrow` check for finalizer inlining: the current
algorithm requires `:nothrow`-ness on all paths from the allocation of
the mutable struct to its last use, which is not practical for
real-world cases. Even when `:nothrow` cannot be guaranteed, auxiliary
optimizations such as inserting a `finalize` call after the last use
might still be possible (JuliaLang#55990).
N5N3
pushed a commit
that referenced
this pull request
Apr 6, 2025
If a MethodError arises on a anonyomous function, the words "anonymous function" are printed in the error like so: ```julia g=(x,y)->x+y g(1,2,3) ``` ``` ERROR: MethodError: no method of the anonymous function var"#5#6" matching (::var"#5#6")(::Int64, ::Int64, ::Int64) The function `#5` exists, but no method is defined for this combination of argument types. Closest candidates are: (::var"#5#6")(::Any, ::Any) @ Main REPL[4]:1 ``` See the [original pull request](JuliaLang#57319) and [issue](JuliaLang#56325) JuliaLang#56325
N5N3
pushed a commit
that referenced
this pull request
Apr 6, 2025
Fixes https://buildkite.com/julialang/julia-master/builds/46446#0195f712-1844-4e81-8b16-27b953fedcd3/899-1778 ``` | Error in testset Profile: | Test Failed at /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/stdlib/v1.13/Profile/test/runtests.jl:231 | Expression: occursin("@Compiler" * slash, str) | Evaluated: occursin("@Compiler/", "Overhead ╎ [+additional indent] Count File:Line Function\n=========================================================\n ╎9 @juliasrc/task.c:1249 start_task\n ╎ 9 @juliasrc/julia.h:2353 jl_apply\n ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ 9 [unknown stackframe]\n ╎ 9 @Distributed/src/process_messages.jl:287 (::Distributed.var\"#handle_msg##2#handle_msg##3\"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()\n ╎ ╎ 9 @Distributed/src/process_messages.jl:70 run_work_thunk(thunk::Distributed.var\"#handle_msg##4#handle_msg##5\"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)\n ╎ ╎ 9 @Distributed/src/process_messages.jl:287 (::Distributed.var\"#handle_msg##4#handle_msg##5\"{Distributed.CallMsg{:call_fetch}})()\n ╎ ╎ 9 @juliasrc/builtins.c:841 jl_f__apply_iterate\n ╎ ╎ 9 @juliasrc/julia.h:2353 jl_apply\n ╎ ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ ╎ ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ ╎ ╎ 9 @Base/Base_compiler.jl:223 kwcall(::@NamedTuple{seed::UInt128}, ::typeof(invokelatest), ::Function, ::String, ::Vararg{String})\n ╎ ╎ ╎ 9 @juliasrc/builtins.c:841 jl_f__apply_iterate\n ╎ ╎ ╎ 9 @juliasrc/julia.h:2353 jl_apply\n ╎ ╎ ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ ╎ ╎ ╎ 9 @juliasrc/builtins.c:853 jl_f_invokelatest\n ╎ ╎ ╎ ╎ 9 @juliasrc/julia.h:2353 jl_apply\n ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ ╎ ╎ ╎ ╎ 9 [unknown stackframe]\n ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:7 kwcall(::@NamedTuple{seed::UInt128}, ::typeof(runtests), name::String, path::String)\n ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:7 runtests\n ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:13 runtests(name::String, path::String, isolate::Bool; seed::UInt128)\n ╎ ╎ ╎ ╎ ╎ 9 @Base/env.jl:265 withenv(f::var\"#4#5\"{UInt128, String, String, Bool, Bool}, keyvals::Pair{String, Bool})\n ╎ ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:27 (::var\"#4#5\"{UInt128, String, String, Bool, Bool})()\n ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/timing.jl:621 macro expansion\n ╎ ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:29 macro expansion\n ╎ ╎ ╎ ╎ ╎ ╎ 9 @Test/src/Test.jl:1835 macro expansion\n ╎ ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:37 macro expansion\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/Base.jl:303 include(mod::Module, _path::String)\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/loading.jl:2925 _include(mapexpr::Function, mod::Module, _path::String)\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/loading.jl:2865 include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/boot.jl:489 eval(m::Module, e::Any)\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/toplevel.c:1095 ijl_toplevel_eval_in\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/toplevel.c:1050 ijl_toplevel_eval\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/toplevel.c:978 jl_toplevel_eval_flex\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/toplevel.c:1038 jl_toplevel_eval_flex\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:897 jl_interpret_toplevel_thunk\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:557 eval_body\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:557 eval_body\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:557 eval_body\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:692 eval_body\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:193 eval_stmt_value\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:242 eval_value\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:124 do_call\n ╎ ╎ ╎ ╎ ... ```
adienes
pushed a commit
that referenced
this pull request
Sep 9, 2025
Use an atomic fetch and add to fix a data race in `Module()` identified by tsan: ``` ./usr/bin/julia -t4,0 --gcthreads=1 -e 'Threads.@threads for i=1:100 Module() end' ================== WARNING: ThreadSanitizer: data race (pid=5575) Write of size 4 at 0xffff9bf9bd28 by thread T9: #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4) #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4) #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968) #3 <null> <null> (0xffff76a21164) #4 <null> <null> (0xffff76a1f074) #5 <null> <null> (0xffff76a1f0c4) #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04) #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04) #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4) #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4) Previous write of size 4 at 0xffff9bf9bd28 by thread T10: #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4) #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4) #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968) #3 <null> <null> (0xffff76a21164) #4 <null> <null> (0xffff76a1f074) #5 <null> <null> (0xffff76a1f0c4) #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04) #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04) #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4) #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4) Location is global 'jl_new_module__.mcounter' of size 4 at 0xffff9bf9bd28 (libjulia-internal.so.1.13+0x3dbd28) ```
adienes
pushed a commit
that referenced
this pull request
Sep 9, 2025
Simplify `workqueue_for`. While not strictly necessary, the acquire load
in `getindex(once::OncePerThread{T,F}, tid::Integer)` makes
ThreadSanitizer happy. With the existing implementation, we get false
positives whenever a thread other than the one that originally allocated
the array reads it:
```
==================
WARNING: ThreadSanitizer: data race (pid=6819)
Atomic read of size 8 at 0xffff86bec058 by main thread:
#0 getproperty Base_compiler.jl:57 (sys.so+0x113b478)
#1 julia_pushNOT._1925 task.jl:868 (sys.so+0x113b478)
#2 julia_enq_work_1896 task.jl:969 (sys.so+0x5cd218)
#3 schedule task.jl:983 (sys.so+0x892294)
#4 macro expansion threadingconstructs.jl:522 (sys.so+0x892294)
#5 julia_start_profile_listener_60681 Base.jl:355 (sys.so+0x892294)
#6 julia___init___60641 Base.jl:392 (sys.so+0x1178dc)
#7 jfptr___init___60642 <null> (sys.so+0x118134)
#8 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
#9 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
#10 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0xbba74)
#11 jl_module_run_initializer /home/user/c/julia/src/toplevel.c:68:13 (libjulia-internal.so.1.13+0xbba74)
#12 _finish_jl_init_ /home/user/c/julia/src/init.c:632:13 (libjulia-internal.so.1.13+0x9c0fc)
#13 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
#14 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
#15 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
#16 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)
Previous write of size 8 at 0xffff86bec058 by thread T2:
#0 IntrusiveLinkedListSynchronized task.jl:863 (sys.so+0x78d220)
#1 macro expansion task.jl:932 (sys.so+0x78d220)
#2 macro expansion lock.jl:376 (sys.so+0x78d220)
#3 julia_workqueue_for_1933 task.jl:924 (sys.so+0x78d220)
#4 julia_wait_2048 task.jl:1204 (sys.so+0x6255ac)
#5 julia_task_done_hook_49205 task.jl:839 (sys.so+0x128fdc0)
#6 jfptr_task_done_hook_49206 <null> (sys.so+0x902218)
#7 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
#8 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
#9 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9c79c)
#10 jl_finish_task /home/user/c/julia/src/task.c:345:13 (libjulia-internal.so.1.13+0x9c79c)
#11 jl_threadfun /home/user/c/julia/src/scheduler.c:122:5 (libjulia-internal.so.1.13+0xe7db8)
Thread T2 (tid=6824, running) created by main thread at:
#0 pthread_create <null> (julia+0x85f88)
#1 uv_thread_create_ex /workspace/srcdir/libuv/src/unix/thread.c:172 (libjulia-internal.so.1.13+0x1a8d70)
#2 _finish_jl_init_ /home/user/c/julia/src/init.c:618:5 (libjulia-internal.so.1.13+0x9c010)
#3 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
#4 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
#5 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
#6 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)
SUMMARY: ThreadSanitizer: data race Base_compiler.jl:57 in getproperty
==================
```
N5N3
pushed a commit
that referenced
this pull request
Feb 14, 2026
This still needs some cleanup, but after a lot of discussion and some experimentation it dawned on me that if we always initialize after late gc lowering runs then there's no need for tricks to hide this from LLVM. There are some caveats (that exist today so this is not a regression). Codegen currently always emits stores of GC tracked fields as release atomics, but doesn't do that for any of it's null initialization. For reasons of LLVM never touching atomics, that stops all optimizations of the null pointer stores. This causes things like ```llvm %"new::RefValue" = call noalias nonnull align 8 dereferenceable(16) ptr @ijl_gc_small_alloc(ptr %ptls_load3, i32 360, i32 16, i64 140078842429264) #5, !dbg !25 %"new::RefValue.tag_addr" = getelementptr inbounds i8, ptr %"new::RefValue", i64 -8, !dbg !25 store atomic i64 140078842429264, ptr %"new::RefValue.tag_addr" unordered, align 8, !dbg !25, !tbaa !30 store ptr null, ptr %"new::RefValue", align 8, !dbg !25, !tbaa !33, !alias.scope !36, !noalias !39 store atomic ptr %"x::Array", ptr %"new::RefValue" release, align 8, !dbg !25, !tbaa !33, !alias.scope !36, !noalias !39 ret ptr %"new::RefValue", !dbg !25 ``` for something like ` @code_llvm raw=true dump_module=true Ref([])` AI PART Move GC pointer field zeroing from codegen to late-gc-lowering to prevent optimization passes from sinking the null stores past safepoints. This ensures the GC never sees uninitialized pointer values. The implementation uses LLVM operand bundles on the gc_alloc_obj call: - `julia.gc_alloc_ptr_offsets(i64 off1, i64 off2, ...)`: specifies byte offsets of GC pointer fields that need null initialization - `julia.gc_alloc_zeroinit(i64 offset, i64 size)`: specifies a contiguous region to zero (for GenericMemory with boxed elements) Late-gc-lowering processes these bundles after lowering the allocation to gc_alloc_bytes and emits the appropriate null stores or memset calls. Since this happens after optimization passes, the zeroing cannot be incorrectly sunk past safepoints. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
N5N3
pushed a commit
that referenced
this pull request
Apr 5, 2026
# Overview This PR overhauls the way linking works in Julia, both in the JIT and AOT. The point is to enable us to generate LLVM IR that depends only on the source IR, eliminating both nondeterminism and statefulness. This serves two purposes. First, if the IR is predictable, we can cache compile objects using the bitcode hash as a key, like how the ThinLTO cache works. JuliaLang#58592 was an early experiment along these lines. Second, we can reuse work that was done in a previous session, like pkgimages, but for the JIT. We accomplish this by generating names that are unique only within the current LLVM module, removing most uses of the `globalUniqueGeneratedNames` counter. The replacement for `jl_codegen_params_t`, `jl_codegen_output_t`, represents a Julia "translation unit", and tracks the information we'll need to link the compiled module into the running session. When linking, we manipulate the JITLink [LinkGraph](https://llvm.org/docs/JITLink.html#linkgraph) (after compilation) instead of renaming functions in the LLVM IR (before). ## Example ``` julia> @noinline foo(x) = x + 2.0 baz(x) = foo(foo(x)) code_llvm(baz, (Int64,); dump_module=true, optimize=false) ``` Nightly: ```llvm [...] @"+Core.Float64#774" = private unnamed_addr constant ptr @"+Core.Float64#774.jit" @"+Core.Float64#774.jit" = private alias ptr, inttoptr (i64 4797624416 to ptr) ; Function Signature: baz(Int64) ; @ REPL[1]:2 within `baz` define double @julia_baz_772(i64 signext %"x::Int64") #0 { top: %pgcstack = call ptr @julia.get_pgcstack() %0 = call double @j_foo_775(i64 signext %"x::Int64") %1 = call double @j_foo_776(double %0) ret double %1 } ; Function Attrs: noinline optnone define nonnull ptr @jfptr_baz_773(ptr %"function::Core.Function", ptr noalias nocapture noundef readonly %"args::Any[]", i32 %"nargs::UInt32") #1 { top: %pgcstack = call ptr @julia.get_pgcstack() %0 = getelementptr inbounds i8, ptr %"args::Any[]", i32 0 %1 = load ptr, ptr %0, align 8 %.unbox = load i64, ptr %1, align 8 %2 = call double @julia_baz_772(i64 signext %.unbox) %"+Core.Float64#774" = load ptr, ptr @"+Core.Float64#774", align 8 %Float64 = ptrtoint ptr %"+Core.Float64#774" to i64 %3 = inttoptr i64 %Float64 to ptr %current_task = getelementptr inbounds i8, ptr %pgcstack, i32 -152 %"box::Float64" = call noalias nonnull align 8 dereferenceable(8) ptr @julia.gc_alloc_obj(ptr %current_task, i64 8, ptr %3) #5 store double %2, ptr %"box::Float64", align 8 ret ptr %"box::Float64" } [...] ``` Diff after this PR. Notice how each symbol gets the lowest possible integer suffix that will make it unique to the module, and how the two specializations for `foo` get different names: ```diff @@ -4,18 +4,18 @@ target triple = "arm64-apple-darwin24.6.0" -@"+Core.Float64#774" = external global ptr +@"+Core.Float64#_0" = external global ptr ; Function Signature: baz(Int64) ; @ REPL[1]:2 within `baz` -define double @julia_baz_772(i64 signext %"x::Int64") #0 { +define double @julia_baz_0(i64 signext %"x::Int64") #0 { top: %pgcstack = call ptr @julia.get_pgcstack() - %0 = call double @j_foo_775(i64 signext %"x::Int64") - %1 = call double @j_foo_776(double %0) + %0 = call double @j_foo_0(i64 signext %"x::Int64") + %1 = call double @j_foo_1(double %0) ret double %1 } ; Function Attrs: noinline optnone -define nonnull ptr @jfptr_baz_773(ptr %"function::Core.Function", ptr noalias nocapture noundef readonly %"args::Any[]", i32 %"nargs::UInt32") #1 { +define nonnull ptr @jfptr_baz_0(ptr %"function::Core.Function", ptr noalias nocapture noundef readonly %"args::Any[]", i32 %"nargs::UInt32") #1 { top: %pgcstack = call ptr @julia.get_pgcstack() @@ -23,7 +23,7 @@ %1 = load ptr, ptr %0, align 8 %.unbox = load i64, ptr %1, align 8 - %2 = call double @julia_baz_772(i64 signext %.unbox) - %"+Core.Float64#774" = load ptr, ptr @"+Core.Float64#774", align 8 - %Float64 = ptrtoint ptr %"+Core.Float64#774" to i64 + %2 = call double @julia_baz_0(i64 signext %.unbox) + %"+Core.Float64#_0" = load ptr, ptr @"+Core.Float64#_0", align 8 + %Float64 = ptrtoint ptr %"+Core.Float64#_0" to i64 %3 = inttoptr i64 %Float64 to ptr %current_task = getelementptr inbounds i8, ptr %pgcstack, i32 -152 @@ -39,8 +39,8 @@ ; Function Signature: foo(Int64) -declare double @j_foo_775(i64 signext) #3 +declare double @j_foo_0(i64 signext) #3 ; Function Signature: foo(Float64) -declare double @j_foo_776(double) #4 +declare double @j_foo_1(double) #4 attributes #0 = { "frame-pointer"="all" "julia.fsig"="baz(Int64)" "probe-stack"="inline-asm" } ``` ## List of changes - Many sources of statefulness and nondeterminism in the emitted LLVM IR have been eliminated, namely: - Function symbols defined for CodeInstances - Global symbols referring to data on the Julia heap - Undefined function symbols referring to invoked external CodeInstances - `jl_codeinst_params_t` has become `jl_codegen_output_t`. It now represents one Julia "translation unit". More than one CodeInstance can be emitted to the same `jl_codegen_output_t`, if desired, though in the JIT every CI gets its own right now. One motivation behind this is to allow us to emit code on multiple threads and avoid the bitcode serialize/deserialize step we currently do, if that proves worthwhile. When we are done emitting to a `jl_codegen_output_t`, we call `.finish()`, which discards the intermediate state and returns only the LLVM module and the info needed for linking (`jl_linker_info_t`). - The new `JLMaterializationUnit` wraps emitting Julia LLVM modules and the associated `jl_linker_info_t`. It informs ORC that we can materialize symbols for the CIs defined by that output, and picks globally unique names for them. When it is materialized, it resolves all the call targets and generates trampolines for CodeInstances that are invoked but have the wrong calling convention, or are not yet compiled. - We now postpone linking decisions to after codegen whenever possible. For example, `emit_invoke` no longer tries to find a compiled version of the CodeInstance, and it no longer generates trampolines to adapt calling conventions. `jl_analyze_workqueue`'s job has been absorbed into `JuliaOJIT::linkOutput`. - Some `image_codegen` differences have been removed: - Codegen no longer cares if a compiled CodeInstance came from an image. During ahead-of-time linking, we generate thunk functions that load the address from the fvars table. - In `jl_emit_native_impl`, emit every CodeInstance into one `jl_codegen_output_t`. We now defer the creation of the `llvm::Linker` for llvmcalls, which has construction cost that grows with the size of the destination module, until the very end. - RTDyld is removed completely, since we cannot control linking like we can with JITLink. Since JuliaLang#60105, platforms that previous used the optimized memory manager now use the new one. ### General refactoring - Adapt the `jl_callingconv_t` enum from `staticdata.c` into `jl_invoke_api_t` and use it in more places. There is one enumerator for each special `jl_callptr_t` function that can go in a CodeInstance's `invoke` field, as well as one that indicates an invoke wrapper should be there. There is a convenience function for reading an invoke pointer and getting the API type, and vice versa. - Avoid using magic string values, and try to directly pass pointers to LLVM `Function *` or ORC string pool entries when possible. ## Future work - `DLSymOptimizer` should be mostly removed, in favour of emitting raw ccalls and redirecting them to the appropriate target during linking. - We should support ahead-of-time linking multiple `jl_codegen_output_t`s together, in order to parallelize LLVM IR emission when compiling a system image. - We still pass strings to `emit_call_specfun_other`, even though the prototype for the function is now created by `jl_codegen_output_t::get_call_target`. We should hold on to the calling convention info so it doesn't have to be recomputed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.