Memoize cwstring when used for env lookup / modification on Windows#51371
Memoize cwstring when used for env lookup / modification on Windows#51371DilumAluthge merged 9 commits intoJuliaLang:masterfrom
cwstring when used for env lookup / modification on Windows#51371Conversation
IanButterworth
left a comment
There was a problem hiding this comment.
Seems like an improvement, even if there are further things that can be done
|
Can we backport this to LTS (1.6) as well? |
|
And can we also backport to 1.9? |
|
I am really no a fan of this change. It seems odd that every environment access now does a string comparison against one particular string? |
|
|
|
Okay, so I tried changing this to use a I'm not sure if the access pattern I did was a good idea or not, I wanted to use |
vtjnash
left a comment
There was a problem hiding this comment.
LGTM. Can you pull this out into a helper so that _hasenv, _setenv, and _unsetenv can do this for their key too
The implementation of this PR has changed since this review
|
@MasonProtter Can you update the PR description to reflect the final implementation? |
@debug performance on Windowscwstring when used for env lookup / modification on Windows
cwstring when used for env lookup / modification on Windowscwstring when used for env lookup / modification on Windows
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
…#51371) ~This is just me proposing a suggestion from @KristofferC in https://discourse.julialang.org/t/debug-has-massive-performance-impact/103974/22, it's all his code / idea, not mine.~ This PR makes it so that on windows, we pre-allocate an `IdDict` and every time someone looks up environment variables (motivating example here is `@debug` statements), we store the result of `cwstring(::String)` in that `IdDict` so that it doesn't need to be re-computed for future uses. The idea behind this is that people have observed that [using `@debug` is significantly more costly on Windows than other platforms](https://discourse.julialang.org/t/debug-has-massive-performance-impact/103974), even though we have documented in that manual that it should be a really cheap operation. @KristofferC suggests this is due to the fact that [checking environment variables in Windows is more costly](https://discourse.julialang.org/t/debug-has-massive-performance-impact/103974/18). ~The idea here is that we preallocate a `Cwstring` on Windows that just holds the text `"JULIA_DEBUG"`, so that if `access_env(f, "JULIA_DEBUG")` gets called, we don't need to create a new `Cwstring` and then throw it away each time.~ --------- Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com> Co-authored-by: Jameson Nash <vtjnash@gmail.com> (cherry picked from commit 9dcedaa)
Backported PRs: - [x] #51213 <!-- Wait for other threads to finish compiling before exiting --> - [x] #51520 <!-- Make allocopt respect the GC verifier rules with non usual address spaces --> - [x] #51598 <!-- Use a simple error when reporting sysimg load failures. --> - [x] #51757 <!-- fix parallel peakflop usage --> - [x] #51781 <!-- Don't make pkgimages global editable --> - [x] #51848 <!-- allow finalizers to take any locks and yield during exit --> - [x] #51847 <!-- add missing wait during Timer and AsyncCondition close --> - [x] #50824 <!-- Add some aliasing warnings to docstrings for mutating functions in Base --> - [x] #51885 <!-- remove chmodding the pkgimages --> - [x] #50207 <!-- [devdocs] Improve documentation about building external forks of LLVM --> - [x] #51967 <!-- further fix to the new promoting method for AbstractDateTime subtraction --> - [x] #51980 <!-- macroexpand: handle const/atomic struct fields correctly --> - [x] #51995 <!-- [Artifacts] Pass artifacts dictionary to `ensure_artifact_installed` dispatch --> - [x] #52098 <!-- Fix errors in `sort` docstring --> - [x] #52136 <!-- Bump JuliaSyntax to 0.4.7 --> - [x] #52140 <!-- Make c func `abspath` consistent on Windows. Fix tracking path conversion. --> - [x] #52009 <!-- fix completion that resulted in startpos of 0 for `\\ --> - [x] #52192 <!-- cap the number of GC threads to number of cpu cores --> - [x] #52206 <!-- Make have_fma consistent between interpreter and compiled --> - [x] #52027 <!-- fix Unicode.julia_chartransform for Julia 1.10 --> - [x] #52217 <!-- More helpful error message for empty `cpu_target` in `Base.julia_cmd` --> - [x] #51371 <!-- Memoize `cwstring` when used for env lookup / modification on Windows --> - [x] #52214 <!-- Turn Method Overwritten Error into a PrecompileError -- turning off caching --> - [x] #51895 <!-- Devdocs on fixing precompile hangs, take 2 --> - [x] #51596 <!-- Reland "Don't mark nonlocal symbols as hidden"" --> - [x] #51834 <!-- [REPLCompletions] allow symbol completions within incomplete macrocall expression --> - [x] #52010 <!-- Revert "Support sorting iterators (#46104)" --> - [x] #51430 <!-- add support for async backtraces of Tasks on any thread --> - [x] #51471 <!-- Fix segfault if root task is NULL --> - [x] #52194 <!-- Fix multiversioning issues caused by the parallel llvm work --> - [x] #51035 <!-- refactor GC scanning code to reflect jl_binding_t are now first class --> - [x] #52030 <!-- Bump Statistics --> - [x] #52189 <!-- codegen: ensure i1 bool is widened to i8 before storing --> - [x] #52228 <!-- Widen diagonal var during `Type` unwrapping in `instanceof_tfunc` --> - [x] #52182 <!-- jitlayers: replace sharedbytes intern pool with one that respects alignment --> Contains multiple commits, manual intervention needed: - [ ] #51092 <!-- inference: fix bad effects for recursion --> Non-merged PRs with backport label: - [ ] #52196 <!-- Fix creating custom log level macros --> - [ ] #52170 <!-- fix invalidations related to `ismutable` --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia -->
…JuliaLang#51371) ~This is just me proposing a suggestion from @KristofferC in https://discourse.julialang.org/t/debug-has-massive-performance-impact/103974/22, it's all his code / idea, not mine.~ This PR makes it so that on windows, we pre-allocate an `IdDict` and every time someone looks up environment variables (motivating example here is `@debug` statements), we store the result of `cwstring(::String)` in that `IdDict` so that it doesn't need to be re-computed for future uses. The idea behind this is that people have observed that [using `@debug` is significantly more costly on Windows than other platforms](https://discourse.julialang.org/t/debug-has-massive-performance-impact/103974), even though we have documented in that manual that it should be a really cheap operation. @KristofferC suggests this is due to the fact that [checking environment variables in Windows is more costly](https://discourse.julialang.org/t/debug-has-massive-performance-impact/103974/18). ~The idea here is that we preallocate a `Cwstring` on Windows that just holds the text `"JULIA_DEBUG"`, so that if `access_env(f, "JULIA_DEBUG")` gets called, we don't need to create a new `Cwstring` and then throw it away each time.~ --------- Co-authored-by: Ian Butterworth <i.r.butterworth@gmail.com> Co-authored-by: Jameson Nash <vtjnash@gmail.com>
…dows env variables (#52758) Should fix #52711. My analysis of the invalidation is as follows: We added code to cache the conversion to `cwstring` in env handling on Windows (#51371): ```julia const env_dict = IdDict{String, Vector{UInt16}}() function memoized_env_lookup(str::AbstractString) ... env_dict[str] = cwstring(str) ... end function access_env(onError::Function, str::AbstractString) var = memoized_env_lookup(str) ... end ``` Since `IdDict` has `@nospecialize` on `setindex!` we compile this method: ```julia setindex!(::IdDict{String, Vector{UInt16}}, ::Any, ::Any) ``` which has an edge to: ```julia convert(Type{Vector{Int64}}, Any}) ``` But then StaticArrays comes along and adds a method ```julia convert(::Type{Array{T, N}}, ::StaticArray) ``` which invalidates the `setindex!` (due to the edge to `convert`) which invalidates the whole env handling on Windows which causes 4k other methods downstream to be invalidated, in particular, the artifact string macro which causes a significant delay in the next jll package you load after loading StaticArrays. There should be no performance penalty to this since strings already does a hash for their `objectid`.
…dows env variables (#52758) Should fix #52711. My analysis of the invalidation is as follows: We added code to cache the conversion to `cwstring` in env handling on Windows (#51371): ```julia const env_dict = IdDict{String, Vector{UInt16}}() function memoized_env_lookup(str::AbstractString) ... env_dict[str] = cwstring(str) ... end function access_env(onError::Function, str::AbstractString) var = memoized_env_lookup(str) ... end ``` Since `IdDict` has `@nospecialize` on `setindex!` we compile this method: ```julia setindex!(::IdDict{String, Vector{UInt16}}, ::Any, ::Any) ``` which has an edge to: ```julia convert(Type{Vector{Int64}}, Any}) ``` But then StaticArrays comes along and adds a method ```julia convert(::Type{Array{T, N}}, ::StaticArray) ``` which invalidates the `setindex!` (due to the edge to `convert`) which invalidates the whole env handling on Windows which causes 4k other methods downstream to be invalidated, in particular, the artifact string macro which causes a significant delay in the next jll package you load after loading StaticArrays. There should be no performance penalty to this since strings already does a hash for their `objectid`. (cherry picked from commit b7c24ed)
…dows env variables (JuliaLang#52758) Should fix JuliaLang#52711. My analysis of the invalidation is as follows: We added code to cache the conversion to `cwstring` in env handling on Windows (JuliaLang#51371): ```julia const env_dict = IdDict{String, Vector{UInt16}}() function memoized_env_lookup(str::AbstractString) ... env_dict[str] = cwstring(str) ... end function access_env(onError::Function, str::AbstractString) var = memoized_env_lookup(str) ... end ``` Since `IdDict` has `@nospecialize` on `setindex!` we compile this method: ```julia setindex!(::IdDict{String, Vector{UInt16}}, ::Any, ::Any) ``` which has an edge to: ```julia convert(Type{Vector{Int64}}, Any}) ``` But then StaticArrays comes along and adds a method ```julia convert(::Type{Array{T, N}}, ::StaticArray) ``` which invalidates the `setindex!` (due to the edge to `convert`) which invalidates the whole env handling on Windows which causes 4k other methods downstream to be invalidated, in particular, the artifact string macro which causes a significant delay in the next jll package you load after loading StaticArrays. There should be no performance penalty to this since strings already does a hash for their `objectid`. (cherry picked from commit b7c24ed)
This is just me proposing a suggestion from @KristofferC in https://discourse.julialang.org/t/debug-has-massive-performance-impact/103974/22, it's all his code / idea, not mine.This PR makes it so that on windows, we pre-allocate an
IdDictand every time someone looks up environment variables (motivating example here is@debugstatements), we store the result ofcwstring(::String)in thatIdDictso that it doesn't need to be re-computed for future uses.The idea behind this is that people have observed that using
@debugis significantly more costly on Windows than other platforms, even though we have documented in that manual that it should be a really cheap operation. @KristofferC suggests this is due to the fact that checking environment variables in Windows is more costly.The idea here is that we preallocate aCwstringon Windows that just holds the text"JULIA_DEBUG", so that ifaccess_env(f, "JULIA_DEBUG")gets called, we don't need to create a newCwstringand then throw it away each time.