Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16x slowdown for CondaPkg v0.2.23 on 1.11.0-rc3, and therefore JuliaCall much slower to start #145

Open
PallHaraldsson opened this issue Sep 3, 2024 · 9 comments

Comments

@PallHaraldsson
Copy link

PallHaraldsson commented Sep 3, 2024

$ julia +1.11

julia> @time using CondaPkg
  3.171187 seconds (1.41 M allocations: 81.186 MiB, 5.03% gc time, 67.35% compilation time)

I'm assuming all dependencies are the same, well except Julia. I think config files are the same (though didn't check) or small, since I just installed on 1.11.

It's in part because of Pkg (it's now 251x slower in 1.11, since it was intentionally dropped from the sysimage, so any way do avoid using it? Use lazily?), and Parsers, and can be mitigated with:

$ julia +1.11 -O1

julia> @time using CondaPkg
  1.904249 seconds (1.41 M allocations: 81.188 MiB, 8.60% gc time, 51.51% compilation time)

$ julia +1.11 --compile=min

julia> @time using CondaPkg
  1.210366 seconds (640.18 k allocations: 37.992 MiB, 8.73% gc time, 3.46% compilation time)

I could make a PR to enable either, since this package is hardly-speed critical, except for its startup?

julia> @time_imports using CondaPkg
      0.7 ms  Printf
     33.5 ms  Dates
      0.3 ms  UUIDs
      0.5 ms  TOML
    124.1 ms  Preferences
      0.4 ms  PrecompileTools
               ┌ 0.1 ms Parsers.__init__() 
    195.1 ms  Parsers 80.05% compilation time
      0.7 ms  Mmap
      7.0 ms  StructTypes
     52.0 ms  JSON3
      0.4 ms  Pidfile
      1.3 ms  Base64
               ┌ 0.0 ms NetworkOptions.__init__() 
      3.2 ms  NetworkOptions
               ┌ 3.5 ms MbedTLS_jll.__init__() 
      7.0 ms  MbedTLS_jll
               ┌ 0.4 ms LibSSH2_jll.__init__() 
      4.7 ms  LibSSH2_jll
               ┌ 0.7 ms LibGit2_jll.__init__() 
      3.9 ms  LibGit2_jll
     12.4 ms  LibGit2
     16.7 ms  ArgTools
               ┌ 0.2 ms nghttp2_jll.__init__() 
      3.7 ms  nghttp2_jll
               ┌ 0.3 ms LibCURL_jll.__init__() 
      3.7 ms  LibCURL_jll
               ┌ 0.0 ms MozillaCACerts_jll.__init__() 
      3.8 ms  MozillaCACerts_jll
               ┌ 0.0 ms LibCURL.__init__() 
      1.6 ms  LibCURL
               ┌ 0.2 ms Downloads.Curl.__init__() 
     33.1 ms  Downloads
      1.3 ms  Tar
               ┌ 0.1 ms p7zip_jll.__init__() 
      6.0 ms  p7zip_jll
      0.3 ms  Logging
      5.5 ms  Markdown
               ┌ 0.0 ms Pkg.__init__() 
    444.5 ms  Pkg
      0.4 ms  Scratch
      0.2 ms  LazyArtifacts
      0.4 ms  JLLWrappers
               ┌ 5.7 ms micromamba_jll.__init__() 95.49% compilation time
      6.1 ms  micromamba_jll 88.94% compilation time
      0.4 ms  MicroMamba
               ┌ 2199.7 ms CondaPkg.PkgREPL.__init__() 99.92% compilation time
   2204.0 ms  CondaPkg 99.73% compilation time
@PallHaraldsson
Copy link
Author

PallHaraldsson commented Sep 3, 2024

FYI: This older version is actually much faster on 1.11:

julia> @time using CondaPkg
  1.177992 seconds (656.20 k allocations: 42.494 MiB, 8.20% gc time, 21.20% compilation time)

(@v1.11) pkg> st
Status `~/.julia/environments/v1.11/Project.toml`
  [992eb4ea] CondaPkg v0.2.14 `~/.julia/dev/CondaPkg`
  [f878e3a2] OpenCV v4.5.3
  [69de0a69] Parsers v2.8.1
  [91a5bcdd] Plots v1.40.8
  [438e738f] PyCall v1.96.4
⌃ [6099a3de] PythonCall v0.9.9
  [33b9d88c] OpenCV_jll v4.6.0+2
⌅ [3eaa8342] libcxxwrap_julia_jll v0.9.7+3
  [0dad84c5] ArgTools v1.1.2

Note before I did:

(@v1.11) pkg> dev CondaPkg
   Resolving package versions...
   Installed PythonCall ─ v0.9.9
   Installed Plots ────── v1.40.8
    Updating `~/.julia/environments/v1.11/Project.toml`
  [992eb4ea] ~ CondaPkg v0.2.23 ⇒ v0.2.14 `~/.julia/dev/CondaPkg`
  [91a5bcdd] ↑ Plots v1.40.4 ⇒ v1.40.8
⌃ [6099a3de] ↓ PythonCall v0.9.23 ⇒ v0.9.9
    Updating `~/.julia/environments/v1.11/Manifest.toml`
  [992eb4ea] ~ CondaPkg v0.2.23 ⇒ v0.2.14 `~/.julia/dev/CondaPkg`
  [91a5bcdd] ↑ Plots v1.40.4 ⇒ v1.40.8
⌃ [6099a3de] ↓ PythonCall v0.9.23 ⇒ v0.9.9
  [66db9d55] + SnoopPrecompile v1.0.3

and I was thinking, why downgraded, I think because I had been deving an old version before, no major(?) changes there.

With that dev version (if it actually works...), NetworkOptions is prominent, so this could be even faster:

julia> @time_imports using CondaPkg
      0.7 ms  Printf
     38.4 ms  Dates
      0.7 ms  TOML
     15.3 ms  Preferences
      0.4 ms  SnoopPrecompile
      0.4 ms  Scratch
      2.1 ms  Base64
               ┌ 0.0 ms NetworkOptions.__init__() 
    296.9 ms  NetworkOptions 98.61% compilation time
               ┌ 0.5 ms MbedTLS_jll.__init__() 
      4.5 ms  MbedTLS_jll
               ┌ 0.2 ms LibSSH2_jll.__init__() 
      3.7 ms  LibSSH2_jll
               ┌ 0.2 ms LibGit2_jll.__init__() 
      3.3 ms  LibGit2_jll
     11.7 ms  LibGit2
     15.9 ms  ArgTools
               ┌ 0.1 ms nghttp2_jll.__init__() 
      3.9 ms  nghttp2_jll
               ┌ 0.2 ms LibCURL_jll.__init__() 
      3.5 ms  LibCURL_jll
               ┌ 0.0 ms MozillaCACerts_jll.__init__() 
      4.4 ms  MozillaCACerts_jll
               ┌ 0.0 ms LibCURL.__init__() 
      1.6 ms  LibCURL
               ┌ 0.2 ms Downloads.Curl.__init__() 
     31.6 ms  Downloads
      1.1 ms  Tar
               ┌ 0.1 ms p7zip_jll.__init__() 
      6.0 ms  p7zip_jll
      0.3 ms  UUIDs
      0.1 ms  Logging
      4.5 ms  Markdown
               ┌ 0.0 ms Pkg.__init__() 
    446.7 ms  Pkg
      0.3 ms  LazyArtifacts
      0.4 ms  JLLWrappers
               ┌ 5.6 ms micromamba_jll.__init__() 95.02% compilation time
      6.0 ms  micromamba_jll 88.82% compilation time
      0.4 ms  MicroMamba
      0.2 ms  PrecompileTools
               ┌ 0.0 ms Parsers.__init__() 
     40.1 ms  Parsers
      0.6 ms  Mmap
      7.5 ms  StructTypes
     57.5 ms  JSON3
      0.5 ms  Pidfile
               ┌ 256.5 ms CondaPkg.PkgREPL.__init__() 99.52% compilation time
    297.5 ms  CondaPkg 85.81% compilation time

@PallHaraldsson
Copy link
Author

PallHaraldsson commented Sep 3, 2024

Note this is not just to blame for slow (in 1.11):

julia> @time using PythonCall
  5.788240 seconds (3.24 M allocations: 176.412 MiB, 5.91% gc time, 79.11% compilation time: 16% of which was recompilation)

(@v1.11) pkg> st
Status `~/.julia/environments/v1.11/Project.toml`
  [992eb4ea] CondaPkg v0.2.23
  [69de0a69] Parsers v2.8.1
  [91a5bcdd] Plots v1.40.8
  [438e738f] PyCall v1.96.4
  [6099a3de] PythonCall v0.9.23
  [33b9d88c] OpenCV_jll v4.6.0+2
  [3eaa8342] libcxxwrap_julia_jll v0.13.2+0
  [0dad84c5] ArgTools v1.1.2

julia> @time_imports using PythonCall
...
               ┌ 2192.7 ms CondaPkg.PkgREPL.__init__() 94.23% compilation time
   2196.9 ms  CondaPkg 94.05% compilation time
...
               ┌ 1839.1 ms PythonCall.C.__init__() 90.91% compilation time  # why this slow, though not much slower?
               ├ 0.0 ms PythonCall.GC.__init__() 
               ├ 771.2 ms PythonCall.Core.__init__() 96.96% compilation time (85% recompilation)  # why this 23x slower?
               ├ 0.2 ms PythonCall.Convert.__init__() 
               ├ 0.0 ms PythonCall.Wrap.__init__() 
               ├ 0.0 ms PythonCall.JlWrap.Cjl.__init__() 
               ├ 151.8 ms PythonCall.JlWrap.__init__() 94.66% compilation time
               ├ 0.7 ms PythonCall.Compat.__init__() 
   2884.7 ms  PythonCall 88.86% compilation time (25% recompilation)

vs in 1.10:
...
      7.4 ms  micromamba_jll 81.39% compilation time
      0.4 ms  MicroMamba
               ┌ 21.5 ms CondaPkg.PkgREPL.__init__() 90.92% compilation time
     28.8 ms  CondaPkg 67.88% compilation time
...
               ┌ 1639.9 ms PythonCall.C.__init__() 84.32% compilation time
               ├ 0.0 ms PythonCall.GC.__init__() 
               ├ 33.2 ms PythonCall.Core.__init__() 29.75% compilation time
               ├ 0.2 ms PythonCall.Convert.__init__() 
               ├ 0.0 ms PythonCall.Wrap.__init__() 
               ├ 0.0 ms PythonCall.JlWrap.Cjl.__init__() 
               ├ 14.4 ms PythonCall.JlWrap.__init__() 45.59% compilation time
               ├ 0.7 ms PythonCall.Compat.__init__() 
   1810.0 ms  PythonCall 77.31% compilation time

I think the differences are only explained by Julia versions, my 1.10 env is messy, some things held back, but I think none of the dependencies, i.e. I think all unrelated, such as:
⌅ [1f15a43c] CxxWrap v0.12.1
⌅ [0ee61d77] Clang_jll v15.0.7+10
⌅ [6b5019fb] jlqml_jll v0.3.0+1
⌅ [3eaa8342] libcxxwrap_julia_jll v0.9.7+3

@cjdoris
Copy link
Collaborator

cjdoris commented Sep 5, 2024

Thanks. The slowness in CondaPkg is entirely down to setting up the PkgREPL mode. Some things we could do:

  • Use SnoopCompile to precompile this init function.
  • Put this into a package extension so it only loads when using Pkg.
  • Only set up the mode when in an interactive session (this would be mildly breaking).

Though note that PythonCall and JuliaCall both depend on Pkg so putting the code into an extension wouldn't help if you're using those. It's probably possible to remove these dependencies on Pkg though.

@PallHaraldsson
Copy link
Author

The slowness in CondaPkg is entirely down to setting up the PkgREPL mode. Some things we could do:

  • Use SnoopCompile to precompile this init function.

Yes, only because of Pkg.REPLMode.gen_help(), and I know the/a fix, but I think it has to be in Julia 1.11 Pkg, so Snoopcompile will not help, at least for any of your packages.

I think it would be best if PythonCall doesn't depend on CondaPkg, or only lazily loads it, and thus delays messing with Pkg.REPLMode. I'm not sure, most of the time the python/Conda environment is up to date, can't you just have a file saying when and only load CondaPkg, if any of the files change and thus are newer? This is I think sort of what you mean by the 3rd option, just not breaking?

@IanButterworth
Copy link

IanButterworth commented Sep 5, 2024

We can precompile Pkg.REPLMode.gen_help() in Pkg, but I'm curious as to why it's called during init here?

@cjdoris
Copy link
Collaborator

cjdoris commented Sep 5, 2024

CondaPkg adds new commands to the PkgREPL so we need to regenerate the help otherwise they don't show up in help.

@PallHaraldsson
Copy link
Author

We can precompile Pkg.REPLMode.gen_help() in Pkg

No, need rather here the root cause making Pkg slow: JuliaLang/julia#55706

@cjdoris
Copy link
Collaborator

cjdoris commented Sep 9, 2024

What system did you get your timings on? On my Windows machine I don't observe a slow-down - loading CondaPkg takes about 20ms on 1.10 or 1.11-rc3 for me. However on nightly (1.12) I do get the slow-down (about 1300ms for me) but I'm not too fussed about nightly :)

IanButterworth pushed a commit to JuliaLang/julia that referenced this issue Sep 9, 2024
Fixes #55706 that is seemingly a 4472x regression, not just 16x (was my
first guess, based on CondaPkg, also fixes or greatly mitigates
JuliaPy/CondaPkg.jl#145), and large part of 3x
regression for PythonCall.

---------

Co-authored-by: Kristoffer Carlsson <[email protected]>
@PallHaraldsson
Copy link
Author

PallHaraldsson commented Sep 9, 2024

I use Linux with rc3, and except same slowdown on Windows. Note, you must not enter Pkg mode before timing, also if you use a script, not the REPL the timing is even worse.

[My just merged PR for Markdown should help, for part of the latency, when it hits nightly, and I guess for rc4. But best to not depend on Pkg, as explained elsewhere, or at least my PythonCall.]

KristofferC pushed a commit to JuliaLang/julia that referenced this issue Sep 11, 2024
Fixes #55706 that is seemingly a 4472x regression, not just 16x (was my
first guess, based on CondaPkg, also fixes or greatly mitigates
JuliaPy/CondaPkg.jl#145), and large part of 3x
regression for PythonCall.

---------

Co-authored-by: Kristoffer Carlsson <[email protected]>
(cherry picked from commit 1463c99)
kshyatt pushed a commit to JuliaLang/julia that referenced this issue Sep 12, 2024
Fixes #55706 that is seemingly a 4472x regression, not just 16x (was my
first guess, based on CondaPkg, also fixes or greatly mitigates
JuliaPy/CondaPkg.jl#145), and large part of 3x
regression for PythonCall.

---------

Co-authored-by: Kristoffer Carlsson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants