update torch base environment#9191
Conversation
| @@ -1,3 +1,4 @@ | |||
| astunparse | |||
There was a problem hiding this comment.
Any particular reason this is included?
There was a problem hiding this comment.
its an indirect dependency that can cause runtime errors if user has tensorflow installed as new transformers will try to check it and then complain that astunparse is missing even if tensorflow is not used.
File "/home/vlado/.local/lib/python3.10/site-packages/transformers/utils/generic.py", line 33, in <module>
import tensorflow as tf
...
from tensorflow.python.autograph.pyct import parser
File "/home/vlado/.local/lib/python3.10/site-packages/tensorflow/python/autograph/pyct/parser.py", line 29, in <module>
import astunparse
ModuleNotFoundError: No module named 'astunparse'
There was a problem hiding this comment.
The TensorFlow pip packages should automatically install astunparse as a required dependency as listed in their package metadata and same with setup.py when installing from source.
Metadata-Version: 2.1
Name: tensorflow
Version: 2.12.0
...
Requires-Dist: astunparse (>=1.6.0)
...
Metadata-Version: 2.1
Name: tensorflow-intel
Version: 2.12.0
...
Requires-Dist: astunparse (>=1.6.0)
...
So this seems redundant when the webui base itself doesn't even install TensorFlow, only some extensions do. None the less, if you feel this is really needed, it would be wise to pin the same version range as TensorFlow itself.
There was a problem hiding this comment.
The TensorFlow pip packages should automatically install astunparse as a required dependency as listed in their package metadata and same with [setup.py]
yup, but there was a buggy installer in tf 2.11 and 2.12 only came out very recently.
it would be wise to pin the same version range as TensorFlow itself.
i hate specifying version range where absolutely not needed - last version of astunparse is from 2019, its unlikely that a brand new breaking version is going to popup out-of-the-blue.
So this seems redundant when the webui base itself doesn't even install TensorFlow, only some extensions do.
very true. i'm just trying to make it as least error prone as possible for average users as i've seen this happen on multiple systems.
if you feel this is really needed
i don't - i'm just trying to make it error-proof - but i'm open to suggestions
(and removing astunparse from dependencies if desired).
There was a problem hiding this comment.
there was a buggy installer in tf 2.11
If this is the case, then maybe astunparse could be moved to the launch.py and be conditional on tensorflow being installed? Something like:
if is_installed("tensorflow") or is_installed("tensorflow-intel") or is_installed("tensorflow-gpu"):
if not is_installed("astunparse"):
run_pip(f"install {astunparse}", "astunparse")
Though astunparse is a tiny package which hasn't been updated since 2019, so it probably isn't a big deal if it gets mistakenly installed when not needed. I'll leave judgement up to you and automatic, since I have no firm opinion about this.
|
Testing these changes out - things seem to work "out of the box", |
its not "included", its just not necessary given new the remaining message comes from external repo - |
|
have you tested these changes on unix? runpod? |
linux yes. runpod no. there are thousands of gpu cloud providers, cannot test each one like that. |
|
Yes, at some point will have to migrate to torch 2.0 since newer xformer wheels require it. |
ok list me 20 :) anyway i am just saying that covering as many as possible widely used scenario is good
correct and i solved this problem with downloading and uploading torch 1 version wheel 0.0.18dev489. they are also still compiling them thankfully. i think automatic1111 can do same way. the wheel and such things can be hosted on hugging face i think . currently they removed all 0.0.14 and 0.0.17 for torch 1 from pip installation. |
This is true only for wheels posted to pypi. You can find a wide range of pre-built xformers wheel builds in their Github action artifacts, if you still needed a wheel for older torch. Not at simple as keeping up to date via pypi, but useful in a pinch. Just keep in mind you need to be logged into Github to download artifacts. |
sorry, can we use discussions for this and keep pr comments as pr comments? i'd love to collect/implement anything thats required, but this is not pr related at all. |
| - cudatoolkit=11.8 | ||
| - pytorch=2.0 | ||
| - torchvision=0.15 | ||
| - numpy=1.23 |
There was a problem hiding this comment.
@vladmandic Didn't you say Torch 2.0 requires Numpy 1.24+ instead of 1.23?
There was a problem hiding this comment.
Beta did, but they relaxed it for GA.
And some other dependencies require 1.23 and are not compatible with 1.24, so it's not that clean to use 1.24 just yet. Thus my recommendation of latest from 1.23.
|
@EfourC Very strange problem, since the recent update is not stable, you can try to see if this problem is reproduced on the old version, for example |
|
I did some more permutations of testing, especially to see if What I found out is that the problem is actually Both of these startup args work ok: Using I haven't looked at (or used previously) any of the other performance optimization switches, but it's probably worth people trying them out on different types of systems (since I blundered into an issue with this one). |
|
Honestly if there is going to be a move to Torch 2.0.0, it should wait until after Torch 2.0.1 is released as there is currently a major bug that made it into GA that breaks compatibility with WebUI when using torch.compile. |
|
I'm aware of that issue, but WebUI does not use torch.compile on its own and anyone that is experienced enough to use it would hand pick torch version manually anyhow. Torch 2.1 has no benefits for normal WebUI user. And existing Torch 1.13 is showing it's teeth with quite a few install issues lately. Whole point of the PR is not to enable experimental use, but to make it simpler for normal users. |
|
fyi, i've initially updated |
It isn't 2.1, we aren't waiting a whole major release, Torch 2.0.1 came out of phase 0 yesterday. I still believe that Torch 2.0 should not be merged until the blocking issue upstream is resolved in the next minor update as I believe PyTorch botched the initial GA release of 2.0, and we shouldn't be running that version of Pytorch until it is more mature. |
and then we'd have to wait for |
As it stands right now, the only people you are claiming are effected are people using cloud setups, which most likely have already done the manual work to support PyTorch 2.0.0. There is no reason for PyTorch to be upgraded to 2.0.0 when it is very clearly NOT stable. It is not worth risking adding even more bugs to the code base as it currently stands. |
that is a very strong statement. can you substantiate this? all errors i've seen so far have been related to on the other hand, there are hundreds users using torch 2.0 with webui without issues. |
To name a few: And not only that, I disagree with moving to 2.0.0 on principle as .0.0 software is generally never stable. Waiting for 2.0.1 has no downsides whereas 2.0.0 is an unstable mess that they are still trying to get stable. The last thing this repo needs is more instability which causes more issues to flood in. |
Bugs relevant to WebUI are what matters - why list random things - this is going in a wrong direction?
That is a question of personal preference and risk vs reward. Issue is that Torch 1.13 wheels are getting obsoleted in many packages and/or environment causing failures to install. So whats the solution? Ignore current issues until some unknown time in the future? PR stands as-is and I've been using Torch 2.0 on my branch for a while now (and users on my branch are not reporting issues relevant to WebUI). We can agree to disagree here. |
|
Convolutions being broken for Cuda 11.8 builds specifically affects users who use Pytorch 2.0 and --opt-channelslast. It basically negates any possible performance benefits from that option. |
There was a problem hiding this comment.
What is the change?
If there is none, why is this here?
It's incredibly confusing.
(BTW: I am a relatively new user, please go easy on me)
There was a problem hiding this comment.
Under certain conditions, there could be a cache.json.lock file in addition to cache.json.
So this change (appending an asterisk) will cover both files.
There was a problem hiding this comment.
Exactly! And you're fast :)
DGdev91
left a comment
There was a problem hiding this comment.
The code for MacOS uses the --extra-index wich was meant for nvidia cards
There was a problem hiding this comment.
Are you sure "--extra-index-url https://download.pytorch.org/whl/cu118" is actually ok for mac os?
I don't know why that code was using version 1.12.1 instead a newer one (probably it was just never updated) but probably the right way to install it in mac os is by just using "pip install torch torchvision" without the --extra-index, as mentioned in the official website https://pytorch.org/get-started/locally/
|
Recently PyTorch changed it's install command, it now uses --index-url instead of --extra-index-url, as mentioned in #9483 Also, i noticed your code doesn't cover AMD cards (in that cate TORCH_COMMAND is set in webui.sh |
|
Been having a lot of issues trying to get things working with my 5700XT. The new torch 2.0 version failed to generate any images. Using the latest rocm as below failed to generate images or any console output: Using the torch command from the pr works: At least it is working on Commit hash: a9fed7c |
|
rocm 5.6.0 alpha is out and it brings torch 2.0 compatibility, i'd be curious if that works. |
|
5.6.0?maybe that will contain my 7900xtx |
Really? Where? I see 5.4.3 as the last realease on https://github.com/RadeonOpenCompute/ROCm/releases |
https://rocmdocs.amd.com/projects/alpha/en/develop/deploy/install.html |
uhm... it doen't seem that's publicly available
There was indeed a docker image for rocm 5.6.0 with 7900xtx support around, but it's now offline so i guess that code was intended for internal testing and not supposed to be released yet. Anyway, there was a discussion here #9591 I'm not sure if that works on other gpus like the 5700xt too. but i wouldn't be surprised if pytorch2.0 starts to work when the next rocm version will be released. I guess for 5700xt users the better choice is sticking to the old 1.13.1 version and wait for an official rocm release |
|
I'm pretty sure |
According to PyTorch's website it should be just |







this pr is a single-step update of
pytorchbase environment:torch1.13.1 withcuda11.7 andcudnn8.5.0torch2.0.0 withcuda11.8 andcudnn8.7.0this allows usage of
sdpcross-optimization, better multi-gpu support withaccelerateand avoids large number of performance issues due to brokencudnnin some environmentsit updates all required packaged, but avoids any prereleases:
torchvision(plus silence future deprecation warning)xformers(update follows torch)accelerate(required to support new torch)numpy(update of numpy is required by new accelerate)note:
acceleratechanged format of the config file to avoid warnings (non-critical), runaccelerate configonceyes, updating torch is a major step, but will have to be done sooner or later as there are more and more reports of issues installing old torch version