-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic setup for AMD GPUs on Linux #6709
Conversation
This commit adds a few lines to detect if the system has an AMD gpu and adds an environment variable needed for torch to recognize the gpu.
This commit adds a script that detects which GPU is currently used in Windows and Linux
This commit allows the launch script to automatically download rocm's torch version for AMD GPUs using an external GPU detection script. It also prints the operative system and GPU in use.
This fixes the script on macos
If those two lines are enough to check for AMD, why all the added code? You can set |
I tried to create an univeral way to detect the gpu on both windows and linux to prepare for a possible future compatibility with directml. I can make another commit to delete the detection code and install the rocm packages directly from the webui script to make everything smaller and just think about the detection part when it'll be needed. |
The only manual thing that is needed to run this on an AMD system is to add the --precision full and --no-half arguments at least on the RX 5700XT but those can now be added to the webui-user script. After this change the wiki could be updated with the simplified setup process. |
About the HSA_OVERRIDE_GFX_VERSION=10.3.0 I know that it's needed on the 5700XT but I'm not sure if that's the case for other cards as well and I have no way to check it. If it turns out to be harmless on the other AMD cards i guess that it would be ok to leave it. |
I found this:
So it seems to be safe to leave the variable for now. |
|
It should have only been added for Navi/RDNA. Using |
Do you know if the variable causes issues or is it necessary on other cards rather than the 5000 series? I can't find any more info about it. If it causes issues on any card rather than the 5000s i can add a check for it, I just don't know if the variable is needed for the 6000 series too. |
I do have an RX 5700 I could test it out on, but I'd need, realistically, an afternoon and an ounce. Maybe a few bars. And more seriously, a 2nd drive to try dual-booting from. This was my original frame of reference:
mentioned for pre-Vega cards, so I guess that'd include my Polaris cards as well, and:
mentioned for "newer" cards, whatever that means. https://forum.garudalinux.org/t/trying-to-run-stable-diffusion-with-rx-590/22898/12 mentions that forcing people using it to run on gfx1032, RX 6700S - seems to work for RX 5500 XT as well openSUSE wiki, "force same generation" sample list of supported architectures from rocBLAS:
so forcing the 10.3.0 override is useful for e.g. gfx1031, gfx1032, which aren't officially supported, but don't seem to complain about running with the flag. gfx90X are the Vega based cards, gfx101X Navi, gfx803 is Polaris which they technically dropped but kinda sorta not really. GG AMD. I'm not sure why you needed the override for your RX5700XT? AMD Radeon RX 5700 XT Specs | TechPowerUp GPU Database
|
I can try later to add a check to just include the variable on RDNA and RDNA2 cards then. I’ve also seen that my GPU should be supported natively but in practice the program without the variable just doesn’t detect the card. |
I just started a pull request that adds a check for the gpu: only if the card is part of the navi family it'll insert the environment variable now |
So would this mean that cards like RX 580 8gb (which is the one I own) would not be supported at all since they are gfx803? I've been trying for days to get this to work but I end up going in circles trying to start from scratch and the first thing to happen is I get the cuda error thing where I then add the "--skip-torch-cuda-test" flag only to then get many different core dumps while playing with the other flags that are supposed to help with amd gpus (no half and full precision). |
Describe what this pull request is trying to achieve.
This pull request automates the setup process for AMD users on linux.
Environment this was tested in