Nov. 26, 2023, 11:20 p.m. wschaub
I recently got very interested in open source AI and wanted to build out a server with a modern GPU to do more experiments on. The problem was that I had the money for a modern GPU but not enough to build out a new system to put it in.
I got my hands on a decommissioned workstation from about 2012 for the low price of free and stuck the GPU in that as a temporary solution. I named it GLaDOS since I'm stuffing a very new and very expensive RTX 3090 into a system that's so old it might as well be a potato. It's from the sandy bridge era and it does not support avx2 instructions. (it's otherwise very powerful for its age being a dual 8 core xeon with 64GB (soon to be 128GB) of memory)
pytorch has some components that assume you have avx2 and this blog post describes How i re-compiled pytorch to work with the potato that I put my new GPU into.
I had been going through the hugging face tutorials for the transformers library just fine until i decided to try out a project called glados-tts which uses a text to speech model trained to talk like GLaDOS from Portal2. it will initilize the NNPACK library and give me a warning about unsupported hardware. immediately after that I would get an illegal instruction exception. It turns out this was from one of the compiled libraries in pytorch trying to use avx2 instructions.
I'm no stranger to working with C compilers and build systems of all descriptions so let's kick it old school like it's the early 2000s and we're tweaking compiler flags in gentoo.
I initially tried to change the NNPACK code to use a different backend, which I do still do in this article but the real culprit actually turned out to be fbgemm (facebook general matrix multiplication) Which requires avx2 to work at all. I found this out by running python under gdb and doing a backtrace. It also turns out that you can opt not to use fbgemm when building pytorch and this along with setting NNPACK_BACKEND=psimd was the solution.
You'e going to need the following things on your system before attempting to build pytorch. I'm building on Ubuntu 22.04 This is not an exhaustive list but I will mention that all my nvidia software comes from the nvidia developer package repo https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#network-repo-installation-for-ubuntu
I set up a virtual environment with the system python for my build but it's probably even easier to use conda. I'm still learning how to use conda so I will describe my method. I think the general build steps will work just as well in a conda environment if not better. I'm used to working with python -mvenv envnamehere when working with Django projects so it's kind of been my default choice.
python3 -mvenv blog_test
source blog_test/bin/activate
pip install -U pip setuptools wheel
pip install cmake ninja
mkdir src cd src git lfs install git clone --recursive https://github.com/pytorch/pytorch --branch v2.1.1 cd pytorch git submodule sync git submodule update --init --recursive
pip install -r requirements.txt export USE_FBGEMM=0 export PYTORCH_BUILD_NUMBER=1 export PYTORCH_BUILD_VERSION=2.1.1 export MAX_JOBS=32 python setup.py --cmake-only install
At this point we want to use ccmake to change some options.
ccmake build
set DISABLE_AVX2=ON set NNPACK_BACKEND=psimd make sure USE_FBGEMM is set to OFF
Select configure from the menu, select exit screen and then quit ccmake.
This step will take a long time. On my dual xeon E5-2670 with 64GB of ram it takes under an hour. I'm using the MAX_JOBS environment variable to limit the build to 32 compilers going at the same time. This can take up quite a bit of memory and without incresing my swap file to 32GB in size it will hang the machine at the later stages of the build.
You will need to tune MAX_JOBS and the size of your swap for your system.
python setup.py install
When this is finished pytorch should now be installed into your current virtual environment.
I was trying to use the glados-tts project when I ran into this particular problem so let's test that out. if it runs on your machine that lacks AVX2 instructions you've built pytorch successfully.
Using the same virtual environment you've installed pytorch into clone glados-tts and try to run glados.py
cd $HOME/src git clone https://github.com/nerdaxic/glados-tts cd glados-tts pip install phonemizer inflect unidecode scipy
Now run glados.py it should look something like this. Type in a sentence and it should generate output.wav and try to play it with aplay. you can scp the file to test that it produced audio correctly if you don't have local sound. it should look like this:
(blog_test) wschaub@GLaDOS:~/src/glados-tts$ python glados.py Initializing TTS Engine... Input: Input: Would you like some cake? Forward Tacotron took 151.5219211578369ms HiFiGAN took 212.4173641204834ms Playing WAVE './output.wav' : Signed 16 bit Little Endian, Rate 22050 Hz, Mono Input:
Now that we know it works lets package it up so we can install it into all our other virtual environment easily. We should still be inside the blog_test virtual environment at this point and still have the same environment variables set as before when we did the build.
cd $HOME/src/pytorch python setup.py bdist_wheel
Inside $HOME/src/pytorch/dist you should find a file that looks something like torch-2.1.1-cp310-cp310-linuxx8664.whl
copy it to your home directory and try installing it into a fresh python virtualenv and re-test with glados-tts:
deactivate cp dist/torch-2.1.1-cp310-cp310-linux_x86_64.whl $HOME/ cd $HOME python3 -m venv new-venv source new-venv/bin/activate pip install -U pip setuptools wheel pip install pip install torch torchvision torchaudio $HOME/torch-2.1.1-cp310-cp310-linux_x86_64.whl pip install phonemizer inflect unidecode scipy cd $HOME/src/glados-tts/ python glados.py
If all goes well you should be set for a while just install torch from your .whl file.