Skip to content

Apple M1 - autotrain setup warning - The installed version of bitsandbytes was compiled without GPU support. #278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
neoneye opened this issue Sep 23, 2023 · 15 comments

Comments

@neoneye
Copy link

neoneye commented Sep 23, 2023

I'm getting a warning during installation, that worries me, will autotrain be able to fine tune llama without GPU acceleration.

I investigated how to compile bitsandbytes with GPU acceleration for M1, and it's not yet support, see issue 252.

PROMPT> pip install autotrain-advanced
PROMPT> autotrain setup --update-torch
/opt/homebrew/lib/python3.11/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
'NoneType' object has no attribute 'cadam32bit_grad_fp32'
> INFO    Installing latest transformers@main
> INFO    Successfully installed latest transformers
> INFO    Installing latest peft@main
> INFO    Successfully installed latest peft
> INFO    Installing latest diffusers@main
> INFO    Successfully installed latest diffusers
> INFO    Installing latest trl@main
> INFO    Successfully installed latest trl
> INFO    Installing latest xformers
> INFO    Successfully installed latest xformers
> INFO    Installing latest PyTorch
> INFO    Successfully installed latest PyTorch

Ideas for improvement

  • Print out a summary at the bottom: Is the lack of GPU support on Mac a known problem. Am I the only one experiencing this.
  • Update install guide that mac isn't yet supported in the readme: https://github.com/huggingface/autotrain-advanced
  • Install guide for mac, to reassure newcomers that autotrain indeed works on Mac with GPU acceleration.
@QueryType
Copy link

I would also like to know about the possibilities of using mac m2 for autotrain. Thanks

@Satyam7166-tech
Copy link

@neoneye, @QueryType, Did you try running it on cpu only though?
I have access to an m2 ultra mac studio and was searching for ways to fine-tune an llm

@QueryType
Copy link

@neoneye, @QueryType, Did you try running it on cpu only though? I have access to an m2 ultra mac studio and was searching for ways to fine-tune an llm

I received a response from Abhishek Thakur, he says M2 is not yet supported. So hoping it come through. I do not know exactly how to run only on CPU.

@neoneye
Copy link
Author

neoneye commented Oct 4, 2023

@neoneye, @QueryType, Did you try running it on cpu only though? I have access to an m2 ultra mac studio and was searching for ways to fine-tune an llm

I didn't continue with autotrain on macOS. Instead I ended up using axolotl for training on a hefty gpu in the cloud.

@Satyam7166-tech
Copy link

@neoneye, @QueryType , Thanks for your prompt reply.

Is there any way to do fine tuning on a mac. Its ok for me if we GPU is not utilised.

@neoneye, The place that I work at has a lot of confidential data and they are not willing to give their data to cloud providers. Any idea how I can work with this?

Thank you

@Satyam7166-tech
Copy link

Also can we use the llama2.c repo somehow to train on mac?

@abhishekkrthakur
Copy link
Contributor

why would you want to use autotrain on mac? to finetune LLMs or something else?

@abhishekkrthakur
Copy link
Contributor

@neoneye, @QueryType, Did you try running it on cpu only though? I have access to an m2 ultra mac studio and was searching for ways to fine-tune an llm

I didn't continue with autotrain on macOS. Instead I ended up using axolotl for training on a hefty gpu in the cloud.

you can do the same and maybe better using autotrain. everything lies in the huggingface ecosystem.

@abhishekkrthakur
Copy link
Contributor

abhishekkrthakur commented Oct 5, 2023

llm training on m1/m2 available from version 0.6.35+. please update

@QueryType
Copy link

llm training on m1/m2 available from version 0.6.36+. please update

This is cool. Thanks. Let me try. I agree @abhishekkrthakur it is not a good idea to train locally. However, my organisation currently requires everything to be "local" and nothing on "internet" due to IPR etc. We can break our heads against a wall but can not explain the logic to Legal. :)

@abhishekkrthakur
Copy link
Contributor

the problem is, it will take ages (provided it works). there is no int4, int8 or fp16 on m1/m2 yet.

@Satyam7166-tech
Copy link

@abhishekkrthakur, thank you so much for replying. I am a big fan of your work, especially your yt videos. They are very well explained.

as @QueryType explained, I can't use cloud solutions for the same reason. It actually feels good to know someone else is going through the same headache xD

I'll try it out and let you know @abhishekkrthakur

@abhishekkrthakur
Copy link
Contributor

thank you for your kind words. what im saying is using m1/m2 will take ages to train if it works. if cloud isnt an option, it would be better to move to a local ubuntu machine with several gpus instead.

@Satyam7166-tech
Copy link

Satyam7166-tech commented Oct 5, 2023

thank you for your kind words. what im saying is using m1/m2 will take ages to train if it works. if cloud isnt an option, it would be better to move to a local ubuntu machine with several gpus instead.

Ah I seee. Yes, I am trying to borrow a gaming laptop from a friend. Buying GPU's is not an option for now.

Although, I have access to an M2 ultra with 128gb memory and 24 cores CPU with 76 cores GPU though. Is this also not viable?

@Satyam7166-tech
Copy link

@abhishekkrthakur I ran auto train advanced on the mac. It seems to have worked. I didn't load it in 8bits though.

However I am getting these warnings,

> /opt/homebrew/Caskroom/miniforge/base/envs/testFine/lib/python3.10/site-packages/torch/utils/data/dataloader.py:645: UserWarning: Length of IterableDataset <trl.trainer.utils.ConstantLengthDataset object at 0x4cd9da740> was reported to be 100 (when accessing len(dataloader)), but 143 samples have been fetched. 
  warnings.warn(warn_msg)
{'loss': 4.2429, 'learning_rate': 0.0, 'epoch': 2.14}                                                                                                                                                            
{'train_runtime': 619.2563, 'train_samples_per_second': 0.484, 'train_steps_per_second': 0.484, 'train_loss': 4.507806447347005, 'epoch': 2.14}                                                                  
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [10:19<00:00,  2.06s/it]
> INFO    Finished training, saving model...

I had a question though, these files got generated. What is the next step?

README.md               adapter_model.bin       checkpoint-300          special_tokens_map.json tokenizer.model         training_args.bin
adapter_config.json     added_tokens.json       runs                    tokenizer.json          tokenizer_config.json   training_params.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants