Train a 70b language model at home (2024)

71 points by amrrs a day ago

Maybe I misunderstand, but it seems like they are using LoRa, which is a fine tuning implementation. That requires an already existing trained LLM. If that's true, I think that the title of this submission is inaccurate, as this doesn't let you train a model from scratch with 2 consumer GPUs.

botro a day ago

Yes, they put this in footnote 1: "Throughout this article “training” can refer to either pre-training, or fine-tuning." But the article is just talking about fine-tuning.
- oceanplexian a day ago
  
  "The thing the word actually means isn't the way we're using it" isn't how I would use a footnote.

underlines a day ago

I hand curate github.com/underlines/awesome-ml so I read a ton about latest trends in this space. when I started to read the article, I felt a lot of information was weirdly familiar and almost outdated.

the space is moving fast after all. they just seem to be explaining QLoRA fine tuning, (yes great achievement and all the folks involved are heroes) but reading a trending article on HN - it felt off.

turns out I was too dumb to check the date: 2024 and the title is mixing up quantized adapter fine tuning with base model training. thanks lol

darkbatman a day ago

Would be nice to see some benchmarks.

Also from my experience you need more power to get some significant result. Mostly fine tuning would work if base model is very close to what you are trying to achieve and you won't be much happy with the results though.

Also context length becomes an issue trying to fit in with gpu with lesser ram.

lostmsu a day ago

Clickbait. They fine tune. Still sounds potentially useful.

nvtop a day ago

March 2024

CactusQuipster a day ago

[dead]

mawadev a day ago

[flagged]

dang a day ago

"Don't be snarky."
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
https://news.ycombinator.com/newsguidelines.html
WesolyKubeczek a day ago

Better do it in winter, when you could use extra heat anyway.
- mawadev a day ago
  
  Thank you for your advice, I will take it into account when I train my 70B language model at home in the winter days
  - WesolyKubeczek a day ago
    
    Everyone trains their 70B language model at home, even if they won't admit it. It's our little dirty secret.
- smnplk a day ago
  
  winter is coming