Why we should pay more attention to tokenization and why it is more important than you think.
Thanks for your write-up, super informative!
If math is important, why is it not a common practice to force the tokenizer to split numbers into single digits? This should make it way easier to handle math related tasks. 🤔
Yeah, I think that is what should be done ideally. It's just too inefficient right now. LLMs are treating Maths more like English.
Thanks for your write-up, super informative!
If math is important, why is it not a common practice to force the tokenizer to split numbers into single digits? This should make it way easier to handle math related tasks. 🤔
Yeah, I think that is what should be done ideally. It's just too inefficient right now. LLMs are treating Maths more like English.