Abu Dhabi-based technology group G42 on Tuesday said it had upgraded its flagship Llama-3-Nanda large language model with 87 billion parameters. NANDA 87B has been trained on a curated Hindi-English dataset with over 65 billion Hindi tokens, the company said.
“A custom Hindi-centric tokenizer boosts efficiency, reducing both training and inference time. This breakthrough makes it the largest and one of the most capable Hindi-centric models available in open weights,” the company said in the statement.
The updated LLM, G42 said, is engineered for real-world use, including casual speech, and works well even in Hinglish, a colloquial mixture of Hindi

)