Home / Technology / Tech News / Abu Dhabi's G42 group upgrades Llama-3-Nanda LLM with Hindi-English dataset
Abu Dhabi's G42 group upgrades Llama-3-Nanda LLM with Hindi-English dataset
The updated LLM, G42 said, is engineered for real-world use, including casual speech, and works well even in Hinglish, a colloquial mixture of Hindi and English
2 min read Last Updated : Dec 16 2025 | 8:03 PM IST
Abu Dhabi-based technology group G42 on Tuesday said it had upgraded its flagship Llama-3-Nanda large language model with 87 billion parameters. NANDA 87B has been trained on a curated Hindi-English dataset with over 65 billion Hindi tokens, the company said.
“A custom Hindi-centric tokenizer boosts efficiency, reducing both training and inference time. This breakthrough makes it the largest and one of the most capable Hindi-centric models available in open weights,” the company said in the statement.
The updated LLM, G42 said, is engineered for real-world use, including casual speech, and works well even in Hinglish, a colloquial mixture of Hindi and English.
“It delivers strong performance across translation, summarisation, instruction-following, and transliteration tasks. Safety and cultural alignment are core to its design, enabling NANDA to generate context-aware, responsible responses,” G42 said.
“As we continue to scale our operations across the country, this model opens doors for more inclusive innovation in education, entertainment, enterprise, and beyond. This upgrade reflects G42’s deep commitment to building AI solutions that serve India’s vibrant AI ecosystem,” Manu Jain, chief executive officer of G42 India, said.
In September last year, G42 launched Nanda, a 13-billion-parameter LLM trained on approximately 2.13 trillion tokens from language datasets, including Hindi. Earlier, in August 2023, the company had launched JAIS, the world’s first open-source Arabic LLM.