While India is working to develop its own language models (LLM), OpenAI has just announced the IndQA standard set, a rating system designed to test not only language processing capabilities but also the ability to grasp the unique cultural nuances of India.
IndQA includes 2,278 questions spanning across 12 languages and 10 different cultural fields, compiled with the collaboration of 261 experts across India.
Topics covered cover art, culture, history, religion, cuisine to media, sports and everyday life, with popular languages such as Hindi, Tamil, Bengali, Marathi, Telugu and Thai- tieng Anh, which are typical forms of communication in this country.
OpenAI said IndQA's goal is not to create rankings, but to measure the timely improvement of AI models, helping research groups better understand the model's ability to grasp the cultural context.
The scoring system is based on criteria set by experts, with a scale that reflects the completeness and suitability of each answer.
Notably, the questions in IndQA are carefully selected, only the questions that OpenAI's advanced models such as GPT-4o, o3 or GPT-4.5 have not yet answered well are retained.
According to OpenAI, this helps ensure space for development for future models, while avoiding the test from becoming too easy.
During the development process, the IndQA compilation group also had the participation of journalists, scholars, artists, poets and chess players, reflecting the cultural diversity and depth of Indian knowledge. Each question has an English translation and sample answers, helping to standardize the assessment.
The initiative comes as India is becoming ChatGPT's second largest market, after the US.
OpenAI is also stepping up its presence in the country of more than 1 billion people, with events like devDay Exchange in Bengaluru and a year-long free ChatGPT Go package for early subscribers.
However, experts say that IndQA is just the beginning in the journey to build local AI standards.
Many Indian startups such as Sarvam or Shunya Labs have complained that global standards are skewed in English, making it difficult for Indian language models to compete.
Recently, Shunya Labs said that their Pingala voice model had an error rate of only 3.1%, better than Nvidia's 5.6%, but was still not highly rated due to a lack of transparency in the evaluation process.
IndQA is therefore seen as an important effort to create a more fair playground for Indian language models, paving the way for the formation of an AI ecosystem with the unique identity and standards of this South Asian country.