
LLM based tools like ChatGPT by OpenAI can perform in multiple languages but the data that it feeds on is primarily in English. Project Indus is aimed to initially support 40 Hindi dialects and other languages. The model has the capacity to serve 25 percent of the total world population. This would benefit a large population of non-English speakers.
The biggest challenge for the project is the unavailability of datasets in local languages. Most of these datasets are either untranslated or incomplete. Datasets in Hindi are also mostly fragmented making it difficult to process. The company has started a campaign “Bhasha Daan” which aims at collecting local dialects and languages. This portal allows speakers of the particular dialect to record information which would be later utilized.
The frustration simmering on the ground in Andhra Pradesh is completely real, and it touches…
Sahaa, new Telugu emotional drama's glimpse, Soul of Sahaa, has been impressive and winning over…