India Artificial Intelligence Mission (AI, LLM, ML): News, Updates & Discussions

the above spec is true for the highest capable variant.

i have the deepseek 8B variant running perfectly fine on my 16 gig RAM VM occupying just 5 gig disk space.

so, depends on the variant you want to use.

thats where the command line screenshot is from ? ,

how does the lite version compare to full ?
 
India Needs a Stake in the $500 Billion “Stargate” in U.S. AI Project

One of the key policy decisions President Trump made early in his term was to prioritize Artificial Intelligence (AI) development on an unprecedented scale. The ambitious objective of this initiative is to enable machines to replicate human cognitive functions—an area where progress, despite breakthroughs, remains in its infancy. While innovations from pioneers like ChatGPT and newcomers like DeepSeek of dollar store quality have advanced the field, they have only scratched the surface of mimicking the complexities of human intelligence.

The $500 billion “Stargate” project, a bold, multi-year endeavor spearheaded by President Trump, seeks to overcome these challenges and push the boundaries of AI research and development. As the world watches the United States chart this transformative path, India must position itself as an important pivotal player in this global initiative.

Why India Matters

Silicon Valley is abuzz with excitement over the ‘Stargate’ project. Major players like OpenAI under Sam Altman, Tesla led by Elon Musk, and tech giants like Microsoft have all expressed interest in leading this monumental effort. However, as the U.S. moves forward, the project’s success hinges on access to a highly skilled workforce and the infrastructure needed to sustain such an ambitious undertaking.

Developing and operating these advanced AI systems will require massive computational resources, storage capacity, reliable power generation, and real estate for data centres, which U.S. has. Crucially, it will also demand a skilled workforce capable of developing, refining and managing these systems around the clock. This is where India can step in.

The Workforce Solution

The U.S. is likely to face a shortage of trained personnel needed to build, operate, and evolve the AI systems under Stargate. Moreover, as AI systems learn and grow, continuous refinement and development will be essential, requiring tens of thousands of highly skilled technical experts. With its vast talent pool of engineers, scientists, and IT professionals, India is uniquely positioned to meet this demand.

However, minor obstacles, such as visa restrictions, must be addressed to incentivize Indian talent to work in the U.S. Additionally, the growing camaraderie between President Trump and Prime Minister Modi provides a strong foundation for facilitating this collaboration.

Expanding AI Infrastructure

India’s role extends beyond supplying a skilled workforce. The U.S. may need to expand its data center infrastructure beyond its borders to support the exponential growth in AI research. India, with its burgeoning tech industry and cost-effective infrastructure, is an ideal partner to house such facilities. Its strategic location and expertise in managing large-scale IT operations further strengthen its case as a key player in the Stargate project.

AI: The Flag Bearer of India’s Digital Economy

AI is already at the forefront of India’s digital revolution. With a billion Indians connected to the digital economy, the country is rapidly embracing AI-driven innovations that are transforming sectors like banking, healthcare, defense, agriculture, and consumer services. India’s simultaneous advancements in AI mirror the rapid progress being made in the U.S., positioning it as a natural ally in global AI development. Ambani(s) have announced building a large data centre in Gujarat, India in collaboration with Nvidia to capitalize on AI push in India and worldwide.

Furthermore, India’s accelerated push to develop its semiconductor industry will bolster its readiness to contribute to future AI advancements, ensuring it remains a critical partner in the AI ecosystem.

Conclusion

The Stargate project is poised to redefine the future of AI, with the potential to shape every aspect of life. For India, participating in this transformative initiative is not just an opportunity but a necessity. By leveraging its skilled workforce, infrastructure, and growing technological capabilities, India can cement its position as an indispensable partner in one of the most ambitious AI undertakings in history.
 

View: https://twitter.com/bookwormengr/status/1884176350652801212?s=19

Writing this as an Indian who works on AI in leadership role for one the largest companies in the world (though strictly my personal opinion, but based on verifiable data).


You heard it first here:
—————————-
First some more shocks:
You heard DeepSeek.
Wait till you hear about Qwen (Alibaba), MiniMax, Kimi, DuoBao (ByteDance) all from China.
Within China, DeepSeek is not unique and their competition is close behind (not far behind).
IMHO, China has 10 labs comparable to OpenAI/Anthropic and another 50 tier 2 labs.
The world will discover them in coming weeks in awe and shock.


AI is not hard (I am not high)
————————————
Ignore Sam Altman.
Many teams that built foundation models are below 50 persons (e.g. Mixtral).
In AI, LLM science part is actually quite easy.
All these models are “Transformer Decoder only models”, an architecture that was invented in late 2017.
There are improvements since then (flash attention, ROPE, MOE, PPO/DPO/GRPO), but they are relatively minor, open source and easy to implement.


Since building foundation models is easy and Nvidia is there to help you (if not directly, then by sharing their software like “Megatron” that is assembly line to build AI models) there are so many foundation models built by Chinese labs as well as global labs.


It is machines that learn by themselves…if you give them data & compute. This is unlike writing operating system or database software. Also, everyone trains on same data: internet archives, books, github code for the first stage called “pre-training”.


What is part is hard then?
———————————-
It is the parallel & distributed computing to run AI training jobs across thousands of GPUs that is hard. DeepSeek did lot of innovation here to save on “flops” and network calls. They used an innovative architecture called Mixture of Experts and a new approach called GRPO. with verifiable rewards both of which are in open domain through 2024.


Also, there is lot of data curation needed particularly for “post training”
to teach model on proper style of answering (SFT/DPO) or to teach them learn to reason (GRPO with verifiable reward). STF/DPO is where “stealing” from existing models to save cost of manual labor may happen.


LLM building is nothing that Indian engineers living in India cannot pull off. Don’t worry about Indians who have left. There are plenty in the country as of today.


Then why India does not have foundation models?
———————
It is for the same reason India does not have Google or Facebook of its own.


You need to able to walk before you can run.


There is no protected market to practice your craft in early days. You will get replaced by American service providers as they are cheaper and better every single time. That is not the case with Chinese player. They have a protected market and leadership who treats this skillset as existential due to geopolitics.


So, even if Chinese models are not good in early days they will continue to get funding from their conglomerates as well as provincial governments. Darwinian competition ensures best rise to the top.


Recall DeepSeek took 2 years to get here without much revenue. They were funded by their parent. Also, most of their engineers are not PHDs.


There is nothing that engineers who built Ola/Swiggy/Flipkart cannot build. Remember these services are second to none when you compare them to their Bay Area counterparts. Also , don’t trivialize those services; there is brilliant engineering to make them work at the price points at which they work.


Indian DARPA with 3B USD in funding over 3 years
———————-
What we need is a mentality that treats this skillset as existential. We need a national fund that will fund such teams and the only expected output will be benchmark performance with benchmarks becoming harder every 6 months . No revenue needed to survive for first 3 years.


That money will be loose change for GOI and world’s richest men living in India.


@protosphinx @balajis @vikramchandra @naval


View: https://twitter.com/bookwormengr/status/1884461105965351010?s=19

Protectionism vs Indian National AI Muscle building program
--------------------------------
Thanks for bringing up this very important question, Sadanand - I should have been clearer in my eassay. The idea is NOT to restrict customers' AI technology choice - it is counter productive. Customers are free to choose what they want (allowed by data locality needs).
Then how these Indian AI labs will make money?


Recall how DARPA works:
---------------------------
The Indian AI labs will be labs - with no obligation to sell (if they still do successfully, they keep the upside). They work on research projects. Their funding is decided by purely outcomes of foundation model quality. They have to open source their research and models.
They are given progressively increasing challenges to meet/exceed benchmarks so that they reach SOTA in no more than 3 years. Again SOTA is a moving target, it needs to be reviewed from time to time to change the pace.


This is how OpenAI, Anthropic, DeepSeek (funded single handedly by High Flyer), Qwen (funded single handedly by Alibaba) were in early days. This is how DARPA helps teams to chase crazy ideas without revenue goals, but "measurable outcomes and short timelines". Below are notes about DARPA funding.


This is also how Chinese Provincial governments encourage EV, Solar Panel, Semiconductor startups. Research in early days is funded by government with clear goals.


Focus Area:
-------------
1. LLMs are not the only Foundation models. There are Vision, Voice, Robotics foundation models. And then there are multiple multi-modal models.
2. The program also need to focus on India specific dataset (already happening - thanks to IITs and startup like Saravam), but moreover, we need to build evaluation sets that are sorely lacking in low resource languages.
3. We also need to focus on hardware and think about being able to work with multiple supply chains, as well as, learn to build models at various sizes. Start with Nvidia in early days. India also has lot of chip design first and emerging manufacturing ecosystem (though it will take time).
4. We also need to work with defense, security, weather, healthcare services to see their needs can be met/or they can sponsor challenges, provide datasets & establish the benchmarks (this should ideally be done in year 2).


Be Vishwa-Vidyarthi first (one who learns from everyone).
------------------------------------------
Initially there will be lot of "reinventing the wheel" in terms of setting up the pipelines, or recreating things already created elsewhere; but that is how you learn: by copying the masters.


That is how SONY become good at making transistors and then they gifted WALKMAN to the world.


Chinese have no shame in doing it, Indians should not either. Remember walk before run.


Why India must be good at building foundation model when there are open source ones?
------------------------------------
India is a glorious civilisation, that can not choose to depend on hostile nations. At the same time, India can contribute to local and global prosperity by building this technology.


Analogy
----------
This program is like going for morning jog/gym. You do it to build lung & heart capacity and muscles. It instantly doesn't give you money, but enables you to earn more money by keeping you energetic and vibrant through out the day.


This shall be India's National AI Muscle building program. It will spin up a high tech ecosystem through spin offs - the way it works in Israel, China and USA.


3 billion USD over 3 years is price too cheap not to miss this industrial revolution. I will write later how I arrived at this number.


Thanks Sadanand for posing this question, and everyone who supported me through their retweets. I am new to writing on twitter.

@svembu @TheJaggi @AbhijitChavda @vikramchandra @RajeevRC_X @AshwiniVaishnaw @priyankac19 @TVMohandasPai @sanjeevsanyal @srinivasiyc @anandmahindra @nikhilkamathcio @narendramodi @PMOIndia
 
Last edited:

Latest Replies

Featured Content

Trending Threads

Back
Top