AI-generated text in Indian government documents
December 10, 2025
Why would you lie about how much coal you have?
Why would you lie about something dumb like that?
Why would you lie about anything at all?
I came across this excellent Wikipedia article on signs of AI/LLM writing and decided to try searching for presence of such text artifacts in Indian government’s websites and documents. I’ve been reading tenders and RFPs for research on another project and they tend to be extremely verbose and lengthy documents.
Maybe there was a cottage industry in manufacturing filler text for these documents before 2023, but it can safely be assumed that like many smaller industries of artificial fluff, this one has also been made more efficient through LLMs. Here are some of my finds.
Note: In case the original links stop working, they have been archived on Wayback Machine, linked at the bottom of this page.
“As of my last knowledge update…”
A quick search for site:*.gov.in "as of my last training" OR "as of my last knowledge update" surfaced a “Fact Finding report” on farmer suicides from the Maharashtra State Human Rights Commission (MSHRC), which contains this disclaimer.
“As of my last knowledge update in January 2022, here are some key initiatives and regulations in India related to farmer suicides.”
And another one from the Kerala State Higher Education Council, in a handbook for master trainers.
“As of my last knowledge update in September 2021, the University Grants Commission (UGC) in India had proposed the concept of an Academic Bank of Credits (ABC)…”
“It wasn’t just…”
Another common AI writing pattern is the phrase “It wasn’t just [mundane thing], it was [another mundane thing]”. A search for site:*.gov.in OR site:*.nic.in "It wasn't just" filetype:pdf shows up in various official documents.
Removing the filetype:pdf filter reveals even more, especially from the Press Information Bureau (PIB), which seems to use this construction frequently in summaries of speeches.
utm_source=chatgpt
A utm_source tag is used for tracking where traffic comes from. Seeing utm_source=chatgpt.com in a URL tells us the link was likely copied directly from a ChatGPT session.
My search was site:.gov.in OR site:.nic.in "utm_source=chatgpt.com".
It led me to this PIB press release which has a link containing the ChatGPT tracker, which points back to another PIB press release.
Copy-pasted as received
Sometimes, the copy-paste job includes the chatbot’s sign-off. A document from a pharmacovigilance committee at Lakhimpur Medical College and Hospital, Assam, ends with “Let me know if you need any modifications”.
Here is the presence of “I hope this helps! Let me know if you have any other questions :)” in a presentation on “LLMs for Defense” on the Indian Navy website. This was delivered by an external person (seems like), but frankly, the provenance of this doesn’t matter to me because it exists on a gov.in domain.
Another common mistake is copying the entire text that an LLM returns, with the UI labels. A user adding stuff to myscheme.gov.in seems to have accidentally copied the button text from the AI interface right into the document. This description for a scheme ends with a stray “regenerate response”. The text itself is also suspiciously generic and repetitive.
And it’s not a one-off. Here is another one on the same portal.
Rich Tapestry
“Tapestry” (especially those that are rich, vibrant and intricately woven) is a purple word that LLMs are particularly fond of. A search for (site:.gov.in OR site:.nic.in) ("intricate tapestry" OR "vibrant tapestry") after:2022-12-01 shows its presence everywhere.
Although since LLMs are trained on public and non-public data, it is entirely possible that the government was in a habit of describing everything as a rich tapestry and we have affected the training data.
“It’s not just about…”
A slight variation on the earlier theme, “it’s not just about” is another go-to phrase for LLMs trying to add depth. It appears in this transcript of ‘Mann Ki Baat’ from the Ministry of Information and Broadcasting and multiple other places.
Bonus: AI-Generated Images
AI-generated images are also making their way into official documents. These ones, from a Delhi SCERT document on “Nurturing and Design Thinking,” are riddled with spelling mistakes like “Survvey Form” and illustrations depicting European people which feel out of place. This document is full of such weird, lifeless and waxy illustrations.
Unnecessary Emojis
LLMs have a tendency to sprinkle emojis everywhere. This page on cybersecurity from AIIMS Bhubaneswar is an example.
Markdown Leftovers
This PDF from the Government of Odisha has some stray markdown.
Full documents
This article so far has been stuff I could find through dork searches, but the scale of the problem is unidentifiable in full because LLMs have gotten better at removing these kind of artifacts. But, the same garbage feel remains if you look at articles in full and not just through keyword searches. Walls of text of generic, absolute garbage. As I find more, I will keep adding to this section but for now, this page by the Comptroller and Auditor General (CAG) of India has an article about Generative AI written, obviously because of knowing satire, entirely with an LLM.
Maybe this is just as bad as it was before.
The perverse incentive structure of length equaled to rigor that exists not just in the government but our schooling and education as well means that this was inevitable.















