A new study by scientists affiliated with Cornell University and the University of California, Los Angeles is raising great concerns in the academic community when nearly 150,000 fake quotes created by artificial intelligence (AI) have appeared in scientific research papers.
According to the study, about 146,900 unlicensed references were found in more than 2.5 million scientific papers stored on four major research databases including arXiv, bioRxiv, SSRN and PubMed Central.
Researchers believe that the main reason comes from the fact that many authors use AI chatbots such as ChatGPT or Gemini to support writing articles but do not verify the source of documents.
Large language models (LLMs) today have the ability to create very convincing texts, but they also have a serious limitation called "AI illusions", which is the phenomenon of systems creating information that sounds reasonable but is completely untrue.
In an academic environment, this is particularly dangerous because scientific articles are always based on the accuracy and authenticity of reference materials.
The research team analyzed about 111 million citations in scientific data to find documents that could not be compared with any existing publications.
Although part of the error stemmed from typing errors or information confusion, the research team still determined that the large number of quotes created entirely by AI were not real.
Notably, researchers say the number of non-existent references has increased sharply since 2023, when AI chatbots became globally popular.
This shows that many scientists or graduate students are overly dependent on AI tools in the process of compiling articles.
According to the author group, fake quotes do not only appear in a few separate studies but are scattered in many different articles.
This is seen as a sign that the problem has spread in the academic community.
Usha Haley - Professor of Management at Wichita State University (USA) - said the increase in fake quotes is a serious warning to modern science.
Ms. Haley believes that the references created by AI are weakening the foundation of academic belief, which depends on peer-to-peer criticism and the accumulation of knowledge through generations of research.
What is worrying is that this skepticism now originates from within the academic community," Ms. Haley said.
Scientific archives such as arXiv or bioRxiv play a very important role in the research community. Before being published in official journals, many works are often published in advance on these platforms for the global scientific community to access and critique.
Faced with the risk of AI "polluting" academic data, arXiv recently announced that it will ban research papers containing fabricated citations or signs of unverified AI content.
Steinn Sigurdsson - Scientific Director of arXiv, warns that the treasure of scientific knowledge is being diluted by poor quality or misleading research created by AI.
According to Mr. Steinn Sigurdsson, this situation not only makes it more difficult to find accurate information but also risks leading researchers in the wrong direction in the future.