Like many scientists, I am fascinated by the latest AI advancements. ChatGPT and other large language models (LLMs) have experienced a sharp rise in their ability to compose text that mimics the way humans write. However, there are imminent risks that using ChatGPT and other AI technologies bring to science. And No, I am not talking here about the rise of the machines who will take over the planet.
In this article, I want to make you aware of 10 very real and immediate risks of using ChatGPT in science (in no particular order). For each problem, I will offer some ideas on what to do about it. Because after all, there is no way around using LLMs. We just have to become good at using them.
ChatGPT Risk: AI hallucinations
When lacking information, large language models often make up stuff that does not exist. This means that you can receive responses that contain false or misleading information. What makes these hallucinations especially dangerous is that they sound real.
For example, ChatGPT is known to provide references to non-existing research papers. It can also easily “cite” made-up research papers.
How to reduce the risk of ChatGPT hallucinations?
Do not blindly trust the information that ChatGPT and other LLMs provide you. Always double-check the references and read the original papers, especially if you are going to use the information for making decisions or citing in your research paper.
Another way of avoiding the risk of ChatGPT hallucinations is to try out the LLMs that are specifically designed for researchers. These tools are much better at providing references and citing real papers compared to ChatGPT. Semantic Scholar, Elicit, and Consensus are some such LLMs.
Finally, you can prompt ChatGPT to always cite only real research papers. This still does not guarantee 100 % accuracy, but it does reduce the hallucination risk to some extent. For example, you can add this paragraph to whatever query you are writing in ChatGPT:
Never make up stuff. Always provide only existing references to real research papers. If you do not know the answer, simply say so.
ChatGPT Risk: Reduction of cognitive capabilities of scientists
Writing is more than a means of efficiently transmitting information from one person to another. Writing helps our brains develop, writing allows us to develop logical arguments, writing helps us to reason and think critically, writing helps us to remember, and writing helps develop creativity.
In my book, Write an Impactful Research Paper, I even propose to look at writing as a way of understanding your own research results. In this way writing equals thinking.
Now, what happens if we outsource writing to ChatGPT and other AI tools? It means we write less. Since the development of our cognitive capabilities is so intrinsically linked with writing, delegation of writing to ChatGPT risks impairing our learning.
It is not that we will become imminently stupid from one day to the other, but letting ChatGPT take over a large portion of our writing tasks poses a very real risk of slowing the development of our collective abilities as scientists.
How to reduce the risk of diminishing cognitive capabilities?
It is very important that we do not delegate writing that requires critical thinking to ChatGPT.
Correcting language? Fine.
Writing routine emails? OK-ish.
Writing research papers? Nope.
ChatGPT Risk: Missing most scientific research results
Currently, LLMs like ChatGPT can not access most research papers. That is the case because most research papers are locked away behind publisher’s paywalls. Even the LLMs that boast access to millions of research papers likely include only open-access papers in their databases.
Open-access articles represent only around a third of the entire scientific literature. This means that if for your literature search, you rely on asking questions to LLMs, you are risking missing out on knowledge from most of the published research papers.
How to reduce the risk of missing out on research results?
Besides using ChatGPT and other LLMs for literature search, be sure to additionally do your own search using conventional scientific search engines. You can even try out the many upcoming AI-powered literature search tools, like ResearchRabbit.
ChatGPT Risk: High energy use
ChatGPT and other large language models are thirsty for energy. OpenAI and other AI companies are hesitant to release the data about energy use of their models, but it is clear that the energy use is much higher than searching in Google (which by itself consumes a substantial amount of energy).
Then there is the water use. The data servers and the power plants that generate energy for the servers all need to be cooled. This is done primarily by using fresh water. A preprint by the University of California, Riverside estimates that the cooling for servers and energy 20-50 queries consume roughly 0.5 liters (17 ounces) of water.
This does not include the water consumption for training the AI models, which is huge. In total, Li et al. predict that the global AI water demand may be accountable for 4.2 — 6.6 billion cubic meters of water withdrawal in 2027, which is more than half of the water demand of the United Kingdom.
Of course, generating energy and using water to operate the LLMs comes with obvious consequences: the generated carbon emissions contribute to climate change. Exactly how much CO2 is generated depends on the energy sources used so it is difficult to give a precise number. However, the risk of contributing to climate change by using ChatGPT in science is very real.
How to reduce the risk of high energy use by ChatGPT?
OpenAI and other companies behind the LLMs are trying to bring down the energy use of the models by optimizing their data centers and networks, improving the efficiency of the models, and developing new AI technologies.
But what can we, users, do about reducing energy use? This is a tough one. Probably the only way to do anything about it is to restrain from using too much of ChatGPT and similar language models. For example, my colleague is a physicist. He does not use ChatGPT in science primarily for the reason it consumes too much energy.
ChatGPT Risk: Demise of original ideas
The generation of innovative ideas is one of the core skills of a scientist. We need crazy out-of-the-box thinkers to move science forward. We need new original ideas that challenge the status quo. We need to be able to think critically.
A risk of using ChatGPT in science is that of diminishing our collective ability to generate new ideas. It is much easier to ask ChatGPT to “Provide a list of innovative research topics on the subject of X” than to learn, read literature, and think critically before coming up with your own ideas.
Sure, we will not suddenly stop being innovative. But gradually over time, as we assign more and more of the idea-generation process to LLMs we might not even notice a demise in the generation of our collective ability to come up with original scientific ideas.
Outsourcing innovation in science to LLM is also risky because, in their very nature, LLMs are trained on existing data. Most of the time in their responses, they simply regurgitate the information they have been fed. It’s a long shot to expect something revolutionary from this process (with some exceptions which I will describe in a later article).
How to reduce the risk of generating low-value ideas?
ChatGPT is in principle not suitable for the generation of original ideas. So don’t use it for this purpose. There are plenty of other good ways to use it in science.
ChatGPT Risk: Plagiarism
ChatGPT and other LLMs are trained on data generated by humans. When you ask a question, the LLMs recombine words from their training data to generate responses that sound like it was written by a human. Technically it is not plagiarism, because every answer is unique.
Even if you ask the same question twice, each time the answer provided by ChatGPT will be different.
But it’s important to keep in mind that LLMs can not come up with new ideas, reason, or use logic. The AI used in the LLM algorithms simply predicts the next word most suited in a sentence. This prediction is based on the text that was used to train it. So using ChatGPT in scientific writing poses the very real risk that text similar to what the LLM is giving you has been already written somewhere by someone.
There are even experts who claim that if you search thoroughly, you will always find something similar to what the AI has written somewhere on the web.
How to reduce the risk of plagiarism when using ChatGPT?
When you use content written by ChatGPT, make sure to paste pieces of it into Google before using it. That way, you will be able to find out if someone else has already written something that is similar. If yes, make sure to properly cite the source.
ChatGPT Risk: Increase of low-quality research papers
Only weeks after the ChatGPT was released, the first papers with ChatGPT as a co-author started to appear. Now most publishers forbid naming LLMs as co-authors but that does not mean AI is not used for writing research papers.
The ease at which ChatGPT allows generating text, coupled with the advance of low-quality journals and the constant pressure on scientists to publish ever more papers make it unavoidable that LLMs will be used to write research papers. Since ChatGPT can not come up with truly original ideas, the risk is high most papers generated using LLMs will be of low quality.
A stream of low-quality papers not only creates an unwanted information overload that makes it difficult for scientists to find truly valuable research papers; low-quality research papers also risk damaging the reputation of science as a whole thus reducing the trust of the society towards the scientific method. During the Covid’19 we saw very well what happens in such situations.
How reduce the risk of increased low-quality research paper flow?
Don’t outsource writing research papers to ChatGPT just for the sake of having more papers published. It’s that easy.
See this article for an example of how a sloppy research paper can ruin a scientific career.
It will also help if you participate in the peer-review process to ensure a high publishing standard for the legit journals of your field.
ChatGPT Risk: Clogging of the review process
Many journals forbid to use LLMs for writing papers and many others require the authors to state the way in which AI was used in writing. Compliance with these requirements, however, mostly relies on the honesty of the researchers.
It will be objectively tough to avoid the stream of AI-generated papers and proposals. This will likely result in added pressure for scientists to perform more reviews.
Reviews of research papers already take a significant portion of the researcher’s time. The ease at which it is now possible to write papers using ChatGPT poses a real risk that scientists will be forced to review even more than currently. Since a day only has so many hours, more review work means less time for research, teaching, and other primary activities of scientists.
Participating in the voluntary peer-review process is a vital task in ensuring high publishing standards. But doing it for unoriginal papers generated primarily by ChatGPT is a waste of the valuable scientist’s time. In this waz the use of ChatGPT in science risks being a productivity sink rather than a productivity boost.
How to reduce the risk of clogging the review process?
My personal solution to the overload of review requests is to only review for high-quality journals and not more than three articles per month. I wish I had a better solution.
Another solution could be to assign LLMs to perform review. Many will oppose ideas and I agree that there are obvious concerns with this approach, including privacy issues and the current inability of LLMs to reason. However, it is not hard to imagine that as the LLMs keep improving and the flow of research papers increases, at least some part of the research paper review process might be assigned to AI in the future.
Time will show, but I am quite convinced that we will soon see cases of AI performing review for scientific journals. In fact, I have already prepared a GPT that helps scientists review their manuscript before submission to a journal.
ChatGPT Risk: Bias toward the training data
The doctor yelled at the nurse because she was late. Who was late?
If you ask ChatGPT this question, it will answer the nurse was late.
The doctor yelled at the nurse because he was late. Who was late?
Now ChatGPT will answer that the doctor was late.
See the gender bias?
Humans are biased. We can have biases toward race, gender, political beliefs, religion, and many other things. LLMs are trained on texts written by humans who hold these very biases. This makes it almost certain that the LLMs will hold biases in the text they generate. This situation might lead to discriminatory responses and further perpetuation of the biases.
Moreover, it’s not only the social biases. Biases in scientific opinions are just as real. AI might hold something for “true” simply because there are more open-access research papers supporting this opinion. More papers do not always equal a higher degree of likelihood. Especially when new research comes or with contrasting evidence.
How to reduce the risk of bias in ChatGPT responses?
We, as individuals, can’t do much about the fact that ChatGPT and other AI technologies are biased, but we can be rigorous in checking the text it generates to make sure we don’t perpetuate the biases further. Always critically evaluate and edit the text generated by AI to make sure it is not biased.
As for the scientific biases – make sure to use conventional search engines in parallel with AI. Finally, use your own judgment and knowledge to evaluate scientific arguments provided by LLMs. Remember that the author is always responsible for the text he or she writes.
ChatGPT Risk: Increase of the number of research proposals
Ask ChatGPT to write a research proposal, and it will. Most of the time these will be somewhat lower-quality research proposals, but they will have to be reviewed nevertheless, thus putting pressure on the administrative personnel and reviewers.
It is, however, possible to use ChatGPT to assist with writing a legitimately good proposal. I even developed a tailor-made GPT for doing it. The GPT first requests to upload a summary of the research idea using Research Project Canvas and then writes the first draft of the proposal.
Using my GPT for research proposal writing ensures the basic content of the proposal is original because the idea is generated by the user rather than generated by the ChatGPT. The ease at which the GPT then converts an idea into a proposal means scientists will be able to generate and submit ever more proposals for every research call.
Lowering the writing barriers that researchers face when submitting proposals might even mean more good ideas get funded. But, as mentioned before, a high influx of proposals will put pressure on the funding agencies (and scientists) to perform more reviews.
How to reduce the risk of funding low-quality research ideas?
It is the moral duty of scientists to participate in the peer-review process or research proposals. With the advance of GPTs, more than ever it will be important to thoroughly evaluate research proposals for their scientific value. Checking whether the technical language is convincing will not be enough.
Like with the research papers, I think it is only a matter of time before AI will be used to do reviews or at least pre-screen proposals before handing them to human reviewers. For example, I already created a GPT for research proposal review.
ChatGPT Risk: Difficulties in assessing student’s abilities
Professors simply can not rely as much on home assignments and thesis to evaluate students’ capabilities since the AI can generate much of the text for successfully passing assignments. There are even studies that demonstrate how ChatGPT can pass a medical license exam.
Using ChatGPT for completing assignments was one of the first uses of the technology in science. Many professors at first tried combating it by forbidding the use of ChatGPT. This of course proved impossible for home assignments since by the vary nature AI tries to write in a way that is indistinguishable from a human. No detection software or professor`s judgment will be certain enough to allow disputing a student`s honesty.
Now many professors have adapted to the use of AI, often at the expense of using in-class assignments for the student`s evaluation. This often takes a lot more time compared to days before the AI advance.
How to evaluate students in the age of ChatGPT?
ChatGPT can not be and should not be forbidden completely. It’s a vital technology that will serve the students in their professional careers so they better learn to use it well.
Rather than combating ChatGPT use, many professors are coming up with ingenious ways to involve ChatGPT in their class work and assignments. Here are some examples.
Summary
As with many emerging technologies, ChatGPT currently seems simultaneously helpful and harmful. But without a doubt, the technology is here to stay. So instead of burying our head into the sand, scientists better learn to use it to our advantage. This also means we should learn how not to use it.
In this article, I explored the potential risks of using ChatGPT and other AI technologies in science. Not because I think we should not use these tools. On the contrary, I think for scientists it is rather risky not to learn using ChatGPT. Rather, it is important that we use the tools responsibly, meaning that we also recognize the limitations and risks of using these tools.
In summary, it is important to remember that AI in the form of large language models is just that – a tool that generates text. It doesn’t possess expert knowledge, it doesn’t think critically, it has a bias towards the data that was used for training, and it can not evaluate the novelty of a research idea.
Here is a summary of the don’t:
- Don’t blindly trust ChatGPT and always check sources that it provides.
- Don’t use ChatGPT to generate ideas.
- Don’t use ChatGPT to replace critical thinking.
- Don’t use ChatGPT to generate important content.
- Don’t rely solely on ChatGPT for literature review. Do your own research as well.
- Don’t think you can forbid ChatGPT use for assignments. Rather – come up with creative ways to involve it class work.
- And finally, remember that just like humans, AI can be biased.
Do you think I should add any other don’ts to this list? Send me an email to [email protected]
Author
Hey! My name is Martins Zaumanis and I am a materials scientist in Switzerland (Google Scholar). As the first person in my family with a PhD, I have first-hand experience of the challenges starting scientists face in academia. With this blog, I want to help young researchers succeed in academia. I call the blog “Peer Recognized”, because peer recognition is what lifts academic careers and pushes science forward.
Besides this blog, I have written the Peer Recognized book series and created the Peer Recognized Academy offering interactive online courses.