The Death of Critical Thinking
The Death of Critical Thinking: In a world where easy answers are a click away, why even question it?
Hinson Chan, Program in Writing and Rhetoric, Stanford University
PWR 1NF: Language 2.0: Investigating the Rhetoric of Digital Language
June 9, 2025
I promise, no AI was used in the writing of this paper.
It wasn’t long ago that this sentence would’ve been utter madness, right? It’s been only two years—TWO YEARS—since generative AI took over the world. Now, it’s literally everywhere; every other company, from startup to megacorporation, seems to be powered by it. This sort of technology—the type that instills both delusional techno-optimism and widespread anxiety—has made generative AI the most polarizing technology of the last few years. It is undoubtedly remarkable how fast generative AI technologies reached ubiquity. Most people wouldn’t even have known what a GPT was five years ago: ‘Generative Pre-trained Transformer? Like the robot guys?” I probably would’ve said. Yet while it took Google several years to hit 100 million users on an already-understood technology, ChatGPT took the world by storm, hitting the same mark in only two months [1].
Before ChatGPT, the closest things we had to an “all-knowing answer machine” were search engines, like Google. Have a question? Just ‘Google it.’ Want to find out about the causes of World War II? Your search engine will take you to Wikipedia, where you can read to your heart's content. But critically, no matter what you wanted to do, search engines weren’t there to answer your questions directly; they simply helped you find ways to find information from real people. In fact, back in 2010, Eric Schmidt, Google’s CEO, declared his vision that “most people don't want Google to answer their questions. They want Google to tell them what they should be doing next [2]."
ChatGPT, on the other hand, flipped the whole idea on its head. Their landing page bluntly puts it: “Get answers. Find inspiration. Be more productive,” while further down, it promises to “help with writing, learning, brainstorming, and more [3]." On the surface, things sounded great; rather than needing to click through extra links and read websites that sometimes didn’t really get your query, ChatGPT’s seemingly unique opinions and even potential “intelligence” drew many to rely on the new tool for quick answers. Even Google’s search team seemed to echo the new Generative AI stance, with Google’s then Vice President of Search Elizabeth Reid offering their new perspective: “We’re now taking more of the work out of searching, so you’ll be able to understand a topic faster, uncover new viewpoints and insights, and get things done more easily.
But what makes generative AI in particular such a danger to critical thinking? Well, for starters, Google and similar search engines never intended to completely answer questions for you. Of course, there are millions of websites that will offer to give their perspective, but the point was for the user to critically analyze all these perspectives to form an opinion. In contrast, generative AI tools promise to provide a completely polished response to any query you might have, doing the research, summarization, and opinionating all at once for you. Rather than combing through sources, compiling evidence, and writing a full report, ChatGPT can do it in just a few seconds. Yet, while companies pledge to create such a perfect tool, the illusion of generative AI’s productivity boost is not as it seems. Not only that, generative AI becomes an ever-present “cheat code” for many common tasks, offering itself as a slippery slope to giving up.
In this paper, I argue that any usage of generative AI simply offloads the ‘critical thinking’ to verifying and building upon the result rather than coming up with original opinions [5], with varying levels of effort. There are three factors that influence that degree of effort: trust in the generative AI model to execute the task, an internal judgement of importance of a given task, and a willingness to invest effort dependent on external circumstances. As a result, the degree of generative AI usage and critical thinking on top of it is a sliding scale: there are times where users might entirely choose to ignore the generated response; other times, they might rewrite the response and build on it [5].
Unfortunately, on the extreme end, academic research shows that some workers today already are confident with entrusting generative AI with their whole jobs, depending on the meniality of the tasks, if the prompter no longer believes that verifying the quality of the AI result is worth it, critical thinking becomes entirely unnecessary [6]. Finally, the implications of this are severe on the most susceptible users of AI: young learners. Through conversations with both learners and teachers, there is a certain ‘breaking point’ observed when external circumstances such as stress or difficulty exceeds the patience a student has for a task [7]; while students initially attempt to complete a task themselves, beyond that breaking point, they depend almost entirely on generative AI for answers.
The importance of understanding how people can effectively work alongside generative AI cannot be overstated. The technology clearly isn’t going away; most likely, as tech companies continue to bet billions on AI, we will increasingly see it being used in every aspect of our lives. Understanding how we can remain not reliant on AI and instead building upon its results will become just as critical as developing the technology itself. For the past two decades, search engines have enabled us to find answers quicker than ever; yet it was ChatGPT that seems to threaten to replace our jobs. But why? What specifically is it about generative AI that makes it so hard to resist putting effort, when tools like Google have existed for decades?
ASKING HARD QUESTIONS IN AN ANSWER BASED WORLD
Traditional search engines heavily relied on the ability to rank websites based on their relation, relevance, and importance to power their search; when a user requests for a search query, Google uses its site ranking to suggest websites that may be interesting for users to look into themselves. In other words, search engines were built for information retrieval, a task that people eventually realized could be useful beyond finding the documents themselves [8]. Today, that may seem obvious; we seemingly use Google for everything these days. However, information retrieval has several well-defined tasks that allow us to more clearly categorize and contextualize how we used these tools:
- Fact finding - locating a specific item of information
- Learning - developing an understanding of a topic
- Gathering - finding material relevant to a new problem not explicitly stated
- Exploring - browsing material without a specific goal in mind that changes as more information is discovered.
- Known item searching - already knowing what item to find, but using the web to find it, such as finding a paper given its name [9].
In this context, we start to see how Google took on many new tasks that went beyond the original capabilities it was built for. Booking a hotel in San Francisco? You’d probably learn by reading some sources about where you’re staying, before finding a specific hotel you want to stay at. How about an answer to a physics question? Well, if you already know the specific question, that might be a known item search, if it’s well known enough. If not, you’d definitely learn by reading articles about similar questions nonetheless. Regardless, it is clear that Google was not built to complete these tasks directly. Rather, it provided a centralized place where all of these tasks could be completed, and let each person figure out how to make use of the sources. Moreover, there’s clarity. These tools were built to work alongside you, providing a multitude of opinions for you to clearly make a decision.
However, in the past few years, a new competitor to search engines has appeared: generative AI. Specifically, large language models, a subset of generative artificial intelligence that outputs continuous streams of text, have become popular due to their fascinating ability to seemingly ‘think.’ Of course, merely five years ago, these models were barely functional; the best models could just scarcely make out a pseudo-sentence or two. But as the large language models got larger and more money was spent, their capability to reason seemed to come out of nowhere; suddenly, we had models that could not only write entire paragraphs, but tell you all about the history of essay-writing. To be clear, the models didn’t suddenly learn what ‘logic’ or ‘facts’ were at all; instead, the massive datasets the models were trained on encapsulated all the information the model would end up knowing—at least, that was the hope. In reality, while humans started to view these model generations with increasing validity, these “stochastic parrots” were only imitating what their training data had to say [10].
Still, this was a revolution in comparison to the old link-surfing days. Generative AI checked all the information retrieval boxes, and more. Now, rather than asking Google to find articles about Newton’s laws of motion, Google Gemini could just tell me all about it from its pretrained memory, perfectly packaged in a way that directly answered my question. For many queries, what the user included in their model ‘prompt’ was enough; if it was a simple enough query, generative AI most likely has training data on it. But suddenly, that clarity was gone: who would take responsibility for the result generated? Surely not the user, and the company that trained the model certainly wouldn’t. As plain language models grew larger and larger, they also became increasingly hard to understand their behavior—an issue known as the “black box problem [11]." Thus, to at least get some level of clarity on the AI’s response, we return back to citations. Retrieval augmented generation, or RAG, became a standard for almost all generative AI tools, from Google’s AI overview to Perplexity, a generative AI search engine [12]. Rather than generating a result blindly, the user’s query would first be fed into a traditional, link-based search engine, after which all relevant pages would be scraped of text, fed back into the language model as context, and a final result is shown to the user.
But wait; did you realize what happened there? With Google, we had a question, Google would give us the links, we’d go through them, and form an opinion yourself. Now, with ChatGPT, you can ask a question, and you’ll get a response followed by a bunch of links. Then, if you’re not sure, you can…check the citations…and form an opinion yourself. Wait, what? Isn’t this just the same thing with extra steps? And this isn’t just a problem with information retrieval. Have an essay to write? You’ll still need to check that everything’s actually correct, and doesn’t just ‘sound like AI.’ What about a tough physics problem? Surely that’ll be better than Google at adapting to a dynamic problem, right? Good luck making sure it’s doing the math correctly. Okay, come on. Health advice. At least it can help me sort through the dozens of internet pages and help me live a healthier life, right [13]?
Okay, never mind; AI just cited the Onion. See, the problem with generative AI is that the aspect of critical thinking has not disappeared entirely; rather, it has become entirely optional. Given how advanced models have become today, generative AI will create a polished output no matter how inaccurate the results are. The task of compiling results into an opinion has already been done. Instead, it has fallen on the human prompter to actively question those answers, to check those sources, and make sure the generated results truly fit the standard. Unfortunately, it isn’t exactly a natural human instinct to take that extra step. After all, imagine if you were stressed, had an essay due the next day, and you had a way of magically creating a perfectly written paper out of thin air; such a tool would be irresistibly tempting to use. Would you really spend hours putting thoughts to paper? Or would you just take it and leave?
JUST CHATGPT IT.
Defining ‘critical thinking’ is not a trivial task on its own. A quick search on Oxford Dictionary defines it as “the objective analysis and evaluation of an issue in order to form a judgment.” However, critical thinking in the context of finding answers and solving problems is more than that. Not just making sure an argument is sourced, but questioning the source itself. Carefully analyzing an argument from multiple perspectives, proposing a balance between each. In that case, given how AI has already shown itself to be capable of answering any question or producing any output, in what specific ways were people critically thinking about a task [14]? For example, how did people approach the task, query the AI tool, or analyze the final response? And if the AI response was already ‘polished,’ to what extent, if at all, do people choose to use GenAI tools? What motivates people to rely on their own work, or on the flipside, to rely on AI?
Research has found that while generative AI promises to save time and effort in completion of tasks, critical thinking has only diffused into “new cognitive tasks for knowledge workers [15]." For starters, it may seem that information retrieval has been significantly reduced; generative AI tools “automate the process of fetching and curating task relevant information.” Rather than needing to search through articles, documents, and online forums, generative AI compiles all these resources into one place. However, some entirely new tasks have sprung up, such as the task of response integration, where users now have to put in extra effort thinking about how they might rewrite or apply a generated response into their own work. Similarly, there are several tasks that normally didn’t require as much time before, but now require much more effort. For example, prompt engineering and optimization now takes much longer; while previously, users might simply put in some keywords into Google, users report that they must first analyze how the AI might respond to their prompt or whether they need to include more context. Even analyzing the task itself, such as figuring out what parts of their task AI might fit in, required more effort: “I had to first learn what I was going to use in order to make progress [16]."
But by far the task that required the most critical thinking was quality assurance, or checking that the GenAI result matched their standards. There were obviously objective criteria that workers needed to be able to verify—making sure code compiled and actually functioned, or that it did all the things that the user requested for example. However, there were also subjective standards that were more difficult to express, such as the feasibility or logic that the AI used in the response. Finally, there was the problem of source verification. The biggest problem facing generative AI has always been hallucinations [17], the phenomenon where the probabilistic model generates a response that simply isn’t true or doesn’t make any sense. As mentioned previously, the most common way companies attempt to mitigate this is by providing the model with as much context through citations as possible, whether this be through internet searches or uploaded document context. Most users surveyed knew about the potential of hallucinations, and significant time checking that the referenced source existed, was “real and reputable” that the source was correctly used within the generated output. They also spent more time checking with other external sources as well, making sure that the output wasn’t just the result of inherent biases within the AI and its information retrieval, but actually representative of sources they could find themselves. All of this is to say, if you demand quality from generative AI, the results are still far from perfect. Yet, rather than spending time writing, gathering, and researching a result, people are putting in the same if not more effort prompting, verifying, and rewriting those AI responses.
With that being said, the original question arises: why not “just ChatGPT it,” and leave it there? All of this sounds like a lot of extra effort on top of a result that was already created, and a ‘polished’ one at that. Not every task demands quality, after all. What drives people to even make an effort to critically think, whether it be verifying or building on the generated result, if the work has already been done? There are three main influences on the extent that one would put effort beyond the generated output. Firstly, trust was a major factor; if the user believed that the AI could complete a task, they would put in their own work less often. For example, workers believed that “with straightforward factual information, ChatGPT usually gives good answers [18]." However, people’s assessments of generative AI’s capabilities were not often correct; some believed that “the information provided by GenAI tools was always truthful and of high quality, while others assumed the outputs would consistently and accurately reflect referenced data sources.” The combination of a trust in AI’s ability led to some users overrelying on the AI’s answers, something particularly common in conjunction with the second factor: stakes. This factor is highly dependent on the task; for example, when drafting legal documents, workers felt an intrinsic need to verify for the sake of the task. Similarly, users looking for medical symptoms additionally understood the importance of being correct, opting to analyze the references with extra detail; as a pharmacist reports: “the entry is to be submitted for review so I would to double check to be sure otherwise I might have to face suspension [19]." On the other hand, tasks determined as “lower stakes,” such as grammar checking and summarization, saw much higher usages of generative AI. Finally, there was motivation. A surprisingly significant factor in prioritizing critical thinking were the circumstances around which a task was required. A commonly cited example for turning to generative AI to help solve a problem was a lack of time; workers put under pressure to complete a task quickly found themselves “us[ing] AI to save time [without] much room to ponder over the result.”20 If a task wasn’t important to them specifically, people also found it unnecessary to put extra effort into ensuring a quality result.
A BREAKING POINT
All of this raises significant concerns for how younger generations will learn to adapt to generative AI. Even for adults, evidence has already shown that using GenAI tools appeared to “reduce the perceived effort of critical thinking while also encouraging over-reliance on AI, with confidence in the tool often diminishing independent problem-solving.” In other words, there was a vicious cycle where the feeling that one required less effort to complete a task encouraged that person to rely on AI more. At this point, I reached out to two high school teachers for evidence of changing times in the classroom. “AI usage has steadily become more widespread among students,” L, a physics teacher, replied. “In my experience, every student has a personal threshold for how much effort they are willing to invest in completing a task. When an assignment exceeds this threshold—whether due to its complexity or time demands—many turn to ChatGPT to complete the remainder of the work [21]." W, a history teacher echoed the sentiment that AI has quickly grown in prevalence: “The use of AI as a conscious tool, like Wikipedia, like Google, like phones, is changing the way people behave and I have made conscious effort to provide students the best guidance I can on its use and developed clear guidelines for what it should be used to do…I don't hide it from students or deny it to students, but encourage students and make rules around its use [22]." When I approached them to look into how specifically AI has changed student’s responses, they both found clear evidence that students were not putting the same effort into learning as before. W revealed that “students who use AI without fact checking will be found out when they can't answer questions about their topic.” When grading physics lab reports, L discovered that “some student submissions include terminology, methodology, or analysis that go beyond what was taught in class or even what is typically expected at the high school level.” L also runs an online physics quiz website that he often assigns his students to complete as homework every week; looking through the analytics, he found that:
“In 2022, the average time spent on the assignment was 179.5 minutes.
In 2025, after AI tools became more commonly used, the average dropped to 106.6 minutes.
One particularly revealing example is Question 10, a challenging item on the assignment:
- In 2022, 0 out of 20 students who answered it correctly did so in under 2 minutes.
- In 2025, 25 out of 43 correct responses were submitted in under 2 minutes.
"Interestingly, earlier questions in the assignment (such as Q1 and Q4) showed slightly increased average times in 2025. This supports my theory that students will work through an assignment on their own until they reach their “breaking point,” at which they turn to AI to finish more difficult or time-consuming problems [23]."
All of this evidence puts the research-backed arguments for when critical thinking is applied in conjunction with generative AI in an understandable context. L’s comment about a “breaking point” where students transition from personal effort into fully AI-assisted responses, makes perfect sense. Students, at first, make an effort to learn when the task is simple enough and they see a personal benefit to completing. When enough time has elapsed and the problems become too difficult, however, it becomes too difficult to ignore the ChatGPT escape route. Once a student sees how easy it is for GenAI to complete simple physics tasks, they begin to trust the AI’s abilities and become overreliant. I also reached out to six high school and university students who have personally used generative AI in their work, and their insights echoed both teachers’ sentiments. One anonymous student commented:
“I was spending 10-15 hours on weekly homework at some point and I was getting maybe 40 or 50% on the assignment. That got really demoralizing at a point, so at some point I half assed the questions and asked AI to correct the work (my work was completely wrong most of the time) and my grades improved. I realized that was really bad and I ended up having to spend so many hours teaching myself the whole course before the final though.”
As I talked to more students, it became clear that the third criteria for additional critical thinking effort—motivation—was what encouraged many students to critically analyze the AI results, rather than using generative AI for everything. One student, committed to studying physics, joked that “if I think the task itself is actually beneficial to me, or if the process seems beneficial, I’ll do it myself. If it's like English, I don’t really care.” Another student, in computer science, commented that, “for homework that I barely care about, or has little interest for me, I just turn to AI right away even though I know I can probably do it better myself. But in a way, I see effort complemented by AI to have diminishing returns; it's easy to get to the 80% with AI, and more often than not that's already more than good enough for the homework that I doubt the lecturers even check. But the rest of the 20% depends on how engaged I am with it or how important it is for me.”
SO, WHAT NOW?
Admittedly, this paper has been quite critical of generative AI and its implications for our future. Of course, the technology has barely existed for three years, and it has already undoubtedly transformed how we all work. It will take many decades more before we truly understand the potential and impacts of generative AI on our generation. Yet if anything, being critical is the point, isn’t it? If any lessons were to be gained from studying the rise of this ubiquitous technology, it’s that simply accepting that “solving artificial general intelligence will solve all the world’s problems” isn’t really that simple at all. So what if AI solves all our problems for us? Will we give up on solving problems ourselves? At the end of every research paper I read and conversation I had, everyone went back to one point: understanding [24]. Not just by the developers who design the technologies, or the experts who wield it on a day-to-day basis, but by everyone. Students need to understand the ethics and consequences of offloading critical thinking to generative AI; adults need to carefully understand the limits of these technologies to avoid overreliance and misinformation. But at the end of the day, I’m left inspired by a rare conversation with a young student who figured it out: “It’s not the hard problems I ask AI to do for me. I want to do the hard stuff myself.”
Let’s ask those hard questions.
BIBLIOGRAPHY
Looking through the document for the numbered footnote citations in order of appearance, here are the APA sources they correspond to:
1. UBS Chief Investment Office. (n.d.). Daily: Has the AI rally gone too far? UBS Wealth Management Insights. https://www.ubs.com/global/en/wealthmanagement/insights/chief-investment-office/house-view/daily/2023/latest-25052023.html
2. Google and the search for the future. (2010, August 14). The Wall Street Journal. https://www.wsj.com/articles/SB10001424052748704901104575423294099527212
3. OpenAI. (n.d.). ChatGPT. https://openai.com/chatgpt/overview/
4. Reid, E. (2023, May 10). Supercharging search with generative AI. Google. https://blog.google/products/search/generative-ai-search/
5. Lee, H.-P., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., & Wilson, N. (2025). The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (Article 1121, pp. 1--22). Association for Computing Machinery. https://doi.org/10.1145/3706598.3713778
6. [Same as #5 - Lee et al. (2025)]
7. [Same as #5 - Lee et al. (2025)]
8. Google for Developers. (n.d.). In-depth guide to how Google Search works. Google Search Central Documentation. https://developers.google.com/search/docs/fundamentals/how-search-works
9. Hersh, W. (2024). Search still matters: Information retrieval in the era of generative AI. Journal of the American Medical Informatics Association, 31(9), 2159--2161. https://doi.org/10.1093/jamia/ocae014
10. Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? 🦜 Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21), 610--623. https://doi.org/10.1145/3442188.3445922
11. Gonsalves, C. (2024). Generative AI's impact on critical thinking: Revisiting Bloom's taxonomy. Journal of Marketing Education, 0(0). https://doi.org/10.1177/02734753241305980
12. Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, M., & Wang, H. (2024). Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997. https://doi.org/10.48550/arXiv.2312.10997
13. Ray, S. (2024, July 2). Google's AI overview appears to produce misleading answers. Forbes. https://www.forbes.com/sites/siladityaray/2024/05/24/googles-ai-overview-appears-to-produce-misleading-answers/
14. Skjuve, M., Brandtzaeg, P. B., & Følstad, A. (2024). Why do people use ChatGPT? Exploring user motivations for generative conversational AI. First Monday, 29(1). https://doi.org/10.5210/fm.v29i1.13541
15. [Same as #5 - Lee et al. (2025)]
16. [Same as #5 - Lee et al. (2025)]
17. [References hallucinations concept - likely from Bender et al. or similar AI literature]
18. [Same as #5 - Lee et al. (2025)]
19. [Same as #5 - Lee et al. (2025)]
20. [Same as #5 - Lee et al. (2025)]
21. Lam, M. (2025, June 8). Personal communication.
22. Wightman, A. (2025, June 8). Personal communication.
23. [Same as #21 - Lam personal communication]
24. Bali, M. (2023, May 2). What I mean when I say critical AI literacy. Reflecting Allowed. https://blog.mahabali.me/educational-technology-2/what-i-mean-when-i-say-critical-ai-literacy/
Discussion