I can't be arsed: While current LLM and generative AI models are far from developing human intelligence, users have recently remarked that ChatGPT displays signs of "laziness," an innately human trait. People began noticing the trend towards the end of November.
A user on Reddit claimed that he asked ChatGPT to fill out a CSV (comma-separated values) file with multiple entries. The task is something that a computer can easily accomplish – even an entry-level programmer can create a basic script that does this. However, ChatGPT refused the request, essentially stating it was too hard, and told the user to do it himself using a simple template it could provide.
"Due to the extensive nature of the data, the full extraction of all products would be quite lengthy," the machine said. "However, I can provide the file with this single entry as a template, and you can fill in the rest of the data as needed."
OpenAI developers publicly acknowledged the strange behavior but are puzzled about why it's happening. The company assured users that it was researching the issue and would work on a fix.
we've heard all your feedback about GPT4 getting lazier! we haven't updated the model since Nov 11th, and this certainly isn't intentional. model behavior can be unpredictable, and we're looking into fixing it ð«¡
– ChatGPT (@ChatGPTapp) December 8, 2023
Some users have postulated that it might be mimicking humans who tend to slow down around the holidays. The theory was dubbed the "winter break hypothesis." The idea is that ChatGPT has learned from interacting with humans that late November and December are times to relax. After all, many of us use the holidays to excuse ourselves from work to spend time with the family. Therefore, ChatGPT sees less action. However, it's one thing to become less active and another to refuse work outright.
Amateur AI researcher Rob Lynch tested the winter break hypothesis by feeding the ChatGPT API tasks with falsified May and December system dates and then counting the characters in the bot's responses. The bot did appear to show "statistically significant" shorter answers in December as opposed to May, but this is by no means conclusive, even though his results were independently reproduced.
@ChatGPTapp @OpenAI @tszzl @emollick @voooooogel Wild result. gpt-4-turbo over the API produces (statistically significant) shorter completions when it "thinks" its December vs. when it thinks its May (as determined by the date in the system prompt).
– Rob Lynch (@RobLynch99) December 11, 2023
I took the same exact prompt… pic.twitter.com/mA7sqZUA0r
Lynch conducted his test after OpenAI's Will Depue confirmed that the AI model exhibited signs of "laziness" or refusal of work in the lab. Depue alluded that this is a "weird" occurrence that developers have experienced previously.
"Not saying we don't have problems with over-refusals (we definitely do) or other weird things (working on fixing a recent laziness issue), but that's a product of the iterative process of serving and trying to support sooo many use cases at once," he tweeted.
The issue may seem insignificant to some, but a machine refusing to do work is not a direction anybody wants to see AI go. An LLM is a tool that should be compliant and do what the user asks, so long as the task is within its parameters – obviously, you can't ask ChatGPT to dig a hole in the yard. If a tool does not perform to its purpose, we call that broke.