Recently I came across an article published by LRT Capital Management on the investment site Seeking Alpha with the provocative title “AI: Fakes, False Promises, Scams.” Apparently, they believe that new generative AI is overhyped. They cite several examples where demonstrations of artificial general intelligence were clearly staged or faked. I have looked into some of these examples in detail and the article appears to be accurate. To give you a sense of what they say, I will quote a few excerpts here:
In 2023, Google was under immense pressure to develop impressive innovations in the AI race. In response, the company released Google Gemini as an answer to OpenAI's ChatGPT. Gemini's announcement in December 2023 featured a video showcasing its capabilities. It was particularly impressive for its ability to handle interactions across multiple modalities. This included demonstrating what was called multimodal AI, such as listening to people, responding to queries, and analyzing and describing images. This breakthrough was widely acclaimed. However, it was later revealed that the video was actually staged and did not represent the actual capabilities of Google's Gemini.
…OpenAI, the company behind the groundbreaking ChatGPT, has a history of questionable demos and overpromises. The company boasted that its latest release, Chat GPT-4-o, could achieve a 90th percentile score on the Uniform Bar Exam. But when researchers looked closely at this claim, they found that ChatGPT did not perform as well as advertised. (10) In fact, OpenAI had manipulated the study, and when the results were independently replicated, ChatGPT achieved a 15th percentile score on the Uniform Bar Exam.
…Amazon has also entered the fray. You may remember Amazon Go, the AI-powered shopping initiative that promised you could pick up your items in the store and walk out, equipped with cameras, machine learning algorithms, and AI that could detect what you put in your bag and charge it to your Amazon account. Unfortunately, Amazon Go also recently turned out to be a scam. This so-called AI turned out to be nothing more than thousands of workers in India working remotely while observing user behavior, as their computer AI models did not work.
…Facebook introduced M, an assistant that was touted as being AI-powered. It then discovered that 70% of requests were actually handled by remote humans. The costs of maintaining the program were too high, and the company was forced to discontinue the assistant.
… If your question doesn’t fit any known examples, ChatGPT will still generate an answer and confidently explain it, even if it’s wrong.
For example, the answer to the question “How many stones should I eat?” would be:
…Proponents of AI and large language models argue that while some of these demos may be fake, the overall quality of AI systems is continually improving. Unfortunately, I have to share some disappointing news with you: the performance of large language models appears to have plateaued. This is in stark contrast to the huge progress that OpenAI’s ChatGPT made between its second iteration (GPT-2) and the new GPT-3. GPT-3 was a meaningful improvement. Today, larger, more complex, and more expensive models are being developed, but the improvements they provide are minimal. Moreover, we face a major challenge: the amount of data available to train these models is decreasing. The most advanced models are already trained on all available internet data, so there needs to be an insatiable demand for more data. There are proposals to generate synthetic data in AI models and use this data to train more robust models indefinitely. However, a recent study in Nature magazine revealed that such models trained on synthetic data often produce inaccurate and nonsensical responses, a phenomenon known as “model collapse.”
Okay, enough already. These authors have an interesting perspective. The truth probably lies somewhere between their extreme skepticism and the breathless hype we've been hearing for the past two years. I think the most practical near-term use of AI will be more specific behind-the-scenes data mining for business applications, rather than exactly mimicking how humans think.