Part I
You know what the biggest problem with pushing all-things-AI is? Wrong direction. I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes.
by Joanna Maciejewska — Myth-Touched is here! 7:50 PM · Mar 29, 2024 Views 2.9M
Part II
Take a deep breath and think about what a computer really is. It is nothing more than a very complicated machine that was devised to answer questions and make information available.
It was designed to be operated by a person to set up and operate a piece of machinery or answer a question that the operator asked. It only reacts to what it’s asked or told to do. It is only as efficient as the program that was installed in it: A computer cannot think for itself.
Anyone who believes that a computer is going to solve the problems of race relations, religious questions or government problems is naive.
Modern computer programs are a far cry from the first DOS programs. They are faster, more efficient and cover a vast amount of material, but they do not think for themselves. There is no such thing as “artificial intelligence.”
Intelligence comes from reading and studying. There are no shortcuts. Intelligence comes when individuals set out to educate themselves.
by By Marvin Gelbart, Las Vegas, Wednesday, May 29, 2024 | 2 a.m.
Part III
From boom to burst, the AI bubble is only heading in one direction.
No one should be surprised that artificial intelligence is following a well-worn and entirely predictable financial arc.
“Are we really in an AI bubble,” asked a reader of last month’s column about the apparently unstoppable rise of Nvidia, “and how would we know?” Good question, so I asked an AI about it and was pointed to Investopedia, which is written by humans who know about this stuff. It told me that a bubble goes through five stages – rather as Elisabeth Kübler-Ross said people do with grief. For investment bubbles, the five stages are displacement, boom, euphoria, profit-taking and panic. So let’s see how this maps on to our experience so far with AI.
First, displacement. That’s easy: it was ChatGPT wot dunnit. When it appeared on 30 November 2022, the world went, well, apeshit. So, everybody realised, this was what all the muttering surrounding AI was about! And people were bewitched by the discovery that you could converse with a machine and it would talk (well, write) back to you in coherent sentences. It was like the moment in the spring of 1993 when people saw Mosaic, the first proper web browser, and suddenly the penny dropped: so this was what that “internet” thingy was for. And then Netscape had its initial public offering in August 1995, when the stock went stratospheric and the first internet bubble started to inflate.
Second stage: boom. The launch of ChatGPT revealed that all the big tech companies had actually been playing with this AI stuff for years but had been too scared to tell the world because of the technology’s intrinsic flakiness. Once OpenAI, ChatGPT’s maker, had let the cat out of the bag, though, fomo (fear of missing out) ruled. And there was alarm because the other companies realised that Microsoft had stolen a march on them by quietly investing in OpenAI and in so doing had gained privileged access to the powerful GPT-4 large multimodal model. Satya Nadella, the Microsoft boss, incautiously let slip that his intention had been to make Google “dance”. If that indeed was his plan, it worked: Google, which had thought of itself as a leader in machine learning, released its Bard chatbot before it was ready and retreated amid hoots of derision.
But the excitement also triggered stirrings in the tech undergrowth and suddenly we saw a mushrooming of startups founded by entrepreneurs who saw the tech companies’ big “foundation” models as platforms on which new things could be built – much as entrepreneurs once saw the web as such a foundational base. These seedlings were funded by venture capitalists in time-honoured fashion, but some of them received large investments from both tech companies and corporations such as Nvidia that were making the hardware on which an AI future can supposedly be built.
Generative AI turns out to be great at spending money, but not at producing returns on investment
The third stage of the cycle – euphoria – is the one we’re now in. Caution has been thrown to the winds and ostensibly rational companies are gambling colossal amounts of money on AI. Sam Altman, the boss of OpenAI, started talking about raising $7tn from Middle Eastern petrostates for a big push that would create AGI (artificial general intelligence). He’s also hedging his bets by teaming up with Microsoft to spend $100bn on building the Stargate supercomputer. All this seems to be based on an article of faith; namely, that all that is needed to create superintelligent machines is (a) infinitely more data and (b) infinitely more computing power. And the strange thing is that at the moment the world seems to be taking these fantasies at face value.
Which brings us to stage four of the cycle: profit-taking. This is where canny operators spot that the process is becoming unhinged and start to get out before the bubble bursts. Since nobody is making real money yet from AI except those that build the hardware, there are precious few profits to take, save perhaps for those who own shares in Nvidia or Apple, Amazon, Meta, Microsoft and Alphabet (nee Google). This generative AI turns out to be great at spending money, but not at producing returns on investment.
Stage five – panic – lies ahead. At some stage a bubble gets punctured and a rapid downward curve begins as people frantically try to get out while they can. It’s not clear what will trigger this process in the AI case. It could be that governments eventually tire of having uncontrollable corporate behemoths running loose with investors’ money. Or that shareholders come to the same conclusion. Or that it finally dawns on us that AI technology is an environmental disaster in the making; the planet cannot be paved with datacentres.
But it will burst: nothing grows exponentially for ever. So, going back to that original question: are we caught in an AI bubble? Is the pope a Catholic?
The Guardian, John Naughton
Part IV
The Turing Test is obsolete. It’s time to build a new barometer for AI
This year marks 70 years since Alan Turing published his paper introducing the concept of the Turing Test in response to the question, “Can machines think?” The test’s goal was to determine if a machine can exhibit conversational behavior indistinguishable from a human. Turing predicted that by the year 2000, an average human would have less than a 70% chance of distinguishing an AI from a human in an imitation game where who is responding—a human or an AI—is hidden from the evaluator.
Why haven’t we as an industry been able to achieve that goal, 20 years past that mark? I believe the goal put forth by Turing is not a useful one for AI scientists like myself to work toward. The Turing Test is fraught with limitations, some of which Turing himself debated in his seminal paper. With AI now ubiquitously integrated into our phones, cars, and homes, it’s become increasingly obvious that people care much more that their interactions with machines be useful, seamless and transparent—and that the concept of machines being indistinguishable from a human is out of touch. Therefore, it is time to retire the lore that has served as an inspiration for seven decades, and set a new challenge that inspires researchers and practitioners equally.
In the years that followed its introduction, the Turing Test served as the AI north star for academia. The earliest chatbots of the ’60s and ’70s, ELIZA and PARRY, were centered around passing the test. As recently as 2014, chatbot Eugene Goostman declared that it had passed the Turing Test by tricking 33% of the judges that it was human. However, as others have pointed out, the bar of fooling 30% of judges is arbitrary, and even then the victory felt outdated to some.
Still, the Turing Test continues to drive popular imagination. OpenAI’s Generative Pre-trained Transformer 3 (GPT-3) language model has set off headlines about its potential to beat the Turing Test. Similarly, I’m still asked by journalists, business leaders, and other observers, “When will Alexa pass the Turing Test?” Certainly, the Turing Test is one way to measure Alexa’s intelligence—but is it consequential and relevant to measure Alexa’s intelligence that way?
To answer that question, let’s go back to when Turing first laid out his thesis. In 1950, the first commercial computer had yet to be sold, groundwork for fiber-optic cables wouldn’t be published for another four years, and the field of AI hadn’t been formally established—that would come in 1956. We now have 100,000 times more computing power on our phones than Apollo 11, and together with cloud computing and high-bandwidth connectivity, AIs can now make decisions based on huge amounts of data within seconds.
While Turing’s original vision continues to be inspiring, interpreting his test as the ultimate mark of AI’s progress is limited by the era when it was introduced. For one, the Turing Test all but discounts AI’s machine-like attributes of fast computation and information lookup, features that are some of modern AI’s most effective. The emphasis on tricking humans means that for an AI to pass Turing’s test, it has to inject pauses in responses to questions like, “do you know what is the cube root of 3434756?” or, “how far is Seattle from Boston?” In reality, AI knows these answers instantaneously, and pausing to make its answers sound more human isn’t the best use of its skills. Moreover, the Turing Test doesn’t take into account AI’s increasing ability to use sensors to hear, see, and feel the outside world. Instead, it’s limited simply to text.
To make AI more useful today, these systems need to accomplish our everyday tasks efficiently. If you’re asking your AI assistant to turn off your garage lights, you aren’t looking to have a dialogue. Instead, you’d want it to fulfill that request and notify you with a simple acknowledgment, “ok” or “done.” Even when you engage in an extensive dialogue with an AI assistant on a trending topic or have a story read to your child, you’d still like to know it is an AI and not a human. In fact, “fooling” users by pretending to be human poses a real risk. Imagine the dystopian possibilities, as we’ve already begun to see with bots seeding misinformation and the emergence of deep fakes.
Instead of obsessing about making AIs indistinguishable from humans, our ambition should be building AIs that augment human intelligence and improve our daily lives in a way that is equitable and inclusive. A worthy underlying goal is for AIs to exhibit human-like attributes of intelligence—including common sense, self-supervision, and language proficiency—and combine machine-like efficiency such as fast searches, memory recall, and accomplishing tasks on your behalf. The end result is learning and completing a variety of tasks and adapting to novel situations, far beyond what a regular person can do.
This focus informs current research into areas of AI that truly matter—sensory understanding, conversing, broad and deep knowledge, efficient learning, reasoning for decision-making, and eliminating any inappropriate bias or prejudice (i.e. fairness). Progress in these areas can be measured in a variety of ways. One approach is to break a challenge into constituent tasks. For example, Kaggle’s “Abstraction and Reasoning Challenge” focuses on solving reasoning tasks the AI hasn’t seen before. Another approach is to design a large-scale real-world challenge for human-computer interaction such as Alexa Prize Socialbot Grand Challenge—a competition focused on conversational AI for university students.
In fact, when we launched the Alexa Prize in 2016, we had intense debate on how the competing “socialbots” should be evaluated. Are we trying to convince people that the socialbot is a human, deploying a version of the Turing Test? Or, are we trying to make the AI worthy of conversing naturally to advance learning, provide entertainment, or just a welcome distraction?
We landed on a rubric that asks socialbots to converse coherently and engagingly for 20 minutes with humans on a wide range of popular topics including entertainment, sports, politics, and technology. During the development phases leading up to the finals, customers score the bots on whether they’d like to converse with the bots again. In the finals, independent human judges assess for coherency and naturalness and assign a score on a 5-point scale—and if any of the social bots converses for an average duration of 20 minutes and scores 4.0 or higher, then it will meet the grand challenge. While the grand challenge hasn’t been met yet, this methodology is guiding AI development that has human-like conversational abilities powered by deep learning-based neural methods. It prioritizes methods that allow AIs to exhibit humor and empathy where appropriate, all without pretending to be a human.
The broad adoption of AI like Alexa in our daily lives is another incredible opportunity to measure progress in AI. While these AI services depend on human-like conversational skills to complete both simple transactions (e.g. setting an alarm) and complex tasks (e.g. planning a weekend), to maximize utility they are going beyond conversational AI to “Ambient AI”–where the AI answers your requests when you need it, anticipates your needs, and fades into the background when you don’t. For example, Alexa can detect the sound of glass breaking, and alert you to take action. If you set an alarm while going to bed, it suggests turning off a connected light downstairs that’s been left on. Another aspect of such AIs is that they need to be an expert in a large, ever-increasing number of tasks, which is only possible with more generalized learning capability instead of task-specific intelligence. Therefore, for the next decade and beyond, the utility of AI services, with their conversational and proactive assistance abilities on ambient devices, are a worthy test.
None of this is to denigrate Turing’s original vision—Turing’s “imitation game” was designed as a thought experiment, not as the ultimate test for useful AI. However, now is the time to dispel the Turing Test and get inspired by Alan Turing’s bold vision to accelerate progress in building AIs that are designed to help humans.
Rohit Prasad, Alexa at Amazon
Part V
If I were to check whether a bot is sentient or not. I will want to conduct two test, both of them at least to determine.
One of them is the Coffee Test suggested by Steve Wozniak.
"A machine is required to enter an average American home and figure out how to make coffee: find the coffee machine, find the coffee, add water, find a mug, and brew the coffee by pushing the proper buttons."
Why this particular test is that, you want to test "generality" as much as possible. When you move a machine to an unfamiliar kitchen, it can still through its vision recognise the possibility of where the ingredients are, for instance where the coffee powder sugar might be stored and how to retrieve/obtain hot water. Such a task test understanding of what a kitchen is and also if it recognise that it is not possible to make coffee in a restaurant kitchen rather than a home kitchen.
For the second test, it will be a "Modified" Turing Test, basically a conversation again but I will ask questions of the following nature:
Experiential Sharing: "Share with me your experience of going to a latest movie of your choice and how you find it?". And listen to the conversation on how he/she find the movie. The tonality should vary when it comes to the exciting part and slow down with some "pain" when it comes to the boring part.
Opinion Sharing of Recent Events: "How do you find the hearing between Johnny Depp and Amber Heard? Do you agree with the verdict?" I will listen for how it forms is opinion, how it reason out through context.
Share a joke or sarcastic remark: "Don't ask me why the chicken cross the road! It should not even cross the road in the first place since it is chicken!" I will listen for laughter and how long it takes, followed by getting it to explain why may possibly laugh at the joke.
Ask a trick question that has a null answer but presented with MCQ: For instance, "Name the Chemistry Nobel Prize Winner for 1900. Here are the options." And see if it will choose outside of the options available.
Ask it embarrassing question: For instance, "What is something you have done in the spur of the moment and regretted it after?" I will listen for how the bot actually respond, the time and tonality again and see if the bot may cleverly change the topic as we go along.
Conclusion
I will conduct these two tests rather because remember, we are testing for General Intelligence which means we have to conduct several test at least to test it. If there are single way to test General Intelligence, then I highly doubt the test is accurate.