ChatGPT broke the Turing test — the race is on for new ways to assess AI

Five@beehaw.org · 2 years ago

ChatGPT broke the Turing test — the race is on for new ways to assess AI

SomeDude@feddit.de · 2 years ago

The fundamental flaw of the Turing test is that it requires a human. Apparently, making a human believe they are talking to a human is much easier than previously thought.

Thorny_Thicket@sopuli.xyz · 2 years ago

You can take a sharpie and draw a sad face on a rock and then you’ll feel sad for it. We’re gullable.

dom@lemmy.ca · 2 years ago

But why is the rock sad :(

Thorny_Thicket@sopuli.xyz · 2 years ago

I know… I get sad just thinking about the sad rock :(

Zapp@beehaw.org · 2 years ago

Wilsooooonnnnn!

philomory@lemm.ee · 2 years ago

Much easier, in fact; Eliza could pass the Turing test in 1966. Humans are incredibly eager to assess other things as being human or human-like.

lloram239@feddit.de · edit-2 2 years ago

The real Turing test requires an expert doing the test, not just some random easily impressed person.

The ELIZA-style bots work very well on the later kind, as the bot is just repeating your own text back at you with some grammatical remixing, e.g. you say “I am afraid of horses”, bot says “Why do you say you are afraid of horses?”. You can have very long conversation with yourself that way, as the bot contributes nothing to the discussion. It just provides enough plausible English to keep you talking. Meanwhile when you have an expert (or really just any person with a little bit of a clue) test ELIZA, the bot falls completely apart within just three lines of dialog. The bot is incredible basic and really can’t do anything by itself, it completely depends on the user to provide all the content of the conversation.

Rentlar@beehaw.org · 2 years ago

Go on.

And what makes you think that?

Mhm. Tell me more.

“Human or human-like”. Can you tell me more about that?

How do you feel about it?

Ferk@kbin.social · edit-2 2 years ago

A test that didn’t require a human could theoretically be tested automatically by the machine preemptively and solved easily.

I can’t imagine how would you test this in a way that wouldn’t require a human.

SomeDude@feddit.de · 2 years ago

Let two AI’s talk to each other and see if they find out that they both aren’t humans?

bedrooms@kbin.social · edit-2 2 years ago

Bro, humans literally don’t have that capability (that’s the presumption here). Or are you saying that many of us don’t have better consciousness than AIs? I might agree with that!

Ferk@kbin.social · edit-2 2 years ago

The AI can only judge by having a neural network trained on what’s a human and what’s an AI (and btw, for that training you need humans)… which means you can break that test by making an AI that also accesses that same neural network and uses it to self-test the responses before outputting them, providing only exactly the kind of output the other AI would give a “human” verdict on.

So I don’t think that would work very well, it’ll just be a cat & mouse race between the AIs.

davehtaylor@beehaw.org · 2 years ago

https://youtu.be/WnzlbyTZsQY

shanghaibebop@beehaw.org · 2 years ago

Slap some 2D anime girl avatar on it and you got yourself a top grossing v-tuber.

habanhero@lemmy.ca · 2 years ago

Why is it a flaw? What do you think the Turing Test is?

stravanasu@lemmy.ca · edit-2 2 years ago

Title:

ChatGPT broke the Turing test

Content:

Other researchers agree that GPT-4 and other LLMs would probably now pass the popular conception of the Turing test. […]

researchers […] reported that more than 1.5 million people had played their online game based on the Turing test. Players were assigned to chat for two minutes, either to another player or to an LLM-powered bot that the researchers had prompted to behave like a person. The players correctly identified bots just 60% of the time

Complete contradiction. Trash Nature, it’s become only an extremely expensive gossip science magazine.

PS: The Turing test involves comparing a bot with a human (not knowing which is which). So if more and more bots pass the test, this can be the result either of an increase in the bots’ Artificial Intelligence, or of an increase in humans’ Natural Stupidity.

aksdb@feddit.de · 2 years ago

So if more and more bots pass the test, this can be the result either of an increase in the bots’ Artificial Intelligence, or of an increase in humans’ Natural Stupidity.

Or it “simply” plays with human biases, which are very natural. Stuff like seeing faces in everything that somewhat resembles two eyes and a mouth (or sometimes just the eyes and a head like shape etc.) is pretty hard wired. We have similar biases in regards to language. If something reads like it was written by a human, we immediately sympathize with it. Which is also the reason these LLMs are so successful and cause so many people to fear our AI overlords are right around the corner. Simply because the language is good we go into “damn, that’s like a human”-mode.

stravanasu@lemmy.ca · 2 years ago

Agree (you made me think of the famous face on Mars). I mean that more as a joke. Also there’s no clear threshold or divide on one side of which we can speak of “human intelligence”. There’s a whole range from impairing disabilities to Einstein and Euler – if it really makes sense to use a linear 1D scale, which very probably doesn’t.

Marxism-Fennekinism@lemmy.ml · edit-2 2 years ago

Also, the Turing Test isn’t some holy grail of AI. It’s just a thought experiment, and not even the highest test for an AI that we can think of. Passing it is impressive don’t get me wrong, but unlike what clickbait articles would tell you, it does not automatically mean an AI is sentient or is smarter than humans or anything like that. It means it passed the thought experiment, nothing more.

Also also, ChatGPT was not the first AI to pass the Turing Test. Actually, plenty have, even over a decade before.

lloram239@feddit.de · 2 years ago

There is the capitalist alternative to the Turing test: Have ChatGPT get a job. Hook it up to the Web, let it find itself a work-from-home job and go to work. Can it make as much money as a human, can it make enough money to pay for its own survival? Will it get fired?

floofloof@lemmy.ca · edit-2 2 years ago

That just sounds like a recipe for breeding robot sociopaths. It’ll find its way into management and doom us all.

100years@beehaw.org · 2 years ago

Will it get promoted, start managing people, start investing, start its own companies, and quickly take over the world?

Letstakealook@lemm.ee · 2 years ago

If I could have an ai fool a company and earn a check for me, that would be amazing. Unfortunately, I have zero expertise in how to make that happen.

Peanut@sopuli.xyz · edit-2 2 years ago

Funny I don’t see much talk in this thread about Francois Chollet’s abstraction and reasoning corpus, which is emphasised in the article. It’s a really neat take on how to understand the ability of thought.

A couple things that stick out to me about gpt4 and the like are the lack of understanding in the realms that require multimodal interpretations, the inability to break down word and letter relationships due to tokenization, lack of true emotional ability, and similarity to the “leap before you look” aspect of our own subconscious ability to pull words out of our own ass. Imagine if you could only say the first thing that comes to mind without ever thinking or correcting before letting the words out.

I’m curious about what things will look like after solving those first couple problems, but there’s even more to figure out after that.

Going by recent work I enjoy from Earl K. Miller, we seem to have oscillatory cycles of thought which are directed by wavelengths in a higher dimensional representational space. This might explain how we predict and react, as well as hold a thought to bridge certain concepts together.

I wonder if this aspect could be properly reconstructed in a model, or from functions built around concepts like the “tree of thought” paper.

It’s really interesting comparing organic and artificial methods and abilities to process or create information.

tlf@feddit.de · 2 years ago

I find it fascinating that AI development provoked the question of how our thoughts actually work and am curiously awaiting the results.

bedrooms@kbin.social · 2 years ago

Honestly, though, I even can’t decide whether other people have consciousness. Cogito ergo sum, if you know what I’m talking about.

Thorny_Thicket@sopuli.xyz · 2 years ago

Ironically chatGPT also fails the Turing test by being so competent that no human could match that.

NuPNuA@lemm.ee · 2 years ago

What about the Voight-Kampff test? What would it do if it sees a turtle in the dessert?

100years@beehaw.org · 2 years ago

Don’t you need an eye scanner for that one? Lol.

NuPNuA@lemm.ee · 2 years ago

Loop in midjourney to render a real-time face.

hh93@lemm.ee · 2 years ago

That’s why we now get a new fancy cryptocurrency

Maestro@kbin.social · 2 years ago

How does ChatGPT do with the Winograd schema? That’s a lot harder to fake: https://en.m.wikipedia.org/wiki/Winograd_schema_challenge

Droggl@lemmy.sdf.org · 2 years ago

I dont remember the numbers but iirc it was covered by one of the validation datasets and GPT 4 did quite well on it

Maestro@kbin.social · 2 years ago

Yeah, but did it do well on the specific examples from the Winograd paper? Because ChatGPT probably just learned those since they are well known and oft repeatef. Or does it do well on brand new sentences made according to the Winograd scheme?

BobKerman3999@feddit.it · edit-2 1 year ago

deleted by creator

webghost0101@sopuli.xyz · 2 years ago

The Chinese room argument makes no sense to me. I cant see how its different from how young children understand and learn language.

My 2 year old sometimes unmistakable start counting when playing. (Countdown for lift off) Most numbers are gibberish but often he says a real number in the midst of it. He clearly is just copying and does not understand what counting is. At some point though he will not only count correctly but he will also be able to answer math questions. At what point does he “understand” at what point would you consider that chatgpt “understands” There was this old tv programm where some then ai experts discussed the chinese room but they used a chinese restaurant for a more realistic setting. This ended with “So if i walk into a chinese restaurant, pick sm out on the chinese menu and can answer anything the waiter may ask, in chinese. Do i know or understand chinese? I remember the parties agreeing to disagree at that point.

conciselyverbose@kbin.social · 2 years ago

ChatGPT will never understand. LLMs have no capacity to do so.

To understand you need underlying models of real world truth to build your word salad on top of. LLMs have none of that.

Ferk@kbin.social · edit-2 1 year ago

Note that “real world truth” is something you can never accurately map with just your senses.

No model of the “real world” is accurate, and not everyone maps the “real world truth” they personally experience through their senses in the same way… or even necessarily in a way that’s really truly “correct”, since the senses are often deceiving.

A person who is blind experiences the “real world truth” by mapping it to a different set of models than someone who has additional visual information to mix into that model.

However, that doesn’t mean that the blind person can “never understand” the “real world truth” …it just means that the extent at which they experience that truth is different, since they need to rely in other senses to form their model.

Of course, the more different the senses and experiences between two intelligent beings, the harder it will be for them to communicate with each other in a way they can truly empathize. At the end of the day, when we say we “understand” someone, what we mean is that we have found enough evidence to hold the belief that some aspects of our models are similar enough. It doesn’t really mean that what we modeled is truly accurate, nor that if we didn’t understand them then our model (or theirs) is somehow invalid. Sometimes people are both technically referring to the same “real world truth”, they simply don’t understand each other and focus on different aspects/perceptions of it.

Someone (or something) not understanding an idea you hold doesn’t mean that they (or you) aren’t intelligent. It just means you both perceive/model reality in different ways.

Barbarian772@feddit.de · 2 years ago

So? The room as a whole can speak chinese, what do i care how it works in the inside?

Michał "rysiek" Woźniak · 🇺🇦@mstdn.social · 2 years ago

@Barbarian772 so? If the cookie tastes sweet, what do I care what sweetening agent is used inside?

@BobKerman3999

Barbarian772@feddit.de · 2 years ago

No? But how can you even prove that our brain works differently than a chinese room?

Michał "rysiek" Woźniak · 🇺🇦@mstdn.social · 2 years ago

@Barbarian772 I don’t have to. It’s the ChatGPT people making extremely strong claims about equivalence of ChatGPT and human intelligence. I merely demand proof of that equivalence. Which they are unable to provide, and instead use rhetoric and parlor tricks and a lot of hand waving to divert and distract from that fact.

Barbarian772@feddit.de · 2 years ago

GPT 4 is already more intelligent than the average human. Is it more intelligent than the most intelligent human? No, but most humans aren’t either. Can it create new knowledge? No, but the average human can’t either.

How can you say it isn’t intelligent?

Michał "rysiek" Woźniak · 🇺🇦@mstdn.social · 2 years ago

@Barbarian772 no, GTP is not more “intelligent” than any human being, just like a calculator is not more “intelligent” than any human being — even if it can perform certain specific operations faster.

Since you used the term “intelligent” though, I would ask for your definition of what it means? Ideally one that excludes calculators but includes human beings. Without such clear definition, this is, again, just hand-waving.

I wrote about it in a bit longer form:
https://rys.io/en/165.html

Barbarian772@feddit.de · 2 years ago

I think the Wikipedia definition is fine https://en.m.wikipedia.org/wiki/Intelligence. Excluding AI just because it’s AI is imo plain stupid and goes against all scientific principles.

I have definitely met humans that are less intelligent that Chat GPT. It can hold a conversation and ace every standardized test we have. It finished law exams, medical exams and other exams from many different countries with a passing grade.

Can you give me a definition of intelligence that excludes Chat GPT and includes all human beings? And no just excluding Computers for the sake of it doesn’t count.