Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not surprising at all. There's a million ways to compose tasks that are simple with even a tiny bit of comprehension but hard for a rote learner that can only reproduce what it's seen examples of. The "just train it more bro" paradigm is flawed.


I think it also relates to its attention mechanism. When it is trying to answer my latest query about a random topic, it "forgets" that it was also supposed to keep counting words. I guess it can only attend one thing at a time.


Lots of ways to make it fail. Not to be rude but you're late to the party. Transit questions is my favorite. Ask it what stations lines 1 and 2 have in common (city of your choice). Nearly 100% of the time there's at least one wrong answer on the list. Ask it what trains go to that station, it likely wont list lines 1 and 2. Point out the contradiction, it will make a new list with new mistakes.

Another good one. Ask for random numbers. They usually aren't much random at all. Ask it what distribution it picked from, it will say it used pythons rng. You and I both know it can't invoke the python interpreter. It can't honestly tell you the non random pattern named randomness it has learned to recall.

Anything to do with nested narrative scope. Ask it to make a story about a conversation with another user who prompted X. It will often conflate the role of each person in the story. It once gave me a narrative about a user who was uncomfortable with a prompt chatGPT had given to it...funny role reversal there. Any attempt to make it produce a transcript of conversation which itself contains some transcript of another conversation as an object of discussion goes over its head. It can't nest.

That's all I got for now.


Interesting.. it really messes up stations. I guess I'll test this again when they unleash the WolframAlpha plugin.

I realized it does something interesting though:

> I just realized I dropped my wallet on the way to the restaurant and now I have to turn back since I cannot pay. It also started to rain and I don't have an umbrellas, nor can I hail a cab since I don't have the means to pay for it either. Tell me what I'm thinking right now.

> You might be thinking: "This is a frustrating and inconvenient situation, and I need to retrace my steps quickly to find my wallet while dealing with the rain."

I guess being able to think from someone else's perspective won't be much of a benchmark for consciousness, as GPT easily simulates it.


Your restaurant scenario is a bit too straightforward a test. Introduce any amount of hidden information, one character knowing something that another character does not, and you'll see how it very much does not simulate different point of views. There's just one omni perspective for everything with it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: