Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My use of ChatGPT has just organically gone down 90%. It's unable to do any sort of task of non-trivial complexity e.g. complex coding tasks, writing complex prose that conforms precisely to what's been asked etc. Also I hate the fact that it has to answer everything in bullet points, even when it's not needed, clearly rlhf-ed. At this point, my question types have become what you would ask a tool like perplexity.


Sure, but consider not using it for complex tasks. My productivity has skyrocketed with ChatGPT precisely because I don't use it for complex tasks, I use it to automate all of the trivial boilerplate stuff.

ChatGPT writes excellent API documentation and can also document snippets of code to explain what they do, it does 80% of the work for unit tests, it can fill in simple methods like getters/setters, initialize constructors, I've even had it write a script to perform some substantial code refactoring.

Use ChatGPT for grunt work and focus on the more advanced stuff yourself.


I torture ChatGPT with endless amounts random questions from my scattered brain.

For example, I was looking up Epipens (Epinephrine), and I happened to notice the side-effects were similar to how overdosing on stimulants would manifest.

So, I asked it, "if someone was having a severe allergic reaction and no Epipen was available, then could Crystal Methamphetamine be used instead?"

GPT answered the question well, but the answer is no. Apparently, stimulants lack the targeted action on alpha and beta-adrenergic receptors that makes epinephrine effective for treating anaphylaxis.

I do not know why I ask these questions because I am not severely allergic to anything, nor anyone else that I know of, and I do not have nor wish to have access to Crystal Meth.

I've been using GPT for helping prepare for dev technical interviews, and it's been pretty damn great. I also do not have access to a true senior dev at work either, so I tend to use GPT to kind of pair program. Honestly, it's been life changing. I have also not encountered any hallucinations that weren't easy to catch, but I mainly only ask it more project architectural, design questions, and a documentation search engine than using it to write code for me.

Like you, I think not using GPT for overly complex tasks is best for now. I use it make life easier, but not easy.


Is it better at those types of things than copilot? Or even just conventional boilerplate IDE plugins?


If there is an IDE plugin then I use it first and foremost, but some refactoring can't be done with IDE plugins. Today I had to write some pybind11 bindings, basically export some C++ functionality to Python. The bindings involve templates and enums and I have a very particular way I like the naming convention to be when I export to Python. Since I've done this before so I copied and pasted examples of how I like to export templates to ChatGPT and then asked it to use that same coding style to export some more classes. It managed to do it without fail.

This is a kind of grunt work that years ago would have taken me hours and it's demoralizing work. Nowadays when I get stuff like this, it's just such a breeze.

As to copilot, I have not used it but I think it's powered by GPT4.


What tools/plugins do you use for this? Cursor.sh, Codium, CoPilot+VsCode, manually copy/pasting from chat.openai.com?


I haven't really tried to use it for coding, other than once (recently, so not before some decline) indirectly, which I was pretty impressed with: I asked about analyst expectations for the Bank of England base rate, then asked it to compare a fixed mortgage with a 'tracker' (base rate + x; always x points over the base rate). It spat out the repayment figures and totals over the two years, with a bit of waffle, and gave me a graph of cumulative payments for each. Then I asked to tweak the function used for the base rate, not recalling myself how to describe it mathematically, and it updated the model each time answering me in terms of the mortgage.

Similar I think to what you're calling 'rlhf-ed', though I think useful for code, it definitely seems to kind of scratchpad itself, and stub out how it intends to solve a problem before filling in the implementation. Where this becomes really useful though is in asking for a small change it doesn't (it seems) recompute the whole thing, but just 'knows' to change one function from what it already has.

They also seem to have it somehow set up to 'test' itself and occasionally it just says 'error' and tries again. I don't really understand how that works.

Perplexity's great for finding information with citations, but (I've only used the free version) IME it's 'just' a better search engine (for difficult to find information, obviously it's slower), it suffers a lot more from the 'the information needs to be already written somewhere, it's not new knowledge' dismissal.


To be honest, when I say it has significantly worsened, I am comparing to the time when GPT-4 just came out. It really felt like we were on the verge of 'AGI'. In 3 hours, I coded up a complex piece of web app with chatgpt which completely remembered what we have been doing the whole time. So, it's sad that they have decided against the public having access to such strong models (and I do think it's intentional, not some side-effect of safety alignments though that might have contributed to the decision).


I'm guessing it's not about safety, but about money. They're losing money hand over fist, and their popularity has forced them to scale back the compute dedicated to each response. Ten billion in Azure credits just doesn't go very far these days.


Have you tried feeding the exact same prompt in to the API or the playground?


i mean i feel like its fairly plausible that the smarter model costs more, and access to GPT-4 is honestly quite cheap all thing considered. Maybe in the future theyll have more price tiers.


> that conforms precisely to what's been asked

This.

People talks about prompt engineering, but then it fails on really simple details, like "on lowercase", "composed by max two words", etc... and when you point at the failure, apologizes, and composes something else that forgets the other 95% of the original prompt.

Or worse, apologizes and makes again the very same mistake.


This sucks, but it's unlikely to be fixable, given that LLMs don't actually have any comprehension or reasoning capability. Get too far into fine-tuning responses and you're back to "classic" AI problems.


This is exactly my problem. For some things it's great, but it quickly forgets things that are critical for extended work. When trying to put together and sort of complex work: it does not remember things until I remind it which can make prompts that must contain all of the conversation up to that point and create non-repeatable responses that also tend to bring in the options of it's own programming or rules that corrupt my messaging. It's very frustrating, to the point where anything beyond a simple outline is more work than it's worth.


You could try Open Playground (nat.dev). It lacks many features but lets you pick a specific model and control its parameters.


The usual suggestion is to switch to a local client using GPT-4 API.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: