The prompt isn't very useful. You'd see the exact same prompt on every ticket for me.
Prompt 1: "Research <X> domain, think deeply, and record a full analysis in /docs/TICKET-123-NOTES.md"
Prompt 2: Based on our research, read TICKET-123 and began formulating solutions. Let's think this problem through and come up with multiple potential solutions. Document our solutions in TICKET-123-SOLUTIONS.md
Prompt 3: Based on Solution X, let's formulate a complete plan to implement. Break the work into medium sized tasks that a human could complete in 5-10 hours. Write our plan in TICKET-123-PLAN.md
I've often thought that some of this metadata, such as the research, solutioning and plan could be shared. I think they're valuable for code review. I've also translated these artifacts into other developer documentation paradigms.
But the prompts? You're not getting a lot of value there.
> Prompt 1: "Research <X> domain, think deeply, and record a full analysis in /docs/TICKET-123-NOTES.md"
> Prompt 2: Based on our research, read TICKET-123 and began formulating solutions. Let's think this problem through and come up with multiple potential solutions. Document our solutions in TICKET-123-SOLUTIONS.md
> Prompt 3: Based on Solution X, let's formulate a complete plan to implement. Break the work into medium sized tasks that a human could complete in 5-10 hours. Write our plan in TICKET-123-PLAN.md
Sounds to me that all these 10x - 100x "engineers" can be removed from the loop.
Almost! We are certainly on the precipice of the vast majority of white collar work being removed from the loop.
However, what each domain will tell you (engineering included) is that AI doesn't understand the full context of what you're doing and the point of the business and where to spend effort and where to cut corners. There is definitely still room for competent engineers to iterate here on the solutioning and plans to refine the AI work into something more sturdy.
Although this is only in domains where code quality truly matters. A lot of consumer software without SLA's are just vibe coding full speed now. No code review, AI writing 100% of the code.
What a utopia, where code quality matters in all domains!
In my opinion nearly the opposite is true: modern business solves for the "minimum viable quality". What is the absolute lowest quality the software can be and not tank the business.
If you could prove what "minimum viable quality" actually was this would be true. We have standards and procedures exactly because it is unknowable. One engineers idea of "good enough" might bankrupt the business.
> What a utopia, where code quality matters in all domains!
It does. The degree may not, though.
"We have a threshold of at least 5 hours total uptime every 24 hours" is still a quality bar, even if it is different to "We have a threshold of 99.99% uptime per year".
Maybe you're different, but I prefer to write code that at least attempts to be performant, tidy and readable, as well as working at least 90% of the time. Maybe I don't achieve perfection, but I try to care about the quality of what I write
You seem pretty smart. If suddenly, after over a decade, schizophrenic artifacts appear in one single isolated subject, - a subject well known and documented with equal and greater concerns among highly credible sources - does that perhaps imply that the subject itself may be inducing schizophrenia? Maybe a pathological system is inducing pathological effects? Strangely, I feel fine.
Regardless, gaslight as you will; The public will see the implications, which is that questioning LLMs, to some (you?), is symptomatic of psychological pathology. In my opinion, that level of trust, or Faith, is naive for such a novel but powerful technology.
And the basic premise seems to be: user questions sensitive system attributes. Pathologize user. Imply system is infallible and any doubt suggests mental incapacitation. Point out all possible flaws in user while deflecting any attention to system.
That's tried and true. I wish you luck. Meanwhile, the message becomes clearer and clearer.
Autopilot isn’t full self driving (FSD), most cars these ship with smart cruise control (what autopilot basically is). Do you have fatality statistics for FSD?
If we are just talking about smart cruise control, most cars are using cameras and radar, not lidar yet. But Tesla is special since it doesn’t even use radar for its smart cruise control implementation, so that could make it less safe than other new cars with smart cruise control, but Autopilot was never competing with Waymo.
I find it interesting that you latched on their jailor metaphor, but had nothing to say about their core goal: protecting my privacy.
I'm okay with the people in charge of building on top of my private information being jailed by very strict, mean sounding, actually-higher-than-you people whose only goal is protecting my information.
Quite frankly, if you changed any word of that, they'd probably be impotent and my data would be toast.
Isn't that the vast majority of products? By making things easier they change the scale it is accomplished at? Farming wasn't previously impossible before the tractor.
People seemingly have some very odd views on products when it comes to AI.
It's actually a fair question. There are software projects I wouldn't have taken on without an LLM. Not because I couldn't make it. But because of the time needed to create it.
I could have taken the time to do the math to figure out what the rewards structure is for my Wawa points and compare it to my car's fuel tank to discover I should strictly buy sandwiches and never gas.
People have been making nude celebrity photos for decades now with just Photoshop.
Some activities have gotten a speed up. But so far it was all possible before just possibly not feasible.
This conversation is naive and simplifies technologies into “does it achieve something you otherwise couldn’t”.
The answer is that chatgpt allows you to do things more efficiently than before. Efficiency doesn’t sound sexy but this is what adds up to higher prosperity.
Arguments like this can be used against internet. What does it allow you to do now that you couldn’t do before?
Answer might be “oh I don’t know, it allows me to search and index information, talk to friends”.
It doesn’t sound that sexy. You can still visit a library. You can still phone your friends. But the ease of doing so adds up and creates a whole ecosystem that brings so many things.
No. I'm just stating that a huge portion of these comments have their own emotional investment and are confusing OUGHT/IS. On top of that their arguments aren't particularly sound, and if they were applied to any other technologies that we worship here in the church of HN would seem like an advanced form of hypocrisy.
...generate piles of low quality content for almost free.
AI is fascinating technology with undoubtedly fantastic applications in the future, but LLMs mostly seem to be doing two things: provide a small speedup for high quality work, and provide a massive speedup to low quality work.
I don't think it's comparable to the plow or the phone in its impact on society, unless that impact will be drowning us in slop.
There is a particular problem that comes with your line of thinking and why AI will never be able to solve it. In fact it's not a solved human problem either.
And that is slop work is always easier and cheaper than doing something right. We can make perfectly good products as it is, yet we find Shien and Temu filled with crap. That's not related to AI. Humans drown themselves in trash whenever we gain the technological capability to do so.
To put this another way, you cannot get a 10x speed up in high quality work without also getting a 1000x speed up in low quality work. We'll pretty much have to kill any further technological advancement if that's a showstopper for you.
Why is 70% a end of life threshold? Considering that most major models are sold with configurations where the entry level begins under 70% compared to the "Long Range" model, clearly 70% is a perfectly fine level of battery for some users.
I myself have a 11 year old Nissan Leaf with pretty significant battery degradation (the guessometer says 70 mi range but I wouldn't count on more than 35-40) and it's fine for probably 95% of my driving.
If I were to buy an electric car with 300-350 miles of range today, I could easily see myself finding a ton of value in it in 20 or even 30 years. It's still more range than my current one! Lol.
Battery degradation is non-linear, and when it reaches a certain point of degradation it can be become unstable. This has lead to 80% being the traditionally considered point for EOL of a Li-Ion pack. However, this is a rule of thumb and the data is evolving with the technology.
"When the battery degrades to a certain point, for instance, if a battery can only retain 80% of its initial capacity,9, 10, 11 the battery should be retired to ensure the safety and reliability of the battery-powered systems."
SME and IC are functionally different. SME informs, IC creates. Often, IC aren't SME in the space they're developing in, because they're SME of the technology instead of the business.
That's fine to do that, but kind of pointless. Everyone is then a "SME" in their own job space and thus the term is kind of useless. So, just replace every mention of SME outside of your company to "Business SME" instead of "Technology SME" and you'll understand what we're talking about.
Or, if you truly do not need anyone but a "technologist" to deliver product, you must work in a pretty simple business space! I work in healthcare and our PhD's and MD's have a very, very different knowledge space than I do, I and I deeply respect their contributions.
This whole thing reminds me why I never wanna work for someone again. From what I saw at Google it all just ends up being classist top-down BS of who isn't allowed at the big kids table, or bottom-up BS by insisting they aren't the SME just the IC and we can't do anything until the XYZ PM SME TL and/or manager approve.
It is unparsable Dilbert nonsense to anyone outside of specific scenarios. And it causes interminable discontent. Because what if the SME is the PM because they know business and tech but the SME is actually the IC because they know the tech and its tech but what if the manager is actually the SME because they're running the tech and may need to redelegate if the IC needs vacation, blah blah blah.
(job history: college dropout waiter => my own startup, sold => Google for 8 years => my own startup)
I'm sorry you've had bad experience working with other people, but in my experience as a developer, having multiple SME's available is indispensable to real alignment and fast development. I've primarily worked in startups, not big companies, and have often worked in healthcare. In healthcare, you get beyond your "I'm a big smart engineer" ego BS and you are willing to listen to the PhD's and MD's that help inform clinical workflows. From my perspective, I would never ask a clinical researcher or a doctor to understand our react app, and they aren't going to ask me to have deep understanding on medical details and clinical workflows. We work together to deliver high quality useful software quickly.
My PM SME validated my workflows and I found Jesus in them then my MBA TL PhD…bla bla bla.
A human being who avoided corporate brainrot just writes “I worked with John and he was indispensable because (insert reasons you wrote here)”
I’m 37 and never heard of this acronym. That’s the entry-level version of my point. Not that other people hurt me or people knowing things is actually bad.
I have a half dozen language learning apps on my phone and have vibe coded a few concepts as well and while spaced recognition is amazing, it still suffers from the duolingo "vocabulary is not a language" problem.
IMO the way around users feeling like spaced recognition isn't progression is by redefining progression away from memorizing vocabulary into into becoming proficient in conversation both listening and speaking. If spaced recognition vocab is just one feature of a holistic experience, users will judge their progression holistically.
I'm really waiting for that one app that finally connects ChatGPT Advanced Voice Mode or Gemini Live to a context-aware and well trained language tutor. I can already have impromptu practice sessions with both in Mandarin and English but they quickly lose the plot regarding their role as a tutor. I'd love to have them available as part of a learning journey. I can study vocab and flash cards all day but the second that voice starts speaking sentences and I need to understand in real time, I freeze up. The real progress is conversing!
I’ve pointed this out before on HN, but if you want to use ChatGPT as a language partner, you must provide a topic. Expecting it to behave as a proactive teacher is a recipe for disappointment.
Here’s what I typically do:
- Create a custom GPT (mine is called Polly the Glot) with a system prompt instructing it to act as a language partner that responds only in Chinese or your target language of choice. Further specify that the user will paste a story or topic before beginning practice, and that this should guide the discussion.
- Start a new chat.
- Paste in an article from AP/Reuters.
- Turn on Voice Chat.
At that point, I’ll head out to walk my dog and can usually get about 30 minutes to an hour of solid language practice in.
Fair warning, you'll likely need to be at least an intermediate student by this point otherwise it'll probably be too much over your head.
Caveat: You could including a markdown file of your known vocabulary as an knowledge attachment in the custom GPT but I've no idea how well that would work in practice.
I have played around pretty significantly with the markdown context idea but managing it by hand is pretty tough.
- I take chinese tutoring lessons on italki with a tutor who uses notion (copy paste in markdown)
- I copy/paste our notion notes in markdown into a repo for storage
- I use AI to summarize lessons and to keep general context on progress
- I use AI to generate a voice AI lesson plan, such as 10 words to focus on, reviewing a specific human tutoring session, or some conversational focus area.
- I start the advanced voice AI with the context
Unfortunately the AI still loses the plot pretty quickly and devolves into free form conversation. It struggles significantly to enforce any kind of structure that would be helpful for structured learning. I haven't tried this in a few months though, maybe newer models are improving.
If you've vibe coded a few language learning apps, maybe we should collab. Check my bio to see what I've done in this space. I'm trying to make the best free language-learning app on the planet
>but the second that voice starts speaking sentences and I need to understand in real time, I freeze up. The real progress is conversing!
What helped me a lot was doing a lot of listening exercises. Start with concentrating on what you can recognize, not on what you can't. Then listen again and and again and again trying to recognize more and more.
That's what I do and it helps and many apps let me do just that. Repeating it, reading the hanzi, reading the pinyin, and it all makes sense.
But there's something about the "conversation" between a real human or an AI voice mode where you're not on the rails. It's real time and you have to lock in and understand. That's where the magic happens!
While it is true that model makers are increasingly trying to game benchmarks, it's also true that benchmark-chasing is lowering model quality. GPT 5, 5.1 and 5.2 have been nearly universally panned by almost every class of user, despite being a benchmark monster. In fact, the more OpenAI tries to benchmark-max, the worse their models seem to get.
Prompt 1: "Research <X> domain, think deeply, and record a full analysis in /docs/TICKET-123-NOTES.md"
Prompt 2: Based on our research, read TICKET-123 and began formulating solutions. Let's think this problem through and come up with multiple potential solutions. Document our solutions in TICKET-123-SOLUTIONS.md
Prompt 3: Based on Solution X, let's formulate a complete plan to implement. Break the work into medium sized tasks that a human could complete in 5-10 hours. Write our plan in TICKET-123-PLAN.md
I've often thought that some of this metadata, such as the research, solutioning and plan could be shared. I think they're valuable for code review. I've also translated these artifacts into other developer documentation paradigms.
But the prompts? You're not getting a lot of value there.
reply