I do some electrical drafting work for construction and throw basic tasks at LLMs.
I gave it a shitty harness and it almost 1 shotted laying out outlets in a room based on a shitty pdf. I think if I gave it better control it could do a huge portion of my coworkers jobs very soon
I just can't imagine we are close to letting LLMs do electrical work.
What I notice that I don't see talked about much is how "steerable" the output is.
I think this is a big reason 1 shots are used as examples.
Once you get past 1 shots, so much of the output is dependent on the context the previous prompts have created.
Instead of 1 shots , try something that requires 3 different prompts on a subject with uncertainty involved. Do 4 or 5 iterations and often you will get wildly different results.
It doesn't seem like we have a word for this. A "hallucination" is when we know what the output should be and it is just wrong. This is like the user steers the model towards an answer but there is a lot of uncertainty in what the right answer even would be.
To me this always comes back to the problem that the models are not grounded in reality.
Letting LLMs do electric work without grounding in reality would be insane. No pun intended.
You'd have to make subagents call tools that limit context and give them only the tools they need with explicit instructions.
I think they'll never be great at switchgear rooms but apartment outlet circuitry? Why not?
I have a very rigid workflow with what I want as outputs, so if I shape the inputs using an LLM it's promising. You don't need to automate everything; high level choices should be done by a human.
The most promising aspect for machine learning in electrical and electronic systems is the quantity of precise and correct training data we already have, which keeps growing. This is excellent for tasks such as ASIC/FPGA/general chip design, PCB design, electrical systems design, AOI (automated optical inspection), etc.
The main task of existing tools is rule-based checks and flagging errors for attention (like a compiler), because there is simply too much for a human to think about. The rules are based on physics and manufacturing constraints--precise known quantities--leading to output accuracy which can be verified up to 100%. The output is a known-functioning solution and/or simulation (unless the tool is flawed).
Most of these design tools include auto-design (chips)/auto-routing (PCBs) features, but they are notoriously poor due to being too heavily rule-based. Similar to the Photoshop "Content Aware Fill" feature (released 15 years ago!), where the algorithm tries to fill in a selection by guessing values based on the pixels surrounding it. It can work exceptionally well, until it doesn't, due to lacking correct context, at which point the work needs to be done manually (by someone knowledgeable).
"Hallucinogenic" or diffusion-based AI (LLM) algorithms do not readily learn or repeat procedures with high accuracy, but instead look at the problem holistically, much like a human; weights of neural nets almost light up with possible solutions. Any rules are loose, context-based, interconnected, often invisible, and all based on experience.
LLM tools as features on the design-side could be very promising, as existing rule-based algorithms could be integrated in the design-loop feedback to ground them in reality and reiterate the context. Combined with the precise rule-based checking and excellent quality training data, it provides a very promising path, and more so than tasks in most fields as the final output can still be rule-checked with existing algorithms.
In the near-future I expect basic designs can be created with minimal knowledge. EEs and electrical designer "experts" will only be needed to design and manufacture the tools, to verify designs, and to implement complex/critical projects.
In a sane world, this knowledge-barrier drop should encourage and grow the entire field, as worldwide costs for new systems and upgrades decreases. It has the potential to boost global standards of living. We shouldn't have to be worrying about losing jobs, nor weighing up extortionately priced tools vs. selling our data.
I've been using pyrevit inside revit so I just threw a basic loop in there. There's already a building model and the coworkers are just placing and wiring outlets, switches, etc. The harness wasn't impressive enough to share (alos contains vibe coded UI since I didn't want to learn XAML stuff on a friday night). Nothing fancy; I'm not very skilled (I work in construction)
I gave it some custom methods it could call, including "get_available_families", "place family instance", "scan_geometry" (reads model walls into LLM by wall endpoint), and "get_view_scale".
The task is basically copy the building engineer's layout onto the architect model by placing my families. It requires reading the symbol list, and you give it a pdf that contains the room.
Notably, it even used a GFCI family when it noticed it was a bathroom (I had told it to check NEC code, implying outlet spacing).
I'm going to try to get it to generate extrusions in Revit based on images of floor plans. I've tried doing this in bunch of models without success so far.
You might want to give it some guidance based on edge centers? It'll have a hard time thinking of wall thickness and have it draw points if you're trying to copy floor plans.
for clarity now that I'm rereading: it understands vectors a lot better than areas. Encoding it like that seems to work better for me.
I would really love a magic wand to make things like AVEVA and AutoCAD not so painful to use. You know who should be using tools to make these tools less awful? AVEVA and AutoCAD. Engineers shouldn't be having to take on risk by deferring some level of trust to third party accelerators with poor track records.
I think that, much like LLM’s are specifically trained to be good at coding and good at being agents, we’re going to need better benchmarks for CAD and spatial reasoning so the AI labs can grind on them.
A good start would be getting image generators to understand instructions like “move the table three feet to the left.”
You twisted one "goalpost" into a tangential thing in your first "example", and it still wasn't true, so idk what you're going for. "Using a wrench vs preliminary layout draft" is even worse.
If one attempted to make a productive observation of the past few years of AI Discourse, it might be that "AI" capabilities are shaped in a very odd way that does not cleanly overlap/occupy the conceptual spaces we normally think of as demonstrations of "human intelligence". Like taking a 2-dimensional cross-section of the overlap of two twisty pool tubes and trying to prove a Point with it. Yet people continue to do so, because such myopic snapshots are a goldmine of contradictory venn diagrams, and if Discourse in general for the past decade has proven anything, it's that nuance is for losers.
The problem is how we use it. A human sees not a photo but a video, and has long context before and after, not just that instance, we can also change position, a LLM can't do that at all.
> Remember when the Turing test was a thing? No one seems to remember it was considered serious in 2020
To be clear, it's only ever been a pop science belief that the Turing test was proposed as a literal benchmark. E.g. Chomsky in 1995 wrote:
The question “Can machines think?” is not a question of fact but one of language, and Turing himself observed that the question is 'too meaningless to deserve discussion'.
The Turing test is a literal benchmark. Its purpose was to replace an ill-posed question (what does it mean to ask if a machine could "think", when we don't know ourselves what this means- and given that the subjective experience of the machine is unknowable in any case) with a question about the product of this process we call "thinking". That is, if a machine can satisfactorily imitate the output of a human brain, then what it does is at least equivalent to thinking.
"I believe that in about fifty years'
time it will be possible, to programme computers, with a storage capacity of about 10^9, to
make them play the imitation game so well that an average interrogator will not have
more than 70 per cent chance of making the right identification after five minutes of
questioning. The original question, "Can machines think?" I believe to be too
meaningless to deserve discussion. Nevertheless I believe that at the end of the century
the use of words and general educated opinion will have altered so much that one will be
able to speak of machines thinking without expecting to be contradicted."
Turing seems to be saying several things. He writes:
>If the meaning of the words "machine" and "think" are to be found by examining how they are commonly used it is difficult to escape the conclusion that the meaning and the answer to the question, "Can machines think?" is to be sought in a
statistical survey such as a Gallup poll. But this is absurd.
This anticipates the very modern social media discussion where someone has nothing substantive to say on the topic but delights in showing off their preferred definition of a word.
For example someone shows up in a discussion of LLMs to say:
"Humans and machines both use tokens".
This would be true as long as you choose a sufficiently broad definition of "token" but tells us nothing substantive about either Humans or LLMs.
The turing test is still a thing. No llm could pass for a person for more than a couple minutes of chatting. That’s a world of difference compared to a decade ago, but I would emphatically not call that “passing the turing test”
Also, none of the other things you mentioned have actually happened. Don’t really know why I bother responding to this stuff
Ironically the main tell of LLMs is that are too smart and write too well. No human can discuss the depth of topics they can and no humans writes like a author/journalist all the time.
i.e. the tell that it's not human is that it is too perfectly human.
However if we could transport people from 2012 to today to run the test on them, none would guess the LLM output was from a computer.
That’s not the Turing Test; it’s just vaguely related. The Turing Test is an interactive party game of persuasion and deception, sort of like playing a werewolves versus villagers game. Almost nobody actually plays the game.
Also, the skill of the human opponents matters. There’s a difference between testing a chess bot against randomly selected college undergrads versus chess grandmasters.
Just like jailbreaks are not hard to find, figuring out exploits to get LLM’s to reveal themselves probably wouldn’t be that hard? But to even play the game at all, someone would need to train LLM’s that don’t immediately admit that they’re bots.
Yesterday I stumbled onto a well written comment on reddit, it was a bit contrarian, but good. Then I was curious and looked at their comment history and found it was a one month old account with many comments of similar length and structure. I put a LLM to read that feed and they spotted LLM writing, and the argument? it was displaying too broad a knowledge across topics. Yes, it gave itself up by being too smart. Does that count as Turing test fail?
> No llm could pass for a person for more than a couple minutes of chatting
I strongly doubt this. If you gave it an appropriate system prompt with instructions and examples on how to speak in a certain way (something different from typical slop, like the way a teenager chats on discord or something), I'm quite sure it could fool the majority of people
I still haven't witnessed a serious attempt at passing the Turing test. Are we just assuming its been beaten, or have people tried?
Like if you put someone in an online chat and ask them to identify if the person they're talking to is a bot or not, you're telling me your average joe honestly can't tell?
A blog post or a random HN comment, sure, it can be hard to tell, but if you allow some back and forth.. i think we can still sniff out the AIs.
A couple of months ago I saw a paper (can't remember if published or just on arxiv) in which Turing's original 3-player Imitation Game was played with a human interrogator trying to discern which of a human responder and an LLM was the human. When the LLM was a recent ChatGPT version, the human interrogator guessed it to be the human over 70% of the time; when the LLM was weaker (I think Llama 2), the human interrogator guessed it to be the human something like 54% of the time.
I'm double replying to you since the replies are disparate subthreads. This is the necessary step so the robots who can turn wrenches know how to turn them. Those are near useless without perfect automated models.
Anything like this willl have trouble getting adopted since you'd need these to work with imperfect humans, which becomes way harder. You could bankroll a whole team of subcontractors (e.g. all trades) using that, but you would have one big liability.
The upper end of the complexity is similar to EDA in difficulty, imo. Complete with "use other layers for routing" problems.
I feel safer here than in programming. The senior guys won't be automated out any time soon, but I worry for Indian drafting firms without trade knowledge; the handholding I give them might go to an LLM soon.
To all of these I can only say: in the hands of a domain-expert user, AI tools really shine.
For example, artists can create incredible art, and so can AI artists. But me, I just can't do it. Whatever art I have generated will never have the creative spark. It will always be slop.
The goalposts haven't moved at all. However, the narrative would rather not deal with that.
I gave it a shitty harness and it almost 1 shotted laying out outlets in a room based on a shitty pdf. I think if I gave it better control it could do a huge portion of my coworkers jobs very soon