In every discussion of AI eliminating or dramatically reducing the compensation for <some large double digit percentage> of “white collar” jobs (and probably “blue collar” too). It’s unclear to me what the end state is - the vast majority of the economy works on volume. You need large numbers of people with enough money to buy your product/service. As wealth concentrates there are fewer potential buyers and economies of scale start working against producers. (And governments need people with money to tax…)
The economy becomes a palace economy, where the money fountains are owned by a few and the loot slowly flows through rings of gatekeepers while the outer rungs are plagued by desperation and poverty despite living in the shadow of abundance. These are common around the world and through time. It's the Star Wars fate.
I’ve been looking for tooling that would evaluate my prompt and give feedback on how to improve. I can get somewhere with custom system prompts (“before responding ensure…”) but it seems like someone is probably already working on this? Ideally it would run outside the actual thread to keep context clean. There are some options popping up on Google but curious if anyone has a first anecdote to share?
2. A full context deficiency analysis and multiple question interview system to bounds check and restructure your prompt into your ‘goal’.
3. Realizing that what looks like a good human prompt is not the same as what functions as a good ‘next token’ prompt.
If you just want #1:
import dspy
class EnhancePrompt(dspy.Signature):
"""Assemble the final enhanced prompt from all gathered context"""
essential_context: str = dspy.InputField(desc="All essential context and requirements")
original_request: str = dspy.InputField(desc="The user's original request")
enhanced: str = dspy.OutputField(desc="Complete, detailed, unambiguous prompt. Omit politeness markers. You must limit all numbered lists to a maximum of 3 items.")
One challenge with hiring a nanny is if they need to take a sick day (or if they quit!) you can end up in a tough spot. In contrast a day care center usually has backups built in so you don’t end up scrambling.
Food pyramid was taught when I was in school, but that was before 2011 (as mentioned by another commenter) my own children are in school now and their school lunches align with more modern ideas (veggies and proteins). Certainly could still be improved but I recognize the cost, scale, delivery constraints, plus allergy considerations makes this non-trivial.
This seems like what should be a killer feature: Copilot having access to configuration and logs and being able to identify where a failure is coming from. This stuff is tedious manually since I basically run through a checklist of where the failure could occur and there’s no great way to automate that plus sometimes there’s subtle typo type issues. Copilot can generate the checklist reasonably well but can’t execute on it, even from Copilot within Azure. Why not??
I have had the experience of approaching or completing something potentially dangerous (merging on a busy street for example) and thinking I should “save” and sub consciously visualizing doing so internally. Very fleeting sensation and doesn’t happen consistently at all but it’s interesting when I notice it.
I don't think this is unique to churches, or even non-profits. Plenty of non-church non-profits rely on a few large donors for much or their funding (in fact plenty are designed that way out the gate - they're founded by one very wealthy individual to work on the projects they care about) and plenty of for profit businesses rely on a few large dollar clients for much of their revenue. Both could potentially be seen extensions of the same economic system that concentrates wealth for both individuals and businesses at top.
We don't really have a true test that means "if we pass this test we have AGI" but we have a variety of tests (like ARC) that we believe any true AGI would be able to pass. It's a "necessary but not sufficient" situation. Also ties directly to the challenge in defining what AGI really means. You see a lot of discussions of "moving the goal posts" around AGI, but as I see it we've never had goal posts, we've just got a bunch of lines we'd expect to cross before reaching them.
I don't think we actually even have a good definition of "This is what AGI is, and here are the stationary goal posts that, when these thresholds are met, then we will have AGI".
If you judged human intelligence by our AI standards, then would humans even pass as Natural General Intelligence? Human intelligence tests are constantly changing, being invalidated, and rerolled as well.
I maintain that today's modern LLMs would pass sufficiently for AGI and is also very close to passing a Turing Test, if measured in 1950 when the test was proposed.
>I don't think we actually even have a good definition of "This is what AGI is, and here are the stationary goal posts that, when these thresholds are met, then we will have AGI".
Not only do we not have that, I don't think it's possible to have it.
Philosophers have known about this problem for centuries. Wittgenstein recognized that most concepts don't have precise definitions but instead behave more like family resemblances. When we look at a family we recognize that they share physical characteristics, even if there's no single characteristic shared by all of them. They don't need to unanimously share hair color, skin complexion, mannerisms, etc. in order to have a family resemblance.
Outside of a few well-defined things in logic and mathematics, concepts operate in the same way. Intelligence isn't a well-defined concept, but that doesn't mean we can't talk about different types of human intelligence, non-human animal intelligence, or machine intelligence in terms of family resemblances.
Benchmarks are useful tools for assessing relative progress on well-defined tasks. But the decision of what counts as AGI will always come down to fuzzy comparisons and qualitative judgments.
The current definition and goal of AGI is “Artificial intelligence good enough to replace every employee for cheaper” and much of the difficulty people have in defining it is cognitive dissonance about the goal.
I’d remove the “for cheaper” part? (And also, only necessary for the employees whose jobs are “cognitive tasks”, not ones that are based on their bodies. So like, doesn’t need to be able to lift boxes or have a nice smile.)
If something would be better at every cognitive task than every human, if it ran a trillion times faster, I would consider that to be AGI even if it isn’t that useful at its actual speed.
Because an important part of being a Natural general Intelligence is having a body and interacting with the world. Data from Star Trek is a good example of an AGI.
Turing test is not really that meaningful anymore because you can always detect the AI by text and timing patterns rather than actual intelligence. In fact the most reliable way to test for AI is probably to ask trivia questions on various niche topics, I don't think any human has as much breath of general knowledge as current AIs.
> you can always detect the AI by text and timing patterns
I see no reason why an AI couldn't be trained on human data to fake all of that.
If noone has bothered so far, that's because pretty much all commercial applications of this would be illegal or at least leading to major reputational damage when exposed.
One of the very first slides of François’ presentation is about defining AGI. Do you have anything that opposes his synthesis of the two (50 years old) takes on this definition?
I have graduated with a degree in Software engineering and i am bilingual (Bulgarian and English). Currently AI is better than me in everything except adding big numbers or writing code in really niche topics - for example code golfing a Brainfuck interpreter or writing a Rubiks cube solver.
I believe AGI has been here for at least a year now.
I suggest you to try to let the AI think through race conditions scenarios in asynchronous programs; it is not that good at these abstract reasoning tasks.
Can the AI wash your dishes, fold your laundry, take out your trash, meet a friend for dinner or the other thousand things you might do in an average day when you're not interacting with text on a screen?
You know stuff that humans have done way before there were computers and screens.
Yeah, I'm convinced that the biggest difference between the current generation of AIs we have and humans is that AIs don't have the range of tool use and interaction with the physical environment that humans do. And that's what's actually holding AGI back not access to more data.
Related - there needs to be individuals and businesses that want/need and can afford upgrades and repairs. If office workers are getting replaced with AI we don't need to build and maintain offices and the ecosystems that support them (see also WFH/Covid) and those workers won't have income to pay for plumbers, electricians, roofers, etc. for their personal property. A worst case scenario AI workforce revolution would attack trades from both supply and demand.
It's worth noting that for those edge cases all the productivity monitoring in the world won't make that employee any more effective, and you won't need those tools to see that they're not cutting it (assuming you're engaged with your team as the other commenter describes). You'll likely lose more in annoying the rest of your team and burning your own cycles with surveillance than you'll gain from it.
> It's worth noting that for those edge cases all the productivity monitoring in the world won't make that employee any more effective, and you won't need those tools to see that they're not cutting it (assuming you're engaged with your team as the other commenter describes).
The main purpose of the tracking of the “edge cases” is basically insurance in the event of a law suit.
Yes, it irritates the folks with good intentions, but a good manager will keep the tracking tax as light as possible for the folks who are actually working.
The amount of headache it saves when the lawsuit or threat of a lawsuit comes around is quite a bit.