Pretty similar except maybe you’ll get lots more nulls judging by the other comments! Cheaper but nulls. Will need to work on the recall a bit. But also potentially based on use case feedback maybe look at other niches and features
I'm always amazed at these relatively tiny projects that "launch" with a "customers" list that reads like they've spent 10 years doing hard outbound enterprise sales: Google, Intel, Apple, Amazon, Deloitte, IBM, Ford, Meta, Uber, Tencent, etc.
When it comes to the evals for this kind of thing, is there a standard set of test data out there that one can work with to benchmark against? ie a collection of documents with questions that should result in particular documents or chunks being cited as the most relevant match.
Docs bots like these are deceptively hard to get right in production. Retrieval is super sensitive to how you chunk/parse documentation and how you end up structuring documentation in the first place (see frontpage post from a few weeks ago: https://news.ycombinator.com/item?id=44311217).
You want grounded RAG systems like Shopify's here to rely strongly on the underlying documents, but also still sprinkle a bit of the magic of the latent LLM knowledge too. The only way to get that balance right is evals. Lots of them. It gets even harder when you are dealing with GraphQL schema like Shopify has since most models struggle with that syntax moreso than REST APIs.
FYI I'm biased: Founder of kapa.ai here (we build docs AI assistants for +200 companies incl. Sentry, Grafana, Docker, the largest Apache projects etc).
Does this mean when they grow up, their own offspring will also have this defect and require a correction? And, if so, does this mean it is now introducing this defective gene into our gene pool?
I know this is an issue with caesarean section. It is becoming more prevalent because those who require it are surviving, making it more likely to happen in their offspring.
Cohere provides an API to access and finetune large language models (generative models like GPT and text representation/embedding models like BERT). These types of language models empower the majority of the latest developments in natural language understanding and generation.
Your feedback is well taken, we'll work to make these more reachable from the homepage.
As others have already said, think about what you're doing when you use this.
If you connect a not-selhosted LLM to this, you're effectively uploading chat message with other people to a third-party server. The people you chat with have an expectation of privacy so this would probably be illegal in many jurisdictions.
I dove into using LLMs together with MCP servers for the first time this weekend. Absolutely incredible.
In addition to the code assistant, I configured a Grafana's MCP server with Cline, so that I can chat with an LLM while having real-time metrics and logs.
For context, I self host grafana in addition to a bunch of services on a raspberry pi. Simple prompts such as "why has CPU been increasing this week?" resulted in a deep analysis of logs/metrics that uncovered correlations I had never been aware of.
Incredible. I can only imagine what this will all look like in a few years
My pursuit of happiness, I'm in fear of quitting my current job and go for a working holiday to Australia, I'm excited while still trying to overcome the fear of not having a stable and well paying job because I don't find any joy in this job no more , so I am working on mentally getting out of this, I want to truly let go "money is more important than my happiness" idea.
Busy building some personal infrastructure around information. Think mix of scrapping and filtering, scoring and summarization.
Clearly all of the info space is getting ever more polluted and I just don’t trust anyone else (with their own agenda) to manage and filter that for me. If one is to abdicate that sort of responsibility to a system wholesale then I think it has to be fully under one’s control, own data, own design, self hosted etc
I am working on a versatile and powerful platform for ingesting, transforming, and searching through structured, unstructured, and semi-structured data. It allows for interactive searches, dashboards, alerts, and more.
We are in open Alpha at the moment, but plan on offering affordable plans while keeping the source code available. While in open Alpha and during the upcoming Beta, it is free to use for any purpose.
We will be selling Fair-Source license, meaning that the source code will be released under MIT 2 years after release.
I'm working on pure.md[1], which lets your scripts, APIs, apps, agents, etc reliably access web content in markdown format. Simply prefix any URL with `pure.md/` and you get the unblocked markdown content of that webpage. It avoids bot detection and renders JavaScript-heavy websites, and can convert HTML, PDFs, images, and more into pure markdown.
pure.md acts as a global caching layer between LLMs and web content. I like to think of it like a CDN for LLMs, similar to how Cloudinary is a CDN for images.
I become more and more convinced with each of these tweets/blogs/threads that using LLMs well is a skill set akin to using Search well.
It’s been a common mantra - at least in my bubble of technologists - that a good majority of the software engineering skill set is knowing how to search well. Knowing when search is the right tool, how to format a query, how to peruse the results and find the useful ones, what results indicate a bad query you should adjust… these all sort of become second nature the longer you’ve been using Search, but I also have noticed them as an obvious difference between people that are tech-adept vs not.
LLMs seems to have a very similar usability pattern. They’re not always the right tool, and are crippled by bad prompting. Even with good prompting, you need to know how to notice good results vs bad, how to cherry-pick and refine the useful bits, and have a sense for when to start over with a fresh prompt. And none of this is really _hard_ - just like Search, none of us need to go take a course on prompting - IMO folks jusr need to engage with LLMs as a non-perfect tool they are learning how to wield.
The fact that we have to learn a tool doesn’t make it a bad one. The fact that a tool doesn’t always get it 100% on the first try doesn’t make it useless. I strip a lot of screws with my screwdriver, but I don’t blame the screwdriver.
Anyone who wants to demystify ML should read: The StatQuest Illustrated Guide to Machine Learning [0] By Josh Starmer.
To this day I haven't found a teacher who could express complex ideas as clearly and concisely as Starmer does. It's written in an almost children's book like format that is very easy to read and understand. He also just published a book on NN that is just as good. Highly recommend even if you are already an expert as it will give you great ways to teach and communicate complex ideas in ML.
It delves into theoretical underpinnings of probability theory and ML, IMO better than any other course I have seen. (Yeah, Andrew Ng is legendary, but his course demands some mathematical familarity with linear algebra topics)
"A Brazilian judge tells Elon that he has to block certain users of X. Elon says no. The judge says that he will then put X's legal representative in Brazil in jail. Elon closes the offices in Brazil. The judge says that he has to have a legal representative in Brazil, that is what the law says. Elon says "if I name another representative you will put him in jail". Then the judge orders X to be blocked in Brazil. And he threatens to fine those who try to use X in Brazil through VPNs. In other words, users who easily use X in Brazil to see memes become potential criminals when they did nothing illegal.
It's crazy. It's an abuse of authority. Because let's suppose that Carlinho Da Souza calls for burning all the kids alive, the one who commits a crime (let's suppose) is Carlinho, and Justice should be focused on him, not on the company that provides its platform without knowing beforehand that Carlinho is an idiot, and even knowing it later from his posts. And you shouldn't demand that the company prevent Carlinho from exposing his stupidity, that would be like telling the cell phone company not to let me talk on the phone because I threatened to break someone's face. And then, since the company says no, it won't prevent me from talking on the phone, then it blocks the cell phone signal throughout the country, for everyone.
Freedom of expression is being able to say what you want and take responsibility for the consequences. But the consequences are for the alleged offender, not for people who have nothing to do with it.
To make another cheap analogy, if someone stabs a neighbor, you can't ban knives and force butchers to cut meat with their teeth."
Learning that some folks can produce so much value with crappy code.
I've seen entire teams burn so much money by overcomplicating projects. Bikesheding about how to implement DDD, Hexagonal Architecture, design patterns, complex queues that would maybe one day be required if the company scaled 1000x, unnecessary eventual consistency that required so much machinery and man hours to keep data integrity under control. Some of these projects were so late in their deadlines that had to be cancelled.
And then I've seen one man projects copy pasting spaghetti code around like there's no tomorrow that had a working system within 1/10th of the budget.
Now I admire those who can just produce value without worrying too much about what's under the hood. Very important mindset for most startups. And a very humbling realization.
I'm incredibly lucky in that I live 5 minutes away from the office. I work 7.5 hours a day, Monday to Friday. Every day I got a ton of time to dedicate to things I like... and yet, many times I feel drained from crunching code every day... It's been like 8 months since I worked on a personal project... lately all I do is play Breath of the Wild... But I feel pretty good about it, I think I relate to this fuck-being-productive culture more and more every day... or maybe I'm just depressed... don't know.
Honestly, I do feel somewhat similar. I work a normal 8-hour day and I am not obsessed with productivity nor I am some kind of anti-work activist, but it just feels like such a waste of time. The only reason I go to work is for money. I don't care about the products we build for someone else (why should I?), nor the technologies used (each of which brings its own challenges and frustrations). If I didn't need to go to work, I could dedicate more time to reading, writing, learning new skills, working on my own side projects, getting enough sleep, exercising, cooking, etc. Work just sucks the very soul out of me, and at the end of the day, I don't really want to do anything. Only on weekends and holidays do I feel much more energetic and motivated to do the things I listed previously, which evaporates by Monday.
I've been using ChatGPT pretty consistently during the workday and have found it useful for open ended programming questions, "cleaning up" rough bullet points into a coherent paragraph of text, etc. $20/month useful is questionable though, especially with all the filters. My "in between" solution has been to configure BetterTouchTool (Mac App) with a hotkey for "Transform & Replace Selection with Javascript". This is intended for doing text transforms, but putting an API call instead seems to work fine. I highlight some text, usually just an open ended "prompt" I typed in the IDE, or Notes app, or an email body, hit the hotkey, and ~1s later it adds the answer underneath. This works...surprisingly well. It feels almost native to the OS. And it's cheaper than $20/month, assuming you aren't feeding it massive documents worth of text or expecting paragraphs in response. I've been averaging like 2-10c a day, depending on use.
Here is the javascript if anyone wants to do something similar. I don't know JS really, so I'm sure it could be improved. But it seems to work fine. You can add your own hard coded prompt if you want even.
Contribute to LAION, Eleuther or any of the image/media generation open source notebooks and you'll get an interview pretty quick, hired dozens that way.
I recently also developed an App to help me figure out my IBS and digestion problems. Basically i looked at all the apps in the app store but did not found a simple to use app. Inserting your meals should really be as simple as it possibly can, otherwise you will end up not using it at all.
The app works in German&English. I haven't proof read every English entry however, so feedback is very much welcomed. I can return lifetime premium access in return :)
Okay enough advertising, sorry but the topic is quite interesting and everyone with food problems can just benefit from this.
I remember being a 13yo kid sitting on IRC doing exactly this for fun years ago back when IP addresses were cheap and easy to come by. But spoofing military IPs in the traceroute was more fun.
Ask to join a standup, ideally with the team you'd be on, if they do them. If there are five teams working on unrelated projects all in the same standup and everyone's checked out and they're all clearly only talking to their manager and listing every little thing they did and trouble they overcame yesterday, to justify why they should keep their job... run away.
1. For a Linux user, you can already build such a system yourself quite trivially by streaming from /dev/urandom, mapping to cardinal directions and simulating a random walk on the keyboard.
2. It doesn't actually replace an email autoresponder. Most people I know will put things like their return date in the auto-response.
3. It does not seem very "scalable" or income-generating. I know this is premature at this point, but it seems that down the road this may require a lot of horses.
1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
2. It doesn't actually replace a USB drive. Most people I know e-mail files to themselves or host them somewhere online to be able to perform presentations, but they still carry a USB drive in case there are connectivity problems. This does not solve the connectivity issue.
3. It does not seem very "viral" or income-generating. I know this is premature at this point, but without charging users for the service, is it reasonable to expect to make money off of this?
This is my favourite feature of gitignores.
Everytime I need "drafts", sample code, etc, in a repo, I create a folder in that repo, but then I have to remember to not add it to commits, and I don't want to add it to the .gitignore that is versioned, so I do "mkdir drafts && echo '*' > ./drafts/.gitignore", and it ignores my drafts without having to add a new ignored dir in the versioned .gitignores.
And obviously, it also "ignores" the .gitignore itself because it matches "*", while still taking it into account, which is what I need.
Despite living in Hong Kong and traveling frequently to Shenzhen at the time, I found out from a high school buddy from growing up in Ohio that lived in the USA and had no real connection to China besides this one business trip.
It appears someone who couldn't read english at all had used baidu translate to translate the menu. It turned out baidu translate was translating "扒饭" - assorted grilled meats - to my Twitter (HN, etc) username for YEARS.