Hacker Newsnew | past | comments | ask | show | jobs | submit | nomand's commentslogin

a bird's eye view :)


As a "sailor in tech", "web standards" do not belong anywhere near an ocean going vessel.


"web standards" have extremely optimized ubiquitous implementations with extensive tooling. I strongly disagree with you. CANbus on boats isn't exactly bullet proof, people mess up their N2K networks all the time by just starring at it wrong.


I'm pretty skeptical of propitiatory protocols too. And a lot of pro marine stuff is starting to go wireless which is scary to think about.


It's all Bluetooth low energy afaict, they're not exactly in the field of inventing new RF comms.


BLE? I'd stay the hell away from that. It's bad enough dealing with it in consumer context, but in something safety critical?!


All these gizmos are not safety critical. Anemometer? Wind on your face. Depth? Plenty of people go around with broken ones. SOW paddle? They're always clogged. Wind angle? Wind on your face, check telltale, check sails, check wind index. Particularly the BLE stuff are things that are nice to haves and a pain to physically wire (masthead instruments).

The only "safety critical" thing in this tech is the chart plotter (and GPS), and weather reports. Marine chartplotters are glorified consumer grade computers, and this OpenPlotter thing with OpenCPN is 1000% more reliable and not actively trying to kill you with proprietary licensing crap. Your phone is a great chartplotter and plenty of people cross oceans using nothing more than an iPad.


I wouldn't use HTTP/json for any path that needs to close a control loop at hundreds of Hz, but to loosely couple a bunch of systems in a plant? No problem.

If you have a specific concern, explain it, instead of just exuding judgment.


Parent comment is likely more evoking a bit of a https://xkcd.com/2030/ response, aka humour.

SignalK is neat. As you note not all protocols are suitable for all use cases, but marine sensor networks where most signals are in the realm of 1-10Hz it definitely has its place.


If anything needs to run a server to communicate with anything else, especially if wireless is involved (which often is the case with more modern forms of radar, gps etc. because of ease of wiring and installation) is a recipe for disaster if you're relying on interfacing with existing instruments for navigation.

Simply firing up cli for troubleshooting and debugging of any kind should not be "expected behavior" at sea.


> If anything needs to run a server to communicate with anything else,

What's a server? Do you have servers in a NMEA2k environment?

> Simply firing up cli for troubleshooting and debugging of any kind should not be "expected behavior" at sea.

No one said that arcane troubleshooting and debugging should be needed after integration of a system. I mean, NMEA2K certainly never makes one fiddle around and play with things aimlessly to make stuff work /s

> especially if wireless is involved

Most common way wireless is involved here is to get the information off-boat or to a tablet for convenient review of trends.


This kind of attitude is toxic. If you have complaints about specific protocols, formats, etc, please share them.


Your labeling is only a reflection of the shape of the lens you have on the world. I'm not interested in feeding that.


It’s your labeling that put “web standards” into one bucket. I assure you many of those web standards are far more secure and well tested for not only failure cases but also adversaries than what exists in marine systems.


You're right. #notallwebstandards


You know... I've been working in an interesting combined AgTech and Aviation (drones, but big ones) for a while now and using JSON over Websockets for IPC is one of the best decisions we've made. We don't use it for everything, mind you, there's lower-level protocols that we use to talk to embedded hardware devices, but when we can we do. And while it's a draft standard, we basically riffed on a variant of this for most of it: https://datatracker.ietf.org/doc/html/draft-inadarei-api-hea...

The reason I love it so much is that it's just so straightforward to make server or client that can talk to it. All of our embedded Linux systems are written in C++ right now and they have absolutely no problem publishing and consuming messages in our standard format. One of the original driving factors for this is that we do have some web-based and Electron-based UIs and any protocol that we made that wasn't HTTP-based or Websocket-based would require them to do twice as much work: first, connecting to whatever service from a "backend" server and implementing whatever protocol it needed, and second exposing that backend service to the frontend over a Websocket (generally... since it needed live updates). By standardizing on our in-flight services just exposing everything as Websockets natively we pretty much eliminated a whole tier of complicated logic. The frontends have a single generic piece of code that has standardized reconnect/timeout/etc logic in it, and the backends just have to #include <WSServer.h> and instantiate an object to be able to publish to listeners.

I definitely didn't start there. And I 100% understand where your opinion comes from... from so many different angles a lot of the "modern" web systems shouldn't come within a mile of a safety critical system. Websockets though? They're great! And while JSON isn't necessarily the most efficient encoding, it sure does make debugging easy. We run everything on a closed network that usually doesn't have an Internet connection, so we don't run TLS in between the ground and air systems. If we need to figure out what's going on and an interface is acting up, we can just tcpdump it and have human-readable traffic to inspect.

The flight critical stuff is isolated from all of this and spits out a serial telemetry feed (Mavlink). We do send that directly to the ground station over a dedicated radio, but we also have an airborne service that cooks that into Websockets and in many cases the Websocket-over-very-special-WiFi connection has been more robust than the 915MHz serial link.

And it's not as if existing protocols like NMEA are all that good either.


Thanks for sharing that! Very interesting. There is even less margin for error in air compared to at sea. At least you can still float if the power goes out and at that point a sewing machine is all you need for most critical problems past that point.


We've actually leaned in pretty hard on using "standard" protocols as much as we can:

- We have a flight planning module that takes multiple polygons as input and returns a (large) list of waypoints for covering the regions that the polygons cover. When I was trying to work out the request/response format I decided to use GeoJSON with some extra properties added. You submit the GeoJSON boundaries with a POST request, the planner does a bunch of computational geometry and graph algorithms, and returns back a GeoJSON. If you want to, you can just load the flight plan up in QGIS or ArcGIS or whatever and inspect it directly.

- We also accumulate quite a bit of geospatial data that we need to post-process. We used SQLite with the Spatialite extension to store that. Same story as the flight plans... you can really easily load it into QGIS or Geopandas or whatever you want and do your analysis

- We need to stream video down to the ground station and ended up using RTSP, h.264, and GStreamer to do that. You can connect to the video feed using our ground station software if you want, but you can also just connect to it using VLC. And internally this meant that if we wanted to do hardware-accelerated encoding it was just a matter of changing the GStreamer pipeline. Or... if I get my way over the next month or so, we'll be adding a HUD with extra telemetry right into the video feed, again using GStreamer plugins.


Why not, specifically?


Ollama for mac + https://continue.dev/. Otherwise c.d has hooks for other types of installs.


Is it possible for such local install to retain conversation history so if for example you're working on a project and use it as your assistance across many days that you can continue conversations and for the model to keep track of what you and it already know?


My LLM command line tool can do that - it logs everything to a SQLite database and has an option to continue a conversation: https://llm.datasette.io


There is no fully built solution, only bits and pieces. I noticed that llama outputs tend to degrade with amount of text, the text becomes too repetitive and focused, and you have to raise the temperature to break the model out of loops.


Does what you're saying mean you can only ask questions and get answers in a single step, and that having a long discussion where refinement of output is arrived at through conversation isn't possible?


My understanding is that at a high level you can look at this model as a black box which accepts a string and outputs a string.

If you want it to “remember” things you do that by appending all the previous conversations together and supply it in the input string.

In an ideal world this would work perfectly. It would read through the whole conversation and would provide the right output you expect, exactly as if it would “remember” the conversation. In reality there are all kind of issues which can crop up as the input grows longer and longer. One is that it takes more and more processing power and time for it to “read through” everything previously said. And there are things like what jmiskovic said that the output quality can also degrade in perhaps unexpected ways.

But that also doesn’t mean that “ refinement of output is arrived at through conversation isn't possible”. It is not that black and white, just that you can run into troubles as the length of the discussion grows.

I don’t have direct experience with long conversations so I can’t tell you how long is definietly too long, and how long is still safe. Plus probably there are some tricks one can do to work around these. Probably there are things one can do if one unpacks that “black box” understanding of the process. But even without that you could imagine a “consolidation” process where the AI is instructed to write short notes about a given length of conversation and then those shorter notes would be copied in to the next input instead of the full previous conversation. All of these are possible, but you won’t have a turn-key solution for it just yet.


The limit here is the "context window" length of the model, measured in tokens, which will quickly become too short to contain all of your previous conversations, which will mean it has to answer questions without access to all of that text. And within a single conversation, it will mean that it starts forgetting the text from the start of the conversation, once the [conversation + new prompt] reaches the context length.

The kind of hacks that work around this are to train the model on the past conversations, and then rely on similarity in tensor space to pull the right (lossy) data back out of the model (or a separate database) later, based on its similarity to your question, and include it (or a summary of it, since summaries are smaller) within the context window for your new conversation, combined with your prompt. This is what people are talking about when they use the term "embeddings".


My benchmark is having a peer programming session spanning days and dozens of queries with ChatGPT where we co-created a custom static site generator that works really well for my requirements. It was able to hold context for a while and not "forget" what code it provided me dozens of messages earlier, it was able to "remember" corrections and refactors that I gave it and overall was incredibly useful for working out things like recurrence for folder hierarchies and building data trees. This kind and similar use-cases where memory is important, when the model is used as a genuine assistant.


Excelent! That sounds like a very usefull personal benchmark then. You could test llama v2 by copying in different lengths of snippets from that conversation and checking how usefull you find its outputs.


llama is just an input/output engine. It takes a big string as input, and gives a big string of output.

Save your outputs if you want, you can copy/paste them into any editor. Or make a shell script that mirrors outputs to a file and use that as your main interface. It's up to the user.


Is there a coherent resource (not a scattered 'just google it' series of guides from all over the place) that encapsulates some of the concepts and workflows you're describing? What would be the best learning site/resource for arriving at understanding how to integrate and manipulate SD with precision like that? Thanks


I have found http://stable-diffusion-art.com to be an absolutely invaluable (and coherent) resource. It's highly ranked on Google for most "how to do X with stable diffusion" style searches, too.


> What would be the best learning site/resource for arriving at understanding how to integrate and manipulate SD with precision like that?

Honestly? Probably YouTube tutorials.


Jaysus.

I'm going to sound like an entitled whiny old guy shouting at clouds, but - what the hell; with all the knowledge being either locked and churned on Discord, or released in form of YouTube videos with no transcript and extremely low content density - how is anyone with a job supposed to keep up with this? Or is that a new form of gatekeeping - if you can't afford to burn a lot of time and attention as if in some kind of Proof of Work scheme, you're not allowed to play with the newest toys?

I mean, Discord I can sort of get - chit-chatting and shitposting is easier than writing articles or maintaining wikis, and it kind of grows organically from there. But YouTube? Surely making a video takes 10-100x the effort and cost, compared to writing an article with some screenshots, while also being 10x more costly to consume (in terms of wasted time and strained attention). How does that even work?


I've been playing with SD for a few months now and have only watched 20-30m of YT videos about it. There's only a few worth spending any time watching, and they're on specific workflows or techniques.

Best just to dive in if you're interested IMO. Otherwise you'll get lost in all the new jargon and ideas. Great place to start is the A1111 repo, lot of community resources available and batteries included.


How does anyone keep up with anything? It's a visual thing. A lot of people are learning drawing, modeling, animation etc in the exact same way - by watching YouTube (a bit) and experimenting (a lot).


Picking images from generated sets is a visual thing. Tweaking ControlNet might be too (IDK, I've never got a chance to use it - partly because of what I'm whining about here). However, writing prompts, fine-tuning models, assembling pipelines, renting GPUs, figuring out which software to use for what, where to get the weights, etc. - none of this is visual. It's pretty much programming and devops.

I can't see how covering this on YouTube, instead of (vs. in addition to) writing text + some screenshots and diagrams, makes any kind of sense.


This isn't for Stable Diffusion, but I wanted to provide a supplemental to my comment: https://kaiokendev.github.io/til

This is the level we're generally working at - first or second party to the authors of the research papers illustrating implementations of concepts, struggling with the Gradio interface, things going straight from commit to production.

It's way less frustrating to follow all of the authors in the citations of the projects you're interested in than wasting your attention sorting through blogspam, SEO, and YT trash just to find out they don't really understand anything, either.


Thank you. I was reluctant to chase after and track first-party research directly, or work directly derived from it, as my limited prior experience told me it's not the most efficient thing unless I want to go into that field of research myself. You're changing my mind about this; from now, I'll try sticking close to source.


There's a relatively thin layer between the papers and implementations, which is another way of saying this stuff is still for researchers and assumes a requisite level of background with them. It sounds like you'd benefit from seeking out the first party sources.

This is where video demonstrations come in handy. Since many concepts are novel, it's uncommon to find anyone who deeply understands them, but it's very easy to find people who have picked up on some tricks of the interfaces, which they're happy to click through. I think gradio/automatic1111 makes learning harder than it needs to be by hiding what it's doing behind its UI, while eg- comfyui has a higher initial learning curve but provides a more representational view of process and pipelines.


Take a moment and go scroll through the examples at civitai.com. Does most of them strike you as something by people with jobs? Most of them are pretty juvenile, with pretty women and various anime girls.


Are you under the impression that people with jobs don't like pretty women and anime girls?


Of course not, but it looks like a teenage boy's room.


An operative word here is people.... the set "people with jobs" contains a far higher fraction of folks who like attractive men than is represented here....


The difference being that youtube videos can make more money for the author. Anyway, it's all open source, so feel free to make a wiki


I would if I could keep up with the videos :).


I think it'd have been convenient for me as well if the AI tool that has access to YouTube videos would've been able to answer queries . But it takes 5 minutes to reply and I forgot it's name. It was on the front page recently


I mostly agree, but in this case it can be genuinely useful to actually see the process of someone using the tool effectively.


You're describing Arietty by Ghibli, even "Borrower" is the English translation for the little beings :)


Well, no, obviously not. The Ghibli movie is a takeoff on the Borrowers series by Mary Norton; there is no reason to believe tetris11 had the movie in mind rather than the books he referred to by name.


There's a story called "The Borrowers" from the 50's. I'm not sure which pre-dates the other.


Arrietty is based on the book.


The thing you touch and interact with every day, better be beautiful and bring you joy, whether through aesthetics, tactile feel, personalized function, or whatever else, hencewhy "custom" is the keyword here. Nothing to do with quality of your typing or the work. You don't opt out for the cheapest possible decor because it "does the job", your environment affects the way you feel and this is just one ways to express one's own.


My team have been developing Remotely (https://meetings.remotelyhq.com/) with this very thing in mind - customizable avatars, voice-driven animation and expressions, animated emoticons, cinematic rooms, interactibles. It's still very early stages, but the main observation is that the scope of problems people are having with communication is too large. Some people want to feel less anxious in meetings, some want accountability, some productivity, some want "fun" and customize stuff, others want video, screensharing etc. It's hard to pinpoint "the problem" that's tangible and solvable outside of the base must-have features. "feeling more like part of the team while working remotely" is not an easy problem to tackle.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: