Hacker Newsnew | past | comments | ask | show | jobs | submit | theOGognf's commentslogin

I feel like both this comment and the parent comment highlight how RL has been going through a cycle of misunderstanding recently from another one of its popularity booms due to being used to train LLMs

care to correct the misunderstanding?

I mean DPO, PPO, and GRPO all use losses that are not what’s used with SFT for one.

They also force exploration as a part of the algorithm.

They can be used for synthetic data generation once the reward model is good enough.


Its reductive, but also roughly correct.

While collecting data according to policy is part of RL, 'reductive' is an understatement. It's like saying algebra is all about scalar products. Well yes, 1%

I really enjoyed it. Very Hitchhiker’s Guide -like. It’s been a while since I’ve read something like that. Thanks for sharing


I used to follow advancements in RL pretty closely since 2016; it’s cool to see how far methodologies and algorithms have come to be able to do complex tasks completely offline. Dreamer 4 is another big leap in the Dreamer series


Along the same lines of missing good junior engineers at work, we occasionally interview stellar engineers that’ve inflated their resume a bit to get an interview, but we end up rejecting them for not having all the specific experiences our manager wants them to have even though they’re generally great and could clearly upskill where necessary. No wonder we can’t grow the team when we’re out here looking for unicorns


Employers need to get back to expecting to have to train new employees. This used to be pretty common. My first two employers after college hired a lot of people with no programming experience into programming roles. They just looked for smart people who were willing to learn, and they taught them to program.


Not sure about other hospital systems, but the one I work at is developing CV systems to help fill workforce gaps in places where there isn’t as many trained professionals or even resources to train professionals


No, that is the norm. Radiologists speak with their colleagues the most, and patients rarely


This article is pretty good. My current work is transitioning CV models in a large, local hospital system to a more unified deployment system, and much of the content aligns with conversations we have with providers, operations, etc..

I think the part that says models will reduce time to complete tasks and allow providers to focus on other tasks is on point in particular. For one CV task, we’re only saving on average <30min of work per study, so it isn’t massive savings from a provider’s perspective. But scaled across the whole hospital, it’s huge savings


>reduce time to complete tasks and allow providers to focus on other tasks

Or, far more likely, to cut costs and increase profits.


I didn’t see this before, but I work at a non-profit, government hospital system. So increasing profits, although probably a good answer most of the time, is probably not as applicable in this case lol


It’s ironic that HN threads, arguably one of the forums where the majority of users should understand a tech company and its tech, about Palantir always devolve into some weird speculative and conspiracy-like discussion. Palantir’s docs are pretty open too - it’s not like it’s a black box that you can only see if you have a contract with them. So one would think the HN crowd would know something and have an interesting discussion on how it compares to what they’ve seen, etc. But it somehow always turns mostly political and less about the tech


It's because we understand technology very well and how it can be used to further control or surveil you. The tech itself isn't complicated; at best you would have a unified protocol that seamlessly integrates with all data sources, at worst, that part is done manually, but the rest of the tech isn't new as a concept, so there's really nothing to discuss technology-wise. However, as technical people, we can see how something can be used in a bad way or at least, sometime in the future based on the current trend, and it's necessary to discuss such implications even if it's political. For example, when a messaging app requires a phone number to activate, it's essential to highlight that it could be exploited in a SIM swap attack (thus the user should not trust it) or it could leak that number which will expose this person's real identity. And in this case, having so much information collected and shared and easily accessed by one centralized entity is never a good indicator. It's also ironic that the people who used to (still?) attack China and other countries for being surveillance state Orwellian dystopias while virtue signaling all democracy and freedom values, are now okay with such data collection and processing and potential red flagging for things as simple as social media posts.


I’ve used Diesel for a bit now but haven’t had issues wrangling the type system. Can you give an example of an issue you’ve encountered?


Is it common to see Metaflow used alongside MLflow if a team wants to track experiment data?


Metaflow tracks all artifacts and allows you to build dashboards with them, so there’s no need to use MLFlow per se. There’s a Metaflow integration in Weights and Biases, CometML etc, if you want pretty off-the-shelf dashboards


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: