More

theOGognf · 2025-11-26T04:40:56 1764132056

I feel like both this comment and the parent comment highlight how RL has been going through a cycle of misunderstanding recently from another one of its popularity booms due to being used to train LLMs

mistercheph · 2025-11-26T06:55:02 1764140102

care to correct the misunderstanding?

mountainriver · 2025-11-26T14:38:08 1764167888

I mean DPO, PPO, and GRPO all use losses that are not what’s used with SFT for one.

They also force exploration as a part of the algorithm.

They can be used for synthetic data generation once the reward model is good enough.

phyalow · 2025-11-26T09:41:02 1764150062

Its reductive, but also roughly correct.

singularity2001 · 2025-11-26T14:03:23 1764165803

While collecting data according to policy is part of RL, 'reductive' is an understatement. It's like saying algebra is all about scalar products. Well yes, 1%

theOGognf · 2025-10-02T03:40:54 1759376454

I really enjoyed it. Very Hitchhiker’s Guide -like. It’s been a while since I’ve read something like that. Thanks for sharing

theOGognf · 2025-09-30T22:07:02 1759270022

I used to follow advancements in RL pretty closely since 2016; it’s cool to see how far methodologies and algorithms have come to be able to do complex tasks completely offline. Dreamer 4 is another big leap in the Dreamer series

theOGognf · 2025-09-30T03:52:55 1759204375

Along the same lines of missing good junior engineers at work, we occasionally interview stellar engineers that’ve inflated their resume a bit to get an interview, but we end up rejecting them for not having all the specific experiences our manager wants them to have even though they’re generally great and could clearly upskill where necessary. No wonder we can’t grow the team when we’re out here looking for unicorns

SoftTalker · 2025-09-30T04:19:24 1759205964

Employers need to get back to expecting to have to train new employees. This used to be pretty common. My first two employers after college hired a lot of people with no programming experience into programming roles. They just looked for smart people who were willing to learn, and they taught them to program.

theOGognf · 2025-09-25T16:04:56 1758816296

Not sure about other hospital systems, but the one I work at is developing CV systems to help fill workforce gaps in places where there isn’t as many trained professionals or even resources to train professionals

theOGognf · 2025-09-25T15:54:09 1758815649

No, that is the norm. Radiologists speak with their colleagues the most, and patients rarely

theOGognf · 2025-09-25T15:40:23 1758814823

This article is pretty good. My current work is transitioning CV models in a large, local hospital system to a more unified deployment system, and much of the content aligns with conversations we have with providers, operations, etc..

I think the part that says models will reduce time to complete tasks and allow providers to focus on other tasks is on point in particular. For one CV task, we’re only saving on average <30min of work per study, so it isn’t massive savings from a provider’s perspective. But scaled across the whole hospital, it’s huge savings

asadotzler · 2025-09-25T21:49:48 1758836988

>reduce time to complete tasks and allow providers to focus on other tasks

Or, far more likely, to cut costs and increase profits.

theOGognf · 2025-09-30T11:45:39 1759232739

I didn’t see this before, but I work at a non-profit, government hospital system. So increasing profits, although probably a good answer most of the time, is probably not as applicable in this case lol

theOGognf · 2025-09-12T02:37:16 1757644636

It’s ironic that HN threads, arguably one of the forums where the majority of users should understand a tech company and its tech, about Palantir always devolve into some weird speculative and conspiracy-like discussion. Palantir’s docs are pretty open too - it’s not like it’s a black box that you can only see if you have a contract with them. So one would think the HN crowd would know something and have an interesting discussion on how it compares to what they’ve seen, etc. But it somehow always turns mostly political and less about the tech

tamimio · 2025-09-12T02:54:35 1757645675

It's because we understand technology very well and how it can be used to further control or surveil you. The tech itself isn't complicated; at best you would have a unified protocol that seamlessly integrates with all data sources, at worst, that part is done manually, but the rest of the tech isn't new as a concept, so there's really nothing to discuss technology-wise. However, as technical people, we can see how something can be used in a bad way or at least, sometime in the future based on the current trend, and it's necessary to discuss such implications even if it's political. For example, when a messaging app requires a phone number to activate, it's essential to highlight that it could be exploited in a SIM swap attack (thus the user should not trust it) or it could leak that number which will expose this person's real identity. And in this case, having so much information collected and shared and easily accessed by one centralized entity is never a good indicator. It's also ironic that the people who used to (still?) attack China and other countries for being surveillance state Orwellian dystopias while virtue signaling all democracy and freedom values, are now okay with such data collection and processing and potential red flagging for things as simple as social media posts.

theOGognf · 2025-07-29T01:43:10 1753753390

I’ve used Diesel for a bit now but haven’t had issues wrangling the type system. Can you give an example of an issue you’ve encountered?

theOGognf · 2025-07-17T12:33:58 1752755638

Is it common to see Metaflow used alongside MLflow if a team wants to track experiment data?

vtuulos · 2025-07-17T13:42:55 1752759775

Metaflow tracks all artifacts and allows you to build dashboards with them, so there’s no need to use MLFlow per se. There’s a Metaflow integration in Weights and Biases, CometML etc, if you want pretty off-the-shelf dashboards