Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sounds like a Data Scientist job?


This is a large problem in industry: defining away some of the most important parts of a job or role as (should be) someone else's.

There is a lot of toil and unnecessary toil in the whole data field, but if you define away all of the "yucky" parts, you might find that all of those "someone elses" will end up eating your lunch.


It's not about "yucky" so much as specialization and only having a limited time in life to learn everything.

Should your reseacher have to manage nvidia drivers and infiniband networking? Should your operations engineer need to understand the math behind transformers? Does your researcher really gain any value from understanding the intricacies of docker layer caching?

I've seen what it looks like when a company hires mostly researchers and ignores other expertise, versus what happens when a company hires diverse talent sets to build a cross domain team. The second option works way better.


My answer is yes to both of those

If other peoples work is reliant on yours then you should know how their part of the system transforms your inputs

Similarly you should fully understand how all the inputs to your part of the system are generated

No matter your coupling pattern, if you have more than 1 person product, knowing at least one level above and below your stack is a baseline expectation

This is true with personnel leadership too, I should be able to troubleshoot one level above and below me to some level of capacity.


The parent comment had three examples...


2/3 is close enough in ML world


> I've seen what it looks like when a company hires mostly researchers and ignores other expertise, versus what happens when a company hires diverse talent sets to build a cross domain team. The second option works way better.

I've seen these too, and you aren't wrong. Division into specializations can work "way better" (i.e. the overall potential is higher), but in practice the differentiating factors that matter will come down to organizational and ultimately human-factors. The anecdotal cases I draw my observations from organizations operating at the scale of 1-10 people, as well as 1,000s working in this field.

> Should your reseacher have to manage nvidia drivers and infiniband networking? Should your operations engineer need to understand the math behind transformers? Does your researcher really gain any value from understanding the intricacies of docker layer caching?

To realize the higher potential mentioned above, what they need to be doing is appreciating the value of what those things are and those who do those things beyond: these are the people that do the things I don't want to do or don't want to understand. That appreciation usually comes from having done and understanding that work.

When specializations are used, they tend to also manifest into organizational structures and dynamics which are ultimately comprised of humans. Conway's Law is worth mentioning here because the interfaces between these specializations become the bottleneck of your system in realizing that "higher potential."

As another commenter mentions, the effectiveness of these interfaces, corresponding bottlenecking effects, and ultimately the entire people-driven system is very much driven by how the parties on each side understand each other's work/methods/priorities/needs/constraints/etc, and having an appreciation for how they affect (i.e. complement) each other and the larger system.


> There is a lot of toil and unnecessary toil in the whole data field, but if you define away all of the "yucky" parts, you might find that all of those "someone elses" will end up eating your lunch.

See: the use of "devops" to encapsulate "everything besides feature development"


Used to do this job once upon a time - can't overstate the importance of just being knee-deep in the data all day long.

If you outsource that to somebody else, you'll miss out on all the pattern-matching eureka moments, and will never know the answers to questions you never think to ask.


My partner is a data engineer, from what I’ve gathered the departments are often very small or one person so the roles end up blending together a lot.


A good DS can double as an MLE.


And sometimes, a good MLE can double as a DS.

Personally I think we calcified the roles around data a little too soon but that's probably because there was such demand and the space is wide.


“Scientist”? Is this like Software Engineer?


I guess it means "someone who has or is about to have a PhD".


Sounds like a data engineer job to me




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: