Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I made this same transition from data science to data engineering about 18 months ago and I've never looked back.

I hated working with bad code and dealing with arrogant phds who don't value good code. I've seen so many terrible Jupyter Notebooks just copied and pasted into VS Code and the data scientist just washed their hands of it calling it "production ready." Here's a conversation I've had multiple times:

Me: have you ever considered not making every variable global scope

Them: that's just software engineering. We do machine learning

Me: if it's just software engineering, then why can't you do it?

Meanwhile, automated data science tools are getting halfway decent. If you know what algorithm to pick and you don't need to run millions of records through the model every minute, your standard business analyst could probably get a solid model going--at least as well as most data scientists for all the reasons the article mentions.

And I like that I know I can do data engineering. With data science you can never really know if you can hit your target metrics given the data you have. So data scientists end up encouraged to fudge their results or make sloppy decisions. With data engineering I can say "yes this is doable or no that's not" and people believe me.

My prediction: there's value in the massive volume of data but most of it can be had through standard dashboards, some summary statistics, a graph network, or maybe a linear/logistic regression. Most data science is BS and companies aren't getting the return they need to pay for these guys. (And good God, you almost certainly don't need a neural network.) Meanwhile, data engineering will get integrated into software development, and machine learning—by virtue of its proliferation through academia—will just become another tool for software developers. Data scientists won't get laid off enmass but they will go the way of the webmaster: either pick up new skills and evolve or move on til they end up with new titles



This resonates with me so much, I stumbled into data science out of University a decade ago. Left it to do SWE and came back to it in the last 3 years.

So many data scientists are full of themselves thinking they are magicians and software developers are blacksmiths who are beneath them.

Incrementally at my company the SWE's have automated so much of the data scientists workflow that they end up just as you describe, using the tooling and being relegated to becoming analysts.

After 3 years coming back to this field, I see the writing on the wall: In the 90's most models were created by software developers, in the 2030's most models will be created by software developers.


I don’t get it. As a data engineer aren’t you the one who has to deal with the DS code? Whereas a DS someone else has to deal with your code?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: