Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The main feature which jumps out at me, I think it schedules on a single node, where Airflow can orchestrate across multiple nodes. Airflow is also starting to implement more enterprise-y features, like role-based access control, which should be coming to the next release.


(author here)

That's absolutely correct. Mara uses Python's multiprocessing [1] to parallelize pipeline execution [2] on a single node so it doesn't need a distributed task queue. Beyond that (and visualization) it can't do much. In fact it doesn't even have a scheduler (you can use jenkins or cron for that).

[1] https://docs.python.org/3.6/library/multiprocessing.html

[2] https://github.com/mara/data-integration/blob/master/data_in...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: