Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think one reason would fault tolerance. Is there a fault tolerance layer in GNU parallel? last time I checked their homepage ( a few minutes ago), there was no reference to fault tolerance.

Another reason is, perhaps, scheduling.



what fault tolerance does spark give you in this scheme? It cannot look into TF progress and checkpoint all state. Using Spark with TF, seems like an overkill -- you need to manage and install two framework what should ideally be a 200 line python wrapper or small mesos framework at most.


Does --retries count as fault tolerance?




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: