However, for the example in this post, I would recommended using the logistic regression provided by MLlib to scale up.