It's a matter of habit more than code readability. Five years ago, I would have ...

jfoutz · on Nov 21, 2016

In this case, it's unfortunate that there's no, "will this name create a valid person object?" predicate. much much better to filter the names, then make the objects.

In this case, as long as append is O(1), i think the imperative version has a big benefit, it avoids building the name size list of persons. If you've got a billion names and 2 valid person objects, the imperative version is a big win. Of course that predicate i mentioned is the right way to go though.

I think the right way to do it is with a fold. But that's not built in, so hard to expect of novices.

I see what you're saying about mutation, i guess i have a higher tolerance for mutating stuff that you have the only reference to. I'm not really sure doing a += [validPerson] is much of a win. (but i think would be the right answer in a fold)

masklinn · on Nov 21, 2016

> In this case, it's unfortunate that there's no, "will this name create a valid person object?" predicate. much much better to filter the names, then make the objects.

No "object" is made if the name can't create a "valid person object", it'll return a stack-allocated null/nothing value. That pattern is also much better for concurrency issues.

> much much better to filter the names, then make the objects.

You're just duplicating the validation logic (or worse, the "objects creation" assumes it's given valid names and does no checks)

jfoutz · on Nov 21, 2016

> You're just duplicating the validation logic (or worse, the "objects creation" assumes it's given valid names and does no checks)

I think the right way to do it is with a fold.

masklinn · on Nov 22, 2016

In Haskell you'd use `mapMaybe :: (a -> Maybe b) -> [a] -> [b]`[0], it's a combined map+filter+map specialised for Maybe (option types). Rust has Iterator::filter_map[1] which does more or less the same thing.

[0] http://hackage.haskell.org/package/base-4.9.0.0/docs/Data-Ma...

[1] https://doc.rust-lang.org/std/iter/trait.Iterator.html#metho...

lkitching · on Nov 21, 2016

Most languages implement map and filter in terms of lazy sequences so they would not allocate an intermediate list (Scala is an exception but I believe you can request laziness).

jfoutz · on Nov 21, 2016

hmm, i don't think lazyness is enough. for example, in the billion names example, let's say the first and last elements are valid. at the first step you'll wind up with something like

   validPerson : thunk

after n steps where n < 1e9

   validPerson : invalid : invalid : invalid : ... : thunk

then finally at n = 1e9

  validPerson : invalid : ... : validPerson

because the intermediate calculations still need to happen.

getting that first answer is cheap and quick. getting the second answer requires evaluating those billion - 1 thunks.

maybe you're thinking of list fusion? AFAIK, only haskell does that.

Perhaps i'm misunderstanding your point. But even then, lisp, scheme, smalltalk, python and perl don't implement map and filter in a lazy way. Of course, i'm thinking in terms of haskell lazy, perhaps, again, i'm misunderstanding and you mean something else.

In retrospect, i think i'll revise my opinion and say map.filter is probably a bigger code smell than referentially transparent mutation.

lkitching · on Nov 21, 2016

I understood from your comment that the imperative version "avoids building the name size list of persons" that you thought the declaritive version would construct an intermediate List[Person] the same size as the source list of names. Most modern languages (e.g. C#, F#, Clojure, Rust) implement map and filter using lazy sequences rather than eagerly constructing intermediate collections (admittedly I don't have much experience with the languages you list).

The use laziness will avoid the need to construct a large intermediate collection of Person instances. Obviously you'll need to iterate the entire source collection to find all valid Persons but the same applies to the imperative version. The lazy version composes better however and avoids extra work if some downstream caller only requires the first n valid Persons.

jfoutz · on Nov 21, 2016

Oh! yes. Yes we're on the same page as far as laziness goes. The lazy version will only construct as much of the intermediate list as you need to get the next answer, and potentially the garbage collector can reclaim the parts of the intermediate list that have already gone past.

But, there's already a good clean functional composable answer to this kind of thing,

fold takes an operation (like map does) a list (like map does) and an accumulator for building up results, and it returns the resulting accumulator.

   foldl(op, names, acc){
       if(names.empty)
           return acc
       else
           foldl(op, names.tail, op(names.head, acc)
   }

so with tail recursion, you get no stack growth. (i mean head like the first element of names, and tail as everything else)

the op would look something like

   op(name, acc){
      let p = Person.init(name)
      if(p.isValid)
          return acc += [p]
      else
          return acc
      }

So fold trundles down the list of names, the op checks each name to see if it's good, and only adds the good names to the final list. whenever fold notices it's out of more names to try, it just returns whatever the current accumulator might be.

Map forces you to build the intermediate representation, which the imperative version avoids.

TheCoelacanth · on Nov 21, 2016

Even Java manages to get your example to work efficiently with map/filter. It's not rocket science. Just because some library designers screwed it up doesn't invalidate the whole concept.

jfoutz · on Nov 21, 2016

but java provides collect. which is, like, the right way to do it.

I'm just sort of mystified that people are defending map.filter over filter.map (which isn't crazy, as name.isValidPerson isn't provided.) but, you know, fold is a thing.

harpocrates · on Nov 21, 2016

> so they would not allocate an intermediate list

Not quite. Lazy sequences make it so that thinking about fusion/deforestation makes sense. This is an optimization the compiler can make. Naively, you still end up with as many allocations as before (just in a different order).

In Haskell, this sort of optimization is actually user-customizable using REWRITE RULES. And you get it for lists out of the box.