I don't want to say "advantage", so much as preference. But a few things come to...

rahimnathwani · on Jan 17, 2022

I can see how this is more intuitive. In pandas I'd assign the output of groupby to a variable, and then add the new column in a separate statement.

(The below is off topic, but I don't use R so I'd love to know whether I'm reading the code correctly)

"Here I'm filtering participants in an mturk study by those who have completed more than 40 trials at least six times across multiple sessions."

A user with this pattern of trials seems like they would fit the above definition:

Session 1: 82 trials Session 2: 82 trials Session 3: 82 trials

But the code seems to want 6 distinct sessions with >40 trials each. Have I misunderstood?

Also, is 'mutate' necessary before 'filter' or is that just to make the intent of the code clearer to your future self?

jonnycomputer · on Jan 17, 2022

My initial wording was sloppy.

There were 50 trials in each session; so I counted a session completed if they did more than 40 in that session. They needed to have completed at least six sessions.

The mutate is unnecessary. I forget why I did that.

jonnycomputer · on Jan 17, 2022

What it woul take to recreate dplyr in python:

https://mchow.com/posts/2020-02-11-dplyr-in-python/

melling · on Jan 17, 2022

Didn’t R introduce the native pipe operator?

%>% is now simply >|

jonnycomputer · on Jan 17, 2022

They did. I just haven't gotten around to using it yet!