Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Some doubt might be cast on the general validity of the results by the following brief description of the experiment being reported here:

"In the study, 40 wallets were sent out in each photograph category as well as 40 containing a card suggesting that the owner had recently made a contribution to charity. A control batch contained no additional item.

All of the wallets were stuffed with the same set of everyday items, including raffle tickets, discount vouchers, and membership cards. None of them contained money, however."

(Added italics mine.)



I think that since there was nothing valuable inside, finders might have thought it was not worth the trouble sending the wallets back, while a baby picture is valuable to its owner, so it was worth sending back. If I had found such a wallet, that's the way I would have acted.


I was actually thinking the complete opposite. Since they found nothing of value in the wallets, they were easy to part with and send back. If they found $300 in cash, the finder may have felt more inclined to keep the wallet or at least the money. A good follow up test would be to see how often the money comes back with the wallet.


I was not trying to speculate what people would do if they found a wallet full of money, I was just trying to interpret why they sent back a wallet with a baby picture but did not send back a wallet with nothing of value inside.


this may be true. I lost my wallet today, and a very honest man found it and returned it to me. The wallet contained a large sum of money.


I sense a grant application for a followup study coming!


"In the study, 40 wallets..."

That is such a small sample size that the results could be masking underlying probabilities world wide. This study certainly has no meaning to places like Saudi Arabia where it is illegal to touch a dropped wallet unless you are the security staff of the building, the police, or an agent of the owner. Imagine if they did the exact same experiment there.


Note that that's 40 wallets in each of 6 categories; the total number was 240.

If half of all wallets get returned, then the variance of the number returned is on the order of 40 * 1/2 * (1-1/2) = 10, which means a standard deviation on the order of 3 wallets returned, or ~ 8% return rate. So the "puppy" and "family" figures might be out by about that much.

The cute-baby category had a measured return rate of 88%, which means a variance of something like 40 * 0.88 * 0.12~=4.2, for a standard deviation of ~ 2 wallets or about 5% in the return rate.

So if these results are unlucky to the tune of two 2-sigma errors pointing in the "right" direction, the puppy category might really be as good as 69%, and the baby category might really be as bad as 78%.

So, at least as far as simple sampling error goes, the "baby beats puppies" result seems pretty robust.

(No need to tell me about all the oversimplifications in the above. I know.)


I sympathize with what you are trying to do, but you are right when you say oversimplifications abound. I do Market/Business Intelligence for a living. I repeatably see the results of split tests change drastically after 100 results per option come in, I have even seen 1000 samples per option change. Human beings don't always fit into a nice standard deviation. A holiday, or the weather, or unknown variable X will just go ahead and mess everything up for you. Sure, if I had to make a decision based JUST on this data I would keep a baby instead of a puppy, but I wouldn't be nearly as confident as the article writer is.


How do statisticians decide on what a good sample size is?

Certainly they can't use the science of statistics to determine the sample size, since a good statistical study would require using a good sample size.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: