verginer's comments

verginer · 2026-04-13T14:22:24 1776090144

Like others have pointed out, it reads like a horoscope. The example images give a reasonable approximation of what I'd profile them as too, but after trying a few of my own picture it's clearly BS. Garbage in, garbage out.

This "use LLMs as psychometric/political polling substitutes" idea seems to have jumpstarted a weird cottage industry of "synthetic" surveys. The model is pattern-matching on superficial visual cues and dressing it up as insight (I have a long beared and hence I vote for the green party).

Nate Silver put it well recently: [AI polls are fake polls][1].

An LLM inferring personality from a photo is even further down that chain of abstraction. That's not profiling, it's stereotyping with extra steps.

[1]: https://www.natesilver.net/p/ai-polls-are-fake-polls

verginer · 2026-01-24T19:31:28 1769283088

I finally found a job for my Raspberry Pi 1 Model B from 2012. It’s been sitting in a drawer for years, but about a 2 years ago added it to my Tailscale network as an exit node.

It’s a single-core 700MHz ARMv6 chip with 512MB of RAM. It's a fossil—a Pi 5 is 600x faster (according to the video). But for the 'low-bandwidth' task of routing some banking traffic or running a few changedetection watches via a Hetzner VPS (where the actual docker image runs), it’s rock solid. There’s something deeply satisfying about giving 'e-waste' a second life as a weekend project.

zikduruqe · 2026-01-24T20:49:41 1769287781

As a fun weekend project in 2013, I stood up a weather station using Weewx and my RPI 2 with 1 GB RAM. I told myself if it ever crashes or the SD card gets corrupted, I'll just tear everything down.

Well, it's still running today on the original SD card. At noon today it processed its 1,055,425th record in the database.

Still, if it ever crashes, I'll just tear it down. :)

varispeed · 2026-01-25T01:24:27 1769304267

Sounds like you want it to crash really badly :)

zikduruqe · 2026-01-25T15:54:30 1769356470

Indeed. I'm not going to do anything to cause its demise, but when it does go, it is gone.

bevr1337 · 2026-01-24T19:36:03 1769283363

They'll run CUPS too! My B modernized some old, commercial Brother laser printers I was running.

thisislife2 · 2026-01-24T19:44:53 1769283893

That's a great idea - if I understood you right, you mean you used it to make a printer "wireless / wifi enabled" with it, right? Is there any guide you can recommend for that?

jffry · 2026-01-24T21:23:10 1769289790

I've done the same thing - making a USB-only printer available on my LAN - following this guide: https://pimylifeup.com/raspberry-pi-print-server/

One nice thing is I can print to the CUPS server even if the printer is off

zbendefy · 2026-01-24T21:27:17 1769290037

I did the same with an rpi3, not sure if I used this guide but it seems good:

https://www.raspberrypi.com/news/printing-at-home-from-your-...

justinsaccount · 2026-01-24T19:53:37 1769284417

I have a few older models lying around too, there's some other minor benefits as well:

  * They have full sized HDMI ports 
  * They will happily run using any random old USB charger and not overheat.

moffkalast · 2026-01-24T19:54:07 1769284447

Well on the other hand, at which point does it become wasteful to run something when it gets less and less power efficient compared to newer devices? According to OP's benchmarks, the Pi 1 burns 2W constant to do essentially zero work and running that on a more modern device that's already running would use almost no extra power.

Then again we use a kW or two to microwave things for minutes on a daily basis so who really gives a shit.

horsawlarway · 2026-01-24T20:06:29 1769285189

Yeah... 2W is just not that much energy.

Enough energy to run that thing for an entire year in under 1/2 a gallon of gasoline.

When you can pretty easily offset the entire yearly energy use by skipping a mow of your yard once, or even just driving slightly more conservatively for a few days... I'm not so worried about the power use.

In my region - it's about $3.50 in yearly power costs.

01HNNWZ0MV43FF · 2026-01-24T21:22:03 1769289723

I did unplug my GPU to save 30 watts, but... 2 watts is equivalent to driving a Prius Prime 0.155 miles per day on battery power. So there's that

3eb7988a1663 · 2026-01-24T21:42:20 1769290940

That seems an impossible range.

This site[0] claims a Prius Prime XSE gets 1.42 miles/kWh. Or (1.42 miles /1000Wh)*2 = 0.0028 miles. Which is ~14 feet, which is significantly more in line with my expectations (though still high)

[0] https://www.motortrend.com/reviews/2024-toyota-prius-prime-x...

tzs · 2026-01-24T23:33:33 1769297613

You are missing a factor of 24, which comes in because they said "0.155 miles per day on battery power".

The easiest way to do the calculation would be, assuming a Prius Prime can do M mi/kWh on battery power, is to calculate 0.155 mi/day x 1/M kWh/mi x 1 day/24h = 0.0065 kW = 6.5/M W. That gives us W which can directly be compared with the 2 W he gave.

Also, 1.42 mi/kWh seems way low for battery power operation. I'm pretty sure that is for mixed gas/electric operation, expressed in MPG-e (47.9) and mi/kWh for convenient comparison to pure EVs. (You can convert between MPG-e and mi/kWh used the conversion factor for 33.7 kWh/gal.

It has a 13.6 kWh battery and a 39 mile all electric range, which suggests M = 2.9 mi/kWh. Plugging that into 6.5/M W gives 2.2 W.

M is probably actually a little higher because the car probably doesn't let the battery actually use 100% of its capacity. Most sites I see seem to say 3.1-3.5 mi/kWh.

On the other hand there are some losses when charging. On my EV during times I've the year when I do not need to use the heating or AC the car is reporting 4.1 or higher mi/kWh, but it is measuring what is coming out of the battery.

When calculated based on what is coming out of my charger it works out to 3.9 mi/kWh. This is with level 2 charging (240 V, 48 A). Level 1 charging is not as efficient as level 2.

If we go with 3.1-3.5 mi/kWh, and assume that is measured on the battery output side and that the loses during charging are about 8%, we get 2.9-3.2 mi/kWh on the "this is what I've getting billed for" side. If we use the average of that and plug into 6.5/M W we get 2.1 W.

fc417fc802 · 2026-01-25T05:45:49 1769319949

I thought the same thing but then I realized that it's 48 W hr for a full day.

TacticalCoder · 2026-01-24T19:46:18 1769283978

> I finally found a job for my Raspberry Pi 1 Model B from 2012.

Nice! Even though I've got a Proxmox serve at home running on a real PC (but it's not on 24/7), I do run my DNS, unbound, on a Pi 2. It's on 24/7 and it's been doing its job just fine since years.

1vuio0pswjnm7 · 2026-01-25T14:15:49 1769350549

I run customised NetBSD on the Model B. Fits in 512MB

I never use a wall plug as I can easily power it from other computers' USB ports

It came with a WiFi dongle but I prefer to connect it to a WiFi travel router via Ethernet if I need wireless

hypercube33 · 2026-01-24T22:11:12 1769292672

I mean in theory and practice a Pentium 2 300 could do full 1gpbs routing with Vyatta and I used that and other distros to do that for years

verginer · on Dec 23, 2023

The author notes that they used "cheats". Depending on what these do the iid assumption of the samples being independent could be violated. If it is akin to snowball sampling it could have an "excessive" success rate thereby inflating the numbers.

> Jason found a couple of cheats that makes the method roughly 32,000 times as efficient, meaning our “phone call” connects lots more often

Woberto · on Dec 25, 2023

Just keep reading TFA

> it was discovered by Jia Zhou et. al. in 2011, and it’s far more efficient than our naïve method. (You generate a five character string where one character is a dash – YouTube will autocomplete those URLs and spit out a matching video if one exists.)

bsdetector · on Dec 23, 2023

There's probably a checksum in the URL so that typos can be detected without actually trying to access the video.

If you don't know how the checksum is created you can still try all values of it for one sample of the actual ID space.

oh_sigh · on Dec 23, 2023

I assume the cheat is something like using the playlist API that returns individual results for whether a video exists or not.

So you issue an API to create a playlist with video IDs x, x+1, x+2, ..., and then when you retrieve the list, only x+2 is in it since it is the assigned ID.

stravant · on Dec 23, 2023

The data probably wouldn't look so clean if it were skewed. If Google were doing something interesting it probably wouldn't be skewed only by a little bit.

verginer · on Dec 23, 2023

Admittedly, I did not read the paper linked. But my point is not about google doing something funny. Even if we assume that ids are truly random and uniformly distributed this does not mean that the sampling method doesn't have to be iid. This problem is similar to density estimation where Rejection sampling is super inefficient but converges to the correct solution, but MCMC type approaches might need to run multiple times to be sure to have found the solution.

mocamoca · on Dec 23, 2023

I agree.

Proving that using cheats and auto complete does not break sample independence and keeps sampling as random as possible would be needed here for stats beginners such as me!

Drunk dialing but having a human operator that each time tries to help you connect with someone, even if you mistyped the number... Doesn't look random to me.

However I did not read the 85 pages paper... Maybe it's addressed there.

Mogzol · on Dec 23, 2023

Page 9 & 10 of the paper [1] go into some detail:

> By constructing a search query that joins together 32 randomly generated identifiers using the OR operator, the efficiency of each search increases by a factor of 32. To further increase search efficiency, randomly generated identifiers can take advantage of case insensitivity in YouTube’s search engine. A search for either "DQW4W9WGXCQ” or “dqw4w9wgxcq” will return an extant video with the ID “dQw4w9WgXcQ”. In effect, YouTube will search for every upper- and lowercase permutation of the search query, returning all matches. Each alphabetical character in positions 1 to 10 increases search efficiency by a factor of 2. Video identifiers with only alphabetical characters in positions 1 to 10 (valid characters for position 11 do not benefit from case-insensitivity) will maximize search efficiency, increasing search efficiency by a factor of 1024. By constructing search queries with 32 randomly generated alphabetical identifiers, each search can effectively search 32,768 valid video identifiers.

They also mention some caveats to this method, namely, that it only includes publicly listed videos:

> As our method uses YouTube search, our random set only includes public videos. While an alternative brute force method, involving entering video IDs directly without the case sensitivity shortcut that requires the search engine, would include unlisted videos, too, it still would not include private videos. If our method did include unlisted videos, we would have omitted them for ethical reasons anyway to respect users’ privacy through obscurity (Selinger & Hartzog, 2018). In addition to this limitation, there are considerations inherent in our use of the case insensitivity shortcut, which trusts the YouTube search engine to provide all matching results, and which oversamples IDs with letters, rather than numbers or symbols, in their first ten characters. We do not believe these factors meaningfully affect the quality of our data, and as noted above a more direct “brute force” method - even for the purpose of generating a purely random sample to compare to our sample - would not be computationally realistic.

[1]: https://journalqd.org/article/view/4066

crazygringo · on Dec 23, 2023

> case insensitivity in YouTube’s search engine.

That's very clever. Presumably the video ID in the URL is case-sensitive, but then YouTube went out of their way to index a video's ID for text search, which made this possible.

verginer · on Dec 23, 2023

Good observation, but they also acknowledge: > there are considerations inherent in our use of the case insensitivity shortcut, which trusts the YouTube search engine to provide all matching results, and which oversamples IDs with letters, rather than numbers or symbols, in their first ten characters. We do not believe these factors meaningfully affect the quality of our data, and as noted above a more direct “brute force” method - even for the purpose of generating a purely random sample

In short I do believe that the sample is valuable, but it is not a true random sample in the spirit that the post is written, there is a heuristic to have "more hits"

verginer · on June 5, 2021

What is meant by the disclaimer at the end,

> This is not an official Google product, it is just code that happens to be owned by Google.

Is the code developed by an external group and google just happened to be where first prototype was developed?

cjohansson · on June 5, 2021

No I think the author works at Google and have a couple of hours weekly or monthly to do "experiments", but the employment terms forces the ownership of the "experiments" to Google but since the projects may not align with Googles image or reputation they wish not to be officially associated with it. They own it but do not "stand behind it" officially

verginer · on May 20, 2018

There is a typo in the title you might want to fix it from asteriod to asteroid, which I assume is the correct name given the domain.

sctb · on May 20, 2018

Thanks! Updated.