Hacker Newsnew | past | comments | ask | show | jobs | submit | j1elo's commentslogin

> went closed source and started injecting adware into checkout pages ... [and] geolocation tracking.

Maybe we should resort to blame and shame publicly this sort of actions. DDoS their servers, fill their inbox with spam, review-bomb anything they do. Public court justice a la 4chan trolling. Selling out is a lawful decision, of course, but there is no reason it shouldn't come with a price tag of becoming publicly hated. In fact, it might help people who are on the verge to stay on the ethical side of things (very ironically).

I'm just kinda joking (but wouldn't hate it if I was rugpulled and the person that did it got such treatment)


Calm down, just spreading the word that the extension is adware and having everyone uninstall it is sufficient to demonstrate that this move was a mistake. Trying to ruin someone's life is going completely overboard. Repercussions should be proportionate, you don't shoot people for stealing a candy bar.

Agreed. Times are tough. Open source is under-appreciated. People are going to crack and slip up like this. We’re only human.

Why not? Sincere question. As a very superficial idea, if we go back to the drawing board, for example we could decide our new cool concept of address to be an IPv4 + an hex suffix, maybe at the expense of not having a humongous address space.

So 10.20.30.40 would be an IPv4 address, and 10.20.30.40:fa:be:4c:9d could be an IPv6 address. With the :00:00:00:00 suffix being equivalent to the IPv4 version.

I just made this up, so I'm sure that a couple years of deep thought by a council of scientists and engineers could come up with something even better.


The header of an IPv4 packet has the source and destination addresses, both as 32-bit values. These fields are adjacent, and there's other stuff next to them. If you appended more bytes to the source address, routers would think that those new bytes are the destination address. This would not be backward compatible.

Interestingly, what you're describing really is similar to how many languages represent an IPv4 address internally. Go embeds IPv4 addresses inside of IPv6 structs as ::ffff:{IPv4 address}: https://cs.opensource.google/go/go/+/go1.26.2:src/net/ip.go;...


That's not a language-specific thing, but is actually part of the IPv6 RFCs as IPv4-mapped IPv6 addresses: [1], [2]

This is super useful because (at least on Linux) IPv6 sockets per default are dual-stack and bind to both IPv6 and IPv6 (except if you are using the IPV6_V6ONLY sockopt or a sysctl), so you don't need to open and handle IPv4 and IPv6 sockets separately (well, maybe some extra code for logging/checking properly with the actual IPv4 address).

That is also documented in ipv6(7):

  IPv4 connections can be handled with the v6 API by using 
  v4-mapped-on-v6 address type; thus a program needs to support only
  this API type to support both protocols.  This is handled
  transparently by the address handling functions in the C library.
  
  IPv4 and IPv6 share the local port space.  When you get an IPv4
  connection or packet to an IPv6 socket, its source address will be
  mapped to v6.
[0]: https://datatracker.ietf.org/doc/html/rfc5156#section-2.2 [1]: https://datatracker.ietf.org/doc/html/rfc4291#section-2.5.5....

Programmers really like to focus on things like:

- How they would format the display of the bits

- Where in the bit pattern IPv4 mapped addresses should go

- Coming up with some variation of NAT64, NAT464, or similar concepts to communicate between/over IPv4 and IPv6 networks

- Blaming the optional extensions/features of IPv6 for being too complex and then inventing something which has 90% of the same parts which are actually required to use

It's even easy to get distracted in a world of "what you can do with IPv6" instead of just using the basics. The things that actually make IPv6 adoption slow are:

- A change in the size of the address field which requires special changes and configuration in network gear, operating systems, and apps because it's not just one protocol to think about the transport of again until the migration is 100% complete.

If IPv4 were more painfully broken then the switch would have happened long ago. People just don't care to move fast because they don't need to. IPv6 itself is fine though and, ironically, it's the ones getting the most value out of the optional extensions (such as cellular providers) who actually started to drive IPv6 adoption.


How would you get someone that only knows about IPv4 addresses like 10.20.30.40 to send a packet to someone with an address 10.20.30.40:fa:be:4c:9d?

How do you squeeze that in IPv4 packet? Especially in a way that won't get mangled by random ossified devices in between?

In IPv4 you only need to transmit IPv4 addresses. If the "cannot be" in parent post is referring to the exact byte disposition in packets, then I go the other way around to claim that I agree. Because the only way that a UTF8 character can pretend to be ASCII is because ASCII didn't use all of the 8 bits in a byte to begin with. Only way to have something similar in this case, would be that IPv4 didn't use all of the allocated bits for addresses... Which is not the case.

What I argued was that IPv4 could be embedded into IPv6 address space if they had designed for it. But I agree, that the actual packet header layouts would need to look at least a bit different.


> What I argued was that IPv4 could be embedded into IPv6 address space if they had designed for it.

Like:

> Addresses in this group consist of an 80-bit prefix of zeros, the next 16 bits are ones, and the remaining, least-significant 32 bits contain the IPv4 address. For example, ::ffff:192.0.2.128 represents the IPv4 address 192.0.2.128. A previous format, called "IPv4-compatible IPv6 address", was ::192.0.2.128; however, this method is deprecated.[5]

* https://en.wikipedia.org/wiki/IPv6#IPv4-mapped_IPv6_addresse...


They did that. Problem is that an ipv4 only host can't talk to ipv6. Adding more bits to ipv4 creates a new protocol just like ipv6 and has the same transition issues.

https://datatracker.ietf.org/doc/html/rfc4291#section-2.5.5....

& the following section for the follow-up embedding.


The protocol field in the ipv4 header seems like a reasonable choice. A value would be associated for ipv6 and if that value is chosen then additional header data follows the ipv4 header.

Perhaps you could use 41, the value already associated with doing this.

(What's up with people constantly suggesting that v6 should do things that it already does?)


That's similar to, but not exactly what we were discussing. In particular 6in4 has a full ipv6 header after the ipv4 header, but here the suggestion was instead that supplementary infomation would follow. For example, the most significant address bits could be stored in the ipv4 header and the least significant ones in the new part.

That's not meaningfully different. It would just amount to a slightly less redundant representation of the same data -- the steps needed to deploy it would be the same, and you'd still have all the same issues of v4 hosts not understanding your format.

Not really reasonable. That would 1) Make routing inefficient because routers have parse an additional, non-adjacent, non-contiguous header to get the source and destination addresses. 2) Break compatibility because there would exist "routers" that do not understand ipv6 headers. They receive your ipv4 with v6 packet and send it somewhere else.

The result is basically the same situation we are in today, except much more hacky. You'd still have to do a bunch of upgrades.


> So 10.20.30.40 would be an IPv4 address, and 10.20.30.40:fa:be:4c:9d could be an IPv6 address. With the :00:00:00:00 suffix being equivalent to the IPv4 version.

Like

> Addresses in this group consist of an 80-bit prefix of zeros, the next 16 bits are ones, and the remaining, least-significant 32 bits contain the IPv4 address. For example, ::ffff:192.0.2.128 represents the IPv4 address 192.0.2.128. A previous format, called "IPv4-compatible IPv6 address", was ::192.0.2.128; however, this method is deprecated.[5]

* https://en.wikipedia.org/wiki/IPv6#IPv4-mapped_IPv6_addresse...

Or:

> For any 32-bit global IPv4 address that is assigned to a host, a 48-bit 6to4 IPv6 prefix can be constructed for use by that host (and if applicable the network behind it) by appending the IPv4 address to 2002::/16.

> For example, the global IPv4 address 192.0.2.4 has the corresponding 6to4 prefix 2002:c000:0204::/48. This gives a prefix length of 48 bits, which leaves room for a 16-bit subnet field and 64 bit host addresses within the subnets.

* https://en.wikipedia.org/wiki/6to4

So you have to ship new code to every 'network element' to support your "IPv4+" plan. Just like with IPv6.

So you have to update DNS to create new resource record types ("A" is hard-coded to 32-bits) to support the new longer addresses, and have all user-land code start asking for, using, and understanding the new record replies. Just like with IPv6. (A lot of legacy code did not have room in data structures for multiple reply types: sure you'd get the "A" but unless you updated the code to get the "A+" address (for "IPv4+" addresses) you could never get to the longer with address… just like IPv6 needed code updates to recognize AAAA, otherwise you were A-only.)

You need to update socket APIs to hold new data structures for longer addresses so your app can tell the kernel to send packets to the new addresses. Just like with IPv6. In any 'address extension' plan the legacy code cannot use the new address space; you have to:

* update the IP stack (like with IPv6)

* tell applications about new DNS records (like IPv6)

* set up translation layers for legacy-only code to reach extended-only destination (like IPv6 with DNS64/NAT64, CLAT, etc)

You're updating the exact same code paths in both the "IPv4+" and IPv6 scenarios: dual-stack, DNS, socket address structures, dealing with legacy-only code that is never touched to deal with the larger address space.

Deploying the new "IPv4+" code will take time, there will partial deployment of IPv4+ is no different than having partial deployment of IPv6: you have islands of it and have to fall back to the 'legacy' IPv4-plain protocol when the new protocol fails to connect:

* https://en.wikipedia.org/wiki/Happy_Eyeballs

"Just adding more bits" means updating a whole bunch of code (routers, firewalls, DNS, APIs, userland, etc) to handle the new data structures. There is no "just": it's the same work for IPv6 as with any other idea.

(This idea of "just add more addresses" comes up in every discussion of IPv6, and people do not bother thinking about what needs to change to "just" do it.)

> If IPv4 were more painfully broken then the switch would have happened long ago.

IPv4 is very painful for people not in the US or Western Europe that (a) were now there early enough to get in on the IPv4 address land rush, or (b) don't have enough money to buy as many IPv4 addresses as they need (assuming someone wants to sell them).

So a lot of areas of the world have switched, it's just that you're perhaps in a privileged demographic and are blind to it.


> IPv4 is very painful for people not in the US or Western Europe that (a) were now there early enough to get in on the IPv4 address land rush, or (b) don't have enough money to buy as many IPv4 addresses as they need (assuming someone wants to sell them).

The lack of pain is not really about the US & Western Europe have plenty of addresses or something of that nature, it's that alternative answers such as NAT and CG-NAT (i.e. double NAT where the carrier uses non-public ranges for the consumer connections) deployments are still growing faster in those regions than IPv6 adoption when excluding cellular networks (they've been pretty good about adopting IPv6 and are where most of the IPv6 traffic in those regions comes from).


I think your summary is really great. One of the better refutations I've seen about the "what about v4 but longer??" question.

However, I think people do get tripped up by the paradigm shift from DHCP -> SLAAC. That's not something that is an inevitable consequence of increasing address size. And compared to other details (e.g. the switch to multicasting, NDP, etc.), it's a change that's very visible to all operators and really changes how things work at a conceptual level.


The real friction with SLAAC was that certain people (particularly some at Google) tried to force it as the only option on users, not that IPv6 ever forced it as the only option. The same kind of thing would likely occur with any new IP version rolling out.

For comparison IPv4 had:

  - Static (1980 - original spec)
  - RARP   (1984 - standalone spec)
  - BOOTP  (1985 - standalone spec)
  - DHCP   (1993 - standalone spec)
And for IPv6:

  - Static (1995 - pre, 1998 final spec)
  - SLAAC  (1996 - pre standalone, 1998 final standalone)
  - DHCPv6 (2003 - standalone)
Some of these have had subsequent minor updates, e.g. DHCP was updated in 1997 and so on.

SLAAC isn't something that is an inevitable consequence of increasing address size, it's something that is a useful advantage of increasing address size. Almost no one had big enough blocks in IPv4 where "just choose a random address and as long as no else seems to be currently claiming it it is yours" was a viable strategy for assigning an address.

There are some nice benefits of SLAAC over DHCP such as modest privacy: if device addresses are randomized they become harder to guess/scan; if there's not a central server with a registration list of every device even more so (the first S, Stateless). That's a great potential win for general consumers and a far better privacy strategy than NAT44 accidental (and somewhat broken) privacy screening. It's at odds with corporate device management strategies where top-down assignment "needs to be the rule" and device privacy is potentially a risk, but that doesn't make SLAAC a bad idea as it just increases the obvious realization that consumer needs and big corporate needs are both very different styles of sub-networks of the internet and they are conflicting a bit. (Also those conflicting interests are why consumer equipment is leading the vanguard to IPv6 and corporate equipment is languishing behind in command-and-control IPv4 enclaves.)


DHCPv6 now exists and every OS except Android supports it.

> except Android

That alone is significant.

Furthermore, DHCPv6 holds you back from various desirable things like privacy addresses and (arguably even more importantly) IPv6 Mostly.


> Furthermore, DHCPv6 holds you back from various desirable things like privacy addresses and (arguably even more importantly) IPv6 Mostly.

Why would DHCPv6 hold back privacy addresses? Can't DHCPv6 servers generate random host address bits and assign them in DHCP Offer packets? Couldn't clients generate random addresses and put them in Request packets?

See perhaps OPTION_IA_TA (Temporary Address):

* https://datatracker.ietf.org/doc/html/rfc8415#section-21.5

* https://en.wikipedia.org/wiki/DHCPv6#Option_Codes

    DHCPv6 temporary addresses have the same properties as SLAAC
    temporary addresses (see Section 4.6).  On the other hand, the
    properties of DHCPv6 non-temporary addresses typically depend on the
    specific DHCPv6 server software being employed.  Recent releases of
    most popular DHCPv6 server software typically lease random addresses
    with a similar lease time as that of IPv4.  Thus, these addresses can
    be considered to be "stable, semantically opaque".  [DHCPv6-IID]
    specifies an algorithm that can be employed by DHCPv6 servers to
    generate "stable, semantically opaque" addresses.
* https://datatracker.ietf.org/doc/html/rfc7721#section-4.7

How does DHCPv6 hold back IPv6-mostly? First, most clients will send out a DHCPv4 request in case IPv4 is the only option, in which case IPv6-mostly can be signalled:

* https://datatracker.ietf.org/doc/html/rfc8925

And hosts would also have to send out an IPv6 RS, and the RA can signal IPv6-mostly:

* https://datatracker.ietf.org/doc/html/rfc8781

* https://datatracker.ietf.org/doc/html/draft-ietf-v6ops-6mops...


> See perhaps OPTION_IA_TA (Temporary Address):

I was unaware of this, so thanks. Sounds like it addresses (pun intended) my concern.

> How does DHCPv6 hold back IPv6-mostly? First, most clients will send out a DHCPv4 request in case IPv4 is the only option, in which case IPv6-mostly can be signalled

It's not the signalling that's the problem--it's the configuration of the CLAT which requires SLAAC, afaiu. This is in fact the subject of the latest IPv6 Buzz podcast episode: https://packetpushers.net/podcasts/ipv6-buzz/ipb197-slaac-an...


> It's not the signalling that's the problem--it's the configuration of the CLAT which requires SLAAC, afaiu.

This operational difficulty has been recognized and alternatives are being put forward:

* https://datatracker.ietf.org/doc/html/draft-ietf-v6ops-clato...


We don't need to be communicative at all times. But don't romanticize it either; we did what you say because we had to, whether we wanted or not. Not having any chance of correcting course or being more flexible is not a cool thing of the past, it's a limitation of how things were. That you find confort on it, is a different thing than it being better or worse... it just was.

As if it would make sense that spending 2hrs relaxing on the beach or gardening your orchids would cost $400 to you. Money not made is not money spent. If you were doing a hobby project for learning, you were not going to be working during that time anyways, so your hourly rate doesn't matter.

What a messy and frankly, absurd situation to be left in. To fork a project in order to provide a tool through Pypi, only to then stop updating it on a broken version. That's more a disservice than a service for the community... If you're going to stay stuck, better to drop the broken release and stay stuck on the previous working one.

A nitpick to your nitpick: they said "memory location". And yes, a pointer always points to a memory location. Notwithstanding that each particular region of memory locations could be mapped either to real physical memory or any other assortment of hardware.

No. Neither in the language (NULL exists) nor necessarily on real CPUs.

NULL exists on real CPUs. Maybe you meant nullptr which is a very different thing, don't confuse the two.

I don't agree. Null is an artefact of the type system and the type system evaporates at runtime. Even C's NULL macro just expands to zero which is defined in the type system as the null pointer.

Address zero exists in the CPU, but that's not the null pointer, that's an embarrassment if you happen to need to talk about address zero in a language where that has the same spelling as a null pointer because you can't say what you meant.


Null doesn't expand to zero on some weird systems. tese days zero is special on most hardware so having zero and nullptr be the same is importnt - even though on some of them zero is also legal.

Historically C's null pointer literal, provided as the pre-processor constant NULL, is the integer literal 0 (optionally cast to a void pointer in newer standards) even though the hardware representation may not be the zero address.

It's OK that you didn't know this if you mostly write C++ and somewhat OK that you didn't know this even if you mostly write C but stick to pre-defined stuff like that NULL constant, if you write important tools in or for C this was a pretty important gap in your understanding.

In C23 the committee gave C the C++ nullptr constant, and the associated nullptr_t type, and basically rewrote history to make this entire mess, in reality the fault of C++ now "because it's for compatibility with C". This is a pretty routine outcome, you can see that WG14 members who are sick of this tend to just walk away from the committee because fighting it is largely futile and they could just retire and write in C89 or even K&R C without thinking about Bjarne at all.


You can point to a register which is certainly not memory.

Whenever you have this kind of impressions on some development, here are my 2 cents: just think "I'm not the target audience". And that's fine.

The difference between 2ms and 0.2ms might sound unneeded, or even silly to you. But somebody, somewhere, is doing stream processing of TB-sized JSON objects, and they will care. These news are for them.


I remember when I was coming up on the command line and I'd browse the forums at unix.com. Someone would ask how to do a thing and CFAJohnson would come in with a far less readable solution that was more performative (probably replacing calls to external tools with Bash internals, but I didn't know enough then to speak intelligently about it now).

People would say, "Why use this when it's harder to read and only saves N ms?" He'd reply that you'd care about those ms when you had to read a database from 500 remote servers (I'm paraphrasing. He probably had a much better example.)

Turns out, he wrote a book that I later purchased. It appears to have been taken over by a different author, but the first release was all him and I bought it immediately when I recognized the name / unix.com handle. Though it was over my head when I first bought it, I later learned enough to love it. I hope he's on HN and knows that someone loved his posts / book.

https://www.amazon.com/Pro-Bash-Programming-Scripting-Expert...


Wow that takes me back. I used to lurk on unix.com when I was starting with bash and perl and would see CFAJohnson's terse one-liners all the time. I enjoyed trying my own approaches to compare performance, conciseness and readability - mainly for learning. Some of the awk stuff was quite illuminating in my understanding of how powerful awk could be. I remember trying different approaches to process large files at first with awk and then with Perl. Then we discovered Oracle's external tables which turned out to be clear winner. We have a lot more options now with fantastic performance.


Why are half the forum posts on there all about AI? Yikes


Also as someone who looks at latency charts too much, what happens is a request does a lot in series and any little ms you can knock off adds up. You save 10ms by saving 10 x 1ms. And if you are a proxyish service then you are a 10ms in a chain that might be taking 200 or 300ms. It is like saving money, you have to like cut lots of small expenses to make an impact. (unless you move etc. but once you done that it is small numerous things thay add up)

Also performance improvements on heavy used systems unlocks:

Cost savings

Stability

Higher reliability

Higher throughput

Fewer incidents

Lower scaling out requirements.


Wait what? I don't get why performance improvement implies reliability and incident improvement.

For example, doing dangerous thing might be faster (no bound checks, weaker consistency guarantee, etc), but it clearly tend to be a reliability regression.


First, if a performance optimization is a reliability regression, it was done wrong. A bounds check is removed because something somewhere else is supposed to already guaratee it won't be violated, not just in a vacuum. If the guarantee stands, removing the extra check makes your program faster and there is no reliability regression whatsoever.

And how does performance improve reliability? Well, a more performant service is harder to overwhelm with a flood of requests.


"Removing an extra check", so there is a check, so the check is not removed?


It does not need to be an explicit check (i.e. a condition checking that your index is not out of bounds). You may structure your code in such a way that it becomes a mathematical impossibility to exceed the bounds. For a dumb trivial example, you have an array of 500 bytes and are accessing it with an 8-bit unsigned index - there's no explicit bounds check, but you can never exceed its bounds, because the index may only be 0-255.

Of course this is a very artificial and almost nonsensical example, but that is how you optimize bounds checks away - you just make it impossible for the bounds to be exceeded through means other than explicitly checking.


Somes directly like other commenters touch on. Less likely to saturate CPU quickly. Lower cost to run so can have more headroom.

But also the stuff you tend to do to make it fast makes it more reliable.

Local caches reduce network traffic. Memory is more reliable than network IO so it improves reliability.

Reducing lookup calls to other services (e.g. by supplying context earlier in the dependency chain) makes it faster and more reliable.

Your code will probably branch less and become more predictable too.

And often the code is simpler (sometimes not when a performance hack is used)


Less OOMs, less timeouts, less noisy neighbors problems affecting other apps


But even in this example, the 2ms vs 0.2 is irrelevant - its whatever the timings are for TB-size objects.

So went not compare that case directly? We'd also want to see the performance of the assumed overheads i.e. how it scales.


Which is fine, but the vast majority of the things that get presented aren’t bothering to benchmark against my use (for a whole lotta mes). They come from someone scratching an itch and solving it for a target audience of one and then extrapolating and bolting on some benchmarks. And at the sizes you’re talking about, how many tooling authors have the computing power on hand to test that?


> "somebody, somewhere, is doing stream processing of TB-sized JSON objects"

That's crazy to think about. My JSON files can be measured in bytes. :-D


Well obviously that would happen mostly only on the biggest business scales or maybe academic research; one example from Nvidia, which showcases Apache Spark with GPU acceleration to process "tens of terabytes of JSON data":

https://developer.nvidia.com/blog/accelerating-json-processi...


All files can be measured in bytes. :)


You, sir or ma'am, are a first class smarty pants.


Who is the target audience? I truly wonder who will process TB-sized data using jq? Either it's in a database already, in which case you're using the database to process the data, or you're putting it in a database.

Either way, I have really big doubts that there will be ever a significant amount of people who'd choose jq for that.


There was a thread yesterday where a company rewrote a similar JSON processing library in Go because they were spending $100,000s on serving costs using it to filter vast amounts of data: https://news.ycombinator.com/item?id=47536712


That's a really great perspective. Thanks for sharing!


For that you need a very centralized VCS, not a decentralized one. Perforce allows you to lock a file so everybody else cannot make edits to it. If they implemented more fine-grained locking within files, or added warnings to other users trying to check them out for edits, they'd be just where you want a VCS to be.

How, or better yet, why would Git warn you about a potential conflict beforehand, when the use case is that everyone has a local clone of the repo and might be driving it towards different directions? You are just supposed to pull commits from someone's local branch or push towards one, hence the wording. The fact that it makes sense to cooperate and work on the same direction, to avoid friction and pain, is just a natural accident that grows from the humans using it, but is not something ingrained in the design of the tool.

We're collectively just using Git for the silliest and simplest subset of its possibilities -a VCS with a central source of truth-, while bearing the burden of complexity that comes with a tool designed for distributed workloads.


It's fully caused by management mindset. There are companies that are investing hard on the AI trend, but the message is clear: all code pushed is your ultimate responsonsibility, and if it lacks quality or causes problems, you're on the hook for it; using AI hasn't changed that.

So if Spotify had a modicum of AI usage hygiene, plus accountability expectations for code quality, this would still mean a bad performance review for whoever introduced this issue (person or team; poor results and mistakes are never something that come from a single source)


spotify has no performance review process or any sort of performance management. Never heard of anyone getting piped there for many years i was there.


Well, thanks. That small web just taught me in a very concise way a thing or two about bicicle braking technique!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: