> the reference implementation of language is slow
Despite its content, this blogpost also pushes this exact "language slow" thinking in its preamble. I don't think nearly enough people read past introductions for that to be a responsible choice or a good idea.
The only thing worse than this is when Python specifically is outright taught (!) as an "interpeted language", as if an implementation-detail like that was somehow a language property. So grating.
While I sympathize (and have said similar in the past), language design can (and in Python's case certainly does) hinder optimization quite a bit. The techniques that are purely "use a better implementation" get you not much further than PyPy. Further benefits come from cross-compilation that requires restricting access to language features (and a system that can statically be convinced that those features weren't used!), or indeed straight up using code written in a different language through an FFI.
But yes, the very terminology "interpreted language" was designed for a different era and is somewhere between misleading and incomprehensible in context. (Not unlike "pass by value".)
Absolutely, no doubt about that. I just find it a terrible way to approach from in general, as well as specifically in this case: swapping out CPython with PyPy, GraalPy, Taichi, etc. - as per the post - requires no code changes, yet results in leaps and bounds faster performance.
If switching runtimes yields, say, 10x perf, and switching languages yields, say, 100x, then the language on its own was "just" a 10x penalty. Yet the presentation is "language is 100x slower". That's my gripe. And these are apparently conservative estimates as per the tables in the OP.
Not that metering "language performance" with numbers would be a super meaningful exercise to begin with, but still. The fact that most people just go with CPython does not escape me either. I do wonder though if people would shop for alternative runtimes more if the common culture was more explicitly and dominantly concerned with the performance of implementations, rather than of languages.
The problem is a lot of software is written to be run by people other than the developer, and usually you don't get a say in the implementation in those cases.
"Reason" is an "in the eye of the beholder" type human thing. They're taking it in the most tortured sense, because under sufficient pressure that's "exactly" what happens anyways. It sounds silly until everything you touch is 20 indirections away.
I don't know about ants, but after a refresher on the people favorite fruit fly, I'd be hard pressed to be so dismissive - 200K seems to be plenty: https://news.ycombinator.com/item?id=47302051
I inspire you to look up what is known about fruit flies' behavior.
The reason it's probably nevertheless not as messed up as people might assume it to be is specifically because it's an organoid, not an actual brain. Which is to say, it has the numbers but not the performance, not by a long shot.
> Surely it makes no difference
It absolutely should, though specifically with organoids, I guess it might not. Ironically, I would expect the ethics angle to be actually worse with small animals. The size of the organoid will be closer to the real thing comparatively, after all, so more chances of it gaining whatever level of sentience the actual organism has.
But then this will be heavily muddled by what people believe consciousness is and whether or how humans are special, I suppose.
Yes *, and in the real world. The question then is if you rate that to be an equivalent existential horror to being a varyingly maldeveloped, malnutritioned, disembodied version of those mice, forced to live out life in a low fidelity version of the Matrix [0], potentially in constant or recurring agony. You get a potential match or approximate match in cognitive ability and operation, but with a lot different set of circumstances.
* They kinda do have a problem with that too, that's why ethics committees exist, and why the term "animal testing" pops up in the news cycle every so often.
> We know this is only 200,000 neurons. Dogs have 500 million. Humans have billions. But where is the line for sentience, awareness?
Check out the venerable fruit fly (drosophila melanogaster) and its known lifecycle and behavioral traits. They're a high profile neuroscience research target for them I believe; their connectome being fully mapped made the news pretty hard a few years ago.
Fruit flies have ~140,000 neurons.
The catch is that these brain-on-a-substrate organoids are nothing like actual structured, developed brains. They're more like randomly wired-together transistors than a proper circuit, to use an analogy.
So even though by the numbers they'd definitely have the potential to be your nightmare fuel, I'd be surprised if they're anywhere close in actuality.
Anyone here using semantic diffing tools in their daily work? How good are they?
I use some for e.g. YAML [0] and JSON [1], and they're nice [2], but these are comparatively simple languages.
I'm particularly curious because just plain diffing ASTs is more on the "syntax-aware diffing" side rather than the "semantic diffing" side, yet most semantic tooling descriptions stop at saying they use ASTs.
ASTs are not necessarily in a minimal / optimized form by construction I believe, so I'm pretty sure you'll have situations where a "semantic" differ will report a difference, whereas a compiler would still compile the given translation unit to the same machine bytecode after all the optimization passes during later levels. Not even necessarily for target platform dependent reasons.
But maybe this doesn't matter much or would be more confusing than helpful?
[2] they allow me to ignore ordering differences within arrays (arrays are ordered in YAML and JSON as per the standard), which I found to be a surprisingly rare and useful capability; the programs that consume the YAMLs and JSONs I use these on are not sensitive to these ordering differences
Fair point on AST vs semantic. sem sits somewhere in between. It doesn't go as far as checking compiled output equivalence, but it does normalize the AST before hashing (we call it structural_hash), so purely cosmetic changes like reformatting or renaming a local variable won't show as a diff. The goal isn't "would the compiler produce the same binary" but "did the developer change the behavior of this entity." For most practical cases that's the useful boundary. The YAML/JSON ordering point is interesting, we handle JSON keys as entities so reordering doesn't conflict during merges.
Regarding the custom normalization step, that makes sense, and I don't really have much more to add either. Looked into it a bit further since, it seems that specifically with programming languages the topic gets pretty gnarly pretty quick for various language theory reasons. So the solution you settled on is understandable. I might spend some time comparing how various semantic toolings compare, I'd imagine they probably aim for something similar.
> The YAML/JSON ordering point is interesting, we handle JSON keys as entities so reordering doesn't conflict during merges.
Just to clarify, I specifically meant the ordering of elements within arrays, not the ordering of keys within an object. The order of keys in an object is relaxed as per the spec, so normalizing across that is correct behavior. What I'm doing with these other tools is technically a spec violation, but since I know that downstream tooling is explicitly order invariant, it all still works out and helps a ton. It's pretty ironic too, I usually hammer on about not liking there being options, but in this case an option is exactly the right way to go about this; you would not want this as a default.
Ah right, array ordering not key ordering. That's a different beast. You're making a deliberate semantic choice because you know your consumers are order-invariant. We can't really do that at our level since function ordering in code is usually meaningful to the language. Your use case needs domain knowledge about the consumer, which is exactly why an option makes sense there.
If you do end up comparing semantic toolings I'd love to hear what you find. The space is weirdly fragmented between syntax-aware, normalized-AST, and domain-specific (dyff/jd). Everyone calls it "semantic" but they're solving pretty different problems.
I guess I don't understand the difference between semantic and syntax-aware, but I've been trying out difftastic which is a bit of an odd beast but does a great job at narrowing down diffs to the actual meaningful parts.
difftastic is solid. The difference is roughly: syntax-aware (difftastic) knows what changed in the tree, sem knows which entity changed and whether it actually matters. difftastic will show you that a node in the AST moved. sem will tell you "the function processOrder was modified, and 3 other functions across 2 files depend on it." difftastic is a better diff. sem is trying to be a different layer on top of git entirely.
Not all of us have cell plans with hotspots ($$$), hotspots often have data caps, cell is often slower or congested, and there are some areas without cell signal. It's also kind of silly from a wider perspective to shove everyone onto the cellular network when most businesses have perfectly decent fiber internet nowadays.
Sure, I'm usually on hotspot, but I personally appreciate when businesses have wifi. Either way, there are always going to be shared networks somewhere.
What we should actually be doing is WiFi using SIM cards as authentication.
Have it count against your data cap (but make it much cheaper than cellular data). Pay part of that revenue to hotspot-owning businesses. If something bad happens, use the logs that telecoms are already required to keep.
It's very strange to me that we don't have something like this already.
How about we don't? We really don't need to tie even more things to SIM cards and phone numbers.
Criminals have more than enough ways to still get anonymous SIM cards (at least until every country on the planet makes KYC mandatory for prepaid SIMs), and legitimate users are greatly inconvenienced by this.
> Pay part of that revenue to hotspot-owning businesses.
To subsidize a network connection they probably already need for their business operations, e.g. their payment terminal or POS? Why should I? The marginal cost of an incremental byte on wired Internet connections is basically zero, these days. It's literally too cheap to meter, so why bother?
Besides the centralization and tracking concerns, not nearly every device has a SIM card. Why does my Laptop not deserve to access a coffee shop Wi-Fi, my Kindle to use an in-flight conenction, or my smartwatch to use the gym's network for podcasts?
It's very strange to me that people keep trying to willingly ruin the open Internet.
I live in a country that has mandatory SIM registration, and it's stopping exactly zero organized criminals – these can just pay a tiny bit more and buy burner phones and use out-of-country SIM cards – while it's making life more complicated and expensive for the average citizen.
Expensive because KYC isn't cheap, and guess who pays for that in the end... And that is assuming that your form of ID is even accepted as a foreigner. In a different country, I literally just spent two days sending back and forth selfies holding my passport(!) to little success. And I guess the customer support reps could now just use the same photos to impersonate me elsewhere, since passport photos provide absolutely zero domain binding and are just about the dumbest thing still seeing widespread adoption.
I don't often use registration-free public Wi-Fis, but I love that they exist, and I would hate if they'd be taken away too. I also just transited at an airport that requires passport scans for Wi-Fi usage, and it feels so backwards.
Thanks for being honest about this, though. I was always wondering who all these people were that are seriously in favor of all this dystopian stuff. Would love to hear why you think that it's a net positive for society.
> What an incredibly short-sighted, dystopian view.
You do recognize that the person I kept replying to was not asking these questions in earnest, right? They were all carefully directed questions, specifically designed to confirm their world view. I played into it, because I think they're pitiful and hilarious. Serves them right. Their latest question about government criticisms completes the caricature perfectly. All they're missing is referencing or quoting Orwell.
> I live in a country that has mandatory SIM registration, and it's stopping exactly zero organized criminals – these can just pay a tiny bit more and buy burner phones and use out-of-country SIM cards – while it's making life more complicated and expensive for the average citizen.
Pretty much the same here to my understanding. There's no credible evidence I'm aware of that'd suggest the criminal use of phone networks decreased significantly thanks to these. It might have improved on the exhaustion rate of the numbering pool, but I don't think we were particularly close to exhausting it anyways. Most benefit I can think of is a chance at traceability, but how well realized vs abused that is, no idea. Just like with IP leasing described in the article above, enlisting the help SIM mules has a long standing tradition, after all.
Any addressing system that relies on non-cryptographic identifiers will be prone to all kinds of mass misuse. There's no amount of lawmaking, honest or not, that could be implemented to counteract these. It's just like email.
> Thanks for being honest about this, though.
Except I really wasn't, and I find it both remarkably funny but also extremely concerning how on board you guys are with it. Propaganda and culture sure are powerful.
The current ways of identity verification are broken, and are prone to enable surveillance: this is something I fully recognize. What I refuse to recognize however is that the concept of identity verification would be wrong wholesale. There was another thread on here a few days ago that I did comment on, but the bottom line is, in my understanding there's no mathematical reason that things would have to be this way. Its shortcomings, including its enablement of mass surveillance, are an implementation issue, not something fundamental to the idea per se.
Being able to trust that a stranger you're talking to is
- an actual specific person
- is actually a stranger
are bottom of the barrel human expectations that communications technology have completely shattered. Technologically guaranteeing these, to the extent the analog hole problem allows for it, does not require dystopian practices. I'm confident that the lack of these guarantees is the root of many societal problems we see at large today. For better or for worse, a lot of people live a lot of their lives on the internet these days, but the internet is no hospitable place for them, among else for these exact reasons.
Accountability is a good thing. I refuse to let it be monkey paw-d by people who mean unwell into being recognized as a tool for evil, and I think you should too. Trust being abused by a centralized system does not mean trust is wrong. It means there are abusers at the wheel. The solution is not mistrust, or even systems that require less trust necessarily, although both can be useful. The solution is reworking the system to get more trustworthy people into the leading positions, and to make it so that those who have demonstrated to be not deserving are thrown out more readily. It is most unfortunate that this listing is ordered exactly by difficulty, from easiest to hardest. Trust is easily broken, and human systems are impossibly hard to get right. I don't think this justifies giving up though.
My profile is not blank. You can page through all my comments, posts, and favorites to your liking.
Did you actually bother to understand what I said by the way? Are you able to formulate a post that isn't just a bare minimum asinine rhetorical question?
> The current ways of identity verification are broken, and are prone to enable surveillance: this is something I fully recognize. What I refuse to recognize however is that the concept of identity verification would be wrong wholesale. There was another thread on here a few days ago that I did comment on, but the bottom line is, in my understanding there's no mathematical reason that things would have to be this way. Its shortcomings, including its enablement of mass surveillance, are an implementation issue, not something fundamental to the idea per se.
Put into more exact terms, your way of wanting to verify my identity is the same one you criticize governments and businesses for doing. It is not one I think is a good idea either, despite how you're trying to present this. I just retain the opportunity for there being other, better ways, whereas you don't.
Mind you, there's no reason to think that those who do publish such information do it because they're here to champion accountability. Note the type of forum this was originally supposed to be. It's in part a place for self-advertising. Many contact details you find on bios are visibly and explicitly HN specific.
Is even that needed? Nothing e2ee about the emails you receive normally, they could just read them right away if they really wanted to. And that is to say nothing about the metadata.
I keep reading about how IoT / wearables / smart home devices are routinely both vulnerable and exploited, if not even come with malware preinstalled, so I was curious to finally go through a primary source like this.
After skimming through the attacks performed in this research, and checking every mention of the word "internet", all I got was a section with a hypothetical scenario where the watch has a publicly reachable IPv4 address. Suffice to say, that is really quite unlikely, certainly in my experience at least.
It did also talk about bundled malware, so I guess that's bad enough, but is all IoT research like this? Always sounded to me like you kinda need to already have a foot in the door for these, and this paper didn't dispel that notion for me at all.
Many of the great hacks have involved breaking through 2 layers of supposed security. You break into the 3D printer, which lets you send packets on the local network. Then you use that to break into the exercise bike, which has a camera because it's based on a generic tablet.
Either vendor might see the flaw as low-severity. So what if someone can send packets? So what if someone already on the local network can hack the camera? But combine them and you're pwned.
"You're safe as long as every device on the network you're on is safe" isn't safe.
In theory I should be able to take a modern browser/device over a completely compromised router and either be safe, or have my device tell me "holy shit, something is wrong".
The days of local trust should be long gone by now.
> a hypothetical scenario where the watch has a publicly reachable IPv4 address
Or one of your other IoT / smart home devices / malware on your PC is doing local network reconnaissance? Connecting this device to a public wifi? Or just a bad neighbour who hijacks your SSID? This smells of "I'm secure because I'm behind a NAT" which conveniently ignores the couple dozen other paths an adversary could take.
I can materialize that smell for you, you're indeed more secure because you're behind NAT. Admitting this does not necessarily entail:
- suggesting that it's a good security solution
- suggesting that it's a security solution to begin with
- suggesting that it somehow prevents all avenues of remote exploitation
What it does do is make these stories sound a lot less dramatic. Because no, John Diddler is not going to be able to just hop on and get into your child's smartwatch to spy on them from the comfort of their home on the other side of the world at a whim, unlike these headlines and articles suggest at a glance. Not through the documented exploitation methods alone anyways, unless my skim reading didn't do the paper justice.
Remaining remote exploitation avenues do include however:
- the vendor getting compromised, and through it the devices pulling in a malicious payload, making them compromised (I guess this kinda either did happen or was simulated in the paper, but this is indirect and kind of benign anyways; you implicitly trust the vendor every time you apply a software update since it's closed source)
- the vendor being a massive (criminal?) doofus and just straight up providing a public or semi-public proxy endpoint, with zero or negligent auth, through which you can on-demand enumerate and reach all the devices (this is primarily the avenue I was expecting, as there was a car manufacturer I believe who did exactly this)
- peer to peer networking shenanigans: not sure what's possible there, can't imagine there not being any skeletons in the closet, would have been excited to learn more
List not guaranteed complete. But this is the kinda stuff I'd be expecting when I see these headlines.
Sure. Or you might step out the door and a fridge falls on you. Equally likely.
Yes, it's an exploit. It should be fixed. But the endless hyperventilating over fringe exploits mostly has the effect that people now ignore all security conversations.
The source site/paper won't load for me at this time, but if the device has a cellular modem in it for network connectivity, it will 100% be assigned an IPv4 address from the carrier. Unless this device is using an APN at the carrier level, or is using a SIM provider that provides some additional security.
Sure, but that’s increasingly likely to be a private IPv4 address as a result of:
Carrier-grade NAT (CGN or CGNAT), also known as large-scale NAT (LSN), is a type of network address translation (NAT) used by Internet service providers (ISPs) in IPv4 network design. With CGNAT, end sites, in particular residential networks, are configured with private network addresses that are translated to public IPv4 addresses by middlebox network address translator devices embedded in the network operator's network, permitting the sharing of small pools of public addresses among many end users. This essentially repeats the traditional customer-premises NAT function at the ISP level.
You must not be in the United States. Here, regular home cable/fiber internet ISPs usually assign a (dynamic) public ipv4 address to your router. Your cellular internet connection is usually behind cgnat, both on your phone and the new home wireless internet from the cellular providers, but regular home cable/fiber internet is the most common home internet type.
So I agree that the watch would likely be behind NAT (for IPv4), I just disagree with the statement that ISPs usually put their customers behind cgnat.
> looks inside
> the reference implementation of language is slow
Despite its content, this blogpost also pushes this exact "language slow" thinking in its preamble. I don't think nearly enough people read past introductions for that to be a responsible choice or a good idea.
The only thing worse than this is when Python specifically is outright taught (!) as an "interpeted language", as if an implementation-detail like that was somehow a language property. So grating.
reply