Hacker Newsnew | past | comments | ask | show | jobs | submit | lmeyerov's commentslogin

Speaking of embeddable, we just announced cypher syntax for gfql, so the first OSS CPU/GPU cypher query engine you can use on dataframes

Typically used with scaleout DBs like databricks & splunk for analytical apps: security/fraud/event/social data analysis pipelines, ML+AI embedding & enrichment pipelines, etc. We originally built it for the compute-tier gap here to help Graphistry users making embeddable interactive GPU graph viz apps and dashboards and not wanting to add an external graph DB phase into their interactive analytics flows.

Single GPU can do 1B+ edges/s, no need for a DB install, and can work straight on your dataframes / apache arrow / parquet: https://pygraphistry.readthedocs.io/en/latest/gfql/benchmark...

We took a multilayer approach to the GPU & vectorization acceleration, including a more parallelism-friendly core algorithm. This makes fancy features pay-as-you-go vs dragging everything down as in most columnar engines that are appearing. Our vectorized core conforms to over half of TCK already, and we are working to add trickier bits on different layers now that flow is established.

The core GFQL engine has been in production for a year or two now with a lot of analyst teams around the world (NATO, banks, US gov, ...) because it is part of Graphistry. The open-source cypher support is us starting to make it easy for others to directly use as well, including LLMs :)


*legal in the US

Apple and Google are facilitating the data sales

Specifically, these big companies revenue share with app companies who in turn increase monetization via selling your private information, esp via free apps. In exchange for Apple etc super high app store rake percentage fees, they claim to run security vetting programs and ToS that vet who they do business with and tell users & courts that things are safe, even when they know they're not.

It's not rocket science for phone OS's to figure out who these companies are and, as iOS / android os users already get tracked by apple/google/etc, triangulate to which apps are participating


I'm game for throwing rocks at Apple and Google, but I don't get this one.

> consumer apps embed ad SDKs → those SDKs feed location signals into RTB ad exchanges → surveillance-oriented firms sit in the RTB pipeline and harvest bid request data even without winning auctions

Would you ban ad supported apps? Assuming the comment you're responding to is realistic, I'm not sure how the OS is to blame.


Neither big players have refined enough permissions. These set users up for giving away more data than they think.

Maybe one clear example is needing a permission once for setup and then it remaining persistent.

An easy demonstration is just looking at what Graphene has done. It's open source and you wana say Google can't protect their users better? Certainly Graphene has some advanced features but not everything can be dismissed so easily. Besides, just throw advanced features behind a hidden menu (which they already have!). There's no reason you can't many most users happy while also catering to power users (they'll always complain, but that's their job)

https://grapheneos.org/features


> Would you ban ad supported apps?

There's no need to ban ad supported apps when you can just ban the practice of using ads targeting users based on individual characteristics.


You trust the adtech companies to pinky promise to totally not do that anymore?

how about jailing CEO's of companies who do this?

I’m not sure that’s how corporate blame works. The ceo signed off on the CIOs proposal to streamline data analytics logs via WeTotallyWontSiphonOffYourDataAndSellIt incorporated for user improvement purposes, which happens to be owned by the CFO’s brother in law. How were the CIO and CEO to know that a third party was selling off the data, and how was that third party to know that the sale of the data to another party who then onsold the data to the fbi would be illegal?

> How were the CIO and CEO to know that a third party was selling off the data, and how was that third party to know that the sale of the data to another party who then onsold the data to the fbi would be illegal?

Ask yourself the same question about personal health data and the answer reveals itself: the CEO and CIO know (or should know) that the vendor needs to be HIPAA-compliant or it's their necks (the CEO's and CIO's), so they look for a vendor who advertises as being HIPAA-compliant.

Pass legislation to the same effect for all PII and the CEO and CIO will then make requirements of the vendor. If the vendor lies, they get fired because the company hiring them is culpable. The vendor may also be subject to civil and/or criminal penalties. It seems simple, other than the fact that we have a federal legislature with no apparent interest in solving this problem, alongside a populace which either doesn't notice or doesn't care about that.

To answer the question more pithily: communication.


> I’m not sure that’s how corporate blame works.

In regulated industries, like finance and taxation, regulators deliberately assign responsibility to individuals, so misconduct doesn’t get lost inside the company or within its corporate stakeholder network. That removes a lot of friction once you want to hold someone liable.

I've read our parents comment as an implicit proposal to establish similar structures in tech.


I would ban apps using unsafe ad platforms

If I was simultaneously also the owner of the ad platform, I'd fix it & knock out the bad players, or get ready to be sued for a decade+ of knowing malpractice

And if I was a US citizen seeing the companies being involved be sued for being monopolies and abusing their position, and then seeing them cry security in court yet knowingly do this for a decade+, I'd feel frustrated by successive left + right US administrations & voters


They are all unsafe. It’s a huge source of revenue for ad companies.

This is really simple to explain:

Apple does not let you restrict app network access[1]

You have no ability to know who your app is connecting to, and you cannot select or prevent it.

[1] except maybe the cellular data toggle


Settings > Privacy & Security > App Privacy Report will at least show domains contacted by each app.

But you cannot block them.

The only way Im aware of is if you do it thru Settings > Cellular and always use data for internet on your phone

You can trace the big players

If Google & Apple & friends refused to take a rake and opened distribution, then I'd agree, net neutrality etc, not their problem

But they own so much, and so deep into the pipeline, and explain their fees to courts because "security"... and then don't do investigations. They employ some of the best security analysts in the world and have $10-30B/yr revenue tied to just the app store fees, so they very much can take a big bite out of this if they wanted.


  > They employ some of the best security analysts in the world and have $10-30B/yr revenue
I'll never not be impressed by how many people will defend trillion dollar organizations and say that things are too expensive. Especially when open source projects (including forks!) implement such features.

I'm completely with you, they could do these things if they wanted to. They have the money. They have the manpower. It is just a matter of priority. And we need to be honest, they're spending larger amounts on slop than actual fixes or even making their products better (for the user).


“Priorities” is far too soft a term in this context. These are anti-priorities: not just things they choose not to work on, but things they’ll spend big money to prevent, up to and including bribing, uh I mean lobbying, lawmakers.

Ultimately the fact that ad sdks have such wide access to location information is a choice by the platforms. I've long wanted meaningful process isolation between the app and its ad sdks, but right now there's oodles of them that just squat on location data when the app requests it.

Apple supposedly does this with the privacy report cards.

However, I'd be shocked if a cursory audit comparing SDKs embedded in apps and disclosed data sales showed they were effectively enforcing anything at all.


> Would you ban ad supported apps?

Yes, I absolutely would. Advertisements are a scourge upon people's wellbeing on top of being ugly and intrusive.

If you want to build a free product, that's great. Build a free product.

If you want to make money from your product, then charge for your product.


>Yes, I absolutely would.

And then you will get fired by the end of day.


Luckily I don't work for an ad-supported business.

How did your company and its customers find each other?

Do people really still think advertising has a legitimate function?

Really these days it's 95% psychological manipulation to get people to buy inferior quality stuff they don't need. And 5% of people actually finding what they're looking for.

Don't forget, most advertising can work fine in a "pull" mode. I need something so I go out and look for it. These days something like Google (not ideal because results also manipulated by the highest bidder). Or I look for dedicated forums or a subreddit for real people's experiences. In the old days it would have been yellow pages or ask a friend.


> I'm not sure how the OS is to blame.

Read the TOS.


If I have a free app that hits location services on the device and I sell this data, how does Apple and Google make money from me?

Apple doesn't even allow apps to know whose device they are running on without the user's explicit opt-in permission.

Just as importantly, apps aren't allowed to remove functionality if the user says no.

You need additional permissions to do things like access location data or scan local networks for device fingerprinting.


And Facebook/Meta. Their trackers are everywhere.

It's everyone. Especially google, but all the big tech companies play in the same pool. Amazon, Google, Apple, Meta etc make money selling ads, which ultimate enables the tools that result data harvesting from everyone across the internet. I wrote a little data investigation [1] (mostly finished) that show cases how every major news organization across the globe I scanned had some level of data collection integrated. This is just one industry, but its important (as it connects back to the incentives these media organizations have, which is to make money by selling ads at any cost). The eff also released an angle in how the bidding process to buy ads is itself a massive privacy nightmare[2]

[1] https://quickthoughts.ca/autotracko/ [2] https://www.eff.org/deeplinks/2026/03/targeted-advertising-g...


cloudflare is more everywhere than facebook

Yeah, but unlike facebook, they weren't just caught making videos of people having sex then paying people to watch the videos.

Also, unlike facebook, they also weren't just caught running a dark money lobbyist network with the goal of forcing more collection of minors' private information.


facebook is evil for many different reasons, but for a government looking to spy on its own citizens cloudflare is much more attractive target. That said, I have no doubt that they're collecting copious amounts of data from both companies, either by sale or by force.

Not Experian, TransUnion, and Equifax?

Or for location, the cellular providers?


There are plenty of bad actors

The interesting part is Google & Apple, as part of explaining to courts why their large app store fees are legit and not proof of monopoly positions, hid behind the security argument that they need to be the clearing house of what software runs on the devices. Except... they've knowingly punted on this one for 10+ years.

I would 100% agree that losing privacy through any utility-level carrier (credit cards, phone, OS provider, etc) should be default disallowed, and any opt-ins have a clear transparency mode with easy opt-out. At least two areas the US can learn from the EU on digital policy is digital marketplaces and consumer privacy protection, and this topic is at the intersection of both.


Once my code exists and passes test, I generally move on to having it iteratively hunt for bugs, security issues, and DRY code reduction opportunities until it stops finding worthwhile ones.

This doesn't always work as well as I'd like, but largely does enough. Conversely, doing as I go has been a waste of time.


The phenomena you're describing is why Cobol programmers still exist, and simultaneously, why it's increasingly irrelevant to most programmers

The killer feature is ecosystem: Easily and reliably reusing other libraries and tools that work out-of-the-box with other Python code written in the last few years . There are individually neato features motivating the efforts involved in upgrading a widely-used language & engine as well, but that kind of thinking misses the forest for the trees unfortunately.

It's a bit surprising to me, in the age of AI coding, for this to be a problem. Most features seem friendly to bootstrapping with automation (ex: f-strings that support ' not just "), and it's interesting if any don't fall in that camp. The main discussion seems to still be framed by the 2024 comments, before Claude Code etc became widespread: https://github.com/orgs/pypy/discussions/5145 .


The alternative is when you run a script that you last used a few years ago and now need it again for some reason (very common in research) and you might end up spending way too much time making it work with your now upgraded stack.

Sure you can were you should have pinned dependencies but that's a lot of overhead for a random script...


Most programmers aren't writing scientific software, which you can tell by claims that nicer f-strings is a pressing concern.


We can play that game - items like GIL-free interpreters and memory views are pretty relevant to folks on the more demanding side of scientific computing. But my point is this is a head-in-sand game when the community vastly outweighs any individual feature. My experience with the scientific computing community is that the non-pypy portion of it is much bigger.

I'm not a pypy maintainer, so my only horse in this race is believing cpython folks benefit from seeing the pypy community prove Things Can Be Better. Part of that means I rather pypy live on by avoiding unforced errors.


I liked they did this work + its sister paper, but disliked how it was positioned basically opposite of the truth.

The good: It shows on one kind of benchmark, some flavors of agentically-generated docs don't help on that task. So naively generating these, for one kind of task, doesn't work. Thank you, useful to know!

The bad: Some people assume this means in general these don't work, or automation can't generate useful ones.

The truth: Instruction files help measurably, and just a bit of engineering enables you to guarantee high scores for the typical cases. As soon as you have an objective function, you can flip it into an eval, and set an AI coder to editing these files until they work.

Ex: We recently released https://github.com/graphistry/graphistry-skills for more easily using graphistry via AI coding, and by having our authoring AI loop a bit with our evals, we jumped the scores from 30-50% success rate to 90%+. As we encounter more scenarios (and mine them from our chats etc), it's pretty straight forward to flip them into evals and ask Claude/Codex to loop until those work well too.

We do these kind of eval-driven AI coding loops all the time , and IMO how to engineer these should be the message, not that they don't work on average. Deeper example near the middle/end of the talk here: https://media.ccc.de/v/39c3-breaking-bots-cheating-at-blue-t...


We split our work:

* Specification extraction. We have security.md and policy.md, often per module. Threat model, mechanisms, etc. This is collaborative and gets checked in for ourselves and the AI. Policy is often tricky & malleable product/business/ux decision stuff, while security is technical layers more independent of that or broader threat model.

* Bug mining. It is driven by the above. It is iterative, where we keep running it to surface findings, adverserially analyze them, and prioritize them. We keep repeating until diminishing returns wrt priority levels. Likely leads to policy & security spec refinements. We use this pattern not just for security , but general bugs and other iterative quality & performance improvement flows - it's just a simple skill file with tweaks like parallel subagents to make it fast and reliable.

This lets the AI drive itself more easily and in ways you explicitly care about vs noise


In our evals for answering cybersecurity incident investigation questions and even autonomously doing the full investigation, gpt-5.2-codex with low reasoning was the clear winner over non-codex or higher reasoning. 2X+ faster, higher completion rates, etc.

It was generally smarter than pre-5.2 so strategically better, and codex likewise wrote better database queries than non-codex, and as it needs to iteratively hunt down the answer, didn't run out the clock by drowning in reasoning.

Video: https://media.ccc.de/v/39c3-breaking-bots-cheating-at-blue-t...

We'll be updating numbers on 5.3 and claude, but basically same thing there. Early, but we were surprised to see codex outperform opus here.


I find it confusing in most directions.

Ex: For the above statement, if they're truly dishonest brokers and openly ignore the rules that are inconvenient, they would have zero problems agreeing to Anthropic's terms and then violating them. So what you say may be quite true, but there would still need to be more to the story for it to make sense.

Ex: DoW officials are stating that they were shocked that their vendor checked in on whether signed contractual safety terms were violated: They require a vendor who won't do such a check. But that opens up other confusing oversight questions, eg, instead of a backchannel check, would they have preferred straight to the IG? Or the IG more aggressively checking these things unasked so vendors don't? It's hard to imagine such an important and publicly visible negotiation being driven by internal regulatory politicking.

I wonder if there's a straighter line for all these things. Irrespective of whether folks like or dislike the administration, they love hardball negotiations and to make money. So as with most things in business and government, follow the money...


I have no idea what exactly Anthropic was offering the DoD, but if there were a LLM product, possible that the existing guardrails prevented the model from executing on the DoD vision.

"Find all of the terrorists in this photo", "Which targets should I bomb first?"

Even if the DoD wanted to ignore the legal terms, the model itself would not cooperate. DoD required a specially trained product without limitations.


New funding round investors generally get seniority over old

But new money may allow buyouts of existing at that time so early team or investors can cash out a bit early

And common doesn't cash out till IPO or private market equivalent, or yes, gets screwed


> About 200,000 people put money into the scheme, which offered a stake in the company, discounts and perks.

Hopefully they took advantage of the discounted beer.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: