Hacker Newsnew | past | comments | ask | show | jobs | submit | mwt's commentslogin

This code is jibberish to me, but it appears the target is just parsing how many atoms are in a molecule string of some representation. That's cool, but to do just about anything useful in chemistry we need the bond graph (and often more - bond orders stereochemistry, plus much more for biopolymers).


That was my initial reaction too, but I suspect this is has utility in applications other than what you and I are looking for. From context, I gather this may be for thermodynamic arithmetic, or reaction product arithmetic.


I'd be really interested to know of anybody making money with those topics (and doesn't already have their own domain-specific practice for the problem)


Cheminformatics is such an example. Heavily used in computational drug discovery.


Computational biology/cheminformatics has probably been on the most frustrating investments pharma companies have made in the past 20 years. There's been waves of optimism with many hires, then a slump after reality doesn't match optimistic expections, and so on. This time it may actually be different, and I myself am in that camp. I'm particularly excited by the discoveries in sampling methods that aren't just molecular dynamics. And the cellular foundation models for pre-screening drug interactions - they aren't quite there yet, but give it time.


The cheminformatics I do (mostly drug discovery/biophysics) definitely requires bonds!


Much like the classic "our department in [big company] works a lot like a startup"


kinda mixed on this

- why not just contribute to the community tool?

- there's already a major split in Python type-checking tools, if there's a third that doesn't agree with either of them it'll be a mess for projects to deal with

- astral has been hiring like mad recently and has yet to communicate that they can actually make money ($5 million doesn't last forever)

- does it actually exist? is this currently a closed-source codebase, or is "we're building" future tense?


> - why not just contribute to the community tool?

what community tool? mypy is written in python, and is non-incremental, whereas astral's goal is to build a fast, incremental type checker in rust. there is no way to get from one to the other via code contributions, they fundamentally need to start from scratch with their own architecture.

as for the split in type checking tools, it is not as bad as you think; the syntax and to a large extent the semantics of the type system are defined via a community process and standardised upon, and the various type checkers largely differ in their implementation details but not in their interpretation of the code. so you can freely use several different type checkers without fear of disagreement.


nah actually we should keep the slow, subpar tooling we got now because it came first

those people producing fantastic tools that have transformed the ecosystem of linting and packaging should be working on it instead of working on transformative tech


> as for the split in type checking tools, it is not as bad as you think

I'm surprised you already know it's not as bad as I think - have you been able to use it? Working on a team that mixes mypy and pyright is pretty frustrating since they don't agree on everything (i.e. when one changeset passes on one and fails on the other) and I see no reason to believe the inconsistencies will become more rare when the number of opinions goes from two to three


i worked on pytype for several years, and occasionally had to support projects that wanted to use multiple type checkers. in my experience, there were definitely times when one type checker would pass some code and another one reject it, but not code that would strictly work on either one checker or another but not both. it could be a pain trying to keep the strictest type checker happy when the other ones passed your code, but usually once you did that you were fine.

also if you do find a case where two type checkers genuinely conflict on a piece of code, one or both of them would definitely like to see it as a bug report. if the underlying cause turns out to be undefined behaviour in the specs, the general typing community will work together to nail it down and the type checkers will all adapt. in general it is a very cooperative process that values the existence of a common set of standards, which is why I think having more type checkers will only improve the situation wrt nailing down corner cases in the specs.


I'm a little surprised to see you dismiss type-checkers disagreeing with each other - there are more than a few cases of mypy and pyright disagreeing and brushing them aside as rare enough to be irrelevant is in conflict with my experience over the years.

I'm happy you believe that the community can converge on unified standards - I wish I shared the optimism - but that's not where we're currently at and years-long efforts don't help me now. (For example, I'd love to use PEP 695 generics since I found a use case for them around 18 months ago but I can't until I can get away with not supporting 3.11, which is years out.) Maybe everything is perfect in 2028 or so - I'd be thrilled - but that doesn't help the pitch for annotations being a value add for people who are worried about the current jobs.


sorry, I didn't mean to sound dismissive of the problem, I just believe it is a problem that having more actively developed type checkers will make better rather than worse. with my type checker developer hat on, the end goal is not to have a single type checker and have the implementation be the spec, it's to have an evolving set of standards that (ideally) can express the patterns people want from python, and then have the type checkers work with those standards.

pep695 is a good example of how the spec is still evolving to try and make the explicit type system work better for python developers. dataclass transforms are perhaps an even better example - that's a pattern that is very specific to python and based on real world, non-typing-related ways in which people use the language, and which they wanted the annotation system to cover, and there was a collaborative effort to develop a way to do it.

also note that there are escape hatches like error suppression (e.g. if pyright likes some code but mypy wrongly thinks it's a type error, you can add a comment so that mypy ignores that one error while continuing to have pyright check it) and typing.cast (which is a directive all the type checkers honour, to say "trust me, this variable has this particular type whether or not you can verify it statically").

another thing to consider is that having type checking work interactively in the IDE is something a lot of people want, and pyright was developed specifically to support that feature. mypy, pytype, and pyre were all architected as batch type checkers, i.e. they need to run on a complete program as a standalone process, and it would have been anywhere from difficult to impossible to get them to support the type of incremental checking an IDE requires.


> does it actually exist? is this currently a closed-source codebase, or is "we're building" future tense?

From the thread:

> We haven't publicized it to-date, but all of this work has been happening in the open, in the Ruff repository.


thanks, I didn't know it was a thread because of the login wall


np! I also posted a link to Charlie's Bluesky cross-post, which you can read without logging in to anything: https://news.ycombinator.com/item?id=42870359


- They usually so a pretty good job adhering to standards

- re: the money: https://news.ycombinator.com/item?id=42869358


Unfortunately that doesn't answer the question


> I wonder if there is any requirement for researchers to at least publish their data set for statistical analysis and further research.

Not generally, though the tide is slowly turning in the right direction. Unfortunately many laws/policies pushing for openness and transparency in research are sidestepped with the classic "data available upon request," a.k.a. "I promise I'll share the Excel files if you email me" (they will not).


I don't understand why can they use this as an excuse. If they can share the data upon request, why can't they just publish that as well? Is that related to some legal/privacy issue?


> Is that related to some legal/privacy issue?

Possibly in some medical or social science fields, I don't know. I know there is not such an issue in chemistry and materials science. There also may be some complications for collaborations with industry, but that's kinda a different situation. For people whose career development is not strongly tied to reproducibility of their work (a.k.a. everybody) it's just another step in the overly complex process of publishing in for-profit journals. Funding agencies generally aren't going to punish people for using this excuse and the watchdogs/groups concerned with reproducibility have no teeth.

Not an excuse, but journals don't make it easy to share files, as hard as that is to believe. Some will only take PDFs for supplemental information and many have garbage UIs, stupidly small file size limits, etc. Just uploading to a repo (or tagged release) on GitHub is common these days because there is much less friction.


> basically all images in scientific publications are not scrutinized for photoshop

This was (mostly) true back then, but it is definitely not true today. People tempted to commit fraud now have to be worried about people like Elisabeth Bik exposing them and ruining their careers. In my experience, the type of people known lie in papers overlap strongly with those that are career-minded/money-driven. So having a few journalists with the skills to detect fraud is an obvious win. Some of the frauds will just get better at, that's just how it does, but it's not like there are no imaging experts in the field.


So the frauds of today are done in smarter ways.

Maybe they will be discovered with new tools developed in a few decades.


Hopefully so, but there will probably always be a back-and-forth between frauds and journalists in the same way security is always a competition. At least the easy frauds are more likely to be caught today.


It's pretty disrespectful to signal (without evidence or elaboration) that researchers are not credible (or worse, broadly lying) in order to keep their research grants flowing. A hypothesis that turns out to be wrong is something both industry and government are obviously not going to invest further in. The people working in the field have skills to transfer to other departments and projects; these aren't the sort of scientists and engineers that are out of ideas or work to do.

It's also plenty obvious that there is no single, monolithic "current research direction" or even that this researcher's work was of fundamental impact when it was published - not to mention the number of people that were highly skeptical from the beginning.


Based on the way that string theory keeps on going like a zombie in physics, I don't think its disrespectful at all to consider allegations that are pretty much indistinguishable from those. Although I think that heavily entrenched groupthink and biases are all that is required. I don't think they're consciously think that "I need to lie in order to keep the research grants flowing" but its instead that old adage that "It is difficult to get a man to understand something, when his salary depends on his not understanding it". To an external observer those are largely indistinguishable and have the same effect, but in the latter case the person has convinced themselves of the lie or the half-truth via financially inclined self-deception.


I think there is truth to the general principle you refer to, but I don't think it accurately describes what I saw skimming experts' comments in the linked thread. I'm an outsider to medical research but have experience in other parts of STEM research at universities. Here I saw a plenty of nuance, documentation of historical skepticism, concern over broad perception, and plenty disagreement over technical points. Far from a unified kool-aid drinker sort of situation. And I think there has been plenty of changes of opinions in the Alzheimer's field in recent years given the number of failed drugs - which goes against the idea that these scientists are following their career over the evidence.


But people who are motivated by a decade or two of having their salaries paid by the leading hypothesis aren't going to produce unified kool-aid drinker kinds of rationales to support it. They've had decades to internalize the arguments and they will be nuanced and multifaceted. They're experts and their defense of the old paradigm will look just like expert opinion. There can still be rotten core foundations at the bottom of it all.


I'm not claiming there are no overly stubborn PIs. But we should not be lazy and paint so a brush that we view the field as a monolith.


Nobody is actually arguing that everyone in the field is goosestepping in perfect synchrony like a cartoon.

But human bias applied by groups is actually very powerful.


It's not disrespectful to point out conflicts of interest. Recusing yourself is standard practice for good reason.


Harboring skepticism of the work people did with a seemingly fraudulent researcher is a good idea. Dismissing everybody in a field whether or not their work is fraudulent is disrespectful approach (not to mention useless).


It’s not even a good faith argument, people cannot prove they’re not biased.

It’s just your basic lazy conspiracy theory thinking dressed up as cynical aloofness.


Given everything we know about how research funding has been captured and shaky ethics in particular domains, skepticism is warranted, not disrespectful.


I'm not asking for people to turn off skepticism or blindly trust researchers. It's not disrespectful to be skeptical.

What is disrespectful is not bothering to read what people have to say before dismissing them as liars who are too vested in "the current research direction" and/or money for their perspectives to matter. It only takes reading a few comments to see that's not happening - for starters, people were skeptical of this group's work for a while now.


Also, “believe science” is the exact opposite of the scientific method.


“Hurl unfounded and baseless accusations of personal corruption” is not the scientific method either.

Trying to pass off conspiracy theorist “disprove the negative” arguments as reasonable discourse is intellectually lazy, at best.


Is that an accusation of bad faith or just a non sequitur?


There is that minor issue that we don't have nearly the same experience and resources as the scientists who say what they say. Are you implying we all becom biologists and figure out the source of Alzheimer's ourselves? As some point you have to trust people and choose someone.


lol


[flagged]


My feeling is that the only reason HN doesn't have a guideline rule forbidding cites to "Upton's Law" is that there isn't space for it. It's tired, snarky, provides no insight, and never takes the conversation in any interesting direction.

The comment you're replying to observes that a bunch of researchers say that the fraudulent paper simply isn't that important in the field. You can contest that claim! Maybe they're totally wrong! But you can't do so with Upton Sinclair, because Upton knows nothing at all about how Alzheimers research works, and when you deploy that quote, you give the strong impression that you don't either.


there isn't space for it

First they came for Upton's Law...

Could be squeezed in maybe:

Eschew flamebait. Avoid unrelated controversies, generic tangents and internet tropes

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...


If at first they don't come for Upton's Law, then you're the product.


That is, as Bill Clinton said to Christopher Buckley, goddamn funny.


Hey that's a good idea. Thanks! I'll add it, along with an Oxford comma.

Edit: done. But perhaps we should take out "internet" and just say "tropes"?


I don't think it flows as well as just "tropes." (And I'm allergic to Oxford commas, but so many of my thoughtful friends who care about language use them that I've started to think that's just my problem.)


Ok, we'll keep 'internet' in there.


I started with 'tropes' but then thought people might try to language-lawyer this (what kind of trope? not all tropes are negative! this isn't a trope! etc) so I scoured scripture for a representative variant and that sounded about right.


I like using the word 'internet' in moderation comments as a sort of mild pejorative—it nicely expresses the shared semi-embarrassment we all feel about whatever this is.

I think it helps take pressure off people personally—because even if you're being scolded, you know..."internet" - how high can the bar really get. It's scolding on a curve.


Variable Yield Scolding, as nuke people might call it.

https://en.wikipedia.org/wiki/Variable_yield


> ... corporate american enabling this will not be seen positively.

I don't think this is true in red states or for the minority views that determine what is and is not illegal in this country, or at least it won't play out like you wish. Certainly I don't see this playing out along this optimistic path. Politicians in Texas and Mississippi will frame it as companies helping dutiful law enforcement investigate evil crimes and ergo it's fine that Google was tracking everybody all along. Single-issue voters would be happy to see abortion providers or seekers behind bars and couldn't care less if the fourth amendment was violated along the way.

Just look at how conservatives (ok, well, a good chunk of all mainstream politicians) and voters viewed the FBI vs. Apple conflict from a few years ago - it was "we have to make sure the cops can get the bad guys so of course the cops should have access to whatever information they need" not "I have a right to privacy - time to stop using these services until they respect it"


> There's no actual storage density information, is there? Nothing in KJ/m^3 units.

Even if there was (to be honest, didn't read the journal article) this is something that can easily be hacked. Energy storage research papers regularly hack energy density numbers by reporting the kJ/cc values of a tiny (like order 1 g) fleck of nanoparticle dust, which totally misrepresents the physics that matters are scale (i.e. in an EV).

Scaling up stuff is hard, including when you're moving from micron scale to cm scale.


It's just diffusing the problem from urban centers to mid-sized cities elsewhere in the united states. Say for simplicity that the core result of the problem is that people not making astronomical tech/finance/etc. salaries can't live in NYC because people making those salaries are scooping up supply and driving up rent ($5,000/month and higher). Then say the solution is to let some amount of people move to smaller cities and work remotely, enough that NYC prices somehow magically drop 20%. Well, where are those jobs going to go / how much of a salary penalty are people going to take to move from NYC to STL, Austin, Nashville, Columbus, Denver, Ann Arbor, and Raleigh? Because they're not going to suddenly be making the same salaries that engineers already in those cities make, they're going to want more. And they're going to scoop up housing supply and put pressure on those housing markets. The same forces keeping NYC rent at $5,000 have caused places like Nashville to become unlivable for the lifers; your grocery store workers and baristas can't get by with $3,000/month rent there either.

It would be lovely if the solution for skyrocketing rents in the big 2-3 urban centers didn't simply shift the problem to every other city, but there's no indication at all that will happen.


> everyone wanted to play like a Karpov or a Kasparov.

I wonder if there's a modern analog to this with how super GM styles have mostly converged. Trying to play like Magnus is as silly as trying to play like an engine. Even the most aggressive players (Nepo, Shak?, Rapport) aren't so wildly different in style.

Maybe the 2010-2020s version of this is bandwagoning onto popular theory, like all the Najdorf lines I know I'll never understand.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: