Hacker Newsnew | past | comments | ask | show | jobs | submit | frostix's commentslogin

Most likely do to some process resulting in separation by density. You’d probably also need a geologist to contribute who understands previous states of earth. I’d hazard a guess that most of it occurred over time which much of earth was in a sort of fluid hot state and things clumped together. Structures you see today might be due to the surrounding materials that also tended to clump and how things cooled over time (slow, fast, etc.) There’s also the fact earth still has active convection going on heating things up, spinning them around, letting gravity pull, rising up, cooling, and then more complex motions like plate tectonic movement, fracture, etc. I suspect it would be pretty difficult to say exactly why any specific mineral deposits tend to follow the structures we find them in due to how complex the process is but someone may know.


I think we just need media literacy at least for now while the noise is still manageable. It’s perfectly fine to rely on private news to spread the information IMO, the issue is that people should independently verify said information.

After reading said announcement on Twitter, the first thing I’d do (if I cared about it) would be to head on over to sec.gov or use a search engine to find the official SEC site, then from navigate to find the official announcement. Any reputable news source should include a link in their announcement to the official announcement to save you this verification step.

At some point there may be so much targeted disinformation/misinformation out there that we need legislation to help protect against it but I don’t think we’re there yet.


>They are also less productive when coding than the scientists because they care too much about the quality of their work and not enough about getting shit done.

Ultimately I’d say the core issue here is that research is complex and those environments are often resource strapped relative to other environments. As such this idea of “getting shit done” takes priority over everything. To some degree it’s not that much different than startup business environments that favor shipping features over writing maintainable and well (or even partially) documented code.

The difference in research that many fail to grasp is that the code is often as ephemeral as the specific exploratory path of research it’s tied to. Sometimes software in research is more general purpose but more often it’s tightly coupled to a new idea deep seated in some theory in some fashion. Just as exploration paths into the unknown are rapidly explored and often discarded, much of the work around them is as well, including software.

When you combine that understanding with an already resource strapped environment, it shouldn’t be surprising at all that much work done around the science, be it some physical apparatus or something virtual like code is duct taped together and barely functional. To some degree that’s by design, it’s choosing where you focus your limited resources which is to explore and test and idea.

Software very rarely is the end goal, just like in business. The exception with business is that if the software is viewed as a long term asset more time is spent trying to reduce long term costs. In research and science if something is very successful and becomes mature enough that it’s expected to remain around for awhile, more mature code bases often emerge. Even then there’s not a lot of money out there to create that stuff, but it does happen, but only after it’s proven to be worth the time investment.


>Ultimately I’d say the core issue here is that research is complex and those environments are often resource strapped relative to other environments. As such this idea of “getting shit done” takes priority over everything.

That conforms to my experience


maintainable prototypes are overengineered


The rule-of-thumb of factoring out only when you've written the same code three times rarely gets a chance here, because as soon as you notice a regularity, and you think critically about it, your next experiment breaks that regularity.

It's tempting to create reusable modules, but for one-off exploratory code, for testing hypotheses, it's far more efficient to just write it.


Indeed, and whatever code is used to publish a paper is a prototype, and unlikely to be reused, ever. Sometimes it is, but rarely.


Is there any metrics which proves that making maintainable code is slower? Because in my experience there is no difference.


I have tons of examples of code where I did the simplest thing to solve the problem. Then later needed a change. I could refactor the entire thing to add this change or just hack in the change. Refactoring the entire thing takes more work than the hack so hack it is unless I forsee this is going to matter later. Usually it doesn't


That’s just anecdote, just like mine. Even simple lack of experience or lack of skills can cause that (which were definitely in my case). Also, I’m quite sure that a terrific coder can create maintainable code faster than an average one bad code. That’s why I asked some statistical data about that.


Differential backups or any sort of versioning seemed like one of the most obvious culprits (that and or total redundant storage to preserve the file) but the issue with all of this is it’s entirely opaque.

Ultimately you’re increasingly tethered to some service for your storage that you pay for periodically based on total storage yet you have little-to-no information how to best optimize that storage if you want to operate in a fixed cost bracket or lower storage/cost ratio. So as a consumer, do I just wave my hands and keep throwing more and more money at the problem, especially now that devices are increasingly pushing everything, including storage, as a subscription service to meet my actual functional needs (that realistically could be met by local storage options if manufacturers didn’t have a vested interest in pushing me towards service based storage solutions)?

The modern business strategy in technology is simply hiding behind complexity. The cost is too complex for you to understand, it gives too much information away about our internals to competitors, and so on. Yet somehow these metrics are derived to assure the business is operating above cost because when the rubber meets the road it must be done, yet when the consumer wants to understand it’s suddenly too complex. The problem is that tech in many cases is growing to scales that really is too complex and business managers know this, so it’s often a valid excuse to hide behind. Conveniently that’s where they focus on investment and padding margins though.


>So as a consumer, do I just wave my hands and keep throwing more and more money at the problem,

Yes.

I can go and buy 1TB of Microsoft OneDrive or 2TB of Google Drive for less than a Franklin a year, and most people won't even need 1TB let alone 2TB. Both Microsoft and Google also offer 100GB plans for a Jackson a year, which is what I purchase myself. The average person can get by paying a Washington per month to Apple for 50GB of iCloud.

The amount of money I would save from managing photos myself locally isn't worth the time spent nor the money spent on the hardware.

EDIT:

For the downvoters, consider this: If I were to manage all this myself, I would need at least three storage mediums with one being a different form factor to satisfy the 3-2-1 backup scheme. I would also need to procure arrangements for that third backup copy in the 3-2-1 scheme. And I would need to spend time managing it all.

That is going to cost me more than a Franklin per year. Life is short, my time is precious, and my money is ultimately expendable.


Presidents on US currency:

  - $100,000: Wilson
  - $1,000: Cleveland
  - $500: McKinley
  - $100: Franklin*
  - $50: Grant
  - $20: Jackson
  - $10: Hamilton*
  - $5: Lincoln
  - $2: Jefferson
  - $1: Washington
  
  * not a president
> I can go and buy 1TB of Microsoft OneDrive or 2TB of Google Drive for less than $100 a year, and most people won't even need 1TB let alone 2TB. Both Microsoft and Google also offer 100GB plans for a $20 a year, which is what I purchase myself. The average person can get by paying $1 per month to Apple for 50GB of iCloud.

While we're at it, iCloud+ offers these monthly storage plans now:

  United States:
  50GB: $1
  200GB: $3
  2TB: $10
  6TB: $30
  12TB: $60
See everywhere in the world here:

https://support.apple.com/en-us/HT201238


Thank you, I had no idea what this person was talking about.


"640KB is more than anyone will ever need"

It's a little absurd to think people don't need more than 2TB - especially on HN. Gamers will likely have 2TB in games alone, videographers often have many TBs of videos and photos from weddings and events in their life, many that care about health may have a few TB in genomic data mirrored on their computers to analyze, etc.

I would imagine it's hard to find people that wouldn't have TBs of data, if they were allowed to do so. The reason many people don't have TBs of data is they're limited by these exact companies you're claiming 'solve the problem' by offering limited storage.

It is notable however, that having better tools to organize, deduplicate, and compress data would be helpful to reduce some of the size of data that many people have. Over the years I've noticed my family will have multiple tar.gz archives, zip archives, etc, which (after extraction/unencryption) will share 20% files here, 10% files there, a 4kb jpg that's the same as a 100MB PNG here and there, etc. So yes, those 10TB archives may end up being 5TB if someone spent the time to really comb over, understand, make good decisions, and organize that data. But I have not yet seen anything that can scratch that surface yet, other than perhaps https://github.com/jjuliano/aifiles - but I won't use it until it's local only and has guarantees not to destroy data without explicit permission. An overlay filesystem that shows compression/deduplication with LLM capability like aifiles is probably the best option here.

However, I wouldn't imagine that most people's life data is less than 2TB even with all of this - it's mostly imposed as an artificial constraint by these companies.


> Gamers will likely have 2TB in games alone

If I were to install all the games on my steam account it would be many hundreds of terabytes of total storage. In the end I have about a terabyte of games on my computer. And of that 0 bytes are in my cloud storage.

I've been an amateur photographer for over 15 years. I tend to curate the photos I keep, largely because I don't need 20+ pictures of the same scene. Its more of a burden to casually flip through my photos if the majority of them are near duplicates. In the end my total collection is only several hundred gigs.

Most people aren't videographers.

Most people in my family have far less than even 50 gigs of actual data they care about. They maybe take a dozen compressed photos a week, maybe 30 minutes of videos a month. A lot of my friends take even fewer photos and pictures.


You could argue all your games are in the cloud since of the hundreds of terabytes of games you have, you're only keeping 1TB of them on the computer.


Its even more than that; easily half to three-quarters of my gaming these days is on cloud gaming so I'm not even installing any content.

But in the end, that's not the same as my cloud storage where I'm being metered by my bytes.


Now, consider a world where you are making $42,000 a year. That's 20 dollars an hour and a very common wage. How do they handle the same when there are so many competing Franklins for them?


bump this, and the implicit vendor lockin that this ideology creates


This is the same argument behind paying for a streaming service. Could I "find" everything I want to watch somewhere else and maintain it myself on a Plex server? Sure, but the cost-benefit analysis just doesn't make sense to me.

Particularly the older I get the more I value my finite free time. Throwing $20 at something to remove a problem that would take me hours (not to mention a large startup cost) to do myself is just an obvious choice.


As someone who has a Proxmox cluster at home (storage on RAIDZ, hot backups with PBS, cold backups on external HDDs) I literally recommend cloud storage for most people that ask me about backup solutions as it's simply not worth it for the average user. Those people already have all their data in the cloud anyways and share it on Facebook et al and they don't really care about the privacy side.

Remember it's not just buying the equipment, it's maintaining and understand it as well (e.g. I have to be familar with how ZFS works, how to restore a failed node, write some scripts, etc.). And with every backup solution you also need to be familar with the restoration process and test it occassionally to make sure it actually works as expected.


I've just upgraded from a Synology 4 disk setup to a custom-built Unraid system.

Wouldn't recommend it to Joe Normal. I've spent a good 5-6 days just rebuilding parity while switching disks. During which the server makes a godawful noise and the performance is degraded because of constant disk load.

It's great to have 40TB+ of storage right next to me though, the performance is great and I can run every self-hostable service imaginable with it.

But I still subscribe to Apple One for the whole family, it just works for every device (3-4 phones, 3 tablets, 2 laptops). I do have some backups running from those to the local system, but mostly it's for quick restores.

The only thing I'm adamant about is to keep your generated content for yourself. Don't trust FB, Instagram, Youtube or whatever to be the only storage of anything you create. Keep the master copy where you control it, publish it to other services.


Yeah I have always backed up to external drives and had cloud storage. But now I learn there isn’t rot, so I have to either build a RAID and have refresh software that rewrites and validates, or just buys a new drive every 3 years and backup all over again. And that’s a simple setup.


Another danger of doing it yourself I have found is that if you give a dev a Proxmox, they are going to play around making toy Kubernetes setups instead of implementing the backup system they intended to make.


> For the downvoters, consider this: If I were to manage all this myself, I would need at least three storage mediums with one being a different form factor to satisfy the 3-2-1 backup scheme. I would also need to procure arrangements for that third backup copy in the 3-2-1 scheme. And I would need to spend time managing it all.

> That is going to cost me more than a Franklin per year. Life is short, my time is precious, and my money is ultimately expendable.

When you're looking at cloud services, you need to perform your own off-site backup. Apple, Google, Microsoft, etc. will maintain copies that they'll restore in the event of a hardware failure. But, if your account gets compromised or a buggy sync or bad API event happens, your data is gone. They're not going to go restore it from tape for you. This is a big part of why I do have an in-home NAS. Maybe you have everything sync'd with a laptop and that has you covered, but Apple's expanded storage options are outlandishly expensive so I doubt many with the 2TB+ plans are able to do that. (Yes, you could use external storage, but that's also rather inconvenient for a Photos.app library.)

We could both get what we want if these storage operations weren't wrapped up in proprietary APIs. If I use iCloud I get a seamless experience on macOS, but no access at all on Linux. If I use Dropbox I get access on Linux, but little more than photo sync on an iPhone. Given the decades of precedent with filesystems and I/O APIs, I suspect we could have an abstraction layer and an implementation layer that would allow for interoperability. Anyone that wants to pay for iCloud are free to do so, others could use their preferred storage engine. But, allowing access into the walled garden is far less profitable.

For most people, storage needs are going to increase over time (more + higher resolution photos & videos, larger apps, document storage, etc.). 6TB for a family is not unreasonable and that's what? Three Franklins and three Jacksons per year + whatever for an external drive for your offsite backups. What comes after the 6TB option? Storage costs have decreased drastically over time, unless you're using a proprietary service; consumers are not benefiting at all from those gains in efficiency.


> Three Franklins and three Jacksons per year + whatever for an external drive for your offsite backups.

I know you're just following the theme set by the parent commenter, but there are a bunch of us folk on HN who aren't US residents, and have no idea how much those presidents mean in terms of currency.


I'm sorry. I use the currency and had to think about what the values were. I was trying to follow the theme by the previous author, but I can definitely see how that'd be hard for others to follow. It's $360 USD (Franklin = $100 USD, Jackson = $20 USD).


In addition to being confusing for non-Americans, it also confusing for Americans because $100 bills in American slang are "Benjamins," not "Franklins." It's most notable use is in the song "It's All About the Benjamins"


I've heard it both ways myself; I like Franklin better since the others are all last/family names too.


A Franklin is more or less two Turings and a Jackson is just less than a Turner.


>6TB for a family is not unreasonable and that's what?

Microsoft in particular has a 6TB for $100/year family plan, sharable with up to 5 other family members for a total of 6 persons each with 1TB. Google's plans can all also be shared with up to 5 other family members, though their bytes-per-dollar can't compete with that particular Microsoft family plan.

Basically: Local storage with personal management needs to be very easy, cheap, and carefree (which it isn't) to compete practically with cloud storage.

The only exception is if one's needs are niche and specific. I actually have a Synology NAS at home that I keep most of my data on, but that's because my data is mostly "bottle of rum" and "Linux ISO" in nature and thus not something I can throw on cloud storage in the first place.


> What comes after the 6TB option?

Well, 12TB.

And if you are head of household and share storage, you can combine storage plans. Mine currently shows “2.3TB of 14TB used”.


Thanks. I overlooked the 12 TB. I'm not sure doubling the capacity and doubling the cost is really ideal for many, but it's nice to know it's there.


Hackers don't understand something that regular people more readily do: It works, it's cheap, and I'm paying for the convenience. We can all have our ideals about how tech should be, but these choices are driven by practicality.


Partly you’re getting downvoted because you’re not accounting for the value of the content to the user. Even if the chance of a cloud provider cancelling an account or deleting content is 0.0001%, the value of a single photo to Microsoft will be magnitudes less than to an individual.

I imagine the other reason is because they’re not mutually exclusive: For instance, Synology makes it easy to have both an in-home NAS and cloud sync.


You still need to manage backups. You’re trusting everything to the vendor. I run a docker image weekly that pulls my google photos, copies to my USB drive in my pi, and also copies to backblaze b2.


got a link for this image?


And if you invest that $100 at 7% that becomes after x years...

,End Balance after x years, 4% per year

10, 1,578.36, 63.13

15, 2,788.81, 111.55

30, 10,207.30, 408.29

40, 21,460.96, 858.44


This is done across many disciplines to try and aide in new discovery paths. Typically you’re limited in exactly what you can simulate and often times solution candidates may be found that are impractical, currently impossible, or perhaps actually impossible to produce. Sometimes you can add search constraints to tie simulations together to narrow down such false positive solutions found but not always. Heck in some cases it’s literally cheaper and more accurate to do the bench science no matter how alluring virtualized renditions may be.

Most fields are still left with piles and piles of potential solutions to sort through. They often select candidates that are the cheapest and most practical to approach or they have high suspicion of success and pursue those. At the end of the day though we don’t have full universe simulators at every scale we’d want, we have very specific area simulators within very specific bounds. You have to go out an empirically test these things.

But this is and has already been going on for decades across most disciplines I’ve interacted with, they just weren’t using DNN or LLMs at the time but domains are adopting these as well to leverage where feasible in the search process.

I work with a variety of people interested in leveraging simulation and everyone wants to take the successes they see in LLMs or say RL from AlphaStar or AlphaGo and apply them in their domain. It’s alluring, I get it, the issue is that we often lack enough real understanding in domains and the science isn’t as airtight and people think it is, its too general or narrow, or on some cases we have good suspicion of how to build better more accurate simulations but there’s not enough compute power or energy in the world to make them currently practical, so we need to take some tradeoffs and live with less accurate and detailed simulation which leads to inaccurate representations of reality and ultimately inaccurate solution suggestion candidates.


The issue with generative AI techniques in general is how low the barrier to entry is. Various forms of information that used to be difficult or resource intensive to create have suddenly become approachable and even trivial in terms of resource investment to create.

Overall, in any sort of cost/benefit analysis, the cost is just so low now the benefits don’t have to be much of anything, if anything at all. Entertainment factor alone, boredom, or perhaps a passing curiosity to try something are enough to create and present false or misleading information and push it out to the public, creating noise needed to filter through. There are plenty of other far stronger motives that make the problem even worse.

Misinformation and disinformation were already becoming an increasingly large societal issue IMHO. That is only going to get worse with wide access to generative AI. We already have a high degree of erosion in social trust where we pretty much have to consider motives and driving forces behind every transactional relationship we have these days and we could at least use costs to help sort that mess out: why would someone bother investing the resources to do this? Does it cost a lot to present me with false information and if so, is there enough potential motive behind that to make this information more likely to be false or misleading?

The answer to this is increasingly yes. It’s now far more difficult to start from a position of distrust and move to a point of trust or likelihood of trust and I think we’re going to see that even more in all sorts of aspects of daily life. I now have to assume most pieces of information out there are targeting me and attempting to manipulate me in some way (more than before). I fear we’re moving to a model of free speech that will put more weight on “authoritative” sources more so than in recent past in many cases considering liabilities authorities have when presenting false, misleading, or inaccurate information. Liabilities that in many cases aren’t real liabilities just perceived liabilities, granting authoritative information sources far more credit than is due.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: