One way to think of it is that each point in your data follows your model but with gaussian iid noise shifting them away. The likelihood is then product of gaussians mean shifted and rescaled by variance. Minimize the log-likelihood then becomes reducing the sum of (x-mu)^2 for each point, which is essentially least squares.
My question is how much of the operations in JAX here can be done with reduced precision and can utilize training accelerators i.e. TPUs. I've noticed a lot of research coming out in physics, where everything is simulated in at least double float, being augmented with ML approaches where precision is traded for dynamic range.
The thing with reduced precision is that things may look fine at first, but then you eventually notice unphysical features in your solution (like additional wave modes after very long simulation times, or energy conservation issues). So we really don't know as a community yet how far we can venture from float64, but it looks like float32 may be viable.
Veros works OK on TPUs (about the same speed as a high-end GPU), but since you can't buy TPUs that's an immediate no for most academic users of climate models. Renting hardware doesn't really make sense when you keep it busy for months at a time and the HPC infrastructure is already in place.
can't you fix a lot of the nonphysical issues by using better integration schemes? that might be hard in Jax though. From what I know, it's options for better numerical stability are pretty limited.
No, in fact, you want to go lower order with lower precision. The real answer is that if the solution is in the chaotic regime then maybe Float16 is fine because you'll be dominated by other numerical errors anyways (if you're also making sure you have adequate conservation so the solution doesn't explode in some way), but if you're not in the chaotic regime then even Float32 is pushing it in many cases (i.e. it better be non-stiff as stiffness pretty much guernetees operations which span beyond Float32 relative epsilon). So it's a case-dependent topic and not something that has an easy answer, though the case for Float16 is rather small.
(We had some small tests generating TPU ODE solver code from Julia and showcased some rather bizarre stuff back when Keno was working on it, but never wrote a post summarizing all of it)
A bit off-topic, but are there any Tailscale users here who consistently use it on mobile? How was the speed and battery life? I see that Tailscale offers a lot of nifty features like this that go beyond setting up a VPN, but if the overhead is too big it might not be worth running at all.
I don't use it 24/7 but I do use it for extended periods of time when testing a product I work on. It's fine.
It's generally pretty fast, about what I'd expect if I was connecting to the box directly (which you technically are, I believe). Battery drain is probably a thing, but I honestly haven't noticed; my phone battery pretty much only needs a charge by nighttime, same as it was pre-tailscale.
Even more than that, the structure of these networks imply that targeting influencers may be ineffective. There's an interview [0] with Duncan Watts discussing this.
The article is a bit vague on these so I'll take a stab:
Random network are generated randomly, i.e. drawn from some distribution. An example would be the Erdos-Renyi (ER) network, where edges are drawn from a Bernoulli distribution between each node in the network. The simplest way to measure random networks is to determine the degree distribution. ER networks hence result to degree distributions that follow the Binomial distribution (Poisson as N->infty, Np->const). Since the edge probabilities are rather simple and drawn independently, one can take the ER network as sort of a "null model" of random networks.
Scale-free networks, on the other hand, do not follow this degree distribution. Rather, they follow a power-law distribution p(k) ~ x^-a. A process to explain this is that the network is generated by preferential attachment: nodes are more likely to be connected to nodes which are highly connected in the first place. Most of the discussion and the following arguments on this are in the article.
If I remember correctly, Mathematica started the notion of having a notebook interface when working with code, Theodore Gray even has the patent for this [0]. Even though the post reads like a longform ad for Mathematica and the Wolfram language, I do agree the notion of code written to present an idea/solution as a story. Sadly, with Jupyter notebooks being free/open source and compatible with other language kernels, there seems to be a decline in the mindshare for Mathematica and its notebook interface among researchers.
[0] Patent US 8407580 B2 - Method and system for presenting input expressions and evaluations of the input expressions on a workspace of a computational system
(https://www.google.com/patents/US8407580)
Mathematica is a great prototyping language for people who know mathematics. It is the equivalent of Microsoft Excel. The problem is when you want to do something beyond that you will probably rewrite it in something else. And that something else is increasingly being Python.
I do like the term "computational essay". It describes a way of presenting information. While "notebook" feels more to me a set of calculations that may or may not be commented and don't have a set structure.
It would be neat if we had a standard notebook notation or system that would subsume all of these competing standards. Unfortunately it would probably end up like this https://xkcd.com/927/
>The problem is when you want to do something beyond that you will probably rewrite it in something else. And that something else is increasingly being Python.
You aren't going to use your Python code in your Jupyter notebook in production, are you? Jupyter is for exploration and exposition. You're going to have to rewrite it anyway.
This is happening, as a direct consequence of Stephen Wolfram's ego, not because there's something inherently wrong with Mathematica, the language (now renamed to "Wolfram language"), or the ecosystem.
I'm not sure how different this is with Cython since you can also write C/C++ code with it, and hence use any C/C++ libraries. I say that the benefit offered by the example is more of bringing the Rust ecosystem to Python than solving performance issues.
Well, with cython, either you write python compiled to C, and you won't match rust perf nor types, or you write c/c++ and bind it to python, and you won't match rust safetiness.
Cython just transpiles to C. It's not "as fast as C", it's as fast as the runtime support structures it uses, and the communication with Python allows.
You can write very efficient Cython code but it's true that in this case, you tend to adopt a lower level code style that is very close to C/C++. Basically, you need to think about the C/C++ code that will be generated by Cython.
C/C++ compilers might be able to generate more optimized native code than what rutsc does though. Actually, this is a question: how good is rustc with numerical / math intensive code? For instance, does it implements loop unrolling and SIMD vectorization?
Most of the time when a developer needs loop unrolling - numpy will work best anyway. Why does everyone always start mentioning this fact when performance is mentioned?
For example, in my case I always need high-performance code to work with strings loaded from loads of CSV files. That includes: merging strings, matching them, comparing them. Loop unrolling/SIMD would not really help here, while an ability to write safe, checked code fast - would.
On the other hand I do need the pythonic dynamics, so that's what I stick to.
Total layman here, but I thought one would use Rust because it’s easier to write “safe” code with it than C/C++, while maintaining an equivalent low-level speed advantage.