conjectureproof's comments

conjectureproof · on Feb 5, 2023

Why?

I briefly had an interest in learning Q, then looked at some code: https://github.com/KxSystems/cookbook/blob/master/start/buil...

Why not just build what you need with C/arrow/parquet?

EGreg · on Feb 5, 2023

Not that Q.

The one I am talking about hasn’t been released yet!!

conjectureproof · on Jan 31, 2023

Could consider offering to be hired as a consultant.

Pete Muller said of his quants, "I want their shower time because in the shower they are thinking about things that get them to solve the problems." Put another way, in certain roles >50% value is created in 20% office time spent crystalizing ideas developed outside the office.

If your team can capture the >50% value of 20% time for 20% comp, that's a huge win. Assumes your work is qualitatively different from a new hire's. Not "more widgets with fewer defects in less time". If latter, company may reasonably prefer new hire who can put in the 50 hour slog. And you will too, it's no fun working as a time-metered widget-making FTE stand in.

meitros · on Jan 31, 2023

Interestingly I came to the exact opposite conclusion from that quote - when people stop working full-time the loss is greater than just the percentage of time lost because you lose that extra edge from them like shower thoughts. It’s natural when your job becomes less of your main focus.

sacnoradhq · on Jan 31, 2023

BTDTBTTS. It's feast or famine. Stability is had at (some) large corporations or in government.

conjectureproof · on Jan 26, 2023

Neat.

Simple way to track time spent on projects that is resilient to user forgetfulness. Much better than collecting timestamps from git commits. Could be interesting to merge with git history and measure how productivity (some combo of bash activity and git activity and lines-of-code/Kolmogorov-complexity) change with time-of-day, season, weather, etc.

# store timestamps

export HISTTIMEFORMAT="%F %T "

# append immediately (affects timestamps?)

PROMPT_COMMAND="$PROMPT_COMMAND; history -a"

# do not limit .bash_history file size

export HISTFILESIZE=""

# append mode (collect from multiple shells)

shopt -s histappend

# multi-line commands as single record

shopt -s cmdhist

conjectureproof · on Jan 10, 2023

+1 on Elements of Statistical Learning.

Here is how I used that book, starting with a solid foundation in linear algebra and calculus.

Learn statistics before moving on to more complex models (neural networks).

Start by learning ols and logistic regression, cold. Cold means you can implement these models from scratch using only numpy ("I do not understand what I cannot build"). Then try to understand regularization (lasso, ridge, elasticnet), where you will learn about the bias/variance tradeoff, cross-validation and feature selection. These topics are explained well in ESL.

For ols and logistic regression I found it helpful to strike a 50-50 balance between theory (derivations and problems) and practice (coding). For later topics (regularization etc) I found it helpful to tilt towards practice (20/80).

If some part of ESL is unclear, consult the statsmodels source code and docs (top preference) or scikit (second preference, I believe it has rather more boilerplate... "mixin" classes etc). Approach the code with curiosity. Ask questions like "why do they use np.linalg.pinv instead of np.linalg.inv?"

Spend a day or five really understanding covariance matrices and the singular value decomposition (and therefore PCA which will give you a good foundation for other more complicated dimension reduction techniques).

With that foundation, the best way to learn about neural architectures is to code them from scratch. Start with simpler models and work from there. People much smarter than me have illustrated how that can go: https://gist.github.com/karpathy/d4dee566867f8291f086 https://nlp.seas.harvard.edu/2018/04/03/attention.html

While not an AI expert, I feel this path has left me reasonably prepared to understand new developments in AI and to separate hype from reality (which was my principal objective). In certain cases I am even able to identify new developments that are useful in practical applications I actually encounter (mostly using better text embeddings).

Good luck. This is a really fun field to explore!

conjectureproof · on Jan 4, 2023

Lenny Baum, Lloyd Welch, and their colleagues at IDA were using the EM algorithm for code cracking well before they were able to prove anything about its convergence.

EM worked in practice, so they spent a long time trying to prove convergence. Modern proofs are simpler.

Could be the case that this method also works in practice. I haven't the faintest idea whether it will.