I want to love Julia. The core language is excellently designed and better than the alternatives, although the performance in practice is not nearly as good as the benchmarks would have you believe. String manipulation and building recursive data structures composed of small types (trees) are especially slow, considerably slower than Python.
The libraries, on the other hand... I know it's new, but the DataFrames.jl package in particular gave me fits. Data frames are essential tools for statistics, and there are many problems. When I last used it, it took several minutes to load modest 10MB TSV matrices, and segfaulted entirely on slightly larger ones. It doesn't support indexes on both axes, and the developers made the extremely questionable decision to require that index names be valid symbols. I could go on.
I think the core developers should exercise more control over the library ecosystem, at least for the packages that are crucial to the type of workflow they're building the language for.
FWIW, I think the DataFrames package and its dependencies have consistently operated at the boundaries of what we know how to do efficiently in Julia. The package has had lackluster performance in many contexts primarily because it adopted many idioms from R and Python that were sharply at odds with Julia's type inference system. We're starting to clear those problems up, but there are still lots of unsolved challenges we need to resolve.
If you have any ideas about how we should modify the basic data types and functions defined in DataFrames, those ideas would go a long way to making Julia a better language.
I fully appreciate that the type system imposes constraints that don't exist in Python or R. For my purposes in particular, and I think many people, I don't actually need a full-fledged data frame with heterogeneous types. What I actually want is a numeric matrix with labels on both axes and good methods for querying, group-by operations, etc. (And an equivalent numeric Series type). Big bonus for memory mapping and/or fast I/O.
I think this is an easier problem to solve, especially since factors and ordinals can be considered as a special type of numeric.
It has been too long since I've looked at the internal code structure of DataFrames.jl, but I think the biggest design flaws at the time were the requirements of index names to be symbols (probably should either be a flat String, or a choice between String and Int64), and axes on columns only. I can only assume the symbol decision was made for performance but you surely have worked with datasets given by investigators that use all kinds of random conventions for index names that don't fit the constraints of a symbol. Not to mention the very common case of numeric index names. I find it very annoying to read such a file in R and get "X1000" or whatever as my index names.
I actually tried briefly to dive in and fix the I/O problems, but the code style was daunting -- a few, very huge functions. If it hasn't been done, I would suggest breaking it up a little.
Anyway, I didn't mean to be overly critical -- I think you're doing a very important task -- but as an honest assessment of why I, as a busy scientist, found Julia to be more trouble than it was worth.
The libraries, on the other hand... I know it's new, but the DataFrames.jl package in particular gave me fits. Data frames are essential tools for statistics, and there are many problems. When I last used it, it took several minutes to load modest 10MB TSV matrices, and segfaulted entirely on slightly larger ones. It doesn't support indexes on both axes, and the developers made the extremely questionable decision to require that index names be valid symbols. I could go on.
I think the core developers should exercise more control over the library ecosystem, at least for the packages that are crucial to the type of workflow they're building the language for.