I would discount the .c code. Rubinius's native code (excluding third-party code) is largely C++, so better to focus on .cpp and .hpp files.
The .rb count is way higher than what I got. The line counts I got (on master, not hydra):
vm / {cpp,hpp}: 162kloc
kernel / rb: 31kloc
lib / rb: 145kloc
Oddly enough the total ratio is roughly what you came up with, but I'm not sure where your numbers came from.
It's worth noting that the density of .rb source is considerably higher than that of .cpp source; a few lines of .rb can do what might take dozens or hundreds of lines of .cpp. I wouldn't fault them based on LOC; Rubinius is still implemented more in Ruby than any other implementation out there.
Let's say .rb is conservatively worth 10LOC of C++. That puts them at more like 10% "not Ruby", going by the amount of work they get done in each language. Even if you only consider kernel / rb versus vm / {cpp,hpp}, the 10X ratio means they're less than 1/3 "not Ruby". That's pretty good.
I just counted everything including stdlib and whatnot (all that comes in the checkout). "Rubinius is still implemented more in Ruby than any other implementation out there." well that's probably true, but not really useful. If all ruby implementations had 0 lines of Ruby and Rubinius 1 would you call it "mostly written in ruby"?
10X on LOC count is bogus, just because it's a single number. It does depend on what you do, it might end up being that, being more or being less, essentially [citation needed].
Anyway, even assuming this odd coefficient, I wouldn't say 1/3 "not Ruby" is "pretty good", although "pretty good" is not a precisely defined term, so we're free to agree to disagree on that. Given the same assumptions you can say CPython is written mostly in Python (10x Python-vs-C, including all the library code).
* stdlib is library code, not core VM code. It shouldn't be counted.
* Percentage of functionality of Rubinius's core implemented in Ruby is definitely higher than percentage of functionality implemented in C++, at least for the core Ruby classes. Hard to quantify that though.
* Rubinius's VM is basically all C++. I will concede that.
Awesome, the great thing about rubinius is playing around with the internals, since they're mostly written in ruby. The C++ core is quite tiny actually.
Obviously great news. It dampens my enthusiasm a bit to know the same company funds development of JRuby however.
EngineYard really should clarify their vision behind maintaining two VMs. Going beyond the obvious "JRuby is for integration with existing J2EE infrastructure" angle would be nice.
Under what circumstances would they actually stand behind Rubinius as the better choice? Is the MRI C extension API compatibility an official feature now? Is it that I could switch from REE to Rubinius without having to worry about finding alternatives for all the libs I'm using that carry C-extension baggage?
This is confusing. JRuby is a great and important project. It's how a several very large Rails apps ship deployable versions of their product; it's the easy way to get a Rails app deployed in shops where the normal mechanism for getting deployed is "submit a war file", and it's how you get Ruby access to a zillion Java libraries.
Rubinius is a native implementation of Ruby that seems, over the long term, to have a goal of being the best single implementation of the language all other things being equal.
Why would I ever want to penalize a company for funding both of these awesome projects?
Microsoft clearly differentiates versions of SQL Server. That's all I'm asking for.
If JRuby performance continues to match or exceed Rubinius performance, but JRuby also gives me deployment flexibility and Java OSS integration options Rubinius doesn't, why would I choose Rubinius for my next green-field project?
That's not an accusation. It's a serious question.
I think Evan's answer on the Ruby C API is a pretty compelling reason on it's own. For businesses that don't have an easy JRuby migration path because of dependencies on multiple gems with C-extensions, maybe Rubinius is a much easier way to get into the next JRuby/Rubinius/YARV Ruby performance bracket.
That can be very compelling on it's own. What are some other reasons I might consider Rubinius?
Given the down-votes I guess I'm in the minority, but I think it's a fair question.
I was an early supporter of Rubinius, and put my money where my mouth was on more than one occasion with cash donations to the project.
I get that it's very neat from an academic standpoint. No argument here. Is it something EngineYard is going to stand behind and just as importantly, start marketing so I get a good feel for the practical uses?
For my needs, jruby will never suffice as java isn't going to be installed on the servers I'm on. Nor to be honest would I want it, and JNI for c extensions is also a nonstarter. We have puppet at work for server deploys, and well, jruby may suffice for web/app server deploys that just run rails. But not all ruby users actually give a rats about rails (that's me, don't use it at all at work), nor do we have a need for java interop. Basically I'm in the exact opposite of your vantage.
I need c bindings for a few gems I made that need c ffi bindings, I don't need/care/use any java interop, and personally, I'd rather not have the jvm running on any of my servers.
I look at rubinius as the eventual YARV for the compiled version as YARV was to MRI, not as an academic project. But I admit, I seem to be a ruby outlier ever since rails has been around.
Your reluctance to install Java on the servers you're on means you're missing out on one of the best Ruby implementations available. You'd be well served by putting aside your prejudice and just giving JRuby a try.
That said, there's certainly domains where JRuby is not a great fit. C extension-heavy systems are one; because C extensions are so crippling to more modern VMs like the JVM or Rubinius, you're better off using the implementation the C API was designed for (MRI). Or if Rubinius suits your needs and C extensions work well enough, I'll recommend Rubinius over JRuby any day. I hate C extensions, or at least I hate them in the form they exist in MRI, since they're invasive and damaging to modern VM features like good GC and concurrent threads.
In any case, it's worth seeing if FFI (the Ruby API that Rubinius pioneered and JRuby made generally available) serves your purposes, because it works extremely well in JRuby and requires no C extension support. And I'd suggest you actually give JRuby a try; it's so much more than "Ruby for Java interop" and may work better as a "plain old Ruby impl" than you expect.
I just used JRuby FFI to implement a native x86-64 debugger on WinAPI; we had x86 working for a long time on Win32, Linux, and OSX using MRI, and JRuby was actually the only way to get around LP64 issues with MRI. JRuby has been a compat and native code win for us; it is a better platform for accessing C code than MRI Ruby is.
Huge fan of FFI (I use it to wrap/test all my shared libraries, mysql ext, etc..), but the problem of cross platform compilation still exists. I write a shared library and an FFI interface, but packaging it for use on Windows/*nix is still a pain (for compilation during install). Is there anything that solves for compilation across implementations and platforms? (FFI makes the interface simple after that)
I don't think we should have to sign up for coupling with a specific Ruby's runtime internals just to get cross-platform compilation of a shared library. For whatever it's worth, even autoconf can be made to work cross-platform.
Yeah, for that I don't have a solution. It seems like there ought to be a way to have FFI libraries bring along the lib they need and compile it right there, but portable C libs are still a bitch to manage across platforms no matter what you do. I hate to say it, but JVM libraries have a huge advantage here; most JRuby-specific libs that ship a jar file just work on any platform with no recompile and no portability issues. Hard to beat that :(
That's great to hear! Given that FFI is intended to work across all Ruby impls, I wish there were more attention paid to using FFI instead of C extensions. I will admit there are some usability problems, especially around cross-platform struct mapping, but the resulting libraries are vastly easier to deal with than a bunch of grotty Ruby C API code.
JRuby's FFI implementation is great, but I've had about as much success with MRI, so I test my FFI libs on MRI 1.8.7, 1.9.2 and the latest JRuby. I'm looking forward to the day when Rubinius' FFI converges with the other implementations.
I don't use Rails. I see the JVM as a tool that gives me access to code I might find useful, and web-servers that are (IMO) 1000X superior to anything available to MRI. That and much better performance than REE anyways.
If you need C-extensions, well, there you go. Rubinius as a better MRI I totally get/agree with.
I think we agree on that as a good reason for the season for Rubinius.
It'd just be nice for EY to put up a "Why Rubinius?" page. And further, a "Which VM best targets my project? JRuby or Rubinius?" page would make me very happy. I can't imagine it could hurt the adoption of either.
C ext support shipped in an "experimental" form in JRuby 1.6. The main problem is that C extensions suck in so many ways:
* Can't have multiple VMs in process, because C exts load all their state into C globals
* Can't run them concurrently, because they can't be trusted not to segfault
* Don't participate well in modern GCs, so they have significantly higher call overhead in JRuby or Rubinius than they do in MRI
FFI is a better option in almost every case, although it is admittedly a much trickier path to cross-platform library binding. The best thing the Ruby community could do to move Ruby forward is to either come up with a better C API or start using FFI for everything. C exts suck.
> What are some other reasons I might consider Rubinius?
JRuby uses UTF-16 as its internal string encoding, so Rubinius is going to use a lot less memory. It can also boot its VM more quickly since the client hotspot JVM has been largely neglected in favour of optimizing JIT for long-running processes. I would use JRuby for sure by default, but there may be places where these two things would disqualify it.
Rubinius development also feeds into JRuby--I believe JRuby uses some of Rubinius's standard library (or at least it used to), and rubyspec came from Rubinius as well.
JRuby uses exactly the same in-memory format as Ruby, byte[]. We also support the Encoding logic from Ruby 1.9, again atop byte[]. JRuby actually has better String performance and encoding support than Java itself in many cases.
Rubinius boots negligibly faster than JRuby; for simple scripts, JRuby takes around 0.5s on my system versus 0.35s for Rubinius. All implementations that JIT to get to full speed will have startup issues.
JRuby does not use any of Rubinius's standard library. We do both share most of MRI's standard library, however.
As for RubySpec; Rubinius started it, and JRuby is a very heavy contributor to the project. However JRuby also runs many other Ruby test suites, so we have not been as dependent on filling out RubySpec.
We do both share implementation ideas and advocate for each other in debates with ruby-core. Hopefully in the future we can work together on adding new standard APIs as well (rather than unilaterally adding them to one implementation or the other).
I shouldn't be surprised, but my knowledge of the ruby world seems to be about two years out of date. I suppose any remaining additional memory usage is probably due to inheriting space/speed tradeoffs straight from upstream hotspot.
It's great to hear that the startup time has improved so much.
Actually a lot of the process size in JRuby is the fact that we need to set JVM's max size high for rare cases, but then the JVM happily grows to fill much of that size. If you choke it down to a smaller size, we're competitive with at least generational-GC impls like Rubinius (but still much larger than conservative-GC impls like MRI).
I'm not sure if what I'm thinking is correct (or even technically possible) - but couldnt it be possible that what they are envisioning is the Rubinius architecture being the one true Ruby and having two backends : C-Rubinius or Java-Rubinius.
What it would mean is that the C-Rubinius and Java-Rubinius releases be simultaneous: you can choose to pick one or the other based on your surrounding stack.
This would mean that they fund both Jruby and Rubinius till they can be merged.
> JRuby as a gateway drug to the Land of Ruby; _could_ end up being VM winner too.
I don't get the "could". JRuby is among the fastest, gives you access to a massive amount of OSS and infrastructure, and fixes Ruby's threading.
It's done these for years at this point.
In my mind the question is more along the lines of "why should I use anything _but_ JRuby on a new project?".
It's blasphemy I'm sure, but I just don't see the point in C-Ruby for new development at this point. It's legacy. JRuby, as the last-man-standing among the major Ruby VM implementations now that IronRuby is dead and everything else is vapor (Maglev) or niche (MacRuby) is the default modern Ruby implementation to my mind.
So let's assume Rubinius is great, and the GIL is clearly a major step forward. Is EngineYard's intent really to supplant YARV with Rubinius?
Because that sounds great to me.
I've never seen EY say that though (they might have and I just missed it). If that's the vision, I'd rather see them come out and say it. Having developers guess, and having any ambiguity at all around the future of these projects and the messaging surrounding them doesn't inspire confidence.
It's blasphemy in that your use of JRuby is inconsequential to rubinius. Not everyone uses JRuby, and the non JRuby using world would like the GIL thread locking abilities in their interpreters without java+JRuby.
Native c/c++ ruby isn't legacy in everyone's environments, just perhaps yours. Why does EngineYard need to justify both implementations? I don't use JRuby at all outside of occasional curiosity. But I'm also not calling it useless in the same way you're basically attacking any non JRuby ruby implementation, which apparently includes MRI. Which is arguably the "default" implementation. Language implementations aren't a zero sum game. Work on one doesn't mean we can't have another. I don't see the Python people complaining about PyPy and vice-versa, they both serve their purposes.
Well put. JRuby and Rubinius both have rich futures ahead of them, and they're going to be largely separate.
If you want to use C extensions, Rubinius will be a better fit since it matches the MRI process model better. Rubinius will also map more directly to underlying OS APIs and quirks than JRuby, since we have to either use the JVM's vanillified versions of APIs or do the hard work of routing around the JVM with our own native libraries. And if you're a Rubyist looking to hack on internals, Rubinius is certainly more approachable right now. We hope to implement more of JRuby in Ruby in the future, but we'd need a really good reason to throw out fast, working code to make a move right now.
If you want to run an entire Rails site in a single process, JRuby's going to be the best option for a long time. C extensions required for native Ruby impls to interface with databases, etc, will always force trade-offs in memory management, concurrency, and process models. Those trade-offs do not exist when running on JRuby and using Java equivalents of those libraries. As a rule of thumb, the more unmanaged code you have in your application, the more hassles you're going to have. Native impls do almost all their interaction with the outside world via unmanaged code (C exts for DB access or fast memcaching, C-accelerated servers like Thin or Unicorn, etc). That's a problem.
If you just want to run some Ruby code, both implementations are going to serve you well. You're going to need to evaluate all options before deciding, and both implementations are going to be excellent Ruby implementations first and differentially interesting platforms second. And therefore, as you say, it's not a zero sum game. We'll all bring different features and weaknesses to the table, and users will choose which impl fits their needs best.
JRuby is among the fastest, gives you access to a
massive amount of OSS and infrastructure, and fixes
Ruby's threading.
Being faster than MRI on some tests doesn't say much because from a performance standpoint MRI is a piece of shit.
If you're looking for a good reason for why Rubinius is important - JRuby cannot innovate and as Ruby will see progress, JRuby is bound to stagnate.
For example, continuations are really useful in many contexts, yet JRuby has no chance to actually implement continuations with decent performance overhead. Also, the Refinements proposal for Ruby 2.0 is a bitch to implement on JRuby. Even the Fibers in Ruby 1.9 come with a lot of overhead.
The reason for that is because JRuby inherits all the JVM's flaws. The JVM was designed for Java. The optimizations it makes function best in the context of Java. The type system is the one from Java and working around it is extremely painful. The garbage-collector itself is designed for Java. It has rough limitations on what it allows and what it doesn't allow. And when I'm saying Java, I'm saying Oracle's Java SE, or OpenJDK, not Android, not Java ME. And considering that Apache Harmony does not have the same JIT-ing characteristics of Oracle's Java SE, while I haven't seen any benchmarks I'm willing to bet the performance is much worse.
JRuby may shine among the current bunch of VMs, but that doesn't say much, because on top of the JVM in its current incarnation (including the invokedynamic additions) you'll never be able to achieve the performance Smaltalk was capable of.
If you care about Ruby as a language, caring about a VM like Rubinius is a no-brainer.
How can you say that JRuby can't innovate? JRuby is far from stagnating.
Continuations are almost never used, and in the rare cases where they are used they are almost always more straightforwardly replaced by non-continuation alternatives. FWIW, Rubinius doesn't support continuations either. It may some day...and so may the JVM.
Refinements will be easy to implement in JRuby. I've implemented prototypes of it in hours of effort. You don't know what you're talking about.
Fibers perform well in JRuby, but necessarily are implemented with native threads. There's work to add full VM-level coroutines in Java 8, and you can build versions of OpenJDK with coroutine support already. Once that's there, we will have substantially faster Fibers than Ruby 1.9.
JRuby supports all of Ruby's class structures and features atop the JVM. The JVM's type model does not hinder us implementing Ruby's class model.
The JVM optimizes calls; Ruby makes lots of calls. The JVM does an outstanding job of optimizing Ruby code in JRuby, and invokedynamic is already making small Ruby benchmarks in JRuby run within a few * the speed of an equivalent Java implementation. That's incredible. I've actually sat and read the assembly code Hotspot outputs and worked steadily to remove as much overhead as possible, with or without invokedynamic. I guarantee you the JVM is going to optimize Ruby like gangbusters. You don't know what you're talking about.
I'm not sure what Smalltalk performance you're talking about. No Smalltalk implementation has managed to approach the performance of current JVMs, and JRuby is starting to approach the performance of Java when using invokedynamic.
If you care about Ruby as a language, supporting many implementations and caring about all of them is a no-brainer. Slagging off one implementation you obviously have no correct facts about shows you don't care about Ruby as much as you harbor bigotry against one implementation.
Headius, you're awesome and I'm honored that you replied to my stupid rant.
My reply was meant for people that questioned a project like Rubinius when there is JRuby around. And you're right, I only have a very high-level overview of the problems you're encountering, mostly being just educated (or maybe not) guesses.
The reason for my rant is that Rubinius is awesome too and I would rather see Rubinius replace MRI as THE reference implementation - since it's a more flexible environment built from scratch, people will be able to experiment with features more easily.
I do however object to one thing you said - continuations are awesome and I'm sad they got pulled from Rubinius.
The MRI C extension API compatibility is very much an official feature. Except for a few gems we can't make work (because they tie directly into 1.8's execution model) most work fine. If you have one that doesn't work, please just open an issue and we'll figure it out!
Been a long time coming, with a lot of blood sweat and tears on your part, but it looks like Rubinius2 is the project I was lusting after those years ago, back when it looked like Sydney was the fix for what ailed Ruby. ;-)
Is the MRI C extension API compatibility an official
feature now?
Yes, because there's lots of code out there that's targeting that API and you don't want to throw good, stable and tested code away.
without having to worry about finding alternatives
for all the libs I'm using that carry C-extension
baggage
It's not really baggage that you don't want. It's only baggage in the sense that backwards-compatibility makes forward progress harder, but if you can come up with a way to maintain compatibility with an emulation layer of some sort, than that's the best of both worlds.
EngineYard really should clarify their vision behind
maintaining two VMs
It's not EngineYard's business to clarify their vision behind these 2 VMs, as both projects were started outside of EngineYard. This company is doing the Ruby community a favor by paying developers for working on these 2 projects, but the projects themselves would go on without EngineYard.
As a Ruby developer, it is important for me to have options on deployment and also support for using large C/C++ libraries (e.g., Redland RDF) or large Java libraries (e.g., machine learning, Sesame RDF, etc., etc.)