Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Notes from the fourth RISC-V workshop (lowrisc.org)
103 points by legulere on July 15, 2016 | hide | past | favorite | 19 comments


It's cool to hear work is being done on a formal memory model. The model described by the current spec basically says "TODO it's going to be weak".


Start-up idea: Customizing RISC-V ASIP design specifically for machine learning would be great.


I'm a co-founder of the lowRISC project (and author of the linked article). Our mission is precisely to make this sort of startup more feasible. We are a not-for-profit project aiming to be the "Linux of the hardware world". The hope is to increase the ability of startups, researchers, and others to innovate by taking our complete known working open source SoC as the starting point for their design. For instance, starting with the lowRISC design and adding a neural network accelerator.

An SoC capable of running Linux should be the starting point for these sorts of projects rather than something that requires substantial investment (time/money) to recreate and verify.


If you're the co founder, how can I grab a datasheet on this chip? I'm getting into EE and this seems like a cool chip to mess with.


It definitely will be a cool chip to mess with but is not yet complete. At the workshop, Wei Song presented our latest work on adding trace debug features (which will be properly introduced on the blog soon). We expect to be RTL complete early next year and to tape out an initial test chip later in 2017.


Do you have any initial specs that you see on the horizon (whatever you're targeting)?

Some things I'd like to know about are temperature requirements, power requirements, IO, if there is an on chip ADC & DAC, clock specs with temp requirements, and radiation hardness.

Getting even preliminary information on that stuff would be amazing!

Also, is the chip design going to be open source?


Target: 28nm, 4x 1GHz+ application cores, ~8 minion cores (see http://www.lowrisc.org/docs/memo-2014-001-tagged-memory-and-...), tagged memory, USB, LPDDR3. For more detailed specs, that's going to depend on confirming those details and the package design.

Our aim is open source to the RTL. Open analog IP (e.g. the DDR PHY and USB PHY) is a much more difficult proposition - it's process specific, and reliant on NDAed/trade secret details of the particular process making redistribution difficult. Therefore, we will use existing commercial PHYs, and it may make sense initially to use the standard proprietary controller it has been verified against. Over the long term, as I say, we'd like all RTL to be open but you've got to start somewhere. I think there's a parallel to the early days of GNU where they worked on replacing UNIX components piece by piece. It would be great if we could share the place+routed design at zero cost to anyone with the appropriate agreement with the fab.


How does tagged memory interact with standard LPDDR3? Do you need two chips? Are the tags aliased somewhere in regular memory DDR?


We currently store tags in a separate region of physical memory, though you could imagine "borrowing" ECC bits. A tag cache (which due to the small size of tags can have a large reach) reduces the number of instances where you might have read or write multiplication. We're about to start work on further optimisations on this.


The really cool thing is that existing implementations of RISC-V microprocessors, like what Berkeley provides, rocket (https://github.com/ucb-bar/rocket) and boom (https://github.com/ucb-bar/riscv-boom), and lowRISC uses and modifies have a dedicated accelerator socket.

We've have good luck developing software/hardware infrastructure for machine learning accelerators and interfacing an example multilayer perceptron backend with this infrastructure (with Linux integration ongoing). This work (shameless plug: https://github.com/bu-icsg/xfiles-dana) may provide a starting point, examples, or general guidance for developing machine learning or other accelerators in this space.

But, I can't overemphasize how the ongoing work at Berkeley and lowRISC have facilitated this process.


combining two buzzwords does not a business make...


How familiar are you with this current bubble.


It makes a pitch deck and possibly an exit.


Writing an assembler for AMD's GCN ISA would roughly accomplish the same task, except your processor would actually be fast.


This is incorrect. With custom ASIC that implements ultra-wide low precision SIMD arithmetic (f16, i16, i8) it is possible to squeeze an order of magnitude or more speedup over conventional GPGPUs using same area and power. GPGPUs have to have big f32 (and even f64!) FPUs, and then some additional 3d-specific hardware overhead, that's why they are still suboptimal for deep machine learning. Google's TPU and Nervana's similar ASIC illustrate this point.

Also you don't need the management cores to be fast, it suffices to have lots of GHz+ cores controlling ultra-wide SIMD (as Intel's Xeon Phi has shown).

Arguably, one could get another order of magnitude speedup by implementing XNOR-net (http://arxiv.org/abs/1603.05279) hardware accelerator, but it hasn't been done yet.



That's a good idea, TPU and Nervana show that there is a large potential in such accelerators.


Does anyone know if there will be video, audio, or slides made available?


Yes, the RISC-V Foundation will be making slides+video available. Watch http://riscv.org for updates. Rick O'Connor (Executive Director of the Foundation) suggests we might see them the end of next week or early the week after. Slides may appear sooner.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: