Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Congratulate on releasing. Well done!

A few questions:

1. Will secondary indices be ever supported? Range scan with a different order than the primary key is very welcomed. E.g. date range query.

2. Do you support conditional update? Or any kind of optimistic locking or versioning to coordinate concurrent updates from different clients?

3. Related to 2. How can loosely-sequential Id be generated using a table?

4. Will some transaction support be added? Don't need full ACID, just grouping updates (intra-table and/or inter-tables) in one shot would be nice. Should be feasible with MVCC already in place.

5. Do all the clients hit a central server to initiate queries which then farms out the requests to different shards? Or the client library knows how to get to different shards directly? First case has a single-point-of-failure, and bottleneck in scaling.

6. Do you support automatically re-balancing of shard data (data migration) when new shards are added or old ones retired?

7. How are authentication and authorization done? Or any clients can come in?

8. Internal detail. For out-of-date distributed query on the slave replicas, is there a cost-based (or load-based) decision process to pick the most idle replica to do the sub-query?

9. Internal detail. Do you use Bloom Filter to optimize distributed joins?



1. Yes. It's a matter of doing this right, which will take some time.

2. Yes. There is no special command, you just combine update and branch (http://www.rethinkdb.com/api/#py:control_structures-branch) Here's an example in Python:

  r.table('foo').get(5).update({ 'bar': r.branch(r['baz'] == 0, 1, 2)})
This will set attribute bar to 1 if baz is 0, or to two 2 otherwise. Everything is atomic on that document.

3. Currently the server doesn't support a sequential (or even loosely sequential) id autogeneration. You'd have to do that on the clients, but using a timestamp for example.

4. I don't know yet how to do this really efficiently. It's relatively easy to do on a single shard, but cross-shard boundaries make this really hard.

5. Any client can connect to any server. The server will then parse and route the query. There is no central server, everything is peer-to-peer. The client library doesn't know about multiple servers now, so responsibility is on the user to hit a random server. Alternatively you can run "rethinkdb proxy" on localhost and connect the client to that. The proxy will then route queries to proper nodes in the cluster.

6. In the web UI, if you click on the table and reshard, everything will be rebalanced. You don't even have to add or remove shards, it'll just rebalance data for the number of shards you have. The UI has a bar graph with shard distribution, so you can see how balanced things are.

7. Currently there is no authentication support - we expect users to use proper firewall/ssh tunneling precautions.

8. Yes, that's how queries get routed. Currently this isn't very smart, but it will get much better over time. If something breaks for you performance-wise, just reach out and we'll fix it.

9. No, not yet. If you run eq_join on a small subset of the data (99% of OLTP workloads) it will be very fast. Other joins work ok, but there's A LOT of room for optimization.

Phew!


Thanks for your and jdoliner's detail answers! Hope I didn't ask too many questions. :) I'll respond to both here.

For 2 and 3, I think I didn't make it clear. Let me clarify. A common db problem with multiple clients is dealing with concurrent update on the same piece of data. E.g both client1 and client2 read D as D=15 at the same time. Client1 adds 1 to D as 16 and saves it. Then client2 adds 1 to D as 16 and save it as 16, which is wrong. It should be 17.

Conditional update is one feature db usually provides to let clients deal with this problem, i.e. the update would only go through if certain condition is met otherwise abort. Update D=16 if D==15. Client1 would succeed while client2 would fail, where it can retry the whole read-increment-update cycle again with the new read value.

The litmus test to see if a db system can handle this problem is to try to implement a sequential Id generation feature run by multiple clients at the same time.

For 8, if the query is parsed into a query execution plan, you can ship the plan to all equivalent replicas to ask them to estimate the execution cost based on their current load. After they reply, pick the lowest cost one and send the execute command. Even a simple approach of asking for machine load of all replicas and picking the lowest one could have adaptive utilization of all the servers.

For 9, Bloomer Filter is a relative simple technique that can dramatically reduce the amount of data to ship across peers to do join. You basically filter out the vast majority of the non-matching data before shipping.

It's a good start. Good luck going forward!


Your exemple of conditional update can be addressed using atomic update:

r.table('tv_shows') .filter({ name: 'Star Trek TNG' }) .update({ episodes: r('episodes').add(1) }) .run()

http://www.rethinkdb.com/docs/advanced-faq/#atomic


I think the atomicity model here works like a transaction on the whole document, where all the changes to the attributes of a document are updated all at once.

The scenario I described has to do with read-consistency, where the value read by a client should not be changed during the time of the read and the time of the update. The usual way of handling it was to take a write lock for the duration to prevent update from others but that degrades concurrency. The other way is to do optimistic lock (or conditional update) to allow the client to detect change during the time and retry with the new value.


My point was that you don't have to do that with rethink because the entire query gets executed on the server. You don't have to take the value down to the client, make the change, and then send it back. The entire update gets evaluated on the server and the server handles atomicity in various ways (depending on the query).


That approach would only work if all the logic to compute the update can be expressed in the update query. It will break down if the read-eval-update cycle involves the client. There are many scenarios involved the clients.

E.g. the client reads a value, displays to the user, gets input from the user which is based on the old value, and stores the updated value. If another user doing the same thing has already changed it, the client would like to know that and let the user retry, with the new current value.


I think you, ww520, have a very well point here and I'm also interesten in what RethinkDB can offer for this very usage scenario. From what I read from the ReQL command reference there it should be possible to do something like:

  r.table('foo').get(5).update({ 'bar': r.branch(r['baz'] == 0, "foo", r.error("invalid baz!"))})
have not tested it, but this is how I understand it...


Do you think something like the following should work with RethinkDB?

  r.table('foo')
   .get(5)
   .update({
     '_rev': r.branch(r['_rev'] == 5,
       r('_rev').add(1),
       r.error("invalid revision")
     ),
     'name': "awesome name"
   })
the basic idea is that `name` should be update to "awesome name" and `_rev` should be incremented by 1, but only if `_rev` is 5, otherwise an "invalid revision" error should be thrown.


That would work. I didn't realize you can raise error on the row. Good work!


Yes, this will work.


awesome, thanks!



Well any product release is a huge effort, especially database product. Things got done and pushed out of door. Congratulation well deserved.


> 1. Will secondary indices be ever supported? Range scan with a different order than the primary key is very welcomed. E.g. date range query.

Secondary indices are one of the most asked for features so they'll probably be added in the next release. No promises though secondary indices are tough to do right and we won't ship them if they're not great.

> 2. Do you support conditional update? Or any kind of optimistic locking or versioning to coordinate concurrent updates from different clients?

Updates can be done with conditions on the row. For example: table.filter(lambda x: x['age'] > 25).update(lambda x: {"salary" : x["salary"] + 25)

> 3. Related to 2. How can loosely-sequential Id be generated using a table?

Loosely-sequential IDs would have to be generated client side for now.

> 4. Will some transaction support be added? Don't need full ACID, just grouping updates (intra-table and/or inter-tables) in one shot would be nice. Should be feasible with MVCC already in place.

Eventually. No concrete timeline for this right now though.

> 5. Do all the clients hit a central server to initiate queries which then farms out the requests to different shards? Or the client library knows how to get to different shards directly? First case has a single-point-of-failure, and bottleneck in scaling.

A client makes a connection to a specific server and all queries go through that server. However every server can file this role so connections can be distributed and there's no single point of failure. An even better option is to run a proxy on the same machine as the client. For more info run:

rethinkdb --help proxy

> 6. Do you support automatically re-balancing of shard data (data migration) when new shards are added or old ones retired?

Right now sharding is a manual process. You tell the server how many shards you want and it handles figuring out how to evenly split the data, picking machines to host them and getting the data where it needs to go. What it doesn't do is readjust the split points when the data distribution changes. This will be a feature in RethinkDB 1.3.

> 7. How are authentication and authorization done? Or any clients can come in?

RethinkDB has no authentication built in to it. You should not allow people you don't trust to have access to it.

8. Internal detail. For out-of-date distributed query on the slave replicas, is there a cost-based (or load-based) decision process to pick the most idle replica to do the sub-query?

Right now we just select randomly. This is slated as a potential upgrade for 1.3. Especially if it proves to be a problem for people. Thus far it hasn't been for us in profiling runs but this is the type of problem that's more likely to show up in real world workloads.

9. Internal detail. Do you use Bloom Filter to optimize distributed joins?

We do not currently use bloom filters to optimize this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: