Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
I agree. I used Disco MR to do amazing things. Trivial to use, like anyone could be productive in under an hour.
Erasure codes are awesome, but so is just having 3 copies. When you have skin in the game, simplicity is the most important driver of good outcomes. Look at the dimensions that Netezza optimized, they saw a technological window and they took it. Right now we have workstations that can push 100GB/s from from flash. We are talking about being able to sort 1TB of data in 20 seconds (from flash) the same machine could do it from ram in 10.
https://github.com/discoproject/disco
I need to give Ray and Dask a try.
I don't know where to put this comment so I'll put it here. DeWitt and Stonebraker are right, but also wrong. Everyone is talking past each other there. Both are geniuses, this essay wasn't super strong.
If I was their editor, I would say, reframe it as MapReduce is an implementation detail, we also need these other things for this to be usable by the masses. Their point about indexes proves my point about talking past each other. If you are scanning the data basically once, building an index is a waste.