crumsort
SHOGUN
crumsort | SHOGUN | |
---|---|---|
7 | 1 | |
314 | 3,006 | |
- | 0.2% | |
3.6 | 4.8 | |
2 months ago | 5 months ago | |
C | C++ | |
The Unlicense | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
crumsort
-
Blitsort: An ultra-fast in-place stable hybrid merge/quick sort
Blitsort is a hybrid quicksort, see title.
It is slower than it's unstable brother, aptly named crumsort. https://github.com/scandum/crumsort
- Crumsort: Introduction to a new unstable sorting algorithm faster than pdqsort
- 380 points in 6 hours
- Crumsort: Introduction to a new sorting algorithm faster than pdqsort
-
Go will use pdqsort in the next release
https://github.com/scandum/crumsort claims better performance than pdqsort
-
Changing std:sort at Google’s Scale and Beyond
Any chance you could comment on fluxsort[0], another fast quicksort? It's stable and uses a buffer about the size of the original array, which sounds like it puts it in a similar category as glidesort. Benchmarks against pdqsort at the end of that README; I can verify that it's faster on random data by 30% or so, and the stable partitioning should mean it's at least as adaptive (but the current implementation uses an initial analysis pass followed by adaptive mergesort rather than optimistic insertion sort to deal with nearly-sorted data, which IMO is fragile). There's an in-place effort called crumsort along similar lines, but it's not stable.
I've been doing a lot of work on sorting[2], in particular working to hybridize various approaches better. Very much looking forward to seeing how glidesort works.
[0] https://github.com/scandum/fluxsort
[1] https://github.com/scandum/crumsort
[2] https://mlochbaum.github.io/BQN/implementation/primitive/sor...
SHOGUN
-
Changing std:sort at Google’s Scale and Beyond
The function is trying to get the median, which is not defined for an empty set. With this particular implementation, there is an assert for that:
https://github.com/shogun-toolbox/shogun/blob/9b8d85/src/sho...
Unrelatedly, but from the same section:
> Fixes are trivial, access the nth element only after the call being made. Be careful.
Wouldn't the proper fix to do the nth_element for the larget element first (for those cases that don't do that already) and then adjust the end to be the begin + larger_n for the second nth_element call? Otherwise the second call will check [begin + larger_n, end) again for no reason at all.
What are some alternatives?
fluxsort - A fast branchless stable quicksort / mergesort hybrid that is highly adaptive.
mlpack - mlpack: a fast, header-only C++ machine learning library
awesome-algorithms - A curated list of awesome places to learn and/or practice algorithms.
Caffe - Caffe: a fast open framework for deep learning.
awesome-theoretical-computer
mxnet - Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
awesome-theoretical-computer-science - The interdicplinary of Mathematics and Computer Science, Distinguisehed by its emphasis on mathemtical technique and rigour.
Dlib - A toolkit for making real world machine learning and data analysis applications in C++
combsort.h - optimized combsort macro
vowpal_wabbit - Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
go - The Go programming language
xgboost - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow