Top 23 Python Benchmark Projects
-
Project mention: How to produce data visualizations like this? | reddit.com/r/computervision | 2022-01-05
-
Do you absolutely need A3C? A2C has become more widely used (see, e.g., the comment in https://github.com/ikostrikov/pytorch-a3c, and the fact that both https://github.com/thu-ml/tianshou and https://github.com/facebookresearch/salina have A2C implementations, but no A3C at first glance).
-
Scout APM
Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.
-
Project mention: Fastest way to calculate distance (drift) between vectors - at scale (billions) | reddit.com/r/mlops | 2022-05-11
-
Project mention: Textbook or blogs for video understanding | reddit.com/r/computervision | 2022-01-25
No book or blog, but a great framework: https://github.com/open-mmlab/mmaction2
-
Project mention: [P] Object detection framework : Detectron2 VS MMDetection | reddit.com/r/MachineLearning | 2021-09-29
The [MMLab key point detection](https://github.com/open-mmlab/mmpose) is in a separate repo from detection.
-
scikit-image, the project that commissioned this task, uses Airspeed Velocity, or asv, for their benchmark tests.
-
Project mention: [D] Python plot package commonly used in RL papers | reddit.com/r/MachineLearning | 2021-08-09
-
SonarLint
Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.
-
-
Project mention: Hello, I created a interpreted dynamic programming language in C#. I use a bytecode compiler and a vm for interpretation. Right now I'm trying to optimise it. Any help would be great! | reddit.com/r/ProgrammingLanguages | 2022-03-19
There are some standard benchmarks like fannkuch, deltablue, and so on (see a bunch for Python here) that you can port to your VM. They have adjustable values that you can raise or lower to increase or decrease the amount of time you take.
-
tape
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (by songlab-cal)
Project mention: ProteinBERT: A universal deep-learning model of protein sequence and function | reddit.com/r/bioinformatics | 2021-05-30We evaluated based on downstream tasks (multiple supervised benchmarks, including 4 from TAPE), not the LM performance.
-
Your list excludes most of well-known open-source AutoML tools such as auto-sklearn, AutoGluon, LightAutoML, MLJarSupervised, etc. These tools have been very extensively benchmarked by the OpenML AutoML Benchmark (https://github.com/openml/automlbenchmark) and have papers published, so they are pretty well-known to the AutoML community.
Regarding H2O.ai: Frankly, you don't seem to understand H2O.ai's AutoML offerings.
I'm the creator of H2O AutoML, which is open source, and there's no "enterprise version" of H2O AutoML. The interface is simple -- all you need to specify is the training data and target. We have included DNNs in our set of models since the first release of the tool in 2017. Read more here: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html We also offer full explainability for our models: https://docs.h2o.ai/h2o/latest-stable/h2o-docs/explain.html
H2O.ai develops another AutoML tool called Driverless AI, which is proprietary. You might be conflating the two. Neither of these tools need to be used on the H2O AI Cloud. Both tools pre-date our cloud by many years and can be used on a user's own laptop/server very easily.
Your Features & Roadmap list in the README indicates that your tool does not yet offer DNNs, so either you should update your post here or update your README if it's incorrect: https://github.com/blobcity/autoai/blob/main/README.md#featu...
Lastly, I thought I would mention that there's already an AutoML tool called "AutoAI" by IBM. Generally, it's not a good idea to have name collisions in a small space like the AutoML community. https://www.ibm.com/support/producthub/icpdata/docs/content/...
-
Project mention: Domain adaptation text recognition/OCR dataset (MSDA) and benchmark: Multi-source domain adaptation dataset for text recognition | reddit.com/r/u_Mammoth_Grade_6875 | 2021-09-27
Codes address
-
CARLA
CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms (by carla-recourse)
Project mention: [R] CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms | reddit.com/r/MachineLearning | 2021-09-29Abstract: Counterfactual explanations provide means for prescriptive model explanations by suggesting actionable feature changes (e.g., increase income) that allow individuals to achieve favourable outcomes in the future (e.g., insurance approval). Choosing an appropriate method is a crucial aspect for meaningful counterfactual explanations. As documented in recent reviews, there exists a quickly growing literature with available methods. Yet, in the absence of widely available open–source implementations, the decision in favour of certain models is primarily based on what is readily available. Going forward – to guarantee meaningful comparisons across explanation methods – we present CARLA (Counterfactual And Recourse Library), a python library for benchmarking counterfactual explanation methods across both different data sets and different machine learning models. In summary, our work provides the following contributions: (i) an extensive benchmark of 11 popular counterfactual explanation methods, (ii) a benchmarking framework for research on future counterfactual explanation methods, and (iii) a standardized set of integrated evaluation measures and data sets for transparent and extensive comparisons of these methods. We have open sourced CARLA and our experimental results on GitHub, making them available as competitive baselines. We welcome contributions from other research groups and practitioners.
-
Project mention: [D] Synthetic data generation techniques for data privacy | reddit.com/r/MachineLearning | 2022-02-15
I would suggest starting with "differentially private synthetic data generation". These methods utilize differential privacy and mostly protect against membership inference attacks, are very popular in the ML/DL community. I would also suggest reading up on privacy preserving ML methods in general and adversarial attacks against them (membership inference, inversion, reconstruction, property inference), but if you're keen on reading some code, check out sd-gym: https://github.com/sdv-dev/SDGym. The authors have collected implementations for a lot of PPSDG methods. Also I strongly suggest reading McMahan's 2016 paper: https://arxiv.org/abs/1607.00133.
-
Project mention: We created the most comprehensive benchmark datasets for federated learning to date! | reddit.com/r/MachineLearning | 2021-11-19
-
fastero
Python timeit CLI for the 21st century! colored output, multi-line input with syntax highlighting and autocompletion and much more!
Project mention: Fastero – Python timeit CLI for the 21st century | news.ycombinator.com | 2022-04-29 -
python-benchmark-harness
A micro/macro benchmark framework for the Python programming language that helps with optimizing your software.
Project mention: Sprucing up my read me for my Python project | reddit.com/r/learnpython | 2022-02-01 -
-
ModelNet40-C
Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296
Project mention: [R] NEW Robustness Benchmark for 3D Point Cloud Recognition, ModelNet40-C. "Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions" | reddit.com/r/MachineLearning | 2022-02-08Code for https://arxiv.org/abs/2201.12296 found: https://github.com/jiachens/ModelNet40-C
-
We spent a lot of time optimizing every benchmark. Here's what creator of sqlalchemy said about our code: https://github.com/edgedb/imdbench/pull/46 (and then he proposed improvements). We'll surely take a look at prefetch_related.
-
compiler-benchmark
Benchmarks compilation speeds of different combinations of languages and compilers.
One synthetic benchmark I saw recently: https://github.com/nordlow/compiler-benchmark
-
ORBIT-Dataset
The ORBIT dataset is a collection of videos of objects in clean and cluttered scenes recorded by people who are blind/low-vision on a mobile phone. The dataset is presented with a teachable object recognition benchmark task which aims to drive few-shot learning on challenging real-world data.
-
Python Benchmark related posts
- Fastest way to calculate distance (drift) between vectors - at scale (billions)
- How slow are ORMs, really?
- How slow are ORMs, really?
- IMDBench – Benchmarking ORMs with realistic queries
- Benchmarking TypeScript ORMs: Prisma vs Sequelize vs TypeORM vs EdgeDB
- Benchmarking Python and JavaScript ORMs: Django, SQLAlchemy, Prisma, TypeORM, Sequelize, EdgeDB
- Benchmarking TypeScript ORMs: Prisma vs Sequelize vs TypeORM vs EdgeDB
Index
What are some of the best open-source Benchmark projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | fashion-mnist | 10,016 |
2 | tianshou | 4,618 |
3 | ann-benchmarks | 2,888 |
4 | mmaction2 | 2,013 |
5 | mmpose | 1,934 |
6 | asv | 682 |
7 | smac | 668 |
8 | py-frameworks-bench | 639 |
9 | pyperformance | 540 |
10 | tape | 435 |
11 | automlbenchmark | 282 |
12 | Meta-SelfLearning | 173 |
13 | CARLA | 166 |
14 | SDGym | 160 |
15 | FedScale | 149 |
16 | fastero | 147 |
17 | python-benchmark-harness | 134 |
18 | freqbench | 113 |
19 | ModelNet40-C | 107 |
20 | imdbench | 94 |
21 | compiler-benchmark | 89 |
22 | ORBIT-Dataset | 58 |
23 | LFattNet | 37 |
Are you hiring? Post a new remote job listing for free.