SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python Machinelearning Projects
-
Project mention: [D] What is the recommended approach to training NN on big data set? | reddit.com/r/MachineLearning | 2022-12-08
And in case scaling is really important to you. May I suggest you look into Horovod?
-
-
Sonar
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
-
vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Project mention: preprocessing millions of records - how to speed up the processing | reddit.com/r/datascience | 2022-06-03Try vaex, vaex, using lazy evaluation and parallel calculations, you should be fine.
-
clearml
ClearML - Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management
Project mention: Is there any workflow orchestrator that is Hydra friendly ? | reddit.com/r/mlops | 2022-06-16 -
igel
a delightful machine learning tool that allows you to train, test, and use models without writing code
-
-
Project mention: From “iron manual” to “Iron Man” – augmenting GPT with a fast editable memory | news.ycombinator.com | 2023-02-08
-
InfluxDB
Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.
-
nannyml
Detecting silent model failure. NannyML estimates performance for regression and classification models using tabular data. It alerts you when and why it changed. It is the only open-source library capable of fully capturing the impact of data drift on performance.
Project mention: [HIRING][Full Time, Part Time, Temporary, Internship, Freelance] Data Science Intern (Remote) | reddit.com/r/jobbit | 2022-05-20Description NannyML - creators of an Open Source Python library, are looking for multiple Data Science interns to help across research, prototyping, and product. Github: https://github.com/NannyML/nannyml About Us NannyML is an Open Source Python lib …
-
deepsparse
Inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application
Project mention: [D] How to get the fastest PyTorch inference and what is the "best" model serving framework? | reddit.com/r/MachineLearning | 2022-10-28For 1), what is the easiest way to speed up inference (assume only PyTorch and primarily GPU but also some CPU)? I have been using ONNX and Torchscript but there is a bit of a learning curve and sometimes it can be tricky to get the model to actually work. Is there anything else worth trying? I am enthused by things like TorchDynamo (although I have not tested it extensively) due to its apparent ease of use. I also saw the post yesterday about Kernl using (OpenAI) Triton kernels to speed up transformer models which also looks interesting. Are things like SageMaker Neo or NeuralMagic worth trying? My only reservation with some of these is they still seem to be pretty model/architecture specific. I am a little reluctant to put much time into these unless I know others have had some success first.
-
Project mention: Any suggestions for client side or API content moderation tools for image uploads | reddit.com/r/reactnative | 2023-01-03
I did some tests myself, and the results look very accurate. The model I use has 93% accuracy, and has been trained for days with over 60 GBs of data
-
-
fal
do more with dbt. fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning models.
Project mention: Dbt-fal: a dbt Python adapter with local code execution | news.ycombinator.com | 2023-01-12We built a dbt adapter that helps you run local Python code with your dbt project with any other data warehouse. You can see it here: https://github.com/fal-ai/fal/tree/main/adapter
This new adapter helps you run your dbt Python models with isolated Python environments using our open source library: https://github.com/fal-ai/isolate
-
retentioneering-tools
Retentioneering: product analytics, data-driven customer journey map optimization, marketing analytics, web analytics, transaction analytics, graph visualization, and behavioral segmentation with customer segments in Python. Opensource analytics, predictive analytics over clickstream, sentiment analysis, AB tests, machine learning, and Monte Carlo Markov Chain simulations, extending Pandas, Networkx and sklearn.
Project mention: My Favorite Off-the-Shelf Data Science Repos, What Are Yours? | news.ycombinator.com | 2022-06-22Here are my top off-the-shelf data science models for Marketing. Would be interested which other marketing data science tools you use?
Product Recommendation on Your Website with Metarank (https://github.com/metarank/metarank)
Metarank is a tool that helps you easily build an advanced recommendation engine for your products or content on your website. To get started you only need historical performance data of your products (e.g. number of clicks) and additional metadata like product rating, genre, ingredients or price. In a YAML file, you define the features and the model parameters (e.g. number of iterations, modeling technique). The API service integrates with Apache Flink and can be easily integrated into Kubernetes clusters.
User Journey Analysis on your Website with Retentioneering (https://github.com/retentioneering/retentioneering-tools)
Retentioneering helps you to understand the user journey on your website. Retentioneering is a Python library that allows you to easily connect your Google Analytics data (in Bigquery). You define user-id, event-type and time stamp. From this data input a comprehensive graph network is created with gains and losses as you know it from a customer journey. In addition, customer segments are created that have a similar customer journey. This reduces the complexity of a purely descriptive view of the data.
Marketing Mix Modeling with Robyn (https://github.com/facebookexperimental/Robyn)
Less third-party cookie means less attribution models. The answer to this is Marketing Mix Modeling. Marketing mix models are regression models that use statistical probability to calculate the effect size of marketing channels and other independent variables. The advantage is that business context can be modeled much more realistically. For example, Google Searches for the own brand can be integrated to determine the share of the own brand strength in the revenue. Likewise, offline advertising measures can be modeled with other metrics in this context (e.g. offline advertising with GRPs). Robyn takes into account adstock effects, ROAS calculation and multicollinarity in the marketing channels. In addition, with simple functionality, budgets can be optimized using the predictions and results from marketing tests can be integrated into the model for calibration.
-
Project mention: Quickly develop risk control algorithms in business scenarios based on MetaSpore | reddit.com/r/learnmachinelearning | 2022-06-15
The evaluation problems related to financial loans are mainly based on tabular data, so the importance of feature engineering is self-evident. The common features in the dataset include ID type, Categorical type, and continuous number type, which require common data handling such as EDA, missing value completion, outlier processing, normalization, feature binning, and importance assessment. The process can reference the GitHub codebase: https://github.com/meta-soul/MetaSpore/blob/main/demo/dataset, which part about tianchi_loan instructions.
-
-
-
covalent
Pythonic tool for running data-science/high performance/quantum-computing workflows in heterogenous environments. (by AgnostiqHQ)
Project mention: Show HN: Covalent – distributed computing for ML, HPC and Quantum (open source) | news.ycombinator.com | 2022-11-09 -
CodeRL
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (NeurIPS22).
Project mention: [D] Most important AI Paper´s this year so far in my opinion + Proto AGI speculation at the end | reddit.com/r/MachineLearning | 2022-08-14CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning Paper: https://arxiv.org/pdf/2207.01780.pdf Github: https://github.com/salesforce/CodeRL
-
-
zoofs
zoofs is a python library for performing feature selection using a variety of nature-inspired wrapper algorithms. The algorithms range from swarm-intelligence to physics-based to Evolutionary. It's easy to use , flexible and powerful tool to reduce your feature size.
and as u described considering you will end up with a lot of features. https://github.com/jaswinder9051998/zoofs for feature selection. Zoofs is a wrapper based feature selection, so you'll be able select feature-based purely on performance if u have a healthy test set or if u perform cross-validation
-
Project mention: [Project] I built a minimal stateless ML project template built on my current favourite stack | reddit.com/r/MachineLearning | 2023-02-02
It provides mature configuration support via [Hydra-Zen](https://github.com/mit-ll-responsible-ai/hydra-zen) and automates configuration generation via [decorators](https://github.com/BayesWatch/minimal-ml-template/blob/af387e59472ea67552b4bb8972b39fe95952dd8a/mlproject/decorators.py#L10) implemented in this repo.
-
-
Project mention: fastchess VS Synergy-Chess - a user suggested alternative | libhunt.com/r/fastchess | 2022-06-18
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Machinelearning related posts
- From “iron manual” to “Iron Man” – augmenting GPT with a fast editable memory
- Dbt-fal: a dbt Python adapter with local code execution
- I want to learn more about AI and Machine Learning
- Neural Network vs AI for predicting trends based on market info
- Blockchain will be great for MMORPGs, but the first few attempts with fail
- Deep learning is a growing trend in healthcare artificial intelligence, but what are the use cases for the various types of deep learning?
- Making a Dialogue Summarizer
-
A note from our sponsor - #<SponsorshipServiceOld:0x00007fea5952dcd0>
www.saashub.com | 9 Feb 2023
Index
What are some of the best open-source Machinelearning projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | horovod | 12,981 |
2 | ludwig | 8,744 |
3 | vaex | 7,732 |
4 | clearml | 4,064 |
5 | igel | 3,024 |
6 | tslearn | 2,375 |
7 | marqo | 2,215 |
8 | nannyml | 1,373 |
9 | deepsparse | 1,257 |
10 | nsfw_model | 1,089 |
11 | pytorch2keras | 831 |
12 | fal | 658 |
13 | retentioneering-tools | 596 |
14 | MetaSpore | 585 |
15 | LiuAlgoTrader | 476 |
16 | deep-significance | 276 |
17 | covalent | 274 |
18 | CodeRL | 254 |
19 | yolo-hand-detection | 210 |
20 | zoofs | 170 |
21 | hydra-zen | 155 |
22 | sagemaker-explaining-credit-decisions | 85 |
23 | fastchess | 73 |