spark
ML.NET
Our great sponsors
spark | ML.NET | |
---|---|---|
3 | 17 | |
1,997 | 8,838 | |
0.6% | 0.8% | |
0.0 | 8.9 | |
12 days ago | 4 days ago | |
C# | C# | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
spark
- .NET for Apache Spark appears to be abandoned
-
Does anyone actually use ML.NET?
Re: DataFrames, that's good to know. There is the DataFrame API which is part of the Microsoft.Data.Analysis NuGet package and that's the API that the issue is tracking and shown in the sample notebook I shared. That API has no dependencies on other systems. The DataFrame you're referring to is part of the .NET for Apache Spark library which has the dependency on Apache Spark which rqeuires some initial setup.
-
What does the .NET ecosystem offer in terms of distributed data processing frameworks?
the data engineering ecosystem is new to me but my first impressions are that everything is catered toward JVM. The only somewhat promising option I've found for building a data pipeline in .NET is github.com/dotnet/spark.
ML.NET
-
ML.net image classification, poor GPU accuracy
You can direct your question to https://github.com/dotnet/machinelearning/issues. Perhaps it is already documented.
-
Building a File Analysis Dataset with Python
Here I'm analyzing all projects in the src and test directories of the ML.NET repository. I chose to include these as separate paths because they represent two different groupings of projects in this repository.
-
Extracting git repository data with PyDriller
Important Note: looping over repository commits takes a long time for large repositories. It took 52 minutes to analyze the ML.NET repository this code example refers to, which had 2,681 commits at the time of analysis on February 25th, 2023.
- Can we please be allowed to do machine learning object detection model training locally?
-
ML.NET: can Microsoft's machine learning be trusted?
We checked the ML.NET 1.7.1 version. The source code of this project's version is available on GitHub.
- Stable Diffusion converted to ONNX (Demo usage, optimized to CPU)
-
Why is there a lack of cool repos?
machine learning? https://github.com/dotnet/machinelearning
-
what is the future of ML.NET?
You can follow some of our plans by taking a look at our roadmap which we'll be updating shortly to more accurately reflect the areas we're investing in.
-
Does anyone actually use ML.NET?
Re: ONNX, if you run into similar issues in the future, feel free to reach out in our GitHub repo or the ONNX Runtime repo and we'd be happy to help!
-
Requesting Senior Project Ideas
Good clarification, I think using something like ML.NET could be cool but I have some experience with Blazor that might be fun to use as well, I think generally performance monitoring or optimizing systems seems interesting to me, and I'm really open to other ideas as well. Let me know if any of that helps narrow my question down!
What are some alternatives?
ParquetSharp.DataFrame - ParquetSharp.DataFrame is a .NET library for reading and writing Apache Parquet files into/from .NET DataFrames, using ParquetSharp
TensorFlow.NET - .NET Standard bindings for Google's TensorFlow for developing, training and deploying Machine Learning models in C# and F#.
dotnet-webassembly - Create, read, modify, write and execute WebAssembly (WASM) files from .NET-based applications.
Accord.NET
azure-event-hubs - ☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
FaceRecognitionDotNet - The world's simplest facial recognition api for .NET on Windows, MacOS and Linux
AnyDiff - A CSharp (C#) diff library that allows you to diff two objects and get a list of the differences back.
OpenCvSharp - OpenCV wrapper for .NET
Mobius: C# API for Spark - C# and F# language binding and extensions to Apache Spark
Catalyst - 🚀 Catalyst is a C# Natural Language Processing library built for speed. Inspired by spaCy's design, it brings pre-trained models, out-of-the box support for training word and document embeddings, and flexible entity recognition models.
jira-issue-analysis - Jira REST client, with emphasis on 'Time in Status' analysis and reporting
Deedle - Easy to use .NET library for data and time series manipulation and for scientific programming