Spark Alternatives
Similar projects and alternatives to spark
-
ParquetSharp.DataFrame
ParquetSharp.DataFrame is a .NET library for reading and writing Apache Parquet files into/from .NET DataFrames, using ParquetSharp
-
dotnet-webassembly
Create, read, modify, write and execute WebAssembly (WASM) files from .NET-based applications.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
ML.NET
ML.NET is an open source and cross-platform machine learning framework for .NET.
-
azure-event-hubs
☁️ Cloud-scale telemetry ingestion from any stream of data with Azure Event Hubs
-
Mobius: C# API for Spark
C# and F# language binding and extensions to Apache Spark (by microsoft)
-
jira-issue-analysis
Jira REST client, with emphasis on 'Time in Status' analysis and reporting
-
AnyDiff
A CSharp (C#) diff library that allows you to diff two objects and get a list of the differences back.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Akka.net
Canonical actor model implementation for .NET with local + distributed actors in C# and F#.
-
-
TorchSharp
A .NET library that provides access to the library that powers PyTorch.
-
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
spark reviews and mentions
-
Does anyone actually use ML.NET?
Re: DataFrames, that's good to know. There is the DataFrame API which is part of the Microsoft.Data.Analysis NuGet package and that's the API that the issue is tracking and shown in the sample notebook I shared. That API has no dependencies on other systems. The DataFrame you're referring to is part of the .NET for Apache Spark library which has the dependency on Apache Spark which rqeuires some initial setup.
-
What does the .NET ecosystem offer in terms of distributed data processing frameworks?
the data engineering ecosystem is new to me but my first impressions are that everything is catered toward JVM. The only somewhat promising option I've found for building a data pipeline in .NET is github.com/dotnet/spark.
Stats
dotnet/spark is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of spark is C#.