Our great sponsors
-
The preferred way to write Spark programs is to use DataFrame API which is untyped and is essentially the same in Scala, C# and Python. It's a DSL that's used to describe AST of the computation and the end result is the same regardless of language. There's a library called Frameless (https://github.com/typelevel/frameless) that implements typed DataFrame API but it is not in wide use, it looked dead for quite some time (though now development seems to continue) and didn't play nice with IntelliJ IDEA last time I checked. Performance-wise there's no difference most of the time (since all the program does is create an AST) except when using UDFs - Python UDFs are significantly slower and you can't write "proper" UDFs in Python - ones that generate Java code.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.