InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Scio Alternatives
Similar projects and alternatives to Scio
-
materialize
Real-time Data Integration and Transformation: use SQL to transform, deliver, and act on fast-changing data. (by MaterializeInc)
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
-
dbt-fal
Discontinued do more with dbt. dbt-fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning models.
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
Reactive-kafka
Alpakka Kafka connector - Alpakka is a Reactive Enterprise Integration library for Java and Scala, based on Reactive Streams and Akka.
-
-
-
-
-
-
-
-
-
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Scio discussion
Scio reviews and mentions
- Are there any openly available data engineering projects using Scala and Spark which follow industry conventions like proper folder/package structures and object oriented division of classes/concerns? Most examples I’ve seen have everything in one file without proper separation of concerns.
-
For the DE's that choose Java over Python in new projects, why?
I doubt it is possible because I suspect that GIL would like a word. So I could spend nights trying to make it work in Python (and possibly, if not likely, fail). Or I could just use this ready made solution.
-
what popular companies uses Scala?
Apache Beam API called Scio. They open sourced it https://spotify.github.io/scio/
-
Scala or Python
Generally Python is a lingua franca. I have never met a data engineer that doesn't know Python. Scala isn't used everywhere. Also, you should know that in Apache Beam (data processing framework that's gaining popularity because it can handle both streaming and batch processing and runs on spark) the language choices are Java, Python, Go and Scala. So, even if you "only" know Java, you can get started with Data engineering through apache beam.
-
Wanting to move away from SQL
I agree 100%. I haven't used SQL that much in previous data engineering roles, and I refuse to consider jobs that mostly deal with SQL. One of my roles involved using a nice Scala API for apache beam called Scio and it was great. Code was easy to write, maintain, and test. It also worked well with other services like PubSub and BigTable.
-
ETL Pipelines with Airflow: The Good, the Bad and the Ugly
If you prefer Scala, then you can try Scio: https://github.com/spotify/scio.
- ELT, Data Pipeline
-
A note from our sponsor - InfluxDB
www.influxdata.com | 16 May 2025
Stats
spotify/scio is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of Scio is Scala.