Python Databricks

Open-source Python projects categorized as Databricks
Python Spark Mlops SQL CI

Top 13 Python Databrick Projects

  1. Redash

    Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. dolly

    Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

  4. sqlglot

    Python SQL Parser and Transpiler

    Project mention: Show HN: SQL-tString a t-string SQL builder in Python | news.ycombinator.com | 2025-05-16

    https://github.com/tobymao/sqlglot :

    > SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine [written in Python] . It can be used to format SQL or translate between 24 different dialects like DuckDB, Presto / Trino, Spark / Databricks, Snowflake, and BigQuery. It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects.

  5. dbrx

    Code examples and resources for DBRX, a large language model developed by Databricks

  6. databricks-sdk-py

    Databricks SDK for Python (Beta)

  7. dbx

    đź§± Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

  8. nutter

    Testing framework for Databricks notebooks

  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. dbt-databricks

    A dbt adapter for Databricks.

    Project mention: Platform Engineering Abstraction: How to Scale IaC for Enterprise | dev.to | 2024-10-02

    Vendors like Confluent, Snowflake, Databricks, and dbt are improving the developer experience with more automation and integrations, but they often operate independently. This fragmentation makes standardizing multi-directional integrations across identity and access management, data governance, security, and cost control even more challenging. Developing a standardized, secure, and scalable solution for multi-platform environments is now a fast evolving area for platform engineering teams.

  11. dlt-meta

    Metadata driven Databricks Delta Live Tables framework for bronze/silver pipelines

  12. databricks-nutter-repos-demo

    Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline

  13. pytester

    Python Testing for Databricks

    Project mention: PyTest Fixtures for Databricks Workspaces | news.ycombinator.com | 2024-09-23
  14. pyjaws

    PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows

  15. fastdbfs

    fastdbfs - An interactive command line client for Databricks DBFS.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Databricks discussion

Log in or Post with

Python Databricks related posts

Index

What are some of the best open-source Databrick projects in Python? This list will help you:

# Project Stars
1 Redash 27,541
2 dolly 10,810
3 sqlglot 7,981
4 dbrx 2,553
5 databricks-sdk-py 458
6 dbx 449
7 nutter 306
8 dbt-databricks 283
9 dlt-meta 202
10 databricks-nutter-repos-demo 153
11 pytester 88
12 pyjaws 43
13 fastdbfs 4

Sponsored
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io