Python Analytics

Open-source Python projects categorized as Analytics

Top 23 Python Analytic Projects

  • Redash

    Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

    Project mention: Recommend Django Great Projects | news.ycombinator.com | 2022-12-03
  • dbt-core

    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

    Project mention: Unit testing with dbt | reddit.com/r/dataengineering | 2023-01-11

    Hey OP! There are packages like dbt-datamocktool or dbt-unit-testing. You can check it out. You might want to check out this thread as well.

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • dagster

    An orchestration platform for the development, production, and observation of data assets.

    Project mention: dbt Cloud Alternatives? | reddit.com/r/dataengineering | 2023-01-23

    Dagster? https://dagster.io

  • Tautulli

    A Python based monitoring and tracking tool for Plex Media Server.

    Project mention: Is there a way to download my chosen artwork for my movies/TV shows from Plex? | reddit.com/r/PleX | 2023-01-31

    Tautulli's export feature allows you to export posters/cover art, including only exporting items that you've manually uploaded/selected.

  • Shynet

    Modern, privacy-friendly, and detailed web analytics that works without cookies or JS.

    Project mention: Why you should remove Google Analytics from your website | reddit.com/r/degoogle | 2022-07-07

    There's also Shynet.

  • rotki

    A portfolio tracking, analytics, accounting and tax reporting application that protects your privacy

    Project mention: Crypto: servizio per calcolare le tasse da pagare? | reddit.com/r/ItaliaPersonalFinance | 2023-01-25
  • WALKOFF

    A flexible, easy to use, automation framework allowing users to integrate their capabilities and devices to cut through the repetitive, tedious tasks slowing them down. #nsacyber

  • InfluxDB

    Build time-series-based applications quickly and at scale.. InfluxDB is the Time Series Platform where developers build real-time applications for analytics, IoT and cloud-native services. Easy to start, it is available in the cloud or on-premises.

  • scikit-learn-intelex

    Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

    Project mention: Machine Learning with PyTorch and Scikit-Learn – The *New* Python ML Book | news.ycombinator.com | 2022-02-25
  • metricflow

    MetricFlow allows you to define, build, and maintain metrics in code.

    Project mention: Show HN: MetricFlow – open-source metric framework | news.ycombinator.com | 2022-04-06

    Three things:

    First, MetricFlow does not currently support MySQL. We launched with support for BigQuery, Redshift, and Snowflake. I have opened an issue to add support for MySQL (and similar issues for other SQL engines are coming): https://github.com/transform-data/metricflow/issues/27

    Second, what we call a data source is more similar to a table in a database, rather than the underlying database service itself. Metricflow itself is useful when you're using a single SQL engine - indeed, that's all we support today - but it is most useful when you're in a world where joins are a thing. That said, if you have one big data table you might still find it useful to have declarative metric definitions defined in Metricflow. Suppose, for example, you had a big NoSQL style table filled with JSON objects. You might define a few data sources that normalize those JSON objects into top level elements (identifiers, dimensions, aggregated measures) using the sql_query data source config attribute, and then that'd allow you to support structured queries on the data consumption end while pushing unstructured blobs from your application layer. This will be slow at query time, and only as reliable as the level of discipline exerted in your application development workflow, but it's possible.

    Third, if we did support MySQL you'd basically connect to it via standard connection parameters - we have a config file where you can store the required information and then we'll manage the connections for you. However, I'm not familiar with uxwizz, and a quick perusal of their documentation did not turn up how one goes about connecting to the underlying DB. It's likely I just missed this, but at any rate I don't know how it is done. If they don't support standard MySQL client connections you'd need to write an adapter of some kind against whatever DB connection APIs they provide, in which case you'd likely need to roll a custom implementation of MetricFlow's SqlClient interface and initialize the MetricFlowEngine with that.

  • flask-profiler

    a flask profiler which watches endpoint calls and tries to make some analysis.

  • fal

    do more with dbt. fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning models.

    Project mention: Dbt-fal: a dbt Python adapter with local code execution | news.ycombinator.com | 2023-01-12

    We built a dbt adapter that helps you run local Python code with your dbt project with any other data warehouse. You can see it here: https://github.com/fal-ai/fal/tree/main/adapter

    This new adapter helps you run your dbt Python models with isolated Python environments using our open source library: https://github.com/fal-ai/isolate

  • kube-opex-analytics

    🎨 Kubernetes Usage Analytics and Accounting for Cost Allocation and Capacity Planning - Hourly Trends, Daily and Monthly Accounting - Prometheus Exporter - Built-in & Grafana Dashboards.

    Project mention: GitHub - rchakode/kube-opex-analytics: 🎨 Kubernetes Usage Analytics and Accounting for Cost Allocation and Capacity Planning - Hourly Trends, Daily and Monthly Accounting - Prometheus Exporter - Built-in & Grafana Dashboards. | reddit.com/r/devopsish | 2022-03-27
  • objectiv-analytics

    Open-source product analytics infrastructure for data teams that want full control. Built for high quality data collection and ready to use for advanced analytics & ML.

    Project mention: Get tools to test, validate and debug your tracking instrumentation → Set up error-free user behavior tracking → No more missing/faulty data downstream. | reddit.com/r/u_objectiv_io | 2022-09-19
  • riptable

    64bit multithreaded python data analytics tools for numpy arrays and datasets

    Project mention: Data-Oriented Programming in Python | news.ycombinator.com | 2022-11-27

    I'd like to plug riptables (https://github.com/rtosholdings/riptable), which is (more-or-less) a performance upgrade to pandas.

  • dbt-metabase

    Model synchronization from dbt to Metabase

    Project mention: A modern data stack for startups | dev.to | 2022-04-21

    So how do we get this into Metabase? There's a tool called dbt-metabase that can infer Metabase semantic type information from the dbt schema and push it into Metabase- we run this whenever complete a dbt build, helping sync Metabase with whatever new fields we may have added.

  • versatile-data-kit

    Build, run and manage your data pipelines with Python or SQL on any cloud

    Project mention: What Orchestration Tool do you use for batch ETL/ELT? | reddit.com/r/dataengineering | 2023-01-31

    We use Versatile Data Kit for batch data job orchestration (https://github.com/vmware/versatile-data-kit)

  • WebHashcat

    Hashcat web interface

  • Data Flow Facilitator for Machine Learning (dffml)

    The easiest way to use Machine Learning. Mix and match underlying ML libraries and data set sources. Generate new datasets or modify existing ones with ease.

  • spectacles

    A continuous integration tool for Looker and LookML.

    Project mention: Track fields in view that are no longer present in the source table in database. | reddit.com/r/Looker | 2022-06-14
  • reddit-detective

    Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more

    Project mention: GitHub - umitkaanusta/reddit-detective: Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more | reddit.com/r/YoutubeFactory | 2022-05-27
  • openskill.py

    Multiplayer rating system. Better than Elo.

    Project mention: Show HN: Predict team ranks in sports and video games with openskill.py | news.ycombinator.com | 2022-12-11
  • dbt-data-reliability

    Data anomalies monitoring as dbt tests and dbt artifacts uploader.

    Project mention: How to store dbt run and test results in tables + code example | reddit.com/r/dataengineering | 2022-08-10

    The entire implementation is available in our open source dbt package.

  • nba-sql

    :basketball: An application to build an NBA database backed by MySQL or Postgres.

    Project mention: Shitpost(?) From Nov to Dec, Braun was trusted with 50% more minutes per game, (when he checked in at all), and it resulted in a 50% bump in fg pct. | reddit.com/r/denvernuggets | 2023-01-31
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-01-31.

Python Analytics related posts

Index

What are some of the best open-source Analytic projects in Python? This list will help you:

Project Stars
1 Redash 22,501
2 dbt-core 6,397
3 dagster 6,364
4 Tautulli 4,709
5 Shynet 2,129
6 rotki 2,023
7 WALKOFF 1,130
8 scikit-learn-intelex 875
9 metricflow 734
10 flask-profiler 719
11 fal 657
12 kube-opex-analytics 421
13 objectiv-analytics 417
14 riptable 332
15 dbt-metabase 273
16 versatile-data-kit 245
17 WebHashcat 207
18 Data Flow Facilitator for Machine Learning (dffml) 201
19 spectacles 184
20 reddit-detective 183
21 openskill.py 167
22 dbt-data-reliability 143
23 nba-sql 130
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com