Mito – Excel-like interface for Pandas dataframes in Jupyter notebook

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • datasette

    An open source multi-tool for exploring and publishing data

  • Looks like a Datasette[0] clone which runs on top of something (jupyter) which runs on top of Python (ipython). I think I would like to see how much time it takes to open a massive dataset in Mito & in Datasette :P

    [0]: https://datasette.io/

  • mito

    The mitosheet package, trymito.io, and other public Mito code.

  • Mito is open source, but using Pro features does actually require a Pro or enterprise license. You can check out this callout in the license [1], as well as the restrictions on Mito Pro features here [2]. We're in the process of fixing up the upgrade to Pro process a bit... as you can tell... :)

    You can of course fork Mito and turn off telemetry as long as you open source your changes! Go for it - happy to hop on a call and help you get set up with the codebase, if you want. Yay open source!

    [1] https://github.com/mito-ds/monorepo/blob/974091b455950c6c50e...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • qgrid

    An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

  • I played around with many of these before:

    https://github.com/quantopian/qgrid

  • dtale

    Visualizer for pandas data structures

  • https://github.com/man-group/dtale

    I find that I'm actually a lot faster using basic Pandas methods to get the data I want in exactly the form I want it.

    If I really want to show everything, I just use:

    '''

  • pandas-profiling

    Discontinued Create HTML profiling reports from pandas DataFrame objects [Moved to: https://github.com/ydataai/pandas-profiling] (by pandas-profiling)

  • For those who are going through the thread finding new tools: pandas-profiling[0] is a library for automatic EDA (part of what bamboolib[1] does).

    [0]: https://github.com/pandas-profiling/pandas-profiling

  • vegafusion

    Serverside scaling for Vega and Altair visualizations

  • One cool library I saw recently for helping on the visualisation side is https://github.com/vegafusion/vegafusion

    It allows you to use Altair in Python for visualising data, but does the computation in the backend using Arrow DataFusion. Not for 15GB perhaps, but cool nonetheless.

  • lux

    Automatically visualize your pandas dataframe via a single print! 📊 💡 (by lux-org)

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Altair

    Declarative statistical visualization library for Python

  • If you can write visualisations in Python itself, I am a big fan of Altair's syntax (https://github.com/altair-viz/altair), which is based on vega-lite. A while back, I wrote a brief guide and comparison of the main plotting libraries: https://datapane.com/reports/87NNEJ7/the-ultimate-guide-to-p...

    One benefit of having them in actual code is that you can programmatically automate the creation of things like dashboards and reports. For instance, schedule a script to share an interactive plot every Monday morning, or build a live dashboard that updates every 10m. This opens up a lot of possibilities that would be impossible in a traditional drag-and-drop tool.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts