Python datasette

Open-source Python projects categorized as datasette

Top 23 Python datasette Projects

  • datasette

    An open source multi-tool for exploring and publishing data

    Project mention: Seeking Help to Preserve Rare WWII Database on Windows 98 | news.ycombinator.com | 2023-09-18

    Looks like it's an Access database. Perhaps convert it to SQLite and publish with something like https://datasette.io/?

    I think the problem is, thread author doesn't know how to rip an ISO of the CD or move the database out; looks like they are getting help already though.

  • sqlite-utils

    Python CLI utility and library for manipulating SQLite databases

    Project mention: Welcome to Datasette Cloud | news.ycombinator.com | 2023-08-20

    There are a few things you can do here.

    SQLite is great at JSON - so I often dump JSON structures in a TEXT column and query them using https://www.sqlite.org/json1.html

    I also have plugins for running jq() functions directly in SQL queries - https://datasette.io/plugins/datasette-jq and https://github.com/simonw/sqlite-utils-jq

    I've been trying to drive the cost of turning semi-structured data into structured SQL queries down as much as possible with https://sqlite-utils.datasette.io - see this tutorial for more: https://datasette.io/tutorials/clean-data

    This is also an area that I'm starting to explore with LLMs. I love the idea that you could take a bunch of messy data, tell Datasette Cloud "I want this imported into a table with this schema"... and it does that.

    I have a prototype of this working now, I hope to turn it into an open source plugin (and Datasette Cloud feature) pretty soon. It's using this trick: https://til.simonwillison.net/gpt3/openai-python-functions-d...

  • Mergify

    Updating dependencies is time-consuming.. Solutions like Dependabot or Renovate update but don't merge dependencies. You need to do it manually while it could be fully automated! Add a Merge Queue to your workflow and stop caring about PR management & merging. Try Mergify for free.

  • csvs-to-sqlite

    Convert CSV files into a SQLite database

  • twitter-to-sqlite

    Save data from Twitter to a SQLite database

  • github-to-sqlite

    Save data from GitHub to a SQLite database

    Project mention: Automating screenshots for the Datasette documentation using shot-scraper | news.ycombinator.com | 2022-10-15

    I have trouble answering this question myself, and I created it!

    The problem I have is that it can be applied to too many different problems.

    I personally have used it for the following (a truncated summary):

    - Publishing data online to allow other people to explore it, for example https://scotrail.datasette.io and https://russian-ira-facebook-ads.datasettes.com/

    - Building websites, by combining it with custom templates. https://datasette.io and https://www.niche-museums.com and https://til.simonwillison.net are three examples

    - Building my own combined search engine over a bunch of different data. https://github-to-sqlite.dogsheep.net is this for my GitHub issues and commits and issue comments across 100+ projects

    - Similarly, building a code search engine across multiple repos (partly to demonstrate how far you can go with custom plugins): https://ripgrep.datasette.io

    - Any time I have a CSV file I open it in the Datasette Desktop macOS app first to start exploring it: https://datasette.io/desktop

    - As a prototyping tool. It's the fastest way I know of to get from some data files (CSV or JSON) to a working JSON API - and a GraphQL API too using this plugin: https://datasette.io/plugins/datasette-graphql

    - Messing around with geospatial data - here's a write-up of my favourite experiment with that so far: https://simonwillison.net/2021/Jan/24/drawing-shapes-spatial...

    This is a bewilderingly wide array of things! And I keep on finding new problems I can apply it to:

    Of course, if all you have is a hammer, everything looks like a nail. But thanks to the plugin system (and the amazing flexibility of SQLite under the good) I can reshape my hammer into all sorts of interesting shapes!

    I've been trying to capture some of this at https://datasette.io/for

    This is one of my biggest marketing challenges for the project though. If someone asks you for an elevator pitch you need to do better than spending 15 minutes talking through a wide ranging bulleted list!

  • dogsheep-beta

    Build a search index across content from multiple SQLite database tables and run faceted searches against it using Datasette

  • healthkit-to-sqlite

    Convert an Apple Healthkit export zip to a SQLite database

    Project mention: Coping strategies for the serial project hoarder | news.ycombinator.com | 2022-11-28

    > Technically I’m actively maintaining all of them, in that if someone reports a bug I’ll push out a fix.

    Ironically, I toned down my enthusiasm for this author's (many) projects after my initial perusing led me to something interesting[0] that didn't work, and the subsequent issues and minor (but linked to issues!) PRs I contributed went completely without response for the last few years. They're still open.

    To be clear, I'm grateful for the work the author is freely providing for me and the world! And I could certainly do a better job with some of the projects I help maintain as well. He's under no obligation to respond to issues if he doesn't have time or just doesn't want to. But it does speak to how difficult it can be to maintain over a hundred projects, even if you have a system.

    [0]: https://github.com/dogsheep/healthkit-to-sqlite

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • datasette-dashboards

    Datasette plugin providing data dashboards from metadata

  • pocket-to-sqlite

    Create a SQLite database containing data from your Pocket account

  • google-takeout-to-sqlite

    Save data from Google Takeout to a SQLite database

  • datasette-graphql

    Datasette plugin providing an automatic GraphQL API for your SQLite databases

    Project mention: Tuql: Automatically create a GraphQL server from a SQLite database | news.ycombinator.com | 2023-04-25

    Impressive how little code is involved here! This is really neat.

    The biggest feature I can see that's missing is pagination - it looks like this doesn't have a way to retrieve e.g. ten results, then pass a next token to get back the next set.

    Here's how I implemented pagination in my similar datasette-graphql plugin (which also gives you a GraphQL API for an existing SQLite database): https://github.com/simonw/datasette-graphql#pagination

  • datasette-ripgrep

    Web interface for searching your code using ripgrep, built as a Datasette plugin

    Project mention: GitHub – GSA/code-gov: An informative repo for all Code.gov repos | news.ycombinator.com | 2023-09-09

    https://github.com/simonw/datasette-ripgrep

    Seeing as there's already a JSONLD @context (schema) for code.json, CSVW as JSONLD and/or YAMLLD would be an easy way merge Linked Data graphs of tabular data:

  • covid-19-datasette

    Deploys a Datasette instance of COVID-19 data from Johns Hopkins CSSE and the New York Times

  • datasette-scraper

    Add website scraping abilities to Datasette

    Project mention: GitHub – GSA/code-gov: An informative repo for all Code.gov repos | news.ycombinator.com | 2023-09-09

    https://github.com/cldellow/datasette-scraper/#architecture

    (TIL datasette-scraper parses HTML with selectolax; and Selectolax with Modest or Lexbor is ~25x faster at HTML parsing than BeautifulSoup in the selectolax benchmark:

  • swarm-to-sqlite

    Create a SQLite database containing your checkin history from Foursquare Swarm

  • datasette-chatgpt-plugin

    A Datasette plugin that turns a Datasette instance into a ChatGPT plugin

    Project mention: Chat with your database using AI | news.ycombinator.com | 2023-04-09

    In my own experiments I've caught ChatGPT running the correct query but then hallucinating the results, because the response was too long for the token context window and got truncated!

    I have an open issue about that here: https://github.com/simonw/datasette-chatgpt-plugin/issues/2

    More about my explorations: https://simonwillison.net/2023/Mar/24/datasette-chatgpt-plug...

  • hacker-news-to-sqlite

    Create a SQLite database containing data pulled from Hacker News

  • datasette-auth-github

    Datasette plugin that authenticates users against GitHub

  • timezones-api

    A Datasette-powered API for finding the time zone for a latitude/longitude point

  • datasette-auth-passwords

    Datasette plugin for authentication using passwords

    Project mention: Database design for project manager [noob] | /r/Database | 2023-01-25

    I did the bad thing and DID roll it myself - here's my most recent code for it (again, mostly copying what Django does which reduces the risk a bit): https://github.com/simonw/datasette-auth-passwords/blob/main/datasette_auth_passwords/utils.py

  • datasette-plugin

    Cookiecutter template for creating Datasette plugins

  • ibis-datasette

    An ibis backend for querying datasette

  • datasette-dateutil

    dateutil functions for Datasette

  • InfluxDB

    Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-09-18.

Python datasette related posts

Index

What are some of the best open-source datasette projects in Python? This list will help you:

Project Stars
1 datasette 8,272
2 sqlite-utils 1,316
3 csvs-to-sqlite 779
4 twitter-to-sqlite 389
5 github-to-sqlite 328
6 dogsheep-beta 172
7 healthkit-to-sqlite 168
8 datasette-dashboards 120
9 pocket-to-sqlite 96
10 google-takeout-to-sqlite 85
11 datasette-graphql 83
12 datasette-ripgrep 70
13 covid-19-datasette 61
14 datasette-scraper 55
15 swarm-to-sqlite 55
16 datasette-chatgpt-plugin 51
17 hacker-news-to-sqlite 43
18 datasette-auth-github 40
19 timezones-api 27
20 datasette-auth-passwords 20
21 datasette-plugin 16
22 ibis-datasette 11
23 datasette-dateutil 7
Collect and Analyze Billions of Data Points in Real Time
Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
www.influxdata.com