Powerful document editing and collaboration in your app or environment. Ultimate security, API and 30+ ready connectors, SaaS or on-premises Learn more →
Top 20 Jupyter Notebook Data Projects
Data and code behind the articles and graphics at FiveThirtyEightProject mention: [Effortpost] Advanced stats on which players are contributing the most to the Heat's playoff run. | reddit.com/r/heat | 2023-05-24
To answer these questions I decided to look at 538’s RAPTOR ratings. RAPTOR uses player tracking data to estimate how much each player contributes on the offensive and defensive ends. The total RAPTOR score should be something like the “number of points a player contributes to his team’s offense and defense per 100 possessions, relative to a league-average player.” Higher is better, best during the regular season has been Nikola Jokic at +14. You can read more about it here or play with an interactive tool on their website here. I don’t really care about the details of why it’s a good statistic, but it seems pretty helpful and most importantly for my purposes you can download the data here for free.
🎁 4,800,000+ Unsplash images made available for research and machine learning (by unsplash)Project mention: Where can I get lots of clean open source data? | reddit.com/r/DataHoarder | 2022-12-14
ONLYOFFICE Docs — document collaboration in your environment. Powerful document editing and collaboration in your app or environment. Ultimate security, API and 30+ ready connectors, SaaS or on-premises
Quilt is a data mesh for connecting people with actionable data
Query data on the command line with SQL-like SELECTs powered by Python expressionsProject mention: Command-line data analytics made easy with SPyQL | dev.to | 2022-11-06
SPyQL documentation: spyql.readthedocs.io
Easy pipelines for pandas DataFrames.
🌱 Join a community of developers at Microsoft Reactor and connect with people, skills, and technology to build your career or personal learning. We offer free livestreams, on-demand content, and hybrid/in-person events daily around the world. Access our projects and code here.Project mention: Michael Mumbauer speaks to a packed crowd at Microsoft Reactor SF during GDC2023 talking all things Ashfall - the multimedia AAA IP utilizing Hedera to unleash the full potential of web3 entertainment. I’ll past video when available. | reddit.com/r/Hedera | 2023-03-22
Data Quality assessment with one line of code
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
Open-Source Software, Tutorials, and Research on Data-Centric AI 🤖Project mention: [Q] How to generate synthetic dataset for anomaly detection? | reddit.com/r/statistics | 2023-05-08
Maybe you can use a synthetic data generator and use your current dataset as input? I believe there are a lot of GAN-based models for this purpose out there. The ones listed on https://github.com/Data-Centric-AI-Community/awesome-data-centric-ai are mostly focused on structured data, but I'm sure there are similar packages for images.
The data behind uawardata.comProject mention: I found an intteractive map locating all Russian military forces. Can anyone verify how up-to-date it is? | reddit.com/r/UkrainianConflict | 2022-12-26
They provide dates with their data, but the data only goes up to September. The raw data is also available on Github, if you need that as well: https://github.com/simonhuwiler/uawardata
Jupyter Notebooks and Data Sets for Pandas Library (by TirendazAcademy)Project mention: The Machine Learning Project Lifecycle | reddit.com/r/learnmachinelearning | 2022-11-05
✨ Thanks for reading 😀 Follow me on YouTube, Twitter, Instagram, Medium, Tiktok
German NER on Legal Data using BERT
A simple CLIP based project for combining images from multiple datasets.
A tool for recording telemetry from Assetto Corsa Competitzione (on PC) for post-session analysisProject mention: How to record FFB signal? | reddit.com/r/ACCompetizione | 2022-12-16
I coded this in Python to record shared memory data into a CSV - with some modification it should do what you need: https://github.com/ThomasBisset/ACC_Data_2
The most common possible readings of the most frequently used Kanji characters.
Full stack search-engine created from youtube videos obtained using "web-scraping"Project mention: Buscador de vídeos con OpenSearch y React | Parte 3 | Limpieza y almacenamiento de los datos | dev.to | 2022-11-18
Walmart Coffee Exploratory Data Analysis. Data Extracted with SerpApi 🧡Project mention: Web scraping Walmart Search with Nodejs | dev.to | 2023-01-26
📌Note: Also see SerpApi Python demo project of extracting data from 500 Walmart stores and analyzing extracted data if you want to know more about scraping Walmart.
This animated map shows the change in surface temperature around the world from 1970 to 2021, based on data from Kaggle.Project mention: [OC] Animated Map of Global Temperature Changes from 1970 to 2021 | reddit.com/r/dataisbeautiful | 2023-02-23
GitHub Repo: DovarFalcone
Highlights rusian losses with predictions based on historic data from Ministry Defence of Ukraine 🐱👤Project mention: [OC] 1 Year Russian Personnel Prediction Losses in Russo-Ukraine War | reddit.com/r/dataisbeautiful | 2022-10-13
There's also a dataset: https://www.kaggle.com/datasets/dimitryzub/russian-losses-in-russia-ukraine-war or https://github.com/dimitryzub/russo-ukraine-war-prediction-losses
2021 presidential election's data extractor. This script collect all the data from official ONPE page.
TestGPT | Generating meaningful tests for busy devs. Get non-trivial tests (and trivial, too!) suggested right inside your IDE, so you can code smart, create more value, and stay confident when you push.
Jupyter Notebook Data related posts
[Effortpost] Advanced stats on which players are contributing the most to the Heat's playoff run.
1 project | reddit.com/r/heat | 24 May 2023
Consanguineous marriage percentage per country
1 project | reddit.com/r/dataisbeautiful | 23 May 2023
USMNT is a European club. How did they do this season?
1 project | reddit.com/r/ussoccer | 22 May 2023
Derrick White's WAR over the past season has been ~6.7 according to a composite of various metrics. Derrick White's WAR in the playoffs has been ~0.1 according to RAPTOR. The worst among the main Boston roster
1 project | reddit.com/r/nba | 19 May 2023
Nate Silver: Some personal news
2 projects | news.ycombinator.com | 2 May 2023
In honor of Sexual Assault Awareness Month, make sure neither you nor friends harbor any misconceptions about consent
1 project | reddit.com/r/MensLib | 30 Apr 2023
CMV: When selecting a movie to watch, the audience's rating is the only thing that matters and the critic's rating is entirely irrelevant.
1 project | reddit.com/r/changemyview | 29 Apr 2023
A note from our sponsor - ONLYOFFICE
www.onlyoffice.com | 28 May 2023
What are some of the best open-source Data projects in Jupyter Notebook? This list will help you: