awesome-public-datasets
tensorboard
awesome-public-datasets | tensorboard | |
---|---|---|
78 | 11 | |
58,470 | 6,542 | |
0.8% | 0.4% | |
5.1 | 9.4 | |
16 days ago | 7 days ago | |
TypeScript | ||
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
awesome-public-datasets
- Awesome Public Datasets
-
How to practice data analytics skills
Merry Christmas buddy.
You'll find a ton of public datasets on GitHub [1].
Maven Analytics offers a monthly data analytics challenge [2] that you can enter for free. See their past competitions for some interesting datasets.
As I'm based in Ireland I'll also recommend the Irish Data Portal [3].
[1] https://github.com/awesomedata/awesome-public-datasets
- Are there people out there who still like Sam atlman - AI IS AT DANGER
-
The Data Engineering Docker-Compose Starter Kit
The βdata.csvβ file contains historical weather data from Tel Aviv, sourced from another article I wrote. If you wish, you can swap it with a public dataset, for example, from here.
-
suggestions for personal GitHub projects in economics/econometrics
If you want something personal and fun, you will need lots of different data. As such, you can take a look at these publicly available datasets. Maybe you can find out some interesting relationships.
-
Where do you get your data when you have an obscure idea for a dashboard?
Some others I use: https://data.world/search This github project with links: https://github.com/awesomedata/awesome-public-datasets Data.fivethirtyeight.com r/datasets and similar subreddits can also be of help.
-
Full Stack Data Science Project Ideas
There's a lot in here; it's a good start.
- Where to find big datasets?
- How do you keep track of useful datasets?
-
Complete: D214 - MSDA Capstone
Github: Awesome Public Datasets I didn't find much of use here for me, as much of this was either very specialized or very large datasets. But maybe you'll find something of use, here.
tensorboard
- Tensorboard
-
[D] Visualizing layer weights
Some form of 3D histograms? And then "discretized"/binned for each layer too. Apparently Tensorboard has them: https://github.com/tensorflow/tensorboard/blob/master/docs/r1/histograms.md
- I think I broke PIP
-
[D] Unpopular Opinion: I hate the tensorboard Smoothing algorithm and always set the slider to 0.
Consider filing an issue? https://github.com/tensorflow/tensorboard/issues
-
Parts of Tensorboard are being rewritten in Rust for a 100Γ to 400Γ speedup
The async code is in our server.rs and cli.rs, because this exposes a Tonic server and Tonic is all-in on async.
-
[D] Comparison of experiment tracking tools
A quick google search is telling me that this is possible but very poorly documented / communicated: https://github.com/tensorflow/tensorboard/issues/767
-
π ππ» Making the Printed Links Clickable Using TensorFlow 2 Object Detection API
The cool part about TensorBoard is that we may run it directly in Google Colab. However, if you're running the notebook in your local installation of Jupyter you may also install it as Python package and launch it from the terminal.
What are some alternatives?
labelImg - LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source data labeling tool for images, text, hypertext, audio, video and time-series data.
aim - Aim π« β An easy-to-use & supercharged open-source experiment tracker.
tidytuesday - Official repo for the #tidytuesday project
wandb - π₯ A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
reddit-top-2.5-million - This is a dataset of the all-time top 1,000 posts, from the top 2,500 subreddits by subscribers, pulled from reddit between August 15β20, 2013.
data - Data and code behind the articles and graphics at FiveThirtyEight
tesseract-ocr - Tesseract Open Source OCR Engine (main repository)
zsv - zsv+lib: tabular data swiss-army knife CLI + world's fastest (simd) CSV parser
models - Models and examples built with TensorFlow
quickdraw-dataset - Documentation on how to access and use the Quick, Draw! Dataset.
rustboard - just-for-fun reimplementation of TensorBoard backend in Rust