Our great sponsors
|6 days ago||about 2 months ago|
|GNU General Public License v3.0 or later||Apache License 2.0|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Metadata Store - Which one to Choose ? OpenMetadata vs Datahub ?
5 projects | /r/dataengineering | 17 Nov 2022
We use Kubernetes as our deployment platform. Any feedback on one of these open source data catalogs ? - https://atlas.apache.org/#/ - https://opendatadiscovery.org/ - https://open-metadata.org/ - https://marquezproject.github.io/marquez/ - https://datahubproject.io/ - https://www.amundsen.io/ - https://ckan.org/ - https://magda.io/
How to start Data Science and Machine Learning Career?
2 projects | /r/ReviewNPrep | 24 Nov 2021
We are digitisers at the Natural History Museum in London, on a mission to digitise 80 million specimens and free their data to the world. Ask us anything!
4 projects | /r/datasets | 8 Mar 2021
We publish all our data on the [Data Portal](https://data.nhm.ac.uk), a Museum project that's been running since 2014. Instead of MediaWiki it runs on an open-source Python framework called [CKAN](https://ckan.org), which is designed for hosting datasets - though we've had to adapt it in various ways so that it can handle such large amounts of data.
Where can neural networks take me? - Semi-existential crisis
4 projects | /r/neuralnetworks | 27 Feb 2023
What Can I Do With My Time as a Substitute for Strategy Computer Games?
2 projects | /r/slatestarcodex | 22 Jan 2023
You could try Kaggle competitions, or participating in forecasting markets (as you stated) is another option. You don't need any specific skill set to be a forecaster, the rules of the bet are stipulated and from there it's just based on your ability to predict the outcome. You could also try your hand at investing in the stock market, or try and make money betting on sports games. If you're very good at this stuff I'm sure you can make a lot of money doing it. The thing to keep in mind is that generally video games are much much easier than real life
What is the best advanced professional certification for Data Science/ML/DL/MLOps?
2 projects | /r/datascience | 10 Dec 2022
As to the specifics of your projects, that's up to you. Try browsing Kaggle; check out some of the work we have on The Pudding; check out some journalism examples to see what you can try to build on or improve.
Suggestions for projects on kaggle for cv?
2 projects | /r/datascience | 3 Dec 2022
Me: I'm always going to code for fun, even after I start working! Also me:
3 projects | /r/ProgrammerHumor | 29 Oct 2022
There are literally hundreds... Google competitive programming
Machine Learning for detecting anomalies in chess
2 projects | /r/chess | 11 Oct 2022
Kaggle: a platform for machine learning research where competitions are often hosted. There are A LOT of previous competitions you can look at and the prize funds associated with them. Companies like Zillow have participated, and you sincerely get the best practitioners applying their skills here.
Hi! We are Dr. Amanda Martin and JJ Brosnan, Developer and Python data scientist at Deephaven. Ask us anything about getting started in the data science industry, working with large data sets, and working with streaming data in Python.
8 projects | /r/IAmA | 27 Apr 2022
You mentioned looking at Kaggle. Did you look at the competitions? Here's the webpage. There are many that are currently going on, and many from the past that you can draw inspiration from (or solve yourself).
Data Science Competition
15 projects | dev.to | 25 Mar 2022
I'm a yr12 and I think I chose the wrong subjects subjects for what I want to do later. Pls help.
2 projects | /r/universityofauckland | 18 Mar 2022
If you simply mess around at home with Raspberry Pi projects, do Kaggle competitions for fun, work through Project Euler (seriously, a friend of mine did that and it lead to him switching careers from working as a doctor to landing a software developer job!), and whatever similar kind of stuff takes your fancy (watch Fireship to get lots of 100 second overviews of topics) will put you waaaaay ahead of someone who does High School "computing".
2 projects | /r/egg_irl | 10 Mar 2022
I am into AI research/related competitions   based on real world applications too though and there I do prioritise maintaining proper logs, readability and all that jargon
What are some alternatives?
ArchivesSpace - The ArchivesSpace archives management tool
ArchiveBox - 🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
Archivematica - Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
Access to Memory (AtoM) - Open-source, web application for archival description and public access.
Collective Access: Providence - Cataloguing and data/media management application
Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]
CKAN-meta - Metadata files for the CKAN
kuwala - Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times
stable-baselines - A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
stable-baselines3 - PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
dwc - Darwin Core standard for sharing of information about biological diversity.
datasci-ctf - A capture-the-flag exercise based on data analysis challenges