opendata.cern.ch
data
opendata.cern.ch | data | |
---|---|---|
13 | 117 | |
635 | 16,635 | |
0.6% | 0.2% | |
9.2 | 8.5 | |
11 days ago | about 2 months ago | |
Python | Jupyter Notebook | |
GNU General Public License v3.0 only | Creative Commons Attribution 4.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
opendata.cern.ch
-
Observable 2.0, a static site generator for data apps
I think the idea of Framework is really good, but static data limits the applications, excluding monitoring and other cases in which the data is constantly changing, but the dashboard can stay as it is. For example, I'd love to see a revamped Framework version of the LHC beam monitor and related pages (see https://op-webtools.web.cern.ch/vistar/, but check again in 2 months or so, when the accelerator will be running).
In high-energy physics, ROOT is /the/ toolkit for data analysis, and I guess jsROOT (https://root.cern.ch/js/) could also be used to load data to be shown in Framework dashboards. I thought the idea of Framework as a blogging engine with powerful data visualization built-in could be very interesting. Think, for example, about physicists pulling open data (https://opendata.cern.ch) and writing about their analysis or someone pulling data from https://ourworldindata.org/ in their own visualizations to support their case while writing about a particular subject, etc.
-
NFS > FUSE: Why We Built Our Own NFS Server in Rust
> XetHub has the world’s first natively cross-platform, user-mode filesystem implementation, allowing you to mount arbitrarily large datasets on your machine.
Not really world's first. CERN has developed EOS (https://eos-web.web.cern.ch/) for many years, and even though it's not available natively on Windows, it is available on Linux and macOS. EOS uses FUSE, though, not NFS.
> This enables you to, in just a few seconds, locally mount ~660 GB of Llama 2 models or write DuckDB queries to analyze large parquet files and scan just the data you need.
If you mount all instances of EOS at CERN on your machine with the FUSE client, that in principle mounts hundreds of PB of data from LHC experiments, although much of this data requires special permissions to be accessed. However, there's also a lot of open data. See https://opendata.cern.ch/.
- Are modern physicists dancing with the devil?
-
Good Series, Tutorial, or Book on Particle Physics Analysis using Python or Root for Undergraduates
CERN Open Data has lots of examples from various collaborations: https://opendata.cern.ch/
-
If you are in the process of building your data analytics project/portfolio, here's a useful video where you can find all the datasets you need
https://opendata.cern.ch/ - datasets from CERN if you're interested in particle physics. Lots of image data.
-
Why atheists behave so unscientific?
data from CERN
-
Es ce que les données récoltées sont disponible au public ?
See: https://opendata.cern.ch/
-
Before There Was Effective Altruism, There Was Effective Philanthropy
Huh? CERN publishes their data: https://opendata.cern.ch/ CERN is also pretty big on open source in general: https://home.cern/science/computing/open-source-open-science Again, the attitude seems to be, "There are times where we may not want to be 100% open, so let's assume there are good reasons it won't work for EA." I'm not saying everyone needs to publish their bank account numbers, passwords, and a video stream of the office bathroom. You can use sense and still be open.
-
[P] Official Imagen Website by Google Brain
CERN actually releases their data publicly and you are free to analyze them for yourself.
- What is the largest free data set that you know of?
data
-
Mastering Dataset Acquisition: A Comprehensive Guide
FiveThirtyEight Datasets: Datasets related to articles and investigations published by FiveThirtyEight. FiveThirtyEight Datasets
-
[USMNT] It only took 20 caps for Jesus Ferreira to get double-digit goals. The fastest in #USMNT history.
You of course already know this answer, but just to put it into more perspective. Here are the SPI ranking equivalents to what he did with these 11 goals in Scotland and Switzerland.
-
[Effortpost] Advanced stats on which players are contributing the most to the Heat's playoff run.
To answer these questions I decided to look at 538’s RAPTOR ratings. RAPTOR uses player tracking data to estimate how much each player contributes on the offensive and defensive ends. The total RAPTOR score should be something like the “number of points a player contributes to his team’s offense and defense per 100 possessions, relative to a league-average player.” Higher is better, best during the regular season has been Nikola Jokic at +14. You can read more about it here or play with an interactive tool on their website here. I don’t really care about the details of why it’s a good statistic, but it seems pretty helpful and most importantly for my purposes you can download the data here for free.
-
Consanguineous marriage percentage per country
EDIT: I came to this data from this repository which has a nice csv collection for machine training.
-
USMNT is a European club. How did they do this season?
Looks like we may actually be collectively underrating our guys now. That's an interesting change. Based on SPI (rating = 72.4) we would be:
- Derrick White's WAR over the past season has been ~6.7 according to a composite of various metrics. Derrick White's WAR in the playoffs has been ~0.1 according to RAPTOR. The worst among the main Boston roster
-
Nate Silver: Some personal news
Before Disney/ABC get any -ideas-, might be a good chance to get our hands on at least their data[0]!
[0]: https://data.fivethirtyeight.com/
-
In honor of Sexual Assault Awareness Month, make sure neither you nor friends harbor any misconceptions about consent
Most young women expect words to be involved when their partner seeks their consent. 43% of young men actually ask for verbal confirmation of consent. Overall, verbal indicators of consent or nonconsent are more common than nonverbal indicators. More open communication also increases the likelihood of orgasm for women.
- CMV: When selecting a movie to watch, the audience's rating is the only thing that matters and the critic's rating is entirely irrelevant.
-
Slight majority of people in WA want to leave state, poll finds
DHM does not use an equity sample. Of all polling operations they rank 250 out of 517. Id like to see another pollster https://github.com/fivethirtyeight/data/blob/master/pollster-ratings/pollster-ratings.csv
What are some alternatives?
nfsserve - A Rust NFS Server implementation
uawardata - The data behind uawardata.com
awesome-public-datasets - A topic-centric list of HQ open datasets.
tidytuesday - Official repo for the #tidytuesday project
Mediawiki - 🌻 The collaborative editing software that runs Wikipedia. Mirror from https://gerrit.wikimedia.org/g/mediawiki/core. See https://mediawiki.org/wiki/Developer_access for contributing.
ydata-quality - Data Quality assessment with one line of code
wikdict-web - Web front end for WikDict dictionaries
quilt - Quilt is a data mesh for connecting people with actionable data
Herbie - Download numerical weather prediction datasets (HRRR, RAP, GFS, IFS, etc.) from NOMADS, NODD partners (Amazon, Google, Microsoft), ECMWF open data, and the University of Utah Pando Archive System.
CodeSearchNet - Datasets, tools, and benchmarks for representation learning of code.
file-system-stress-testing - A tool that can be used to stress test POSIX filesystems.
Video-Swin-Transformer - This is an official implementation for "Video Swin Transformers".