fma
FMA: A Dataset For Music Analysis (by mdeff)
TheVault
[EMNLP 2023] The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation (by FSoft-AI4Code)
fma | TheVault | |
---|---|---|
1 | 4 | |
2,108 | 79 | |
- | - | |
0.0 | 7.9 | |
over 1 year ago | 3 months ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
fma
Posts with mentions or reviews of fma.
We have used some of these posts to build our list of alternatives
and similar projects.
-
Analyzing music to determine subgenre?
This dataset seems worth looking into: https://github.com/mdeff/fma. I think you'll have a hard time identifying subgenres since even people don't know what subgenre a song belongs to. It's a very subjective classification compared to distinguishing between main genres; e.g. rock, rap, and country. Also, from my work with the Spotify API, there a lot of seemingly synonymous subgenres which will make this task even more tedious (what is the difference between "pop dance" and "dance pop"?).
TheVault
Posts with mentions or reviews of TheVault.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-06-02.
-
(2/2) May 2023
A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation (https://github.com/FSoft-AI4Code/TheVault)
-
List of code generation datasets (open source)
TheVault
-
[P] Fine-tuning LLaMA on TheVault by AI4Code
I essentially want to fine-tune LLaMA on a dataset that's geared towards code generation. After a bit of research I found TheVault which seems good enough for the job (let me know if there are better datasets tho).
-
[R] Introducing The Vault: A new multilingual dataset for advancing code understanding and generation.
Github page: https://github.com/FSoft-AI4Code/TheVault
What are some alternatives?
When comparing fma and TheVault you can also consider the following projects:
mac-miller-lyrics-dataset - Dataset with lyrics from Mac Miller
DB-GPT - AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
SKAB - SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Anomaly Detection algorithms.
GirlfriendGPT - Girlfriend GPT is a Python project to build your own AI girlfriend using ChatGPT4.0