ffhq-dataset

Flickr-Faces-HQ Dataset (FFHQ) (by NVlabs)

Ffhq-dataset Alternatives

Similar projects and alternatives to ffhq-dataset

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better ffhq-dataset alternative or higher similarity.

ffhq-dataset reviews and mentions

Posts with mentions or reviews of ffhq-dataset. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-04-17.
  • [SD 1.5] Swizz8-REAL is now available.
    1 project | /r/StableDiffusion | 30 Aug 2023
  • [R] How do paper authors deal with takedown requests?
    1 project | /r/MachineLearning | 26 Jul 2023
    Datasets like FFHQ consist of face images crawled from the Internet. While those images are published under CC licenses, the authors usually have not obtained consent from each person depicted in those images. I guess that's why they are taking takedown requests: People can send requests to remove their faces from the dataset.
  • Collecting dataset
    1 project | /r/StableDiffusion | 8 Jun 2023
  • Artificial faces are more likely to be perceived as real faces than real faces
    1 project | /r/science | 2 Jan 2023
    The real ones were taken from this dataset.
  • This sub is misrepresenting “Anti-AI” artists
    1 project | /r/StableDiffusion | 28 Dec 2022
    NVIDIA's FFHQ says "Only images under permissive licenses were collected." https://github.com/NVlabs/ffhq-dataset
  • Open image set of a non-celebrity that can be used for demoing Stable Diffusion tuning?
    1 project | /r/StableDiffusion | 21 Dec 2022
  • [D] Does anyone have a copy of the FFHQ 1024 scale images (90GB) ? and or a copy of the FFHQ Wild images (900GB) ?
    1 project | /r/MachineLearning | 13 Jun 2022
    The FFHQ dataset https://github.com/NVlabs/ffhq-dataset is a high quality, high resolution, and extremely well curated dataset that is used in many recent SOTA GAN papers and also has applications in many other areas.
  • [N] [P] Access 100+ image, video & audio datasets in seconds with one line of code & stream them while training ML models with Activeloop Hub (more at docs.activeloop.ai, description & links in the comments below)
    4 projects | /r/MachineLearning | 17 Apr 2022
  • [P] Training StyleGAN2 in Jax (FFHQ and Anime Faces)
    2 projects | /r/MachineLearning | 12 Sep 2021
    I trained on FFHQ and Danbooru2019 Portraits with resolution 512x512.
  • Facebook apology as AI labels black men 'primates'
    1 project | news.ycombinator.com | 6 Sep 2021
    > Which makes it an inexcusable mistake to make in 2021 - how are you not testing for this?

    They probably are, but not good enough. These things can be surprisingly hard to detect. Post hoc it is easy to see the bias, but it isn't so easy before you deploy the models.

    If we take racial connotations out of it then we could say that the algorithm is doing quite well because it got the larger hierarchical class correct, primate. The algorithm doesn't know the racial connotations, it just knows the data and what metric you were seeking. BUT considering the racial and historical context this is NOT an acceptable answer (not even close).

    I've made a few comments in the past about bias and how many machine learning people are deploying models without understanding them. This is what happens when you don't try to understand statistics and particularly long tail distributions. gumboshoes mentioned that Google just removed the primate type labels. That's a solution, but honestly not a great one (technically speaking). But this solution is far easier than technically fixing the problem (I'd wager that putting a strong loss penalty for misclassifiying a black person as an ape is not enough). If you follow the links from jcims then you might notice that a lot of those faces are white. Would it be all that surprising if Google trained from the FFHQ (Flickr) Dataset?[0] A dataset known to have a strong bias towards white faces. We actually saw that when Pulse[1] turned Obama white (do note that if you didn't know the left picture was a black person and who they were that this is a decent (key word) representation). So it is pretty likely that _some_ problems could simply be fixed by better datasets (This part of the LeCunn controversy last year).

    Though datasets aren't the only problems here. ML can algorithmically highlight bias in datasets. Often research papers are metric hacking, or going for the highest accuracy that they can get[2]. This leaderboardism undermines some of the usage and often there's a disconnect between researchers and those in production. With large and complex datasets we might be targeting leaderboard scores until we have a sufficient accuracy on that dataset before we start focusing on bias on that dataset (or more often we, sadly, just move to a more complex dataset and start the whole process over again). There's not many people working on the biased aspects of ML systems (both in data bias and algorithmic bias), but as more people are putting these tools into production we're running into walls. Many of these people are not thinking about how these models are trained or the bias that they contain. They go to the leaderboard and pick the best pre-trained model and hit go, maybe tuning on their dataset. Tuning doesn't eliminate the bias in the pre-training (it can actually amplify it!). ~~Money~~Scale is NOT all you need, as GAMF often tries to sell. (or some try to sell augmentation as all you need)

    These problems won't be solved without significant research into both data and algorithmic bias. They won't be solved until those in production also understand these principles and robust testing methods are created to find these biases. Until people understand that a good ImageNet (or even JFT-300M) score doesn't mean your model will generalize well to real world data (though there is a correlation).

    So with that in mind, I'll make a prediction that rather than seeing fewer cases of these mistakes rather we're going to see more (I'd actually argue that there's a lot of this currently happening that you just don't see). The AI hype isn't dying down and more people are entering that don't want to learn the math. "Throw a neural net at it" is not and never will be the answer. Anyone saying that is selling snake oil.

    I don't want people to think I'm anti-ML. In fact I'm a ML researcher. But there's a hard reality we need to face in our field. We've made a lot of progress in the last decade that is very exciting, but we've got a long way to go as well. We can't just have everyone focusing on leaderboard scores and expect to solve our problems.

    [0] https://github.com/NVlabs/ffhq-dataset

    [1] https://twitter.com/Chicken3gg/status/1274314622447820801

    [2] https://twitter.com/emilymbender/status/1434874728682901507

  • A note from our sponsor - SaaSHub
    www.saashub.com | 19 Apr 2024
    SaaSHub helps you find the best software and product alternatives Learn more →

Stats

Basic ffhq-dataset repo stats
13
3,447
0.0
over 1 year ago

NVlabs/ffhq-dataset is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.

The primary programming language of ffhq-dataset is Python.

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com