oemer VS google-research

Compare oemer vs google-research and see what are their differences.


End-to-end OMR system based on deep learning and machine learning techniques. Transcribe on skewed, phone-taken photos. (by BreezeWhite)
Our great sponsors
  • Zigi - Close all those tabs. Zigi will handle your updates.
  • Scout APM - Truly a developer’s best friend
  • InfluxDB - Build time-series-based applications quickly and at scale.
  • SonarLint - Clean code begins in your IDE with SonarLint
oemer google-research
4 60
118 26,143
- 3.5%
5.7 9.9
about 2 months ago 6 days ago
Jupyter Notebook Jupyter Notebook
MIT License Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.


Posts with mentions or reviews of oemer. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-12-17.


Posts with mentions or reviews of google-research. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-11-04.
  • Nearest-neighbor search in high-dimensional spaces
    3 projects | reddit.com/r/compsci | 4 Nov 2022
    Don't roll your own solution, use ScaNN (https://github.com/google-research/google-research/tree/master/scann) or Faiss (https://github.com/facebookresearch/faiss). I used the internal version of ScaNN while I was at Google, and found it incredibly well put-together. Can't speak to the open-source version, but it should be similarly good. These might be a bit overkill given your set sizes, but it'll be easier than building your own fix.
    3 projects | reddit.com/r/compsci | 4 Nov 2022
  • The Vector Database Index: Who, what, why now, & how
    3 projects | news.ycombinator.com | 20 Sep 2022

    We use ScaNN for a large scale/performant neural search. Otherwise this all feels bloated.

  • Apprendre Python, de zéro
    2 projects | reddit.com/r/france | 19 Sep 2022
  • [D] Most important AI Paper´s this year so far in my opinion + Proto AGI speculation at the end
    10 projects | reddit.com/r/MachineLearning | 14 Aug 2022
    An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale Multitask Learning Systems - Google 2022 – Pathways - Jeff Dean! - Network grows with amount of tasks and data! Paper: https://arxiv.org/abs/2205.12755 Github: https://github.com/google-research/google-research/tree/master/muNet
  • [R] LocoProp: Enhancing BackProp via Local Loss Optimization (Google Brain, 2022)
    2 projects | reddit.com/r/MachineLearning | 2 Aug 2022
    Github: https://github.com/google-research/google-research/tree/master/locoprop
  • Some ML questions
    2 projects | reddit.com/r/DiscoDiffusion | 12 Jul 2022
    * The research around faster denoising diffusion seems quite active https://github.com/google-research/google-research/tree/master/diffusion_distillation ? https://github.com/NVlabs/denoising-diffusion-gan ? Any chances to see those models in DD for a quicker rendering ?
  • 80 million sentence embeddings
    4 projects | reddit.com/r/LanguageTechnology | 3 Jun 2022
    Nearest neighbour search isn't O(N²), neither is building the index. If you had a machine with enough RAM, then I would recommend scann, as it works well and is incredibly fast. I'm not sure if it works with an on-disk file format, though that's what you would want.
  • I don't trust papers out of “Top Labs” anymore
    2 projects | news.ycombinator.com | 28 May 2022
    Jeff Dean responded to OP:

    (The paper mentioned by OP is https://arxiv.org/abs/2205.12755, and I am one of the two authors, along with Andrea Gesmundo, who did the bulk of the work).

    The goal of the work was not to get a high quality cifar10 model. Rather, it was to explore a setting where one can dynamically introduce new tasks into a running system and successfully get a high quality model for the new task that reuses representations from the existing model and introduces new parameters somewhat sparingly, while avoiding many of the issues that often plague multi-task systems, such as catastrophic forgetting or negative transfer. The experiments in the paper show that one can introduce tasks dynamically with a stream of 69 distinct tasks from several separate visual task benchmark suites and end up with a multi-task system that can jointly produce high quality solutions for all of these tasks. The resulting model that is sparsely activated for any given task, and the system introduces fewer and fewer new parameters for new tasks the more tasks that the system has already encountered (see figure 2 in the paper). The multi-task system introduces just 1.4% new parameters for incremental tasks at the end of this stream of tasks, and each task activates on average 2.3% of the total parameters of the model. There is considerable sharing of representations across tasks and the evolutionary process helps figure out when that makes sense and when new trainable parameters should be introduced for a new task.

    You can see a couple of videos of the dynamic introduction of tasks and how the system responds here:



    I would also contend that the cost calculations by OP are off and mischaracterize things, given that the experiments were to train a multi-task model that jointly solves 69 tasks, not to train a model for cifar10. From Table 7, the compute used was a mix of TPUv3 cores and TPUv4 cores, so you can't just sum up the number of core hours, since they have different prices. Unless you think there's some particular urgency to train the cifar10+68-other-tasks model right now, this sort of research can very easily be done using preemptible instances, which are $0.97/TPUv4 chip/hour and $0.60/TPUv3 chip/hour (not the "you'd have to use on-demand pricing of $3.22/hour" cited by OP). With these assumptions, the public Cloud cost of the computation described in Table 7 in the paper is more like $13,960 (using the preemptible prices for 12861 TPUv4 chip hours and 2474.5 TPUv3 chip hours), or about $202 / task.

    I think that having sparsely-activated models is important, and that being able to introduce new tasks dynamically into an existing system that can share representations (when appropriate) and avoid catastrophic forgetting is at least worth exploring. The system also has the nice property that new tasks can be automatically incorporated into the system without deciding how to do so (that's what the evolutionary search process does), which seems a useful property for a continual learning system. Others are of course free to disagree that any of this is interesting.

    Edit: I should also point out that the code for the paper has been open-sourced at: https://github.com/google-research/google-research/tree/mast...

    We will be releasing the checkpoint from the experiments described in the paper soon (just waiting on two people to flip approval bits, and process for this was started before the reddit post by OP).


    source: https://old.reddit.com/r/MachineLearning/comments/uyratt/d_i...

  • stereodemo: compare several recent stereo depth estimation methods in the wild
    6 projects | reddit.com/r/computervision | 23 May 2022
    Hitnet: "Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching" (CVPR 2021)

What are some alternatives?

When comparing oemer and google-research you can also consider the following projects:

milvus - Vector database for scalable similarity search and AI applications.

qdrant - Qdrant - Vector Search Engine for the next generation of AI applications

struct2depth - Models and examples built with TensorFlow

torchsort - Fast, differentiable sorting and ranking in PyTorch

fast-soft-sort - Fast Differentiable Sorting and Ranking

CLIP - Contrastive Language-Image Pretraining

faiss - A library for efficient similarity search and clustering of dense vectors.

rmi - A learned index structure

ml-agents - The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

ML-KWS-for-MCU - Keyword spotting on Arm Cortex-M Microcontrollers


haystack - :mag: Haystack is an open source NLP framework that leverages pre-trained Transformer models. It enables developers to quickly implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications.