dblink

Distributed Bayesian Entity Resolution in Apache Spark (by cleanzr)

Dblink Alternatives

Similar projects and alternatives to dblink

  1. mmlspark

    2 dblink VS mmlspark

    Discontinued Simple and Distributed Machine Learning [Moved to: https://github.com/microsoft/SynapseML]

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. delight

    2 dblink VS delight

    Discontinued A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.

  4. entity-embed

    2 dblink VS entity-embed

    PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

  5. reclin

    Probabilistic Record Linkage in R

  6. sparkMeasure

    This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.

  7. SynapseML

    18 dblink VS SynapseML

    Simple and Distributed Machine Learning

  8. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
  9. Spark Tools

    Executable Apache Spark Tools: Format Converter & SQL Processor

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better dblink alternative or higher similarity.

dblink discussion

Log in or Post with

dblink reviews and mentions

Posts with mentions or reviews of dblink. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-03-10.
  • [D] Machine Learning and "Record Linkage"
    2 projects | /r/statistics | 10 Mar 2021
    Felligi-Sunter is the baseline model in record linkage research. It is implemented in R in fastLink and RecordLinkage, but you will need training data. There are some other options, e.g. dblink, that use Bayesian methods and a latent variable set up so you don’t need training data.

Stats

Basic dblink repo stats
1
57
0.0
almost 4 years ago

cleanzr/dblink is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.

The primary programming language of dblink is Scala.


Sponsored
CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai