dblink

Distributed Bayesian Entity Resolution in Apache Spark (by cleanzr)

Dblink Alternatives

Similar projects and alternatives to dblink based on common topics and language

  • entity-embed

    2 dblink VS entity-embed

    PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • mmlspark

    Discontinued Simple and Distributed Machine Learning [Moved to: https://github.com/microsoft/SynapseML]

  • sparkMeasure

    This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.

  • reclin

    Probabilistic Record Linkage in R

  • delight

    A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.

  • SynapseML

    Simple and Distributed Machine Learning

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better dblink alternative or higher similarity.

dblink reviews and mentions

Posts with mentions or reviews of dblink. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-03-10.
  • [D] Machine Learning and "Record Linkage"
    2 projects | /r/statistics | 10 Mar 2021
    Felligi-Sunter is the baseline model in record linkage research. It is implemented in R in fastLink and RecordLinkage, but you will need training data. There are some other options, e.g. dblink, that use Bayesian methods and a latent variable set up so you don’t need training data.

Stats

Basic dblink repo stats
1
54
0.0
almost 3 years ago

cleanzr/dblink is an open source project licensed under GNU General Public License v3.0 or later which is an OSI approved license.

The primary programming language of dblink is Scala.


Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com