Transforming free-form geospatial directions into addresses - SOTA?

This page summarizes the projects mentioned and recommended in the original post on /r/LanguageTechnology

CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. libpostal

    A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

    I know of https://github.com/openvenues/libpostal which handles typos and omissions in addresses, but I am looking into a more fuzzy description of a location.

  2. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  3. duckling

    Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.

    To understand what relative distance and direction is indicated from the reference point, I'd look into something like Facebook & Wit.AI's Duckling, and a custom classifier to identify if it's on the reference point ("corner of"), or some distance from ("200 meters southwest"). If you can parse out a distance and direction, then it's all logic to plot the point.

  4. spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

    If you've got a specific area you're looking at, and already have street data, you could: 1. Follow the ArcGis blog's directions, creating intersection features. 2. Train a classifier (or a specific NER entity type; SpaCy would be a good package for that) on the types of cross-street references you're finding in your text. You can see some of the relevant tokens in the examples you provided - "Corner of", "along", and I'd imagine "intersection of" etc. Even simple string lookups could help you bootstrap the training data. 3. Use some sort of embedding similarity to compare the hit terms to potential cross-streets.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • VerifAI – open-source private/organizational gen search

    1 project | news.ycombinator.com | 1 Mar 2025
  • VerifAI – document-based question-answering systems

    1 project | news.ycombinator.com | 21 Feb 2025
  • Mastering the Art of Conversational AI: Insights and Implementations with Python

    1 project | dev.to | 12 Feb 2025
  • Generative Search for Everyone

    1 project | news.ycombinator.com | 2 Feb 2025
  • VerifAI – Generative Search easy to deploy

    1 project | news.ycombinator.com | 13 Jan 2025

Did you know that Python is
the 2nd most popular programming language
based on number of references?