RAG Using Structured Data: Overview and Important Questions

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

workos.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

Graphormer

3 1,902 5.5 Python

Graphormer is a general-purpose deep learning backbone for molecular modeling.

Ok, using ChatGPT and Bard (the irony lol) I learned a bit more about GNNs:
GNNs are probabilistic and can be trained to learn representations in graph-structured data and handling complex relationships, while classical graph algorithms are specialized for specific graph analysis tasks and operate based on predefined rules/steps.
* Why is PyG it called "Geometric" and not "Topologic" ?
Properties like connectivity, neighborhoods, and even geodesic distances can all be considered topological features of a graph. These features remain unchanged under continuous deformations like stretching or bending, which is the defining characteristic of topological equivalence. In this sense, "PyTorch Topologic" might be a more accurate reflection of the library's focus on analyzing the intrinsic structure and connections within graphs.
However, the term "geometric" still has some merit in the context of PyG. While most GNN operations rely on topological principles, some do incorporate notions of Euclidean geometry, such as:
- Node embeddings: Many GNNs learn low-dimensional vectors for each node, which can be interpreted as points in a vector space, allowing geometric operations like distances and angles to be applied.
- Spectral GNNs: These models leverage the eigenvalues and eigenvectors of the graph Laplacian, which encodes information about the geometric structure and distances between nodes.
- Manifold learning: Certain types of graphs can be seen as low-dimensional representations of high-dimensional manifolds. Applying GNNs in this context involves learning geometric properties on the manifold itself.
Therefore, although topology plays a primary role in understanding and analyzing graphs, geometry can still be relevant in certain contexts and GNN operations.
* Real world applications:
- HuggingFace has a few models [0] around things like computational chemistry [1] or weather forecasting.
- PyGod [2] can be used for Outlier Detection (Anomaly Detection).
- Apparently ULTRA [3] can "infer" (in the knowledge graph sense), that Michael Jackson released some disco music :-p (see the paper).
- RGCN [4] can be used for knowledge graph link prediction (recovery of missing facts, i.e. subject-predicate-object triples) and entity classification (recovery of missing entity attributes).
- GreatX [5] tackles removing inherent noise, "Distribution Shift" and "Adversarial Attacks" (ex: noise purposely introduced to hide a node presence) from networks. Apparently this is a thing and the field is called "Graph Reliability" or "Reliable Deep Graph Learning". The author even has a bunch of "awesome" style lists of links! [6]
- Finally this repo has a nice explanation of how/why to run machine learning algorithms "outside of the DB":
"Pytorch Geometric (PyG) has a whole arsenal of neural network layers and techniques to approach machine learning on graphs (aka graph representation learning, graph machine learning, deep graph learning) and has been used in this repo [7] to learn link patterns, also known as link or edge predictions."
--
0: https://huggingface.co/models?pipeline_tag=graph-ml&sort=tre...
1: https://github.com/Microsoft/Graphormer
2: https://github.com/pygod-team/pygod
3: https://github.com/DeepGraphLearning/ULTRA
4: https://huggingface.co/riship-nv/RGCN
5: https://github.com/EdisonLeeeee/GreatX
6: https://edisonleeeee.github.io/projects.html
7: https://github.com/Orbifold/pyg-link-prediction

pygod

3 1,208 8.6 Python

A Python Library for Graph Outlier Detection (Anomaly Detection)

Ok, using ChatGPT and Bard (the irony lol) I learned a bit more about GNNs:
GNNs are probabilistic and can be trained to learn representations in graph-structured data and handling complex relationships, while classical graph algorithms are specialized for specific graph analysis tasks and operate based on predefined rules/steps.
* Why is PyG it called "Geometric" and not "Topologic" ?
Properties like connectivity, neighborhoods, and even geodesic distances can all be considered topological features of a graph. These features remain unchanged under continuous deformations like stretching or bending, which is the defining characteristic of topological equivalence. In this sense, "PyTorch Topologic" might be a more accurate reflection of the library's focus on analyzing the intrinsic structure and connections within graphs.
However, the term "geometric" still has some merit in the context of PyG. While most GNN operations rely on topological principles, some do incorporate notions of Euclidean geometry, such as:
- Node embeddings: Many GNNs learn low-dimensional vectors for each node, which can be interpreted as points in a vector space, allowing geometric operations like distances and angles to be applied.
- Spectral GNNs: These models leverage the eigenvalues and eigenvectors of the graph Laplacian, which encodes information about the geometric structure and distances between nodes.
- Manifold learning: Certain types of graphs can be seen as low-dimensional representations of high-dimensional manifolds. Applying GNNs in this context involves learning geometric properties on the manifold itself.
Therefore, although topology plays a primary role in understanding and analyzing graphs, geometry can still be relevant in certain contexts and GNN operations.
* Real world applications:
- HuggingFace has a few models [0] around things like computational chemistry [1] or weather forecasting.
- PyGod [2] can be used for Outlier Detection (Anomaly Detection).
- Apparently ULTRA [3] can "infer" (in the knowledge graph sense), that Michael Jackson released some disco music :-p (see the paper).
- RGCN [4] can be used for knowledge graph link prediction (recovery of missing facts, i.e. subject-predicate-object triples) and entity classification (recovery of missing entity attributes).
- GreatX [5] tackles removing inherent noise, "Distribution Shift" and "Adversarial Attacks" (ex: noise purposely introduced to hide a node presence) from networks. Apparently this is a thing and the field is called "Graph Reliability" or "Reliable Deep Graph Learning". The author even has a bunch of "awesome" style lists of links! [6]
- Finally this repo has a nice explanation of how/why to run machine learning algorithms "outside of the DB":
"Pytorch Geometric (PyG) has a whole arsenal of neural network layers and techniques to approach machine learning on graphs (aka graph representation learning, graph machine learning, deep graph learning) and has been used in this repo [7] to learn link patterns, also known as link or edge predictions."
--
0: https://huggingface.co/models?pipeline_tag=graph-ml&sort=tre...
1: https://github.com/Microsoft/Graphormer
2: https://github.com/pygod-team/pygod
3: https://github.com/DeepGraphLearning/ULTRA
4: https://huggingface.co/riship-nv/RGCN
5: https://github.com/EdisonLeeeee/GreatX
6: https://edisonleeeee.github.io/projects.html
7: https://github.com/Orbifold/pyg-link-prediction

WorkOS

workos.com featured

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
ULTRA

1 356 6.0 Python

A foundation model for knowledge graph reasoning (by DeepGraphLearning)

Ok, using ChatGPT and Bard (the irony lol) I learned a bit more about GNNs:
GNNs are probabilistic and can be trained to learn representations in graph-structured data and handling complex relationships, while classical graph algorithms are specialized for specific graph analysis tasks and operate based on predefined rules/steps.
* Why is PyG it called "Geometric" and not "Topologic" ?
Properties like connectivity, neighborhoods, and even geodesic distances can all be considered topological features of a graph. These features remain unchanged under continuous deformations like stretching or bending, which is the defining characteristic of topological equivalence. In this sense, "PyTorch Topologic" might be a more accurate reflection of the library's focus on analyzing the intrinsic structure and connections within graphs.
However, the term "geometric" still has some merit in the context of PyG. While most GNN operations rely on topological principles, some do incorporate notions of Euclidean geometry, such as:
- Node embeddings: Many GNNs learn low-dimensional vectors for each node, which can be interpreted as points in a vector space, allowing geometric operations like distances and angles to be applied.
- Spectral GNNs: These models leverage the eigenvalues and eigenvectors of the graph Laplacian, which encodes information about the geometric structure and distances between nodes.
- Manifold learning: Certain types of graphs can be seen as low-dimensional representations of high-dimensional manifolds. Applying GNNs in this context involves learning geometric properties on the manifold itself.
Therefore, although topology plays a primary role in understanding and analyzing graphs, geometry can still be relevant in certain contexts and GNN operations.
* Real world applications:
- HuggingFace has a few models [0] around things like computational chemistry [1] or weather forecasting.
- PyGod [2] can be used for Outlier Detection (Anomaly Detection).
- Apparently ULTRA [3] can "infer" (in the knowledge graph sense), that Michael Jackson released some disco music :-p (see the paper).
- RGCN [4] can be used for knowledge graph link prediction (recovery of missing facts, i.e. subject-predicate-object triples) and entity classification (recovery of missing entity attributes).
- GreatX [5] tackles removing inherent noise, "Distribution Shift" and "Adversarial Attacks" (ex: noise purposely introduced to hide a node presence) from networks. Apparently this is a thing and the field is called "Graph Reliability" or "Reliable Deep Graph Learning". The author even has a bunch of "awesome" style lists of links! [6]
- Finally this repo has a nice explanation of how/why to run machine learning algorithms "outside of the DB":
"Pytorch Geometric (PyG) has a whole arsenal of neural network layers and techniques to approach machine learning on graphs (aka graph representation learning, graph machine learning, deep graph learning) and has been used in this repo [7] to learn link patterns, also known as link or edge predictions."
--
0: https://huggingface.co/models?pipeline_tag=graph-ml&sort=tre...
1: https://github.com/Microsoft/Graphormer
2: https://github.com/pygod-team/pygod
3: https://github.com/DeepGraphLearning/ULTRA
4: https://huggingface.co/riship-nv/RGCN
5: https://github.com/EdisonLeeeee/GreatX
6: https://edisonleeeee.github.io/projects.html
7: https://github.com/Orbifold/pyg-link-prediction

GreatX

1 81 5.4 Python

A graph reliability toolbox based on PyTorch and PyTorch Geometric (PyG).

Ok, using ChatGPT and Bard (the irony lol) I learned a bit more about GNNs:
GNNs are probabilistic and can be trained to learn representations in graph-structured data and handling complex relationships, while classical graph algorithms are specialized for specific graph analysis tasks and operate based on predefined rules/steps.
* Why is PyG it called "Geometric" and not "Topologic" ?
Properties like connectivity, neighborhoods, and even geodesic distances can all be considered topological features of a graph. These features remain unchanged under continuous deformations like stretching or bending, which is the defining characteristic of topological equivalence. In this sense, "PyTorch Topologic" might be a more accurate reflection of the library's focus on analyzing the intrinsic structure and connections within graphs.
However, the term "geometric" still has some merit in the context of PyG. While most GNN operations rely on topological principles, some do incorporate notions of Euclidean geometry, such as:
- Node embeddings: Many GNNs learn low-dimensional vectors for each node, which can be interpreted as points in a vector space, allowing geometric operations like distances and angles to be applied.
- Spectral GNNs: These models leverage the eigenvalues and eigenvectors of the graph Laplacian, which encodes information about the geometric structure and distances between nodes.
- Manifold learning: Certain types of graphs can be seen as low-dimensional representations of high-dimensional manifolds. Applying GNNs in this context involves learning geometric properties on the manifold itself.
Therefore, although topology plays a primary role in understanding and analyzing graphs, geometry can still be relevant in certain contexts and GNN operations.
* Real world applications:
- HuggingFace has a few models [0] around things like computational chemistry [1] or weather forecasting.
- PyGod [2] can be used for Outlier Detection (Anomaly Detection).
- Apparently ULTRA [3] can "infer" (in the knowledge graph sense), that Michael Jackson released some disco music :-p (see the paper).
- RGCN [4] can be used for knowledge graph link prediction (recovery of missing facts, i.e. subject-predicate-object triples) and entity classification (recovery of missing entity attributes).
- GreatX [5] tackles removing inherent noise, "Distribution Shift" and "Adversarial Attacks" (ex: noise purposely introduced to hide a node presence) from networks. Apparently this is a thing and the field is called "Graph Reliability" or "Reliable Deep Graph Learning". The author even has a bunch of "awesome" style lists of links! [6]
- Finally this repo has a nice explanation of how/why to run machine learning algorithms "outside of the DB":
"Pytorch Geometric (PyG) has a whole arsenal of neural network layers and techniques to approach machine learning on graphs (aka graph representation learning, graph machine learning, deep graph learning) and has been used in this repo [7] to learn link patterns, also known as link or edge predictions."
--
0: https://huggingface.co/models?pipeline_tag=graph-ml&sort=tre...
1: https://github.com/Microsoft/Graphormer
2: https://github.com/pygod-team/pygod
3: https://github.com/DeepGraphLearning/ULTRA
4: https://huggingface.co/riship-nv/RGCN
5: https://github.com/EdisonLeeeee/GreatX
6: https://edisonleeeee.github.io/projects.html
7: https://github.com/Orbifold/pyg-link-prediction

pyg-link-prediction

1 22 10.0 Jupyter Notebook

Pytorch Geometric link prediction of a homogeneous social graph.

Ok, using ChatGPT and Bard (the irony lol) I learned a bit more about GNNs:
GNNs are probabilistic and can be trained to learn representations in graph-structured data and handling complex relationships, while classical graph algorithms are specialized for specific graph analysis tasks and operate based on predefined rules/steps.
* Why is PyG it called "Geometric" and not "Topologic" ?
Properties like connectivity, neighborhoods, and even geodesic distances can all be considered topological features of a graph. These features remain unchanged under continuous deformations like stretching or bending, which is the defining characteristic of topological equivalence. In this sense, "PyTorch Topologic" might be a more accurate reflection of the library's focus on analyzing the intrinsic structure and connections within graphs.
However, the term "geometric" still has some merit in the context of PyG. While most GNN operations rely on topological principles, some do incorporate notions of Euclidean geometry, such as:
- Node embeddings: Many GNNs learn low-dimensional vectors for each node, which can be interpreted as points in a vector space, allowing geometric operations like distances and angles to be applied.
- Spectral GNNs: These models leverage the eigenvalues and eigenvectors of the graph Laplacian, which encodes information about the geometric structure and distances between nodes.
- Manifold learning: Certain types of graphs can be seen as low-dimensional representations of high-dimensional manifolds. Applying GNNs in this context involves learning geometric properties on the manifold itself.
Therefore, although topology plays a primary role in understanding and analyzing graphs, geometry can still be relevant in certain contexts and GNN operations.
* Real world applications:
- HuggingFace has a few models [0] around things like computational chemistry [1] or weather forecasting.
- PyGod [2] can be used for Outlier Detection (Anomaly Detection).
- Apparently ULTRA [3] can "infer" (in the knowledge graph sense), that Michael Jackson released some disco music :-p (see the paper).
- RGCN [4] can be used for knowledge graph link prediction (recovery of missing facts, i.e. subject-predicate-object triples) and entity classification (recovery of missing entity attributes).
- GreatX [5] tackles removing inherent noise, "Distribution Shift" and "Adversarial Attacks" (ex: noise purposely introduced to hide a node presence) from networks. Apparently this is a thing and the field is called "Graph Reliability" or "Reliable Deep Graph Learning". The author even has a bunch of "awesome" style lists of links! [6]
- Finally this repo has a nice explanation of how/why to run machine learning algorithms "outside of the DB":
"Pytorch Geometric (PyG) has a whole arsenal of neural network layers and techniques to approach machine learning on graphs (aka graph representation learning, graph machine learning, deep graph learning) and has been used in this repo [7] to learn link patterns, also known as link or edge predictions."
--
0: https://huggingface.co/models?pipeline_tag=graph-ml&sort=tre...
1: https://github.com/Microsoft/Graphormer
2: https://github.com/pygod-team/pygod
3: https://github.com/DeepGraphLearning/ULTRA
4: https://huggingface.co/riship-nv/RGCN
5: https://github.com/EdisonLeeeee/GreatX
6: https://edisonleeeee.github.io/projects.html
7: https://github.com/Orbifold/pyg-link-prediction

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

graphormer pretrained-models

1 project | /r/deeplearning | 27 Jul 2022
[D] Autoregressive model for graph generation?

2 projects | /r/MachineLearning | 5 Apr 2022
Maxtext: A simple, performant and scalable Jax LLM

10 projects | news.ycombinator.com | 23 Apr 2024
awesome-TS-anomaly-detection: NEW Data - star count:2694.0

1 project | /r/algoprojects | 21 Nov 2023
awesome-TS-anomaly-detection: NEW Data - star count:2694.0

1 project | /r/algoprojects | 20 Nov 2023

RAG Using Structured Data: Overview and Important Questions

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Graph outlier-detection Transformer anomaly-detection Deep Learning
Post date: 10 Jan 2024

Graphormer

pygod

WorkOS

ULTRA

GreatX

pyg-link-prediction

InfluxDB

Related posts

graphormer pretrained-models

[D] Autoregressive model for graph generation?

Maxtext: A simple, performant and scalable Jax LLM

awesome-TS-anomaly-detection: NEW Data - star count:2694.0

awesome-TS-anomaly-detection: NEW Data - star count:2694.0

RAG Using Structured Data: Overview and Important Questions

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Graph outlier-detection Transformer anomaly-detection Deep Learning Post date: 10 Jan 2024

Graphormer

pygod

WorkOS

ULTRA

GreatX

pyg-link-prediction

InfluxDB

Related posts

graphormer pretrained-models

[D] Autoregressive model for graph generation?

Maxtext: A simple, performant and scalable Jax LLM

awesome-TS-anomaly-detection: NEW Data - star count:2694.0

awesome-TS-anomaly-detection: NEW Data - star count:2694.0

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Graph outlier-detection Transformer anomaly-detection Deep Learning
Post date: 10 Jan 2024