pygraphistry vs NetworkX

pygraphistry

PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer (by graphistry)

Source Code

Suggest alternative

Edit details

NetworkX

Network Analysis in Python (by networkx)

Science and Data analysis Python complex-networks graph-theory graph-algorithms graph-analysis graph-generation graph-visualization spec-0 spec-1 spec-4

Source Code

networkx.org

Suggest alternative

Edit details

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

pygraphistry		NetworkX
	Project
9	Mentions	61
2,055	Stars	14,178
2.3%	Growth	1.6%
9.2	Activity	9.6
19 days ago	Latest Commit	4 days ago
Python	Language	Python
BSD 3-clause "New" or "Revised" License	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

pygraphistry

Posts with mentions or reviews of pygraphistry. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-05.

Graph Data Fits in Memory
1 project | news.ycombinator.com | 15 Apr 2024

Extra fun: We find most enterprise/gov graph analytics work only requires 1-2 attributes to go along with the graph index, and those attributes often are already numeric (time, $, ...) or can be dictionary-encoded as discussed here (categorical, ID, ...)... so even 'tough' billion scale graphs are fine on 1 gpu.
Early, but that's been the basic thinking into our new GFQL system: slice into the columns you want, and then do all the in-GPU traversals you want. In our V1, we keep things dataframe-native include the in-GPU data representation, and are already working on the first extensions to support switching to more graph-native indexing for steps as needed.
Ex: https://github.com/graphistry/pygraphistry/blob/master/demos...
The "missing" graph datatype already exists. It was invented in the '70s
6 projects | news.ycombinator.com | 5 Mar 2024

If you enjoy this kind of thinking, we recently released GFQL for dataframe-native graph querying & compute
Imagine Neo4j Cypher, except no need for a database -- just import it -- and automatically vectorizes for significantly faster CPU+GPU performance. This is fundamentally similar to the kinds of implementations a datalog approach enables. (And indeed one of the alternative interfaces we were considering!)
We've run it on 100M+ edge graphs on some of the cheapest GPUs you can get, and are getting ready for the next rev with aggregate compute: https://github.com/graphistry/pygraphistry/blob/master/demos...
Displaying Content as a Graph
1 project | news.ycombinator.com | 1 Jan 2024

This is a great article and fun to see fundamental concepts get (re)discovered here!
A perspective that we can generalize from the hierarchy discussion is to think about tool-for-the-job: what is the 'content' job, and what 'jobs' graphs will do? We think about this a lot as we work on problems like how to make it easy to explore 100,000X+ more relationships on screen than they're showing: https://github.com/graphistry/pygraphistry .
First, what do graph visualizations do?
- They let us see the relationships in data. The article discusses hierarchy. But there is also progression, root cause, scope, and basically any correlation/causation relationship ML/AI figures out.
- They let us directly manipulate the nodes & edges, such as for drilling down, navigating, reclustering, etc.
- A useful 'aha' is thinking of modern information visualization as trying to optimize some sort of time-to-insight through a sequence of visual interactions. So each view must be information dense for visually revealing certain insights, and make it easy to get to the next set of visual Q&A.
- Ex: When the entities are the interesting thing wrt questions, being able to drill down into individual nodes/edges into great dedicated views becomes important, so graphs get to need to be multimodal. And if the relationship aspect is unimportant... then graph view hurts more than it helps.
- From optimization perspective, it now makes sense to specialize for specific domains. Maybe what is needed is more of a small diagram, and not actually investigating a lot of relationships. Or a graph of subway stops, which has additional visual considerations. For a website, a sitemap navigation vs clickstream product analytics view would likewise need
A good analogy is a map. Sometimes exploring Google Maps is great, and you drill into a business inspector sidebar or down to a street view. But other times, it's better to have the map embedded into Yelp.com restaurant entry when you just need a quick view of mapping information as part of some broader context. Or you don't care about that map at all and can skip it.
Given all that.. it's interesting to revisit asking... what is the 'content' job to be solved? What kinds of content lean towards graph, and which don't?
NeurIPS 2023 Posters Cluster Visualization
1 project | news.ycombinator.com | 9 Dec 2023

We regular use pygraphistry to generate /import => viz 100k+ entity embeddings on mobile fine: https://github.com/graphistry/pygraphistry
More fun, in umap mode, by default, it also shows the top-n similarity edges between each entity, so you get an interactive graph you can recluster, vs just the 2d scatter plot
NetworkX – Network Analysis in Python
8 projects | news.ycombinator.com | 8 Dec 2023

We make it pretty easy to go from networkx or any other pydata (DF, csv, parquet, ...) to interactive GPU viz w all sorts of analytics built in: https://github.com/graphistry/pygraphistry#explore-any-data-...
How to pass any first-round interview (even in a terrible talent market)
1 project | news.ycombinator.com | 5 Jul 2023

I appreciate the good faith attempt:
https://github.com/graphistry/pygraphistry
And yes, we currently get used by data scientists and devs on problems like supply chain analysis, misinformation, cybersecurity, human trafficking. Seeing 100x+ more data than d3 and having a full env there makes their investigations easier. Our original tech helped lead to what is now Apache Arrow (we wrote the JS tier) and Nvidia RAPIDS (we wrote the precursor in js/opencl, and worked with Nvidia to restart for pydata), and are now focusing on the Nvidia Morpheus & graph AI sides for end-to-end GPU pipelines with our bigger customers (cyber, ...). To make this kind of tech easier for analysts, who are traditionally stuck with Splunk/Kibana/etc style UIs for investigations, we have been launching louie.ai with various customers. L
Hopefully now it makes sense why we don't go far with candidates who can't have conversations on these things.
Handbook of Graph Drawing and Visualization
4 projects | news.ycombinator.com | 30 Dec 2021

This! We do it all the time in fraud, genomics, social media, security, etc
We do one more thing: connect the nearest neighbors to make an interactive similarity graph. Takes just a few lines in total: https://github.com/graphistry/pygraphistry/blob/master/demos...
Don't Bring a Tree to a Mesh Fight
1 project | news.ycombinator.com | 23 Nov 2021

It's super useful in practice!
In the table -> hypergraph transform @ https://github.com/graphistry/pygraphistry , we do `hypergraph(multicolumn_table, direct=True | False)['graph'].plot()` , which renders hypergraphs as a regular graph, this lets you pick/. Consider exploring some logs of customer activity or security events:
A hyperedge becomes either:
- a node of a bipartite graph. Ex: each log event becomes a node connecting the various entity nodes it mentions (IPs, accounts, countries, ...)
- .. or a bunch of pairwise entity<>entity edges. Ex: connect each IP<>account<>country directly, and label each edge with the hyperedge it came from.
In both cases, you can now directly leverage a lot of traditional graph thinking, and in our case, GPU acceleration.
Other systems might render hyperedges as say circles encomposing their nodes, but that's trickier at even small/medium scales
I increasingly just directly equate 'logs' with 'hypergraphs' and skip the relational step :)
An Engineer's View of Venture Capitalists (2011)
2 projects | news.ycombinator.com | 11 Nov 2021

NetworkX

Posts with mentions or reviews of NetworkX. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-03-04.

Routes to LANL from 186 sites on the Internet
1 project | news.ycombinator.com | 4 Mar 2024

Built from this data... https://github.com/networkx/networkx/blob/main/examples/grap...
The Hunt for the Missing Data Type
10 projects | news.ycombinator.com | 4 Mar 2024

I think one of the elements that author is missing here is that graphs are sparse matrices, and thus can be expressed with Linear Algebra. They mention adjacency matrices, but not sparse adjacency matrices, or incidence matrices (which can express muti and hypergraphs).
Linear Algebra is how almost all academic graph theory is expressed, and large chunks of machine learning and AI research are expressed in this language as well. There was recent thread here about PageRank and how it's really an eigenvector problem over a matrix, and the reality is, all graphs are matrices, they're typically sparse ones.
One question you might ask is, why would I do this? Why not just write my graph algorithms as a function that traverses nodes and edges? And one of the big answers is, parallelism. How are you going to do it? Fork a thread at each edge? Use a thread pool? What if you want to do it on CUDA too? Now you have many problems. How do you know how to efficiently schedule work? By treating graph traversal as a matrix multiplication, you just say Ax = b, and let the library figure it out on the specific hardware you want to target.
Here for example is a recent question on the NetworkX repo for how to find the boundary of a triangular mesh, it's one single line of GraphBLAS if you consider the graph as a matrix:
https://github.com/networkx/networkx/discussions/7326
This brings a very powerful language to the table, Linear Algebra. A language spoken by every scientist, engineer, mathematician and researcher on the planet. By treating graphs like matrices graph algorithms become expressible as mathematical formulas. For example, neural networks are graphs of adjacent layers, and the operation used to traverse from layer to layer is matrix multiplication. This generalizes to all matrices.
There is a lot of very new and powerful research and development going on around sparse graphs with linear algebra in the GraphBLAS API standard, and it's best reference implementation, SuiteSparse:GraphBLAS:
https://github.com/DrTimothyAldenDavis/GraphBLAS
SuiteSparse provides a highly optimized, parallel and CPU/GPU supported sparse Matrix Multiplication. This is relevant because traversing graph edges IS matrix multiplication when you realize that graphs are matrices.
Recently NetworkX has grown the ability to have different "graph engine" backends, and one of the first to be developed uses the python-graphblas library that binds to SuiteSparse. I'm not a directly contributor to that particular work but as I understand it there has been great results.
Build the dependency graph of your BigQuery pipelines at no cost: a Python implementation
2 projects | dev.to | 11 Jan 2024

In the project we used Python lib networkx and a DiGraph object (Direct Graph). To detect a table reference in a Query, we use sqlglot, a SQL parser (among other things) that works well with Bigquery.
NetworkX – Network Analysis in Python
1 project | /r/patient_hackernews | 9 Dec 2023

1 project | /r/hackernews | 9 Dec 2023

1 project | /r/hypeurls | 8 Dec 2023

8 projects | news.ycombinator.com | 8 Dec 2023
Custom libraries and utility tools for challenges
1 project | /r/adventofcode | 5 Dec 2023

If you program in Python, can use NetworkX for that. But it's probably a good idea to implement the basic algorithms yourself at least one time.
Google open-sources their graph mining library
7 projects | news.ycombinator.com | 3 Oct 2023

For those wanting to play with graphs and ML I was browsing the arangodb docs recently and I saw that it includes integrations to various graph libraries and machine learning frameworks [1]. I also saw a few jupyter notebooks dealing with machine learning from graphs [2].
Integrations include:
* NetworkX -- https://networkx.org/
* DeepGraphLibrary -- https://www.dgl.ai/
* cuGraph (Rapids.ai Graph) -- https://docs.rapids.ai/api/cugraph/stable/
* PyG (PyTorch Geometric) -- https://pytorch-geometric.readthedocs.io/en/latest/
--
1: https://docs.arangodb.com/3.11/data-science/adapters/
2: https://github.com/arangodb/interactive_tutorials#machine-le...
org-roam-pygraph: Build a graph of your org-roam collection for use in Python
2 projects | /r/orgmode | 7 May 2023

org-roam-ui is a great interactive visualization tool, but its main use is visualization. The hope of this library is that it could be part of a larger graph analysis pipeline. The demo provides an example graph visualization, but what you choose to do with the resulting graph certainly isn't limited to that. See for example networkx.

What are some alternatives?

When comparing pygraphistry and NetworkX you can also consider the following projects:

Graphia - A visualisation tool for the creation and analysis of graphs

Numba - NumPy aware dynamic Python compiler using LLVM

cugraph - cuGraph - RAPIDS Graph Analytics Library

Dask - Parallel computing with task scheduling

reddit-detective - Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more

julia - The Julia Programming Language

cusim - Superfast CUDA implementation of Word2Vec and Latent Dirichlet Allocation (LDA)

RDKit - The official sources for the RDKit library

Gephi - Gephi - The Open Graph Viz Platform

snap - Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library.

chinese-whispers - An implementation of Chinese Whispers in Python.

SymPy - A computer algebra system written in pure Python

pygraphistry vs Graphia NetworkX vs Numba pygraphistry vs cugraph NetworkX vs Dask pygraphistry vs reddit-detective NetworkX vs julia pygraphistry vs cusim NetworkX vs RDKit pygraphistry vs Gephi NetworkX vs snap pygraphistry vs chinese-whispers NetworkX vs SymPy

Compare pygraphistry vs NetworkX and see what are their differences.

pygraphistry

NetworkX

pygraphistry

NetworkX

What are some alternatives?