-
Datashader is good for rendering large amounts of data, I'd start with that
https://datashader.org/
-
InfluxDB
Purpose built for real-time analytics at any scale. InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.
-
-
I havenβt tried that many, but this was able to render 100s of millions for me in real time.
https://github.com/latentcat/graphpu
-
You can visualise a graph with 9 billion nodes on https://www.openstreetmap.org
You could copy their design, if you know how you want to project your nodes into 2D. Essentially dividing the visualisation into a very large number of tiles, generated at 18 different zoom levels, then the 'slippy map' viewer loads the tiles corresponding to the chosen field of view.
-
-
Cytoscape JS[1] with canvas rendering. Probably won't be able to do a billion nodes, but the last time I compared graph rendering libraries it was the best one in terms of performance/customizability. If you need even more performance, there's VivaGraphJS[2], which uses webgl to render.
If you want other resources, I also a GitHub list of Graph-related libraries (visualizations etc.) on GitHub[3].
[1]: https://js.cytoscape.org/
-
[3]: https://github.com/stars/AlexW00/lists/graph-stuff
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Mosaic is designed for scale
https://github.com/uwdata/mosaic
https://idl.uw.edu/mosaic/
-
1) 100B? Try a thousand. Of course context matters, but I think it is common to overestimate the amount of information that can be visually conveyed at once. But it is also common to make errors in aggregation, or errors in how one interprets aggregation.
2) You may be interested in the large body of open source HPC visualization works. LLNL and ORNL are the two dominant labs in that space. Your issue might also be I/O since you can generate data faster than you can visualize it. One paradigm that HPC people utilize is "in situ" visualization. Where you visualize at runtime so that you do not hold back computation. At this scale, if you're not massively parallelizing your work, then it isn't the CPU that's the bottleneck, but the thing between the chair and keyboard. The downside of in situ is you have to hope you are visualizing the right data at the right time. But this paradigm includes pushing data to another machine that performs the processing/visualization or even storage (i.e. compute on the fast machine, push data to machine with lots of memory and that machine handles storage. Or more advanced, one stream to a visualization machine and another to storage). Checkout ADIOS2 for the I/O kind of stuff.
https://github.com/ornladios/ADIOS2
-
Honestly, I've been away from the field for quite a long time so wouldn't be up to date. But, if you want kind of a good framing of the field, how it evolved and how it's different from other kinds of visualization (like scientific) maybe start here [0]
0 - https://www.cs.purdue.edu/homes/xmt/classes/slides/CS530/Inf...
There used to be a lively research field for information visualization that studied current visualization techniques and proposed new ones to solve specific challenges -- I remember when treemaps were first introduced for example [1]. Large networks were a pretty big area of research at the time with all kinds of centrality clustering, and edge minimization techniques.
1 - https://www.google.com/search?q=treemap+visualization&tbs=im...
A few teams even tried various kind of hyperbolic representations [2,3] so that areas under local inspection were magnified under your cursor, and the rest of the hairball was pushed off to the edges of the display. But with big graphs you run into quite a few big problems very quickly like local vs. global visibility, layout challenges, etc.
2 - https://graphics.stanford.edu/papers/webviz/webviz/node2.htm...
3 - https://www.caida.org/catalog/software/walrus/
Not specifically graph related, but the best critical thinker I know of in the space is probably Edward Tufte [4]. I have some problems with a few bits of his thinking, and other than sparklines his contributions are mostly in terms of critically challenging what should be represented, why, how, and methods of interaction, his critical analysis has stayed up there as some of the best. He has a book set that's a really great collection of his thoughts.
4 - https://www.edwardtufte.com/tufte/
If you approach this problem critically, you end up at the inevitable conclusion that trying to globally visualize a massive graph in general is basically useless. Sure there are specific topologies that can be abstracted into easier to display graphs, but the general case is not conducive. It's also somewhat surprising at how small a graph can be before visualizing it gets out of hand -- maybe a few dozen nodes and edges.
I remember the U.S. DoE did some really pioneering studies in the field and produced some underappreciated experts like Thomas, Cook and Risch [5,6]. I like Risch's concepts around visualizations as formal metaphors of data. I think he's successful in defining the rigorous atomic components of visualization that you can build up from.
5 - https://ils.unc.edu/courses/2017_fall/inls641_001/books/RD_A...
6 - https://arxiv.org/pdf/0809.0884v1
One interesting artifact from all of this is that most of the research has long ago been captured and commoditized or made open source. There really isn't a market anymore for commercial visualization companies, or grant money for visualization research. D3.js [7] (and the derivatives) more or less took millions upon millions of dollars in R&D and commercial research and boiled it down into a free, open source, library that captured pretty much all of the major findings in one place. It's objectively better than anything that was on the market or in labs at the time I was in the space and it's free.
7 - https://d3js.org/
-
I had this question a few years back while working on a social network graph project and trying to render a multi-million node graph. Tried Ogma and it worked quite well but it became too slow when approaching the million. Ended up writing my own renderer in C++ and then Rust. Code here: https://github.com/zdimension/graphrust
Tested it up to 5M nodes, renders above 60fps on my laptop's iGPU and on my Pixel 7 Pro. Turns out, drawing lots of points using shaders is fast.
Though like everybody else here said you probably don't want to draw that many nodes. Create a lower LoD version of the graph and render it instead
-
GoJS, a JavaScript Library for HTML Diagrams
JavaScript diagramming library for interactive flowcharts, org charts, design tools, planning tools, visual languages.
My library (https://gojs.net) can do that easily. Give it a look, and if you think the price is acceptable for your project, contact us and we can make you a proof-of-concept.
-
Oh god I ran into this issue! Fewer nodes, but still.
I created an HTML page that used vis-network to created a force-directed nodegraph. I'd then just open it up and wait for it to settle.
The initial code is here, you should be able to dump it into an LLM to explain: https://github.com/HebeHH/skyrim-alchemy/blob/master/HTMLGra...
I later used d3 to do pretty much the same thing, but with a much larger graph (still only 100,000 nodes). That was pretty fragile though, so I added an `export to svg` button so you could load the graph, wait for it to settle, and then download the full thing. This kept good quality for zooming in and out.
However my nodegraphs were both incredibly messy, with many many connections going everywhere. That meant that I couldn't find a library that could work out how to lay it out properly first time, and needed the force-directed nature to spread them out. For your case of 1 billion nodes, force-directed may not be the way to go.