GraphScope: A One-Stop Large-Scale Graph Computing System

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • GraphScope

    🔨 🍇 💻 🚀 GraphScope: A One-Stop Large-Scale Graph Computing System from Alibaba | 一站式图计算系统

  • Thanks for you interests on GraphScope!

    We do have a concrete plan for k8s-less deployment and we already have an issue [1] to track that. That will be available before the end of March 2021.

    To simplify the environment setup process we will release a docker image for end-users, but without docker will be ok as well (requires building from sources).

    GraphScope use vineyard [2] as the storage layer for im-memory graph data structures. And current the graph type (aka. ArrowPropertyFragment in GraphScope) uses a set of arrow tables and arrays under the hood.

    GraphScope supports a `to_vineyard_dataframe` method on the computation context [3]. We also has a plan for integration between vineyard and dask (may could be delivered in March as well). At that time the interop between dask would be straightforward.

    [1]: https://github.com/alibaba/GraphScope/discussions/113

    [2]: https://github.com/alibaba/libvineyard

    [3]: https://graphscope.io/docs/reference/context.html#graphscope...

  • libvineyard

    Discontinued vineyard (v6d): an in-memory immutable data manager. [Moved to: https://github.com/alibaba/v6d]

  • It makes sense to run such tasks in other machines/systems without adding too much burden to a graph db to avoid affect its quality of service.

    2. Fully integration with Python makes it more flexible to do data analytics. For example, you can leverage the ability provided by numpy, pandas and mars (https://github.com/mars-project/mars) along GraphScope with zero-copy thanks to our storage engine vineyard (https://github.com/alibaba/libvineyard)

    3. Besides distributed processing, extra performance can also come from the efficient graph layout in memory, and other optimizations on the compiler and runtime-level. GraphScope is ~100x faster on Gremlin, and even more on graph analytical algorithms like PageRank, compared with graph dbs like JanusGraph.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • libgrape-lite

    🍇 A C++ library for parallel graph processing (GRAPE) 🍇

  • We don't have a benchmark between the analytical engine in GraphScope (aka. GAE) with GraphX/Giraph. But we do have evaluated the performance of the underlying engine of GAE (libgrape-lite) with LDBC Graph Analytics Benchmark and it achieves higher performance comparably to the state-of-the-art systems [2].

    [1]: https://github.com/alibaba/libgrape-lite

    [2]: https://github.com/alibaba/libgrape-lite/blob/master/Perform...

  • euler

    A distributed graph deep learning framework. (by alibaba)

  • https://github.com/alibaba/euler/wiki

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts