arrow-julia VS cylon

Compare arrow-julia vs cylon and see what are their differences.

arrow-julia

Official Julia implementation of Apache Arrow (by apache)

cylon

Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame. (by cylondata)
Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
arrow-julia cylon
4 3
277 293
1.8% 1.0%
6.2 4.8
16 days ago 6 days ago
Julia C++
GNU General Public License v3.0 or later Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

arrow-julia

Posts with mentions or reviews of arrow-julia. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-08-18.
  • Julia 1.8 has been released
    8 projects | news.ycombinator.com | 18 Aug 2022
    For some examples of people porting existing C++ Fortran libraries to julia, you should check out https://github.com/JuliaLinearAlgebra/Octavian.jl, https://github.com/dgleich/GenericArpack.jl, https://github.com/apache/arrow-julia (just off the top of my head). These are all ports of C++ or Fortran libraries that match (or exceed) performance of the original, and in the case of Arrow.jl is faster, more general, and 10x less code.
  • How to adapt Arrow.Table columns (naturally per record batch basis) into CuArrays for GPU processing?
    1 project | /r/Julia | 2 Mar 2022
  • Reading HDF5 Files
    2 projects | /r/Julia | 9 Mar 2021
    I guess current preferred format not feather, but arrow: https://github.com/JuliaData/Arrow.jl
  • Apache Arrow 3.0.0 Release
    10 projects | news.ycombinator.com | 3 Feb 2021
    Excited to see this release's official inclusion of the pure Julia Arrow implementation [1]!

    It's so cool to be able mmap Arrow memory and natively manipulate it from within Julia with virtually no performance overhead. Since the Julia compiler can specialize on the layout of Arrow-backed types at runtime (just as it can with any other type), the notion of needing to build/work with a separate "compiler for fast UDFs" is rendered obsolete.

    It feels pretty magical when two tools like this compose so well without either being designed with the other in mind - a testament to the thoughtful design of both :) mad props to Jacob Quinn for spearheading the effort to revive/restart Arrow.jl and get the package into this release.

    [1] https://github.com/JuliaData/Arrow.jl

cylon

Posts with mentions or reviews of cylon. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-25.
  • Data Parallel Pipeline/MapReduce in C++?
    2 projects | /r/cpp | 25 Jan 2023
    There's also https://cylondata.org/ which is more of a Pandas approach.
  • Cylon: DataFrames for MPI!
    1 project | /r/dataengineering | 21 Apr 2021
    I'd like to introduce Cylon, a fast, scalable, distributed-memory-parallel runtime. From v0.4 release onward, Cylon introduces a "Pandas-like DataFrames for MPI environments"! It is now available Cylon v0.4.0! :-) This is, by far our most significant release.
  • Apache Arrow 3.0.0 Release
    10 projects | news.ycombinator.com | 3 Feb 2021
    Cudf and Cylon are two execution engines natively supporting Arrow format https://github.com/rapidsai/cudf https://github.com/cylondata/cylon

What are some alternatives?

When comparing arrow-julia and cylon you can also consider the following projects:

perspective - A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

Apache Arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

arquero - Query processing and transformation of array-backed data tables.

vega-loader-arrow - Data loader for the Apache Arrow format.

ClickHouse - ClickHouse® is a free analytics DBMS for big data

TableIO.jl - A glue package for reading and writing tabular data. It aims to provide a uniform api for reading and writing tabular data from and to multiple sources.

go-py-arrow-bridge - Bridge between Go and Python to facilitate zero-copy using Apache Arrow