arquero VS falcon

Compare arquero vs falcon and see what are their differences.

arquero

Query processing and transformation of array-backed data tables. (by uwdata)
SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
surveyjs.io
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
arquero falcon
8 2
1,191 924
1.5% 0.5%
4.6 7.8
about 1 month ago about 1 month ago
JavaScript Jupyter Notebook
BSD 3-clause "New" or "Revised" License GNU General Public License v3.0 or later
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

arquero

Posts with mentions or reviews of arquero. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-07.
  • Show HN: Matrices – explore, visualize, and share large datasets
    2 projects | news.ycombinator.com | 7 Dec 2023
    Hey HN, I'm excited to share a new side project I've been working on.

    The product is called Matrices. You can check it out here: https://matrices.com/.

    With Matrices, you can *explore*, *visualize*, and *share* large (100k rows) datasets–all without code. Filter data down to just what you want, visualize it with built-in charts, and share your results with one click.

    You can use it today (no login or waitlist or anything). Just copy and paste your data from a google sheet or CSV file.

    It's hard to describe the feeling of "gliding over data" you get with Matrices, so I'd rather *show* you how it works instead. This 75s video will give you a sense of how it works: https://www.youtube.com/watch?v=Rrh9_I3Ux8E.

    Data is stored locally in your browser until you publish it, though small sample does go to the OpenAI APIs for AI-assisted features.

    I started building Matrices because I wanted a tool that made it easy to explore new datasets. When I'm first trying to dig into data, I'll have one question... that leads to another... that will invariably lead to five more questions. It's sort of a fractal process, and I couldn't find many good options that were fast, responsive, and visual.

    I figured this crowd would be interested in tech stack as well, it's using arquero [1] bindings over apache arrow for in-memory analytics, and visx [2] for visualizations. I'd like to add duckdb-wasm support at some point to open up a wider set of databases. Data is serialized as parquet to save a bit on bandwidth + storage.

    Give it a spin, and let me know what you think. This is my first 'serious frontend project' so I appreciate any and all feedback and bug reports. Feel free to comment here (I'll be around most of the day), or shoot me a note: [email protected]

    [1]: https://uwdata.github.io/arquero/

  • Goodbye, Node.js Buffer
    15 projects | news.ycombinator.com | 24 Oct 2023
    https://github.com/uwdata/arquero
  • Arquero is a JavaScript library for query processing and transformation of array-backed data tables
    1 project | /r/programming | 24 Jul 2022
  • Arquero – data tables wrangling in JavaScript
    1 project | news.ycombinator.com | 22 Jul 2022
  • Hal9: Data Science with JavaScript
    4 projects | /r/datascience | 9 Sep 2021
    Transformations: We found out that JavaScript in combination with D3.js has a pretty decent set of data transformation functions; however, it comes nowhere near to Pandas or dplyr. We found out about Tidy.js quite early, loved it, and adopted it. The combination of Tidy.js and D3.js and Plot.js is absolutely amazing for visualizations and data wrangling with small datasets, say 10-100K rows. We were very happy with this for a while; however, once you move away from visualizations into real-world data analysis, we found out 100K rows restrictive, which gets worse when having 100 or 1K columns. So we switched gears and started using Arquero.js, which happens to be columnar and enabled us to process +1M rows in the browser, descent size for real-world data analysis.
  • Arquero – Query processing and transformation of array-backed data tables
    1 project | news.ycombinator.com | 16 Feb 2021
  • Apache Arrow 3.0.0 Release
    10 projects | news.ycombinator.com | 3 Feb 2021
    Take a look at the arquero library from a research group at University of Washington (the same group that D3 came out of). https://github.com/uwdata/arquero

falcon

Posts with mentions or reviews of falcon. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-10-24.
  • Goodbye, Node.js Buffer
    15 projects | news.ycombinator.com | 24 Oct 2023
  • Launch HN: Drifting in Space (YC W22) – A server process for every user
    5 projects | news.ycombinator.com | 28 Feb 2022
    Good questions!

    > Why do you need one process per user? / Wouldn't this "event loop" actually be more efficient that one user/process, as there would be less context switching cost from the OS?

    We're particularly interested in apps that are often CPU-bound, so a traditional event-loop would be blocked for long periods of time. A typical solution is to put the work into a thread, so there would still be a context switch, albeit a smaller one.

    The process-per-user approach makes the most sense when a significant amount of the data used by each user does not overlap with other users. VS Code (in client/server mode) is a good example of this -- the overhead of siloing each process is relatively low compared to the benefits it gives. We think more data-heavy apps will make the same trade-offs.

    > Can I just keep a map of (connection, thread_id) on my server, and spawn one thread per user on my own server?

    If you don't have to scale beyond one server, this approach works fine, but it makes scaling horizontally complicated because you suddenly can't just use a plain old load balancer. It's not just about routing requests to the right server; deciding which server to run the threads on becomes complicated because you ideally want to decide based on the server load of each. We started going down this path, realized we'd end up re-inventing Kubernetes, so decided to embrace it instead.

    > Could I just load up my server with many cores, and give each user a SQLite database which runs each query in its own thread? This way a multi GB database would not be loaded into RAM, the query would filter it down to a result set.

    If, for a particular use case, it's economical to keep the data ready in a database that supports the query pattern users will make, it's probably not a good fit for a session-lived backend. In database terms, where our architecture makes sense is when you need to create an index on a dataset (or subset of a dataset) during the runtime of an application. For example, if you have thousands of large parquet files in blob storage and you want a user to be able to load one and run [Falcon](https://github.com/vega/falcon)-type analysis on it.

What are some alternatives?

When comparing arquero and falcon you can also consider the following projects:

perspective - A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

stateroom - A lightweight framework for building WebSocket-based application backends.

Apache Arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

nodejs-polars - nodejs front-end of polars

hal9ai - Hal9 — Data apps powered by code and LLMs [Moved to: https://github.com/hal9ai/hal9]

streams - Streams Standard

regression-js - Curve Fitting in JavaScript.

proposal-zero-copy-arraybuffer-list - A proposal for zero-copy ArrayBuffer lists

arrow-julia - Official Julia implementation of Apache Arrow

proposal-arraybuffer-base64 - TC39 proposal for Uint8Array<->base64/hex

cylon - Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.

spawner - Session backend orchestrator for ambitious browser-based apps. [Moved to: https://github.com/drifting-in-space/plane]