arquero
hal9ai
DISCONTINUED
Our great sponsors
arquero | hal9ai | |
---|---|---|
8 | 22 | |
1,170 | 122 | |
2.7% | - | |
5.1 | -22.7 | |
13 days ago | 8 months ago | |
JavaScript | TypeScript | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
arquero
-
Show HN: Matrices – explore, visualize, and share large datasets
Hey HN, I'm excited to share a new side project I've been working on.
The product is called Matrices. You can check it out here: https://matrices.com/.
With Matrices, you can *explore*, *visualize*, and *share* large (100k rows) datasets–all without code. Filter data down to just what you want, visualize it with built-in charts, and share your results with one click.
You can use it today (no login or waitlist or anything). Just copy and paste your data from a google sheet or CSV file.
It's hard to describe the feeling of "gliding over data" you get with Matrices, so I'd rather *show* you how it works instead. This 75s video will give you a sense of how it works: https://www.youtube.com/watch?v=Rrh9_I3Ux8E.
Data is stored locally in your browser until you publish it, though small sample does go to the OpenAI APIs for AI-assisted features.
I started building Matrices because I wanted a tool that made it easy to explore new datasets. When I'm first trying to dig into data, I'll have one question... that leads to another... that will invariably lead to five more questions. It's sort of a fractal process, and I couldn't find many good options that were fast, responsive, and visual.
I figured this crowd would be interested in tech stack as well, it's using arquero [1] bindings over apache arrow for in-memory analytics, and visx [2] for visualizations. I'd like to add duckdb-wasm support at some point to open up a wider set of databases. Data is serialized as parquet to save a bit on bandwidth + storage.
Give it a spin, and let me know what you think. This is my first 'serious frontend project' so I appreciate any and all feedback and bug reports. Feel free to comment here (I'll be around most of the day), or shoot me a note: [email protected]
- Goodbye, Node.js Buffer
-
Hal9: Data Science with JavaScript
Transformations: We found out that JavaScript in combination with D3.js has a pretty decent set of data transformation functions; however, it comes nowhere near to Pandas or dplyr. We found out about Tidy.js quite early, loved it, and adopted it. The combination of Tidy.js and D3.js and Plot.js is absolutely amazing for visualizations and data wrangling with small datasets, say 10-100K rows. We were very happy with this for a while; however, once you move away from visualizations into real-world data analysis, we found out 100K rows restrictive, which gets worse when having 100 or 1K columns. So we switched gears and started using Arquero.js, which happens to be columnar and enabled us to process +1M rows in the browser, descent size for real-world data analysis.
-
Apache Arrow 3.0.0 Release
Take a look at the arquero library from a research group at University of Washington (the same group that D3 came out of). https://github.com/uwdata/arquero
hal9ai
-
PyScript
At https://hal9.com, we built components for data science com native JavaScript to avoid the waiting times and download overhead if Pyodide. We found out the best tools for doing data science in the browser are a combination of Arquero and D3 and TensorFlow.js. At least for now.
We wrote our findings of this and many other libraries here: https://news.hal9.com/posts/data-science-with-javascript
We are not using libfortran not gdpr, we are basically using whatever libraries are available for the web. Since most data scientists don't want to use JS per se, you can build the apps as blocks in the Hal9 site or using a soon-to-be-released Python/R package, see https://notebooks.hal9.com
Feel free to check out our repo as well, all the "primitives" / blocks code is in the scripts folder: https://github.com/hal9ai/hal9ai
-
Ask HN: Can you share websites that are pushing the utility of browsers forward?
https://hal9.com helps data scientists build faster web applications.
It uses WebGL and WebAssembly to process larger datasets, perform inference in the browser with TensorFlow.js, and enables running Python code with Pyodide.
-
Ask HN: What ML platform are you using?
If you want to build a web application on top of your ML project, give https://hal9.com a shot. We designed Hal9 with ease of use for deployment and maximum compatibility with web technologies that enable you to build ML apps with React, Vue, etc. We launched a couple months ago but could use some early feedback and users. Thank you!
-
Built data analysis platform optimized for web developers
BTW. If you are ever interested in helping us out, you can send a PR's to our GitHub repo. For instance, the summarize and convert blocks are here: https://github.com/hal9ai/hal9ai/blob/main/scripts/transforms/summarize.txt.js and https://github.com/hal9ai/hal9ai/blob/main/scripts/transforms/convert.txt.js
In addition, you can also use Hal9 as a standalone JS library for data analysis and skip the UX, see https://github.com/hal9ai/hal9ai
Thanks! Is actually all Vue. I like Vue better over React, just personal preference, React is great and probably better for large applications. The trickiest path is the code that executes the pipelines since it has to run dynamic JS code with dynamic parameter; made that open source to make sure people are not stuck with the product if they ever have to leave or want to scar outside the product: https://github.com/hal9ai/hal9ai
You can find more about this project at https://hal9.com — We allow you to edit any block with JavaScript and to export the analysis as as embeddable HTML. You can also use Python or NodeJS if you need more advanced functionality.
-
PyFlow – visual and modular block programming in Python
We are working in https://hal9.com which is language agnostic and allows you to compose different programming languages; however, we are focused at the moment at 1D-graphs but have plans to support 2D-graphs in the coming weeks.
If you want a demo or just time to chat, I'm available at javier at hal9.ai.
-
Mlflow, fastapi, streamlit template Project
We would love to help out since this is a perfect use case for https://hal9.ai; we are about to release our beta version that makes this as easy as copy-pasting code. You can find me at javier at hal9.ai to find some time to chat and give you a walkthrough of our code-to-api functionality.
What are some alternatives?
perspective - A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
Apache Arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
regression-js - Curve Fitting in JavaScript.
arrow-julia - Official Julia implementation of Apache Arrow
pyodide - Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
cylon - Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
blockly - The web-based visual programming editor.
vega-loader-arrow - Data loader for the Apache Arrow format.
starboard-notebook - In-browser literate notebooks
gradio - Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
ml5-library - Friendly machine learning for the web! 🤖