InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more โ
Top 23 JavaScript Data Projects
-
SheetJS js-xlsx
๐ SheetJS Spreadsheet Data Toolkit -- New home https://git.sheetjs.com/SheetJS/sheetjs
Project mention: Building an inventory management app: 'Invento' as a Beginner Developer | dev.to | 2024-07-24XLSX : XLSX is a library for parsing and writing Excel spreadsheet files. It enables the application to export data to Excel, which is a common requirement for inventory management systems.
-
SurveyJS
JavaScript Form Builder with No-Code UI & Built-In JSON Schema Editor. Add the SurveyJS white-label form builder to your JavaScript app (React/Angular/Vue3). Build complex JSON forms without coding. Fully customizable, works with any backend, perfect for data-heavy apps. Learn more.
-
Project mention: Svelte Data Tables for 2024: A Comprehensive Feature Comparison | dev.to | 2024-09-25
Tabulator
-
Project mention: Show HN: Documind โ Open-source AI tool to turn documents into structured data | news.ycombinator.com | 2024-11-18
From the source, Documind appears to:
1) Install tools like Ghostscript, GraphicsMagick, and LibreOffice with a JS script. 2) Convert document pages to Base64 PNGs and send them to OpenAI for data extraction. 3) Use Supabase for unclear reasons.
Some issues with this approach:
* OpenAI may retain and use your data for training, raising privacy concerns [1].
* Dependencies should be managed with Docker or package managers like Nix or Pixi, which are more robust. Example: a tool like Parsr [2] provides a Dockerized pdf-to-json solution, complete with OCR support and an HTTP api.
* GPT-4 vision seems like a costly, error-prone, and unreliable solution, not really suited for extracting data from sensitive docs like invoices, without review.
* Traditional methods (PDF parsers with OCR support) are cheaper, more reliable, and avoid retention risks for this particular use case. Although these tools do require some plumbing... probably LLMs can really help with that!
While there are plenty of tools for structured data extraction, I think thereโs still room for a streamlined, all-in-one solution. This gap likely explains the abundance of closed-source commercial options tackling this very challenge.
---
1: https://platform.openai.com/docs/models#how-we-use-your-data
2: https://github.com/axa-group/Parsr
-
Countly
Countly is a product analytics platform that helps teams track, analyze and act-on their user actions and behaviour on mobile, web and desktop applications.
-
gray-matter
Smarter YAML front matter parser, used by metalsmith, Gatsby, Netlify, Assemble, mapbox-gl, phenomic, vuejs vitepress, TinaCMS, Shopify Polaris, Ant Design, Astro, hashicorp, garden, slidev, saber, sourcegraph, and many others. Simple to use, and battle tested. Parses YAML by default but can also parse JSON Front Matter, Coffee Front Matter, TOML Front Matter, and has support for custom parsers. Please follow gray-matter's author: https://github.com/jonschlinkert
Next is shifting towards what they're calling App Router. The previous itteration, known as Pages Router is not compatible with those shiny new React Server Components I mentioned earlier. The main difference to me was using simple fetch and async/await syntax to fetch the server side props. In this case, I had a node script that relied on fs to retrieve the markdown files and a library called gray-matter to retrieve their YAML metadata properties. Then all I had to do was transform my [slug] page into an async function and call the function that fetched the posts from the filesystem.
-
-
-
InfluxDB
InfluxDB โ Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
-
Project mention: Show HN: JAQT โ JavaScript Queries and Transformations | news.ycombinator.com | 2024-09-16
In a similar vein is https://pbeshai.github.io/tidy/ which I've used for 3+ years. It's a really nice lightweight transformer.
I've also used https://github.com/uwdata/arquero once (better performance for large datasets).
-
-
-
covid19
JSON time-series of coronavirus cases (confirmed, deaths and recovered) per country - updated daily (by pomber)
-
Project mention: Embedding Atlas: a scalable way to explore text embeddings | news.ycombinator.com | 2025-05-14
There have been several projects over the past few years to make text embeddings visually explorable (notably Nomic.ai's Nomic Atlas). However, Apple's just released a tool that makes this kind of analysis super accessible and insanely interactive.
Under the hood it's powered by Mosaic[0], a dataviz library built on top of DuckDB that's designed to handle coordinated interactive plots over huge datasets, the kind of thing where you interact with one plot and the rest all respond, which requires going back to the database to recalculate all the aggregations.
I've been fanboying Mosaic for the past year but finally have this to point to as an illustration of what's possible with it.
[0]: https://idl.uw.edu/mosaic
-
kuwala
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demograp
-
Project mention: Show HN: Codigo โ The Programming Language Repository | news.ycombinator.com | 2025-05-10
-
Project mention: Hereโs how you can type check your CSS properties and values. | dev.to | 2025-05-15
TypeScript and Flow definitions for CSS, generated by data from MDN. It provides autocompletion and type checking for CSS properties and values.
-
minecraft-data
Language independent module providing minecraft data for minecraft clients, servers and libraries.
-
-
react-native-big-list
This is a high performance list view for React Native with support for complex layouts using a similar FlatList usage to make easy the replacement. This list implementation for big list rendering on React Native works with a recycler focused on performance and memory usage and so it permits processing thousands items on the list.
-
-
genshin-db
npm package with searching functions for Genshin Impact data of all in-game languages. Data parsed/organized directly from GenshinData repo.
-
spamscanner
Spam Scanner is a Node.js anti-spam, email filtering, and phishing prevention tool and service. Built for @ladjs, @forwardemail, @cabinjs, @breejs, and @lassjs.
-
strapi-plugin-config-sync
:recycle: CLI & GUI for continuous migration of config data across environments
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
JavaScript Data discussion
JavaScript Data related posts
-
Embedding Atlas: a scalable way to explore text embeddings
-
Show HN: Codigo โ The Programming Language Repository
-
Drawdata: Draw Datasets from Within Jupyter
-
I built an options trading analytics dashboard for Robinhood users: OptionScope
-
Experimental web browser optimized for rabbit-holing
-
Show HN: JAQT โ JavaScript Queries and Transformations
-
Can LLMs Generate Novel Research Ideas?
-
A note from our sponsor - InfluxDB
www.influxdata.com | 21 May 2025
Index
What are some of the best open-source Data projects in JavaScript? This list will help you:
# | Project | Stars |
---|---|---|
1 | SheetJS js-xlsx | 35,566 |
2 | Tabulator | 7,121 |
3 | Parsr | 5,930 |
4 | Countly | 5,674 |
5 | gray-matter | 4,096 |
6 | react-refetch | 3,426 |
7 | awesome-json-datasets | 3,423 |
8 | kea | 1,955 |
9 | arquero | 1,397 |
10 | drawdata | 1,396 |
11 | covid19_scenarios | 1,355 |
12 | covid19 | 1,227 |
13 | mosaic | 1,020 |
14 | kuwala | 795 |
15 | pldb | 762 |
16 | data | 745 |
17 | minecraft-data | 738 |
18 | panini | 595 |
19 | react-native-big-list | 536 |
20 | PostGUI | 449 |
21 | genshin-db | 383 |
22 | spamscanner | 311 |
23 | strapi-plugin-config-sync | 261 |