Top 12 tabular-data Open-Source Projects
React components for efficiently rendering large lists and tabular dataProject mention: Data driven Web Frontends....looking at React and beyond for CRUD | reddit.com/r/datascience | 2021-04-21
> GraphQL and React seem to be really popular The combo Apollo + React works till you have less than 10K data points / 1 request. Afterwards you have to invent a ways how to a) reduce bandwidth; b) optimize performance in browser. React already has quite some ways to deal with in-browser performance, e.g. https://github.com/bvaughn/react-virtualized. As about Apollo... I have had some epic troubles with it when there are many JOINs / big payload, and ended up with Websockets in some cases and with REST in some other cases.
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualize and explore big tabular data at a billion rows per second 🚀Project mention: I wrote one of the fastest DataFrame libraries | news.ycombinator.com | 2021-03-13
Scout APM - Leading-edge performance monitoring starting at $39/month. Scout APM uses tracing logic that ties bottlenecks to source code so you know the exact line of code causing performance issues and can get back to building a great product faster.
A terminal spreadsheet multitool for discovering and arranging dataProject mention: `uq is a simple, user-friendly alternative to `sort | uniq`. | reddit.com/r/commandline | 2021-04-15
Run vd (VisiData on the file, press Shift+F, instant unique lines sorted by number of uses. Like sort | uniq -c | sort -n in one go.
Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSONProject mention: Consultare un databate XML, JSON, CVS o RDF | reddit.com/r/ItalyInformatica | 2021-03-31
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.Project mention: Return 1 to N results from a large (19MM line) CSV | reddit.com/r/commandline | 2021-04-17
May well be overkill for your needs, but I'm a fan of tsv-utils It's fast and enormously flexible, and seems to me a "best of breed" toolset for data mining CSV files (that is what it was written for). https://github.com/eBay/tsv-utils
In-memory tabular data in JuliaProject mention: Polars (Rust DataFrame library) join algorithm fastest in db-benchmark | reddit.com/r/rust | 2021-03-12
Looks like it's single threaded according to this open issue: https://github.com/JuliaData/DataFrames.jl/issues/2626
PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdfProject mention: [D] Why Neural Networks for tabular data are bad? | reddit.com/r/MachineLearning | 2021-03-07
ktrain is a Python library that makes deep learning and AI more accessible and easier to applyProject mention: Increase accuracy of NLP sentiment analysis? | reddit.com/r/datascience | 2020-12-29
Could be the last resort: pre-trained transformers through ktrain (ktrain has very detailed tutorials and examples)
Conditional GAN for generating synthetic tabular data.Project mention: Weekly Entering & Transitioning Thread | 28 Mar 2021 - 04 Apr 2021 | reddit.com/r/datascience | 2021-03-29
A lightweight library for generating text tables.
AI Tool for querying natural language on tabular data.Project mention: TableQA -Query your tabular data with natural language | dev.to | 2020-11-28
Multimodal model for text and tabular data with HuggingFace transformers as building block for text dataProject mention: Classification problem with text and numerical features | reddit.com/r/LanguageTechnology | 2021-04-15
A quick and dirty way of trying this is using this framework:https://github.com/georgian-io/Multimodal-Toolkit
What are some of the best open-source tabular-data projects? This list will help you: