table-transformer
ast-grep
table-transformer | ast-grep | |
---|---|---|
9 | 37 | |
1,869 | 6,099 | |
8.3% | 7.2% | |
6.1 | 9.9 | |
5 months ago | 2 days ago | |
Python | Rust | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
table-transformer
-
Data extraction from pdf
Saw this last time but never played with it https://github.com/microsoft/table-transformer
- FLaNK Stack Weekly 11 Dec 2023
-
[P] OCR + Table Extraction Advice
Have you tried the SOTA on Table Detection and Extraction with out of the box model weights?
-
How do you parse tables in PDF with langchain? Especially, the context which is few lines above and below the table.
https://huggingface.co/blog/document-ai https://github.com/microsoft/table-transformer https://github.com/google-research/pix2struct https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/table/README.md
-
[D] Unimpressive improvement in training speed after upgrading from GTX 980 Ti to RTX 4090
GPU is at 100% although a bit spiky (dropping occasionally to 50%), I expect that to be normal? I use the same configuration as the authors, num_workers is set to 1 https://github.com/microsoft/table-transformer/blob/main/src/structure_config.json . Data is on a separate SSD, C-drive is a NVMe SSD
- Microsoft TableTransformer
- DeepDoctection
ast-grep
-
An infinite canvas for code exploration
It's unclear what the superpowers would be? Video doesn't show anything I can't do with an IDE or decent code editor, and there I also have refactoring tools, metadata like indicators for usages that can be used for navigating and so on.
Reminds me of UML-like diagrams over relational databases, except that it's generated one piece at a time. In practice I generate diagrams showing cyclomatic complexity much more often, and for code exploration outside the IDE I'd use ast-grep.
https://ast-grep.github.io/
-
Migrate to React 19 with ast-grep
This article illustrates the usage of ast-grep, a tool designed to locate and substitute patterns in your codebase, towards easing your migration to React 19.
- AST-grep(sg) AST grep based on Treesitter
-
Show HN: GritQL, a Rust CLI for rewriting source code
This looks great, thanks for building and sharing it.
Interested folks may also want to check out ast-grep:
https://github.com/ast-grep/ast-grep
-
How I build a chatbot for my OSS project, for free, without code!
ast-grep is a command-line tool that lets you search and transform code written in many programming languages using abstract syntax trees (ASTs). ASTs are data structures that capture the syntactic and semantic structure of source code. With ast-grep, you can write patterns as if you are writing ordinary code, and it will match all code that has the same syntactical structure. And if you need more power, you can use YAML, a rule system that allows you to write more sophisticated linting rules or code modifications.
- FLaNK Stack Weekly 11 Dec 2023
-
AST-grep(sg) is a CLI tool for code structural search, lint, and rewriting
I really like this - it means the tool is available to people with familiarity of any of those four distribution mechanisms.
You can also download pre-built binaries from their releases page: https://github.com/ast-grep/ast-grep/releases/tag/0.14.2
On top of that, they offer API bindings for it in three different languages:
- Rust (not yet stable): https://docs.rs/ast-grep-core/latest/ast_grep_core/
- JavaScript/TypeScript: https://ast-grep.github.io/guide/api-usage/js-api.html
- Python: https://ast-grep.github.io/guide/api-usage/py-api.html
It's rare to see a tool/library offer this depth of language support out of the box.
What are some alternatives?
pix2struct
ssr.nvim - Treesitter based structural search and replace plugin for Neovim.
CascadeTabNet - This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"
helix - A post-modern modal text editor.
PaddleOCR - Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
weggli - weggli is a fast and robust semantic search tool for C and C++ codebases. It is designed to help security researchers identify interesting functionality in large codebases.
FLaNK-Ice - Apache Iceberg - Cloud Data Lakehouse
git-repo-sync - Auto synchronization of remote Git repositories. Auto conflict solving. Network fail resilience. Linux & Windows support. And more.
llama - Inference code for Llama models
telescope.nvim - Find, Filter, Preview, Pick. All lua, all the time.
camelot - Camelot: PDF Table Extraction for Humans
telescope-sg - Ast-grep picker for telescop.nvim