textract
nimpy
textract | nimpy | |
---|---|---|
4 | 38 | |
3,784 | 1,416 | |
- | - | |
3.5 | 5.8 | |
17 days ago | 3 months ago | |
HTML | Nim | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
textract
- How to give a file path to a file parser when you only have an HTTPRequest?
-
pdf2doi : A python library to retrieve the DOI (or other identifiers) from a pdf file
Scan the text inside the .pdf file, and check for any string that matches the pattern of a DOI or an arXiv ID. The text is extracted with PyPDF2 and textract.
-
I am a proficient Python coder whose learning has plateaued. Any really useful libraries I should look into learning? Taking recommendations.
And here are some libraries that might pique your interest although they don't strictly answer your question: - tqdm for adding a progress bar on for loops (it comes with useful information like iteration per second and estimated time needed to finish) - alive_progress adds a progress bar like tqdm, but it works even with generators and while loops which I don't think tqdm does. -timebudget, with just a decorator as soon as a function is completed it prints the time taken to execute it - send2trash for sending files to the trash bin instead of permanently deleting them - keyboard for sending keyboard inputs or check if a key is pressed - mouse same as keyboard but with mouse buttons - textract for extracting text from many types of file with a single interface. It supports documents, powerpoint presentations, csv, excels, images, gifs, audio, and many more
-
Textract: Extract text from a large variety of file formats
Huh. Must have made a mistake posting the original link. Anyway, this is what I meant: https://textract.readthedocs.io
nimpy
-
Mojo is now available on Mac
I mean honestly, the closest language to Mojo really is Nim. In the latest Lex Fridman interview [0] when he talks about his ideas behind Mojo it pretty much sounds like he's describing Nim. Ok fair, he wants Mojo to be a full superset of Python, but honestly with nimpy [1] our Python interop is about as seamless as it can really be (without being a superset, which Mojo clearly is not yet). Even the syntax of Mojo looks a damn lot like Nim imo. Anyway, I guess he has the ability to raise enough funds to hire enough people to write his own language within ~2 years so as not have to follow random peoples whim about where to take the language. So I guess I can't blame him. But as someone who's pretty invested in the Nim community it's quite a shame to see such a hyped language receive so much attention by people who should really check out Nim. ¯\_(ツ)_/¯
[0]: https://youtu.be/pdJQ8iVTwj8?si=LfPSNDq8UKKIsJd3
[1]: https://github.com/yglukhov/nimpy
-
Show HN: Pip Imports in Deno
You can also do this in Nim, which basically means you can write any program you could in Python with libraries in Nim. https://github.com/yglukhov/nimpy
-
Nim v2.0 Released
Ones that have not been mentioned so far:
nlvm is an unofficial LLVM backend: https://github.com/arnetheduck/nlvm
npeg lets you write PEGs inline in almost normal PEG notation: https://github.com/zevv/npeg
futhark provides for much more automatic C interop: https://github.com/PMunch/futhark
nimpy allows calling Python code from Nim and vice versa: https://github.com/yglukhov/nimpy
questionable provides a lot of syntax sugar surrounding Option/Result types: https://github.com/codex-storage/questionable
ratel is a framework for embedded programming: https://github.com/PMunch/ratel
cps allows arbitrary procedure rewriting to continuation passing style: https://github.com/nim-works/cps
chronos is an alternative async/await backend: https://github.com/status-im/nim-chronos
zero-functional fixes some inefficiencies when chaining list operations: https://github.com/zero-functional/zero-functional
owlkettle is a declarative macro-oriented library for GTK: https://github.com/can-lehmann/owlkettle
A longer list can be found at https://github.com/ringabout/awesome-nim.
-
Prospects of utilising Nim in scientific computation?
I use Python daily for its massive momentum for scientific stuff, but I also use Nim for everything else. Nim compiles to C, and making Python native modules with Nim is easy with Nimpy.
- Can't run compiled nim code in Python
-
Returning to Nim from Python and Rust
If are a data scientist and come from python take a look at nimpy, a great way to just import python libraries and use them! https://github.com/yglukhov/nimpy Numpy, pandas, pytorch all usable in Nim.
Nim is the ultimate glue language, use libraries from anything: python, c, js, objc.
-
Python's “Disappointing” Superpowers
I've come to really enjoy programming in Nim. Note that Nim is very different language despite sharing a similar syntax. However, I feel it keeps a lot of the "feel" of Python 2 days of being a fairly simple neat language but that lets you do things at compile time (like compile time duck typing).
There's a good Python -> Nim bridge: https://github.com/yglukhov/nimpy
-
Dunder methods in nimpy
See this nimpy issue about it: https://github.com/yglukhov/nimpy/issues/43
-
What language to move to from python to speed up algo?
It has pretty good integration with python, either for having your main code in python and writing small hot functions as nim and importing via nimporter or using python libraries in nim via nimpy.
-
ABI compatibility in Python: How hard could it be?
Related: Nimpy[0] provides an easy way to write Python extensions in Nim, which manages the ABI side very well.
Python 2 is now gone, but until it was, Nimpy was an easy way to write Python extension modules that only needed to be compiled once, and would work with any of your installed Python 2 and Python 3. Magic.
[0] https://github.com/yglukhov/nimpy
What are some alternatives?
PyPDF2 - A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
Nim - Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
newspaper - newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
Box - Python dictionaries with advanced dot notation access
python-goose - Html Content / Article Extractor, web scrapping lib in Python
nimporter - Compile Nim Extensions for Python On Import!
html2text - Convert HTML to Markdown-formatted text.
scinim - The core types and functions of the SciNim ecosystem
python-readability - fast python port of arc90's readability tool, updated to match latest readability.js!
nimpylib - Some python standard library functions ported to Nim
sumy - Module for automatic summarization of text documents and HTML pages.
nimskull - An in development statically typed systems programming language; with sustainability at its core. We, the community of users, maintain it.