Top 23 Python Parser Projects
Data parsing and validation using Python type hintsProject mention: Strict Python Function Parameters | news.ycombinator.com | 2022-01-23
Slightly off-topic, but everyone writing modern Python should be familiar with Pydantic and similar libraries that use type hints for validation and parsing:
We're using Pydantic for Robusta (https://github.com/robusta-dev/robusta) and absolutely love it. You get the best of traditional Python (rapid prototyping and no boilerplate) while still being able to scale your codebase and keep it maintainable. Robusta is the first large project I've written in Python where I'm not encountering type errors at runtime left and right.
Community maintained fork of pdfminer - we fathom PDFProject mention: How should I go about extracting a dataframe from a PDF? | reddit.com/r/learnpython | 2021-11-09
OPS - Build and Run Open Source Unikernels. Quickly and easily build and deploy open source unikernels in tens of seconds. Deploy in any language to any cloud.
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.Project mention: Made a Programing language using python | reddit.com/r/Python | 2021-11-29
There's also lark, which is used by a plethora of projects (I haven't used it, but I heard about PreQL on a podcast where they talk for a bit about what it's like to develop a new language in lark)
Python port of Google's libphonenumberProject mention: Does anyone know where I can find official docs for python-phonenumbers package? | reddit.com/r/learnprogramming | 2022-01-12
This is the GitHub repo for the package.
A non-validating SQL parser module for PythonProject mention: Open Source SQL Parsers | dev.to | 2021-10-08
Regular expressions is a popular approach to extract information from SQL statements. However, regular expressions quickly become too complex to handle common features like WITH, sub-queries, windows clauses, aliases and quotes. sqlparse is a popular python package that uses regular expressions to parse SQL.
Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.
Python m3u8 Parser for HTTP Live Streaming (HLS) TransmissionsProject mention: check if m3u8 link is active | reddit.com/r/learnpython | 2021-03-12
Snoop — инструмент разведки на основе открытых данных (OSINT world) (by snooppr)Project mention: FOSS News International #2: November 8-145, 2021 | reddit.com/r/fossnews | 2021-11-15
A Python library that provides an easy way to identify devices like mobile phones, tablets and their capabilities by parsing (browser) user agent strings.
Type-safe YAML parser and validator.
IMDbPY is a Python package useful to retrieve and manage the data of the IMDb movie database about movies, people, characters and companiesProject mention: [OC]IMDB Top 30 movies: cast death rate | reddit.com/r/dataisbeautiful | 2022-01-17
A VBA parser and emulation engine to analyze malicious macros.Project mention: De-obfuscation | reddit.com/r/Malware | 2021-06-02
Construct: Declarative data structures for python that allow symmetric parsing and buildingProject mention: Binary serialization library for at least C++17? | reddit.com/r/cpp_questions | 2021-10-10
I myself am looking for a binary serializer/deserializer that's like construct in python or construct-js, but obviously I wouldn't need some of the types that they have, since C++ already has them.
GuessIt is a python library that extracts as much information as possible from a video filename.Project mention: Small but fast open directory with movies, and Zappa documentary (personal interest) | reddit.com/r/opendirectories | 2021-01-31
Extract movie details (title, year) from the filename with guessit
Domain-Specific Languages and parsers in Python made easy http://textx.github.io/textX/
A simple Python module for parsing human names into their individual components
A Python parser for MediaWiki wikicodeProject mention: [Python] How can I clean up Wikipedia's XML backup dump to create dictionaries of commonly used words for multiple languages? | reddit.com/r/learnprogramming | 2021-10-12
In particular what you're looking at is not XML but wikitext. I found a discussion on stackoverflow about solving the same problem of getting text from wikitext. Seems like the most promising solution in Python since you already have the dump is to run each page through mwparserfromhell. According to the top stackoverflow answer you could use something like
A common base representation of python source code for pylint and other projects (by PyCQA)Project mention: Klara: Python automatic test generations and static analysis library | reddit.com/r/Python | 2021-09-13
It also provide inference for static analysis purposes, similar to astroid, with SMT support. E.g.
Exposing problems in json parsers of several programming languages.Project mention: Parsing JSON is a Minefield 💣 (2018) | reddit.com/r/coding | 2021-10-11
The nginx default is 1MB, which gets you 512 uncompressed nested arrays. That's already beyond the nesting limit of many parsers (see the Results section of that repository README, which documents the limit on many different language libraries).
Python-based Hardware Design Processing Toolkit for Verilog HDLProject mention: How to compare HDL simulation/implementation results to Matlab? | reddit.com/r/FPGA | 2021-06-01
Identify hardcoded secrets in static structured text (by Skyscanner)Project mention: Skyscanner/whispers - Identify hardcoded secrets and dangerous behaviours | reddit.com/r/GithubSecurityTools | 2021-10-07
Wiktionary dump file parser and multilingual data extractorProject mention: What are some of the best digital free dictionaries available online (even for commercial use)? | reddit.com/r/languagelearning | 2022-01-02
Many parsers are available. https://github.com/tatuylonen/wiktextract
Python Parser related posts
3 projects | news.ycombinator.com | 22 Jan 2022
Collection of tools for executable packing detection
6 projects | reddit.com/r/Malware | 15 Jan 2022
What type hint should I use for "some container type" in general but explicitly exclude the str type?
2 projects | reddit.com/r/learnpython | 13 Jan 2022
Does anyone know where I can find official docs for python-phonenumbers package?
1 project | reddit.com/r/learnprogramming | 12 Jan 2022
Master Dataclasses in Python Part 3 - Ordering of Dataclasses
1 project | reddit.com/r/Python | 2 Jan 2022
Don't let dicts spoil your code
1 project | dev.to | 27 Dec 2021
Attrs – The One Python Library Everyone Needs
5 projects | news.ycombinator.com | 24 Dec 2021
What are some of the best open-source Parser projects in Python? This list will help you:
Are you hiring? Post a new remote job listing for free.