Python Parsing

Open-source Python projects categorized as Parsing

Top 23 Python Parsing Projects

  • pydantic

    Data validation using Python type hints

    Project mention: [DISCUSSION] What's your favorite Python library, and how has it helped you in your projects? | /r/pythonhelp | 2023-04-22

    As for the most utilized and still loved library, that would probably be pydantic, it helps declaring types so convenient - be it dto's, models or just complex arguments - and plays nice with bunch of other libraries from it's own ecosystem.

  • maigret

    🕵️‍♂️ Collect a dossier on a person by username from thousands of sites

    Project mention: IWTL how to find and delete old online accounts that I've forgotten about | /r/IWantToLearn | 2023-04-17


  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • Maya

    Datetimes for Humans™

  • dateutil

    Useful extensions to the standard Python datetime features

    Project mention: Using Openpyxl - keep min date, handle line breaks, handle duplicates | /r/learnpython | 2023-05-01

    Here is an example for a single cell (I'm using the dateutil package to parse the strings):

  • pyparsing

    Python library for creating PEG parsers

    Project mention: Need help developing an interpreter | /r/learnpython | 2023-03-07

    Look into "parser combinators" for building an interpreter. There's a few ones out there, but PyParsing is one I've seen around that looks pretty nifty.

  • plaso

    Super timeline all the things

    Project mention: Custom DFIR | /r/computerforensics | 2023-02-09

    However, what you are trying to do has already been done. For collections look at velociraptor's offline collector For processing check out Log2Timeline (plaso)

  • pydantic-core

    Core validation logic for pydantic written in rust

    Project mention: Investigating Pydantic v2's Bold Performance Claims | | 2023-05-17

    I encourage you to checkout the official benchmarks for more realistic and detailed examples, and, as always, YMMV.

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • facexlib

    FaceXlib aims at providing ready-to-use face-related functions based on current STOA open-source methods.

    Project mention: local Windows installation of GFP-GAN | /r/MLQuestions | 2022-07-16

    # Install facexlib -

  • socid-extractor

    ⛏️ Extract accounts info from personal pages on various sites for OSINT purpose

    Project mention: Looking for a good open source web scraping tool | /r/webscraping | 2023-01-14

    Check this for profiles:

  • FormatFuzzer

    FormatFuzzer is a framework for high-efficiency, high-quality generation and parsing of binary inputs.

  • pytago

    A source-to-source transpiler for Python to Go translation

    Project mention: Learning Go as a Python Developer: The Good and the Bad | | 2022-07-18

    Similarly helpful, pytago is a source to source transpired for python to go

  • funcparserlib

    Recursive descent parsing library for Python based on functional combinators

  • py-pdf-parser

    A Python tool to help extracting information from structured PDFs.

    Project mention: Need free/low-cost software that allows me to view the tags in a PDF. | /r/pdf | 2023-01-31

    Maybe look at this?

  • wikitextparser

    A Python library to parse MediaWiki WikiText

  • Whatsapp-Chat-Exporter

    A customizable Android and iPhone WhatsApp database parser that will give you the history of your WhatsApp conversations in HTML and JSON. Android Backup Crypt12, Crypt14, Crypt15, and new schema supported.

    Project mention: I am willing to pay hundreds of dollar to have my conversation with my parents on Whatsapp preserved, but there is no solution. No body other than me cares? | /r/DataHoarder | 2023-06-02

    Since you have the backup, this should be an option:

  • OpenSIEM-Logstash-Parsing

    SIEM Logstash parsing for more than hundred technologies

  • yacv

    Yet Another Compiler Visualizer

  • tree-hugger

    A light-weight, extendable, high level, universal code parser built on top of tree-sitter

    Project mention: Tree-Hugger: Mine / Query source code | | 2022-10-02
  • arxiv-miner

    arxiv_miner is a toolkit for mining research papers on CS ArXiv.

  • htmldate

    Fast and robust date extraction from web pages, with Python or on the command-line

  • dataconf

    Simple dataclasses configuration management for Python with hocon/json/yaml/properties/env-vars/dict/cli support.

  • Robinhood-1099-Parser

    Parse Robinhood 1099 Tax Document from PDF into CSV

  • python-hslog

    Python module to parse Hearthstone Power.log files

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-06-02.

Python Parsing related posts


What are some of the best open-source Parsing projects in Python? This list will help you:

Project Stars
1 pydantic 13,891
2 maigret 8,561
3 Maya 3,381
4 dateutil 2,071
5 pyparsing 1,843
6 plaso 1,455
7 pydantic-core 937
8 facexlib 469
9 socid-extractor 436
10 FormatFuzzer 349
11 pytago 346
12 funcparserlib 316
13 py-pdf-parser 261
14 wikitextparser 230
15 Whatsapp-Chat-Exporter 229
16 OpenSIEM-Logstash-Parsing 155
17 yacv 129
18 tree-hugger 106
19 arxiv-miner 98
20 htmldate 72
21 dataconf 65
22 Robinhood-1099-Parser 57
23 python-hslog 49
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives