Python Parsing

Open-source Python projects categorized as Parsing

Top 23 Python Parsing Projects

  • pydantic

    Data validation using Python type hints

  • Project mention: Advanced RAG with guided generation | dev.to | 2024-04-18

    First, note the method prefix_allowed_tokens_fn. This method applies a Pydantic model to constrain/guide how the LLM generates tokens. Next, see how that constrain can be applied to txtai's LLM pipeline.

  • maigret

    🕵️‍♂️ Collect a dossier on a person by username from thousands of sites

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Maya

    Datetimes for Humans™

  • llmware

    Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.

  • Project mention: More Agents Is All You Need: LLMs performance scales with the number of agents | news.ycombinator.com | 2024-04-06

    I couldn't agree more. You should check out LLMWare's SLIM agents (https://github.com/llmware-ai/llmware/tree/main/examples/SLI...). It's focusing on pretty much exactly this and chaining multiple local LLMs together.

    A really good topic that ties in with this is the need for deterministic sampling (I may have the terminology a bit incorrect) depending on what the model is indended for. The LLMWare team did a good 2 part video on this here as well (https://www.youtube.com/watch?v=7oMTGhSKuNY)

    I think dedicated miniture LLMs are the way forward.

    Disclaimer - Not affiliated with them in any way, just think it's a really cool project.

  • dateutil

    Useful extensions to the standard Python datetime features

  • Project mention: Using Openpyxl - keep min date, handle line breaks, handle duplicates | /r/learnpython | 2023-05-01

    Here is an example for a single cell (I'm using the dateutil package to parse the strings):

  • pyparsing

    Python library for creating PEG parsers

  • Project mention: Pyparsing 3.1.0 released | /r/pyparsing | 2023-06-19

    After over a year since the last release of pyparsing, I've bundled up all the bug-fixes and changes, and they are now released as pyparsing 3.1.0. Visit this link for the details.

  • plaso

    Super timeline all the things

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • pydantic-core

    Core validation logic for pydantic written in rust

  • Project mention: Is there a pydantic.BaseSettings equivalent in rust? | /r/rust | 2023-06-05

    Funny that you ask... https://github.com/pydantic/pydantic-core Unfortunately it seems that the functionality you ask for is not (yet) part of this ...

  • facexlib

    FaceXlib aims at providing ready-to-use face-related functions based on current STOA open-source methods.

  • Project mention: stable diffusion downloads something from github when making a image | /r/StableDiffusion | 2023-07-22

    "https://github.com/xinntao/facexlib/releases/download/v0.1.0/detection_Resnet50_Final.pth"

  • socid-extractor

    ⛏️ Extract accounts info from personal pages on various sites for OSINT purpose

  • WhatsApp-Chat-Exporter

    A customizable Android and iOS/iPadOS WhatsApp database parser that will give you the history of your WhatsApp conversations in HTML and JSON. Android Backup Crypt12, Crypt14, Crypt15, and new schema supported.

  • Project mention: Autogenerating a Book Series from Three Years of iMessages | news.ycombinator.com | 2024-03-07

    https://github.com/KnugiHK/WhatsApp-Chat-Exporter

    Just in case you missed my other comment.

    Not my repo.

  • FormatFuzzer

    FormatFuzzer is a framework for high-efficiency, high-quality generation and parsing of binary inputs.

  • pytago

    A source-to-source transpiler for Python to Go translation

  • funcparserlib

    Recursive descent parsing library for Python based on functional combinators

  • py-pdf-parser

    A Python tool to help extracting information from structured PDFs.

  • wikitextparser

    A Python library to parse MediaWiki WikiText

  • OpenSIEM-Logstash-Parsing

    SIEM Logstash parsing for more than hundred technologies

  • yacv

    Yet Another Compiler Visualizer

  • parglare

    A pure Python LR/GLR parser - http://www.igordejanovic.net/parglare/

  • Project mention: Parsing: The Solved Problem That Isn't (2011) | news.ycombinator.com | 2024-02-21

    These are not new, but my takeaways from https://tratt.net/laurie/blog/2020/which_parsing_approach.ht... and https://rust-analyzer.github.io/blog/2020/09/16/challeging-L... are to embrace various forms of LR parsing. https://github.com/igordejanovic/parglare is a very capable GLR parser, and I've been keeping a close eye on it for use in my projects.

  • tree-hugger

    A light-weight, extendable, high level, universal code parser built on top of tree-sitter

  • arxiv-miner

    arxiv_miner is a toolkit for mining research papers on CS ArXiv.

  • htmldate

    Fast and robust date extraction from web pages, with Python or on the command-line

  • dataconf

    Simple dataclasses configuration management for Python with hocon/json/yaml/properties/env-vars/dict/cli support.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Parsing related posts

Index

What are some of the best open-source Parsing projects in Python? This list will help you:

Project Stars
1 pydantic 18,617
2 maigret 9,606
3 Maya 3,402
4 llmware 3,086
5 dateutil 2,247
6 pyparsing 2,086
7 plaso 1,618
8 pydantic-core 1,263
9 facexlib 741
10 socid-extractor 581
11 WhatsApp-Chat-Exporter 446
12 FormatFuzzer 384
13 pytago 371
14 funcparserlib 336
15 py-pdf-parser 335
16 wikitextparser 268
17 OpenSIEM-Logstash-Parsing 174
18 yacv 132
19 parglare 133
20 tree-hugger 121
21 arxiv-miner 111
22 htmldate 106
23 dataconf 79

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com