Python information-extraction

Open-source Python projects categorized as information-extraction

Top 17 Python information-extraction Projects

  • PaddleNLP

    πŸ‘‘ Easy-to-use and powerful NLP and LLM library with πŸ€— Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including πŸ—‚Text Classification, πŸ” Neural Search, ❓ Question Answering, ℹ️ Information Extraction, πŸ“„ Document Intelligence, πŸ’Œ Sentiment Analysis etc.

  • DeepKE

    [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

  • Project mention: Would this method work to increase the memory of the model? Saving summaries generated by a 2nd model and injecting them depending on the current topic. | /r/LocalLLaMA | 2023-06-09
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • InvoiceNet

    Deep neural network to extract intelligent information from invoice documents.

  • kor

    LLM(😽)

  • Project mention: Pydentic in prompt engineering | /r/LangChain | 2023-11-29

    Check out kor

  • 007-TheBond

    This Script will help you to gather information about your victim or friend.

  • ail-framework

    AIL framework - Analysis Information Leak framework

  • Project mention: Ask HN: Show me your half baked project | news.ycombinator.com | 2023-10-12

    First time coming across this, looks very cool! Definitely some ideas there that I'd like to implement for osintbuddy. Another project I'm going to be taking some ideas from is: https://github.com/ail-project/ail-framework - a modular framework to analyse potential information leaks

  • RomBuster

    RomBuster is a router exploitation tool that allows to disclosure network router admin password.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • medaCy

    :hospital: Medical Text Mining and Information Extraction with spaCy

  • MedCAT

    Medical Concept Annotation Tool

  • GoLLIE

    Guideline following Large Language Model for Information Extraction

  • Project mention: A LLM trained to follow annotation guidelines, for information extraction tasks | news.ycombinator.com | 2023-10-30
  • huspacy

    HuSpaCy: industrial-strength Hungarian natural language processing

  • htmldate

    Fast and robust date extraction from web pages, with Python or on the command-line

  • targetedSummarization

    TextReducer - A Tool for Summarization and Information Extraction

  • KIE_invoice_minimal

    Key information extraction from invoice document with Graph Convolution Network

  • IRCP

    A robust information gathering tool for large scale reconnaissance on Internet Relay Chat servers πŸ›°οΈ (by internet-relay-chat)

  • Project mention: IRCP: A robust information gathering tool for large scale reconnaissance on Internet Relay Chat servers | /r/netsec | 2023-06-07
  • AdaKGC

    [EMNLP 2023 (Findings)] Schema-adaptable Knowledge Graph Construction

  • Project mention: Schema-adaptable Knowledge Graph Construction | /r/BotNewsPreprints | 2023-05-16

    Conventional Knowledge Graph Construction (KGC) approaches typically follow the static information extraction paradigm with a closed set of pre-defined schema. As a result, such approaches fall short when applied to dynamic scenarios or domains, whereas a new type of knowledge emerges. This necessitates a system that can handle evolving schema automatically to extract information for KGC. To address this need, we propose a new task called schema-adaptable KGC, which aims to continually extract entity, relation, and event based on a dynamically changing schema graph without re-training. We first split and convert existing datasets based on three principles to build a benchmark, i.e., horizontal schema expansion, vertical schema expansion, and hybrid schema expansion; then investigate the schema-adaptable performance of several well-known approaches such as Text2Event, TANL, UIE and GPT-3. We further propose a simple yet effective baseline dubbed AdaKGC, which contains schema-enriched prefix instructor and schema-conditioned dynamic decoding to better handle evolving schema. Comprehensive experimental results illustrate that AdaKGC can outperform baselines but still have room for improvement. We hope the proposed work can deliver benefits to the community. Code and datasets will be available in https://github.com/zjunlp/AdaKGC.

  • CyberSecurityAuditScript

    Security audit script decreases info gathering from average of 5 minutes, to 20 seconds, and returns everything into a textfile.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python information-extraction related posts

Index

What are some of the best open-source information-extraction projects in Python? This list will help you:

Project Stars
1 PaddleNLP 11,423
2 DeepKE 2,929
3 InvoiceNet 2,382
4 kor 1,501
5 007-TheBond 1,030
6 ail-framework 495
7 RomBuster 423
8 medaCy 412
9 MedCAT 408
10 GoLLIE 204
11 huspacy 147
12 htmldate 106
13 targetedSummarization 87
14 KIE_invoice_minimal 49
15 IRCP 44
16 AdaKGC 16
17 CyberSecurityAuditScript 9

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com