Python unstructured-data

Open-source Python projects categorized as unstructured-data

Top 3 Python unstructured-data Projects

  • towhee

    Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.

    Project mention: What Is DocArray? | | 2022-10-02

    The description of this is kind of confusing but I think the easiest way to understand it is that it is a data processing pipeline of sorts. Take unstructured data and apply transformation and computation. A similar project to this is Towhee ( This project tries to simplify unstructured data processing and provides pretrained models and pipelines from their hub.

  • docarray

    🧬 The data structure for multimodal data · Neural Search · Vector Search · Document Store

    Project mention: [P] Code super clean multi-modal PyTorch models and easily serve them through FastAPI, using DocArray | | 2023-01-19

    And last but not least: The features above are part of DocArray v2, which is currently in alpha and can be found here:

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • bootcamp

    Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc. (by milvus-io)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-01-19.

Python unstructured-data related posts


What are some of the best open-source unstructured-data projects in Python? This list will help you:

Project Stars
1 towhee 1,772
2 docarray 1,636
3 bootcamp 1,033
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives