tika-python

Tika-Python is a Python binding to the Apache Tikaβ„’ REST services allowing Tika to be called natively in the Python community. (by chrismattmann)

Tika-python Alternatives

Similar projects and alternatives to tika-python

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better tika-python alternative or higher similarity.

tika-python reviews and mentions

Posts with mentions or reviews of tika-python. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-07-19.
  • Document Parsing - an unsolved problem?
    5 projects | /r/LanguageTechnology | 19 Jul 2022
    At my previous job we had the same problem which we solved by using Tika. We called it on the server along with other stuff, but there is also a Python binding.
  • Extract text from PDF
    7 projects | /r/Python | 2 Nov 2021
    Tika is from Apache so yes its original code base is Java but it has bindings in other languages. Checkout Tika-Python!
  • Extract text from documents
    5 projects | dev.to | 27 Mar 2021
    The Textractor instance is the main entrypoint for extracting text. This method is backed by Apache Tika, a robust text extraction library written in Java. Apache Tika has support for a large number of file formats: PDF, Word, Excel, HTML and others. The Python Tika package automatically installs Tika and starts a local REST API instance used to read extracted data.
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 24 Apr 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more β†’

Stats

Basic tika-python repo stats
4
1,411
2.2
10 days ago

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com