-
konfuzio-sdk
OCR, extract and classify documents. In addition, annotate documents and build your own NLP and Computer Vision models using Python by downloading the data. Find examples in our Colab Notebooks, e. g. how to fine-tune Flair.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
The flair model requires the data structure to be in the BIO scheme and to be saved in a text file. In this case, to convert the visual annotations to the BIO scheme, we only need to get the start and end offsets of each annotation and its label. This conversion can be done using the method get_text_in_bio_scheme() of the Document class.
Find the source code here https://github.com/konfuzio-ai/document-ai-python-sdk/blob/b...
Many other file types are supported. Have a look at https://dev.konfuzio.com/web/api.html#supported-file-types