Our great sponsors
-
jc
CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Use ccextractor CLI in -out=report mode. Out of ffprobe, mediainfo and ccextractor, ccextractor is the most comprehensive tool. It includes caption debug mode and extraction/conversion capabilities. They accept pull requests to the project.
Unfortunately, ccextractor's output is not structured json data like FFprobe or mediainfo, so you will have to parse the unstructured text output, but ccextractor does a really good job of captions analysis. The ccextractor team are really responsive. If you want json output from ccextractor and you have the appetite, jc is a cool project for writing and contributing custom parser templates from unstructured CLI commands. I not seen one for ccextractor, but I believe it allows you to write your own custom parser for various tools. If you do have the appetite to write a json parser with jc (or whatever), they have an open ticket for json output.