Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Hi HN - I made texify to convert equations to markdown/LaTeX for my project marker [1] then realized it could be generally useful.
Texify converts equations and surrounding text to Markdown, with embedded LaTeX (MathJax compatible).
You can either use a GUI to select equations (inline or block) from PDFs and images to convert, or use the CLI to batch convert images. It works on CPU, GPU, or MPS (Mac).
The closest open source comparisons are pix2tex and nougat - marker is more accurate than both of them for this task. However, nougat is more for entire pages, and pix2tex is more for block equations (not inline equations and text).
I trained texify for 2 days on 4x A6000 GPUs - I was pleasantly surprised how far I could get with limited GPU resources by reframing the problem to use small parameter counts/images.
Texify is licensed for commercial use, with the weights under CC-BY-SA 4.0. Fine them here - https://huggingface.co/vikp/texify .
See the texify repo for more details, benchmarks, how to install, etc.
[1] https://github.com/VikParuchuri/marker