-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Woodpecker
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs. (by BradyFU)
I was casually sifting through the llama.cpp discussions when i found a particularly interesting conversation.
A core Llama.cpp contributor, named cmp-nct, discovered stumbled upon what might be the next leap forward for vision/language models. CogVLM (which uses a Vicuna 7B language model combined with a 9B vision tower) excels particularly in OCR (Optical Character Recognition), detail detection, and minimal hallucinations. It effectively understands both handwritten and typed text, context, fine details, and background graphics. It even provides pixel coordinates for small visual targets. CovVLM surpasses other models like llava-1.5 and Qwen-VL in performance.
Woodpecker: Hallucination Correction for Multimodal Large Language Models https://github.com/BradyFU/Woodpecker
Related posts
-
IT Employment Grew by Just 700 Jobs in 2023, Down From 267,000 in 2022
-
Show HN: I built an open source AI video search engine to learn more about AI
-
CogAgent-18B – visual-based GUI Agent capabilities
-
What do you think. When should we expect the next SDXL version?
-
Gemini: Google's most capable AI model yet