shinning the spotlight on CogVLM

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

llama.cpp

773 57,463 10.0 C++

LLM inference in C/C++

I was casually sifting through the llama.cpp discussions when i found a particularly interesting conversation.

CogVLM

16 5,062 9.0 Python

a state-of-the-art-level open visual language model | 多模态预训练模型

A core Llama.cpp contributor, named cmp-nct, discovered stumbled upon what might be the next leap forward for vision/language models. CogVLM (which uses a Vicuna 7B language model combined with a 9B vision tower) excels particularly in OCR (Optical Character Recognition), detail detection, and minimal hallucinations. It effectively understands both handwritten and typed text, context, fine details, and background graphics. It even provides pixel coordinates for small visual targets. CovVLM surpasses other models like llava-1.5 and Qwen-VL in performance.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Woodpecker

2 539 8.9 Python

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs. (by BradyFU)

Woodpecker: Hallucination Correction for Multimodal Large Language Models https://github.com/BradyFU/Woodpecker

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

IT Employment Grew by Just 700 Jobs in 2023, Down From 267,000 in 2022

2 projects | news.ycombinator.com | 8 Jan 2024
Show HN: I built an open source AI video search engine to learn more about AI

2 projects | news.ycombinator.com | 19 Dec 2023
CogAgent-18B – visual-based GUI Agent capabilities

2 projects | news.ycombinator.com | 16 Dec 2023
What do you think. When should we expect the next SDXL version?

1 project | /r/StableDiffusion | 10 Dec 2023
Gemini: Google's most capable AI model yet

2 projects | news.ycombinator.com | 6 Dec 2023

shinning the spotlight on CogVLM

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA
cross-modality hallucination language-model hallucinations multi-modal
Post date: 9 Dec 2023

llama.cpp

CogVLM

InfluxDB

Woodpecker

Related posts

IT Employment Grew by Just 700 Jobs in 2023, Down From 267,000 in 2022

Show HN: I built an open source AI video search engine to learn more about AI

CogAgent-18B – visual-based GUI Agent capabilities

What do you think. When should we expect the next SDXL version?

Gemini: Google's most capable AI model yet

shinning the spotlight on CogVLM

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA cross-modality hallucination language-model hallucinations multi-modal Post date: 9 Dec 2023

llama.cpp

CogVLM

InfluxDB

Woodpecker

Related posts

IT Employment Grew by Just 700 Jobs in 2023, Down From 267,000 in 2022

Show HN: I built an open source AI video search engine to learn more about AI

CogAgent-18B – visual-based GUI Agent capabilities

What do you think. When should we expect the next SDXL version?

Gemini: Google's most capable AI model yet

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA
cross-modality hallucination language-model hallucinations multi-modal
Post date: 9 Dec 2023