Universal APIs for unstructured data. Sync documents from SaaS tools to a SQL or vector database, where they can be easily queried by AI applications [Moved to: https://github.com/psychic-api/psychic] (by ai-sidekick)
We’re in the process of making everything open source (there are some contractual issues we’re working through), but our client side code and basic infra is here: https://github.com/ai-sidekick/sidekick.
Providing technical support to developers has been expensive for companies because they need to hire skilled engineers to do it. We’ve seen community support channels with a 2000:1 ratio of developers to support engineers - there’s no way every question will get answered. We built Sidekick to make this much easier. It’s particularly helpful for open-source companies/projects because many OSS communities have a lot of people asking questions, but hardly anyone helping troubleshoot.
We integrate with Slack and Discord, since that’s where developer support is already happening. On the backend, we use Weaviate to index the data and OpenAI’s text-davinci-3 model to generate responses.
In addition to answering questions, Sidekick can also update .md files automatically with new information. When someone reacts to a message in Slack with the emoji, Sidekick will use Weaviate to find the part in your documentation that’s most related to the message, then use GPT3 to merge the new info into the documentation. Finally, it will submit a pull request on Github with the changes. We saw that devrel teams are already making product announcements and helping users troubleshoot common issues in the community, so we built this feature to save them even more time.
We use GPT for generating the responses and new documentation, but are relying less and less on it after learning that you hit a ceiling on answer quality very quickly by using only GPT and prompt engineering techniques. Here’s some of what we learned trying to prevent hallucinations in our answers: https://medium.com/@jfan001/how-we-cut-the-rate-of-gpt-hallu...
What we found makes a much bigger difference is the breadth and quality of the content you can search through, which is why we now rely a lot more on cleaning and annotating data, which yields far better results when combined with prompt chaining. For example, instead of naively chunking data into 1000 token blocks, we parse the markdown into semantically meaningful sections (e.g. paragraphs, lists, code blocks) and tag the content with the header name and document name so it’s more likely to surface for searches that are match for the section it’s from, even if it doesn’t exactly match the content in that chunk.
One fun thing we also learned is that when Sidekick gets added to a #help channel, people who otherwise wouldn’t ask questions start using it. It turns out, there are a lot of “lurkers” who come to these channels to find answers, but don’t want to bother anyone with their issue. Adding a tool that they can get answers from instantly brings these people out into the community, giving founders and community managers an opportunity to reach out to them.
To summarize, Sidekick 1) saves support engineers time, 2) keeps the docs up to date and 3) helps engage developers in the community. Long-term we want to provide an analytics product on top of Sidekick so companies can understand how their product is being used and where there are opportunities to add more value to their customers (and charge more money for it).
We’d love to hear from the HN community about this product! Do you think using a tool to search through and update developer docs from Slack would save you time?
GPT-powered chat for documentation, chat with your documents
https://github.com/arc53/DocsGPT has been here on the frontpage just a month ago.
And some other similar ones, but it good to see projects that will take a OSS approach, they are a few.
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Simple, efficient background processing for Ruby
What is more, there is already a quite popular, and beloved in the Ruby community, project named Sidekiq - https://github.com/sidekiq/sidekiq.
Nice, I was playing with documentation q&a with gpt in https://github.com/rmuller-ml/gpt_table_of_contents_doc_qa and found that if the documents your are doing semantic search on is structured, you can prompt gpt with the table of contents so gpt does the chunk prefiltering for you. I guess this is similar to your tagging explanation. Nice work
3 one-person million dollar online businesses
2 projects | /r/Business_Ideas | 4 Dec 2023
Exploring concurrent rate limiters, mutexes, semaphores
2 projects | dev.to | 11 Sep 2023
Sidekiq and managing resumable jobs?
2 projects | /r/rails | 24 May 2023
Organize Business Logic in Your Ruby on Rails Application
4 projects | dev.to | 17 May 2023
How to mitigate being rate limited by a third party API?
1 project | /r/rails | 5 May 2023