The Unreasonable Effectiveness of an LLM Agent Loop with Tool Use

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io
featured
  1. nanoagent

    The nano framework for AI agents. Pure TypeScript, no dependencies, 100x smaller than langchain.

    Yes, agent loops are simple, except, as the article says, a bit of "pump and circumstance"!

    If anyone is interested, I tried to put together a minimal library (no dependency) for TypeScript: https://github.com/hbbio/nanoagent

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. apply-llm-changes

    A command-line tool that applies file changes from LLM output to your local filesystem.

    I am avoiding the cost of API access by using the chat/ui instead, in my case Google Gemini 2.5 Pro with the high token window. Repomix a whole repo. Paste it in with a standard prompt saying "return full source" (it tends to not follow this instruction after a few back and forths) and then apply the result back on top of the repo (vibe coded https://github.com/radekstepan/apply-llm-changes to help me with that).

  4. litellm

    Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

    Confession time: litellm still doesn't support parallel tool calls with Gemini models [https://github.com/BerriAI/litellm/issues/9686] so we wrote our own "parallel tool calls" on top of Structured Output. It did take a few iterations on the prompt design but all of it was "yeah I can see why that was ambiguous" kinds of things, no real complaints.

    GP2.5 does have a different flavor than S3.7 but it's hard to say that one is better or worse than the other. GP2.5 is I would say a bit more aggressive at doing "speculative" tool execution in parallel with the architect, e.g. spawning multiple search agent calls at the same time, which for Brokk is generally a good thing but I could see use cases where you'd want to dial that back.

  5. OpenHands

    🙌 OpenHands: Code Less, Make More

  6. PocketFlow-Tutorial-Cursor

    Pocket Flow Tutorial Project: Build Cursor with Cursor

    There's also this one which uses pocketflow, a graph abstraction library to create something similar [0]. I've been using it myself and love the simplicity of it.

    [0] https://github.com/The-Pocket/PocketFlow-Tutorial-Cursor/blo...

  7. llm-min.txt

    Min.js Style Compression of Tech Docs for LLM Context

    The default chat interface is the wrong tool for the job.

    The LLM needs context.

    https://github.com/marv1nnnnn/llm-min.txt

    The LLM is a problem solver but not a repository of documentation. It still needs to look that up like human developers.

    You could use o3 and ask it to search the web for documentation and read that first but it's not efficient. The proper LLM coding assistant tools manage the context properly.

  8. clickclickclick

    A framework to enable autonomous android and computer use using any LLM (local or remote)

    I built android-use[1] using LLM. It is pretty good at self healing due to the "loop", it constantly checks if the current step is actually a progress or regress and then determines next step. And the thing is nothing is explicitly coded, just a nudge in the prompts.

    1. clickclickclick - A framework to let local LLMs control your android phone (https://github.com/BandarLabs/clickclickclick)

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  10. SWE-bench

    SWE-bench [Multimodal]: Can Language Models Resolve Real-world Github Issues?

    “Claude Code, better than Sourcegraph, better than Augment Code.”

    That’s a pretty bold claim, how come you are not at the top of this list then? https://www.swebench.com/

    “Use frontier models like o3, Gemini Pro 2.5, Sonnet 3.7”

  11. claude-task-master

    An AI-powered task-management system you can drop into Cursor, Lovable, Windsurf, Roo, and others.

    No manual copy paste. That is not good use of time. I work in a git repo and point multiple LLMs at it.

    One LLM reviews existing code and the new requirement and then creates a PRD. I usually use Augment Code for this because it has a good index of all local code.

    I then ask Google Gemini to review the PRD and validate it and find ways to improve it. I then ask Gemini to create a comprehensive implementation plan. It frequently creates a 13 step plan. It would usually take me a month to do this work.

    I then start a new session of Augment Code, feed it the PRD and one of the 13 tasks at a time. Whatever work it does, it checks it in a feature branch with detailed git commit comment. I then ask Gemini to review the output of each task and provide feedback. It frequently finds issues with implementation or areas of improvement.

    All of this managed by using git. I make LLMs use git. I think would go insane if I had to copy/paste this much stuff.

    I have a recipe of prompts that I copy/paste. I am trying to find ways to cut that down and making slow progress in this regard. There are tools like "Task Master" (https://github.com/eyaltoledano/claude-task-master) that do a good job of automating this workflow. However this tool doesn't allow much customization. e.g. Have LLMs review each other's work.

    But, maybe I can get LLMs to customize that part for me...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Markdown for the AI Era

    1 project | news.ycombinator.com | 13 Jul 2025
  • Agentmark

    1 project | news.ycombinator.com | 10 Jul 2025
  • Show HN: A modern C++20 AI SDK (GPT‑4o, Claude 3.5, tool‑calling)

    4 projects | news.ycombinator.com | 29 Jun 2025
  • In-memory Semantic caching for LLMs written in Rust

    1 project | news.ycombinator.com | 16 Jun 2025
  • RubyLLM 1.3.0: Just When You Thought the Developer Experience Couldn't Get Any Better 🎉

    1 project | dev.to | 3 Jun 2025