Show HN: Running LLMs in one line of Python without Docker

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • leptonai

    A Pythonic framework to simplify AI service building

  • Hello Hacker News! We're Yangqing, Xiang and JJ from lepton.ai. We are building a platform to run any AI models as easy as writing local code, and to get your favorite models in minutes. It's like container for AI, but without the hassle of actually building a docker image.

    We built and contributed to some of the world's most popular AI software - PyTorch 1.0, ONNX, Caffe, etcd, Kubernetes, etc. We also managed hundreds of thousands of computers in our previous jobs. And we found that the AI software stack is usually unnecessarily complex - and we want to change that.

    Imagine if you are a developer who sees a good model on github, or HuggingFace. To make it a production ready service, the current solution usually requires you to build a docker image. But think about it - I have a few python code and a few python dependencies. That sounds like a huge overhead, right?

    lepton.ai is really a pythonic way to free you from such difficulties. You write a simple python scaffold around your PyTorch / TensorFlow code, and lepton launches it as a full-fledged service callable via python, javascript, or any language that understands OpenAPI. We use containers under the hood, but you don't need to worry about all the infrastructure nuts and bolts.

    We have made the python library open-source at https://github.com/leptonai/leptonai/. With it, launching a common HuggingFace model is as simple as a one liner. For example, if you have a GPU, Stable Diffusion XL is as simple as:

    ```

  • examples

    Lepton Examples (by leptonai)

  • It's not only about "building a docker" but also maintaining multiple models, multiple environments and a lot of users. Imagine there is a group of engineers each needing to deploy their own models: one needs tensorflow 1.x, one needs tensorflow 2.x, one needs pytorch and one needs a very strange combination of dependencies. Trust me, things get complex very easily:

    https://github.com/leptonai/examples/blob/main/advanced/whis...

    I definitely agree that for a fixed use case, building a docker once and for all is probably the simplest and best approach. However, it quickly gets more complex and out of hand.

    Also the basic plan is free for independent developers - isn't it worth it? :)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts