Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work. Learn more →
Similar projects and alternatives to flower
Federated gradient boosted decision tree learning
Minimum docker/fastapi/celery/flower setup
Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.
An Industrial Grade Federated Learning Framework
FedScale is a scalable and extensible open-source federated learning (FL) platform.
A python framework for risk scoring
PyTorch for benchmarking communication-efficient distributed SGD optimization algorithms
Thousands of code solutions with clear explanation @ onelinerhub.com
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
Early-stage b-rep CAD kernel, written in the Rust programming language.
gpt4all: an ecosystem of open-source chatbots trained on a massive collections of clean assistant data including code, stories and dialogue
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
FauxPilot - an open-source alternative to GitHub Copilot server
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
A list of totally open alternatives to ChatGPT
Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)
Universal APIs for unstructured data. Sync documents from SaaS tools to a SQL or vector database, where they can be easily queried by AI applications [Moved to: https://github.com/psychic-api/psychic] (by ai-sidekick)
AI-controlled Linux Containers
Pretrained remote sensing models for the rest of us. (by moonshinelabs-ai)
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
flower reviews and mentions
13 projects | /r/dailyainews | 23 May 2023
Flower , an open-source framework for training AI on distributed data. We move the model to the data instead of moving the data to the model. (https://flower.dev/)20 projects | /r/dailyainews | 23 May 2023
22-Mar-2023 Adobe unveils creative generative AI model, Firefly, to aid content creation Google has begun rolling out early access to its Bard chatbot in the US and UK Data Breach At ChatGPT? Users Report Seeing Unknown Conversations On Their Screens GPT-4 is available in preview in Azure OpenAI Service AI-powered coding assistance REPL that pairs GPT-4 (https://github.com/jiggy-ai/pair) Open source alternative to ChatGPT (https://github.com/nichtdax/awesome-totally-open-chatgpt) Run 100B+ language models at home, BitTorrent‑style (https://petals.ml/) Find the most relevant piece of code context. Hover and highlight blocks of code, the tool will point you to the most relevant pieces of information on git, messaging, and ticketing systems. Finally, it provide a summary with the power of GPT.(https://www.watermelontools.com/) Why AI Won't Replace Software Engineers (https://softwarecomplexity.com/why-ai-wont-replace-software-engineers) 23-Mar-2023 'The iPhone Moment of AI' Nvidia to Rent Out Supercomputers Behind ChatGPT to Businesses for $37,000 a Month Bill Gates calls AI revolutionary, says it can reduce some of the world’s worst inequities AI pics of Donald Trump's arrest by 'cop' Joe Biden go viral. Will we no longer be able to tell what’s real vs what’s fake?” - Eluna AI New research shows we can only accurately identify AI writers about 50% of the time. (https://hai.stanford.edu/news/was-written-human-or-ai-tsu) FauxPilot - an open-source GitHub Copilot server(https://github.com/fauxpilot/fauxpilot) Flower , an open-source framework for training AI on distributed data. We move the model to the data instead of moving the data to the model. (https://flower.dev/) OpenAI-Integrated Microsoft Bing Outperforms Google in Page Visits (https://www.gadgets360.com/internet/news/openai-integrated-microsoft-bing-outperforms-google-page-visits-growth-3885069) GitHub Copilot X: GitHub Copilot is evolving to bring chat and voice interfaces, support pull requests, answer questions on docs, and adopt OpenAI’s GPT-4 for a more personalized developer experience. (https://github.blog/2023-03-22-github-copilot-x-the-ai-powered-developer-experience/) Moonshine – open-source, pretrained ML models for satellite (https://github.com/moonshinelabs-ai/moonshine) Mozilla.ai: A startup — and a community — that will build a trustworthy and independent open-source AI ecosystem. Mozilla.ai’s initial focus? Tools that make generative AI safer and more transparent. And, people-centric recommendation systems that don’t misinform or undermine our well-being. (https://blog.mozilla.org/en/mozilla/introducing-mozilla-ai-investing-in-trustworthy-ai/) OpenAI’s policies hinder reproducible research on language models (https://aisnakeoil.substack.com/p/openais-policies-hinder-reproducible) 24-Mar-2023 Adobe has added AI features to Photoshop and Illustrator, while Nvidia has unveiled ‘Picasso’ AI image generation service. ChatGPT-owner OpenAI fixes 'significant issue' exposing user chat titles.A bug in an open-source library caused ChatGPT to leak user conversation titles. Graphic design platform Canva introduces new generative AI tools Gmail for Android, Google Messages to Soon Get Features for AI-Generated Texts Apple: Transformer architecture optimized for Apple Silicon (https://github.com/apple/ml-ane-transformers) ChatGPT plugins, join waitlist (https://openai.com/blog/chatgpt-plugins) Microsoft's paper on OpenAI's GPT-4 had hidden information (https://twitter.com/DV2559106965076/status/1638769434763608064) how to use LoRA to fine-tune LLaMA using Alpaca training data (https://replicate.com/blog/fine-tune-alpaca-with-lora) Helicone: one-line integration logs the prompts, completions, latencies, and costs of your OpenAI requests (https://github.com/Helicone/helicone) RWKV is an RNN with Transformer-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). (https://github.com/BlinkDL/RWKV-LM) open-source retrieval plugin The open-source retrieval plugin enables ChatGPT to access personal or organizational information sources (with permission). It allows users to obtain the most relevant document snippets from their data sources, such as files, notes, emails or public documentation, by asking questions or expressing needs in natural language. Security considerations The retrieval plugin allows ChatGPT to search a vector database of content, and add the best results into the ChatGPT session. This means it doesn’t have any external effects, and the main risk is data authorization and privacy. Developers should only add content into their retrieval plugin that they are authorized to use and can share in users’ ChatGPT sessions. https://github.com/openai/chatgpt-retrieval-plugin 27-Mar-2023 Autodoc: Toolkit for auto-generating codebase documentation using LLMs (https://github.com/context-labs/autodoc) March 20 ChatGPT outage: Here’s what happened (https://openai.com/blog/march-20-chatgpt-outage) Facebook is going after LLaMA repos with DMCA's (https://twitter.com/theshawwn/status/1638925249709240322) ChatGPT + Wolfram is INSANE! (https://old.reddit.com/r/ChatGPT/comments/1205omc/chatgpt\_wolfram\_is\_insane/) Reproducing the Stanford Alpaca results using low-rank adaptation (LoRA) (https://github.com/chris-alexiuk/alpaca-lora) GOAT, a decentralized way to publish and download AI models.Powered by BitTorrent and Bitcoin.(https://ipfs.io/ipfs/QmYyucgBQVfs9JXZ2MtmkGPAhgUjNgyGE6rcJT1KybQHhp/index.html) Dolly from databricks (https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html) AI powered Developer Tools 2.0. https://www.sequoiacap.com/article/ai-powered-developer-tools/ Turn your designs into production-ready front-end code for mobile apps and the web (https://www.locofy.ai/) Using ChatGPT Plugins with LLaMA (https://blog.lastmileai.dev/using-openais-retrieval-plugin-with-llama-d2e0b6732f14) 28-Mar-2023 Bing AI now allows 20 prompts per session and can make images for you ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks (https://arxiv.org/abs/2303.15056) ChatGPT or Grammarly? Evaluating ChatGPT on Grammatical Error Correction Benchmark (https://arxiv.org/abs/2303.13648) AI-controlled Linux Containers (https://github.com/fafrd/aquarium) Microsoft reportedly orders AI chatbot rivals to stop using Bing’s search data (https://www.theverge.com/2023/3/25/23656336/microsoft-chatbot-rivals-stop-using-bing-search-index) 29-Mar-2023 Text2Video-Zero Code and Weights Released by Picsart AI Research (12G VRAM).(https://github.com/Picsart-AI-Research/Text2Video-Zero) Pause Giant AI Experiments: An Open Letter. Huggingface's SF Open-Source AI Meetup officially has 2000 people registered. Cerebras open sources seven GPT-3 models from 111 million to 13 billion parameters. Trained using the Chinchilla formula, these models set new benchmarks for accuracy and compute efficiency.(https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/) Independent implementation of LLaMA that is fully open source under the Apache 2.0 license (https://github.com/Lightning-AI/lit-llama) Bootstrap knowledge of LLMs (https://gist.github.com/rain-1/eebd5e5eb2784feecf450324e3341c8d) OPENFLAMINGO: AN OPEN-SOURCE FRAMEWORK FOR TRAINING VISION-LANGUAGE MODELS WITH IN-CONTEXT LEARNING (https://laion.ai/blog/open-flamingo/) gpt4all: a chatbot trained on a massive collection of clean assistant data including code, stories and dialogue (https://github.com/nomic-ai/gpt4all) 30-Mar-2022 Microsoft Security Copilot is a new GPT-4 AI assistant for cybersecurity (https://www.theverge.com/2023/3/28/23659711/microsoft-security-copilot-gpt-4-ai-tool-features) UK details ‘pro-innovation’ approach to AI regulation (https://www.artificialintelligence-news.com/2023/03/29/uk-details-pro-innovation-approach-ai-regulation/) Employees Are Feeding Sensitive Biz Data to ChatGPT, Raising Security Fears (https://www.darkreading.com/risk/employees-feeding-sensitive-business-data-chatgpt-raising-security-fears) In the Age of AI, Don't Let Your Skills Atrophy (https://www.cyberdemon.org/2023/03/29/age-of-ai-skill-atrophy.html) Now ChatGPT is being (mis)used to do #PeerReview (https://mstdn.science/@ukrio/110100752908161183) Bing Chat now has Ads! (https://twitter.com/debarghya\_das/status/1640892791923572737) Cerebras-GPT vs LLaMA AI Model Comparison (https://www.lunasec.io/docs/blog/cerebras-gpt-vs-llama-ai-model-comparison/) Arthur C. Clarke about the future of AI. — 21 September 1964 (https://twitter.com/Rainmaker1973/status/1640016339011076097) ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline (https://medium.com/@yangyou\_berkeley/colossalchat-an-open-source-solution-for-cloning-chatgpt-with-a-complete-rlhf-pipeline-5edf08fb538b) Create and Embed Custom AI Assistants with Libraria (https://libraria.dev/) 31-Mar-2023 Deranged New AI Has No Guardrails Whatsoever, Proudly Praises Hitler (https://futurism.com/deranged-ai-no-guardrails) Midjourney Kills Free AI Image Generator Access After Explosion of Deep Fakes (https://decrypt.co/124972/midjourney-free-ai-image-generation-stopped-over-deepfakes) Judge asks ChatGPT to decide bail in murder trial (https://nypost.com/2023/03/29/judge-asks-chatgpt-for-decision-in-murder-trial/) Should you use OpenAI's embeddings? Probably not, and here's why. (https://iamnotarobot.substack.com/p/should-you-use-openais-embeddings) Visual Studio Code and GitHub Copilot (https://code.visualstudio.com/blogs/2023/03/30/vscode-copilot) Llama Hub (https://llamahub.ai/) Finetuning LLMs on a Single GPU Using Gradient Accumulation (https://lightning.ai/pages/blog/gradient-accumulation/) Open source ETL framework for retrieval augmented generation (RAG). Sync data from your SaaS tools to a vector store, where they can be easily queried by GPT apps (https://github.com/ai-sidekick/sidekick) HALTT4LLM - Hallucination Trivia Test for Large Language Models (https://github.com/manyoso/haltt4llm) Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality (https://vicuna.lmsys.org/) Iterate.ai Brings Generative AI Capabilities to Interplay, the Low-Code Platform Accelerating Customers’ Digital Innovation (https://www.indianweb2.com/2023/03/iterateai-brings-generative-ai.html) RFdiffusion is an open source method for structure generation, with or without conditional information (a motif, target etc). (https://github.com/RosettaCommons/RFdiffusion) Google denies training Bard on ChatGPT chats from ShareGPT
Transformer fine-tuning on decentralized data
2 projects | /r/learnmachinelearning | 29 Mar 2023
Large language models like GPT-3 have gained immense popularity recently, and, using Flower, it's easy to transform an existing Hugging Face workflow to train models on decentralized data. This example blog post will show how to fine-tune a pre-trained distilBERT model on the IMDB dataset for sequence classification (determining if a movie review is positive or not). You can also check out the associated Colab notebook and the code example from the Flower repo.
Launch HN: Flower (YC W23) – Train AI models on distributed or sensitive data
Hey HN - we're Daniel, Taner, and Nic, and we're building Flower (https://flower.dev/), an open-source framework for training AI on distributed data. We move the model to the data instead of moving the data to the model. This enables regulatory compliance (e.g. HIPAA) and ML use cases that are otherwise impossible. Our GitHub is at https://github.com/adap/flower, and we have a tutorial here: https://flower.dev/docs/tutorial/Flower-0-What-is-FL.html.
Flower lets you train ML models on data that is distributed across many user devices or “silos” (separate data sources) without having to move the data. This approach is called federated learning.
A silo can be anything from a single user device to the data of an entire organization. For example, your smartphone keyboard suggestions and auto-corrections can be driven by a personalized ML model learned from your own private keyboard data, as well as data from other smartphone users, without the data being transferred from anyone’s device.
Most of the famous AI breakthroughs — from ChatGPT and Google Translate to DALL·E and Stable Diffusion — were trained with public data from the web. When the data is all public, you can collect it in a central place for training. This “move the data to the computation” approach fails when the data is sensitive or distributed across organizational silos and user devices.
Many important use cases are affected by this limitation:
* Generative AI: Many scenarios require sensitive data that users or organizations are reluctant to upload to the cloud. For example, users might want to put themselves and friends into AI-generated images, but they don't want to upload and share all their photos.
* Healthcare: We could potentially train cancer detection models better than any doctor, but no single organization has enough data.
* Finance: Preventing financial fraud is hard because individual banks are subject to data regulations, and in isolation, they don't have enough fraud cases to train good models.
* Automotive: Autonomous driving would be awesome, but individual car makers struggle to gather the data to cover the long tail of possible edge cases.
* Personal computing: Users don't want certain kinds of data to be stored in the cloud, hence the recent success of privacy-enhancing alternatives like the Signal messenger or the Brave browser. Federated methods open the door to using sensitive data from personal devices while maintaining user privacy.
* Foundation models: These get better with more data, and more diverse data, to train them on. But again, most data is sensitive and thus can't be incorporated, even though these models continue to grow bigger and need more information.
Each of us has worked on ML projects in various settings, (e.g., corporate environments, open-source projects, research labs). We’ve worked on AI use cases for companies like Samsung, Microsoft, Porsche, and Mercedes-Benz. One of our biggest challenges was getting the data to train AI while being compliant with regulations or company policies. Sometimes this was due to legal or organizational restrictions; other times, it was difficulties in physically moving large quantities of data or natural concerns over user privacy. Any of these problems can halt a ML project in its tracks.
We realized issues of this kind were making it too difficult for many ML projects to get off the ground, and would especially impact certain domains like healthcare and finance.
Federated learning offers an alternative — it doesn't require moving data in order to train models on it, and so has the potential to overcome many barriers for ML projects.
In early 2020, we began developing the open-source Flower framework to simplify federated learning and make it user-friendly. Last year, we experienced a surge in Flower's adoption among industry users, which led us to apply to YC. In the past, we funded our work through consulting projects, but looking ahead, we’re going to offer a managed version for enterprises and charge per deployment or federation. At the same time, we’ll continue to run Flower as an open-source project that everyone can continue to use and contribute to.
Federated learning can train AI models on distributed and sensitive data by moving the training to the data instead of moving the data to the training. It collects whatever it can from the learning process, and the data stays where it is. Because the data never moves, we can train AI on sensitive data spread across organizational silos or user devices to improve models with data that could never be leveraged until now.
Here’s how it works: (0) Initialize the global model parameters on the server; (1) Send the model parameters to a number of organizations/devices (client nodes); (2) Train model locally on the data of each organization/device (client node); (3) Return the updated model parameters back to the server; (4) On the server, aggregate the model updates (e.g., by averaging them) into a new global model; (5): Repeat steps 1 to 4 until the model converges.
This, of course, is more challenging than centralized learning: we must move AI models to data silos or user devices, train locally, send updated models back, aggregate them, and repeat. Flower provides the open-source infrastructure to easily do this, as well as supporting other privacy-enhancing technologies (PETs). It is compatible with PyTorch, TensorFlow, JAX, Hugging Face, Fastai, Weights & Biases and all the other tools used in ML projects regularly. The only dependency on the server side is NumPy, but even that can be dropped if necessary. Flower uses gRPC under the hood, so a basic client can easily be auto-generated, even for most languages that are not supported today.
Flower is open-source (Apache 2.0 license) and can be run in all kinds of environments: on a personal workstation for development and simulation, on Google Colab, on a compute cluster for large-scale simulations or on a cluster of Raspberry Pi’s (or similar devices) to build research systems, or deployed on public cloud instances (AWS, Azure, GCP, others) or private on-prem hardware. We are happy to help users when deploying Flower systems and will soon make this even easier through our managed cloud service.
You can find PyTorch example code here: https://flower.dev#examples, and more at https://github.com/adap/flower/tree/main/examples.
We believe that AI technology must evolve to be more collaborative, open and distributed than it is today (https://flower.dev/blog/2023-03-08-flower-labs/). We’re eager to hear your feedback, experiences regarding difficulties in training, data access, data regulation, privacy and anything else related to federated (or related) learning methods!
looks like they've introduced some differential privacy wrappers, the changelog points to that: https://github.com/adap/flower/blob/94a1f942abfce5dff4e9aff2...
There are some similarities, but also some differences. Flower's take is that it wants to support the entire FL workflow from experimental research to large-scale production deployments and operation. Some other FL frameworks fall either in the "research" or "production deployment" bucket, but few have good support for both.
Flower does a lot under the hood to support these different usage scenarios: it has both a networked engine (gRPC, experimental support for REST, and the possibility to "bring your own communication stack") and a simulation engine to support both real deployment on edge devices/server and simulation of large-scale federations on single machines or compute clusters.
This is - to the best of our knowledge - one of the drivers of our large and active community. The community is very collaborative and there are many downstream projects in the ecosystem that build on top of Flower (GitHub lists 748 dependent projects: https://github.com/adap/flower/network/dependents).
PG: We can't all use AI. Someone has to generate the training data
2 projects | news.ycombinator.com | 14 Mar 2023
I agree that proprietary data will become more valuable. It is, even today, mostly not accessible for AI training and holds so much value. We are working on Flower (https://flower.dev), which enables training AI on private data without the data owner having to share it.
Call for Volunteers in Machine Learning User Study
2 projects | /r/BATProject | 6 Sep 2022
Flower framework: https://flower.dev/
Flower Team Releases Flower 0.18 With Cool New Updates For Federated Learning
2 projects | /r/learnmachinelearning | 29 Mar 2022
Flower is an end-to-end federated learning framework that allows for a smoother transition from simulation-based experimental research to system research on many real-world edge devices. Flower has individual strengths in both domains (i.e., simulation and real-world devices) and the capacity to switch back and forth between the two extremes as needed throughout exploration and development. Researchers present use cases that drive our viewpoint, design goals, the resultant framework architecture, and comparisons to other frameworks in this part.2 projects | /r/learnmachinelearning | 29 Mar 2022
A note from our sponsor - Sonar
www.sonarsource.com | 4 Jun 2023
adap/flower is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of flower is Python.