Top 12 Python robustness Projects

promptbench

4 2,061 9.2 Python

A unified evaluation framework for large language models

Project mention: Show HN: Times faster LLM evaluation with Bayesian optimization | news.ycombinator.com | 2024-02-13

Fair question.
Evaluate refers to the phase after training to check if the training is good.
Usually the flow goes training -> evaluation -> deployment (what you called inference). This project is aimed for evaluation. Evaluation can be slow (might even be slower than training if you're finetuning on a small domain specific subset)!
So there are [quite](https://github.com/microsoft/promptbench) [a](https://github.com/confident-ai/deepeval) [few](https://github.com/openai/evals) [frameworks](https://github.com/EleutherAI/lm-evaluation-harness) working on evaluation, however, all of them are quite slow, because LLM are slow if you don't have infinite money. [This](https://github.com/open-compass/opencompass) one tries to speed up by parallelizing on multiple computers, but none of them takes advantage of the fact that many evaluation queries might be similar and all try to evaluate on all given queries. And that's where this project might come in handy.

OpenOOD

2 751 7.5 Python

Benchmarking Generalized Out-of-Distribution Detection

Project mention: [Online Leaderboard | Easy Evaluation] OpenOOD v1.5: Enhanced Benchmark for Out-of-Distribution Detection | /r/DeepLearningPapers | 2023-06-28

Open-sourced implementations of 40+ advanced methods (see our repo);

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
natural-adv-examples

1 570 0.0 Python

A Harder ImageNet Test Set (CVPR 2021)
safe-control-gym

2 518 6.2 Python

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and RL
assembled-cnn

1 330 0.0 Python

Tensorflow implementation of "Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network"
auto_LiRPA

1 263 4.2 Python

auto_LiRPA: An Automatic Linear Relaxation based Perturbation Analysis Library for Neural Networks and General Computational Graphs
linqit

2 245 3.5 Python

Extend python lists operations using .NET's LINQ syntax for clean and fast coding.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
alpha-beta-CROWN

1 206 4.7 Python

alpha-beta-CROWN: An Efficient, Scalable and GPU Accelerated Neural Network Verifier (winner of VNN-COMP 2021, 2022, and 2023)
ModelNet40-C

2 201 0.0 Python

Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296
ViTs-vs-CNNs

1 171 0.0 Python

[NeurIPS 2021]: Are Transformers More Robust Than CNNs? (Pytorch implementation & checkpoints)
fiddler-auditor

2 142 8.1 Python

Fiddler Auditor is a tool to evaluate language models.

Project mention: I asked 60 LLMs a set of 20 questions | news.ycombinator.com | 2023-09-09

This is really cool!
I've been using this auditor tool that some friends at Fiddler created: https://github.com/fiddler-labs/fiddler-auditor
They went with a langchain interface for custom Evals which I really like. I am curious to hear if anyone has tried both of these. What's been your key take away for these?

LBGAT

2 33 2.8 Python

Learnable Boundary Guided Adversarial Training (ICCV2021)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python robustness related posts

Show HN: Times faster LLM evaluation with Bayesian optimization

6 projects | news.ycombinator.com | 13 Feb 2024
[D] Yet another case of plagiarism in ICCV. The ICCV 2021 paper "Learnable Boundary Guided Adversarial Training"(arxiv 2011.11164) with the BMVC 2020 paper "Adversarial Concurrent Training: Optimizing Robustness and Accuracy Trade-off of Deep Neural Networks" (arxiv 2008.07015)

4 projects | /r/MachineLearning | 13 Jun 2022
[R] NEW Robustness Benchmark for 3D Point Cloud Recognition, ModelNet40-C. "Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions"

2 projects | /r/MachineLearning | 8 Feb 2022

Index

What are some of the best open-source robustness projects in Python? This list will help you:

	Project	Stars
1	promptbench	2,061
2	OpenOOD	751
3	natural-adv-examples	570
4	safe-control-gym	518
5	assembled-cnn	330
6	auto_LiRPA	263
7	linqit	245
8	alpha-beta-CROWN	206
9	ModelNet40-C	201
10	ViTs-vs-CNNs	171
11	fiddler-auditor	142
12	LBGAT	33