ethics
moonwatcher
ethics | moonwatcher | |
---|---|---|
1 | 1 | |
265 | 16 | |
0.0% | - | |
0.0 | 4.9 | |
almost 2 years ago | 9 months ago | |
Python | Python | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ethics
-
[P] Request: Any datasets of morality stories?
Code for https://arxiv.org/abs/2008.02275 found: https://github.com/hendrycks/ethics
moonwatcher
-
Open-Source Evaluation and Testing Framework for Computer Vision Models
Hey,
for the past weeks, we’ve been developing an open-source evaluation and testing framework for computer vision models. Today we’ve released the first alpha version and would love to get your feedback and support.
Github: https://github.com/moonwatcher-ai/moonwatcher
*What problems are we solving?*
- *Manual, error-prone evaluation:* Assessing model quality is still a manual and error-prone process. Of course, aggregation metrics exist, but they usually overlook that the model works differently on some parts of the data.
What are some alternatives?
ToolEmu - [ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
feather - AI Testing Toolkit for AI applications
natural-adv-examples - A Harder ImageNet Test Set (CVPR 2021)
ModelNet40-C - Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296
ACE_Model_Implementation - A python implementation of Dave Shap's ACE Model
langtest - Deliver safe & effective language models