Go macaroon Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
It's valuable to examine the challenges in machine learning without assuming decentralization as a solution:
> High Cost and Resource Requirements
For training and local inferencing use, quantization may help. Problem becomes local via quantization vs. remote full tensor use. Solution may involve distributed inferencing. Techniques like model distillation can help create smaller, more efficient models for inferencing.
> Data Privacy
For training, some private datasets may be needed. For local inferencing use, determining what needs to be inferenced locally vs. what needs to be run remotely may be useful. Problem becomes privacy scope mapped onto a marketplace to mitigate high cost and resource requirements. Techniques like model explainability (versioning) and robustness testing can help build trust in AI systems.
Complying with data privacy regulations and ensuring that AI systems adhere to legal and ethical standards can be a challenge, especially in international contexts.
> Incentives
Instead of assuming the solution when considering the problem, we assume there is an incentive to either simply train a model or use one. Problem becomes financial rewards, data access agreements, or even altruistic motivations.
> Stale Data and Reproducibility
Both the code and datasets for training the model need to be updated. Inferencing needs RAG, so the augmented reference data needs to be updated as well. Anything updated might need some type of revision control, especially if that data (or code) results in poor output. Labeling data and knowledge transfer are another problems that needs revision control.
> Interoperability
We can assume a marketplace for a ML train/inference platform is needed. We have HuggingFace, for example. The problem here is likely based on the tendency for datasets to be private, such as in the case of Llama 2. Models contain the "essence" of the dataset, but we still need RAG to ground the responses.
There does exist one technology that may assist in solving most of these issues without assuming full decentralization, and that is the Lightning Network combined with the yet to be implemented 402 response code: https://github.com/lightninglabs/aperture
Index
Project | Stars | |
---|---|---|
1 | aperture | 229 |
Sponsored