-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Would you be willing to elaborate on this paragraph? I found the GitHub pages for SparseGPT and WANDA and I'll read up on those methods (they're both new to me). I've seen a reference to task_arithmetic before in the code that the DARE authors produced for model merging, but it's also a new concept for me. I found this paper and this associated GitHub project. Do you recommend other reading or tools for task_arithmetic? Do you think DARE + TIES merging obviates the utility of sparsifying a model prior to merging the way you described?
I actually asked the creator of mergekit a question here. In his response, I learned how to use task_arithmetic to isolate the deltas. One could, in theory, use WANDA on that model from the second example, then merge it back into another model. However, that's firmly past the frontier of what has been tried, so experimentation might be messy.
Related posts
-
Language Models Are Super Mario: Absorbing Abilities from Homologous Models
-
Tools for merging pretrained large language models
-
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM
-
IBM Granite: A Family of Open Foundation Models for Code Intelligence
-
Ask HN: Affordable hardware for running local large language models?