-
tensorflow_macos
Discontinued TensorFlow for macOS 11.0+ accelerated using Apple's ML Compute framework.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
determined
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
The answer is an API, like NNAPI. AD is a frontend concern and doesn't really matter to accelerators.
For AD, I am bullish for Enzyme, which does AD on LLVM IR, avoiding deep compiler integration: https://enzyme.mit.edu/
You might be interested in this for your M1 MBA: https://github.com/apple/tensorflow_macos
I'm not sure that's necessarily the domain of a low-level package like CUDA.jl though (which I assume you're referring to). That kind of interface is more the domain of higher-level packages like https://github.com/JuliaParallel/Dagger.jl/ and to a lesser extent https://juliagpu.github.io/KernelAbstractions.jl/stable/. Moreover, the jury is still out on whether the built-in Distributed module is an ideal abstraction for every use-case (clusters, heterogeneous compute, etc.)
WRT Nx, my biggest question is how they'll crack the problem of still needing big balls of C++ and the shims everywhere to get acceleration. Creating a compiler that generates efficient GPU or other accelerator code is a massive research project with no clear winners, never mind the challenge of reconciling the very mutation-heavy needs of GPU compute with a mostly immutable language model.
Ah I see - I think we're pretty much on the same page in terms of timetables. Although if you include TPU, I think it's fair to say that custom accelerators are already a moderate success.
Updated my profile. I've been working on DL training platforms and distributed training benchmarking for a bit so I've gotten a nice view into the GPU/TPU battle.
Shameless plug: you should check out the open-source training platform we are building, Determined[1]. One of the goals is to take our hard-earned expertise on training infrastructure and build a tool where people don't need to have that infrastructure expertise. We don't support TPUs, partially because a lack of demand/TPU availability, and partially because our PyTorch TPU experiments were so unimpressive.
[1] GH: https://github.com/determined-ai/determined, Slack: https://join.slack.com/t/determined-community/shared_invite/...
Related posts
-
Nebuly – The LLM Analytics Platform
-
Ask HN: Any tools or frameworks to monitor the usage of OpenAI API keys?
-
good computer vision or deep learning projects in github
-
What are you building with LLMs? I'm writing an article about what people are building with LLMs
-
🤖🌟 Unlock the Power of Personal AI: Introducing ChatLLaMA, Your Custom Personal Assistant! 🚀💬