Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Fortunately, literate programming is a thing, so there are still tools for that. If you want PDF/HTML output facilities, it looks like CodeBraid is the way to go. It uses the Pandoc framework, so you can do all kinds of neat things with it.
IMO the biggest strength (there are many) that machine learning has over stats is "pretraining", which is basically training a model on one task, then using it in other tasks. Google spends $10K - 100K training BERT on an external knowledge base (usually gigabytes of text data), then freely puts it up for download. You can then "fine tune" BERT on your own dataset, which is more accurate and much cheaper/faster and less data intensive than it would be otherwise.