Our great sponsors
-
-
This can be done as a batch job. 350MB is not really that big and may even be smaller if you just need a subset of the columns. You would basically loop through and process each file individually and append if I understood you correctly. My initial implementation would be to use a combination of zip, StringIO, and csv modules to process the zip file in-memory since it should fit comfortably in RAM. The issue would be is having a fault tolerant process to do this continuously and reliably. So for that I would use a general purpose scheduler. If you're stuck on Windows, I highly recommend dagster as it now comes with an awesome general purpose scheduler that works in Windows. Otherwise, I would look into Airflow or Prefect with Prefect easier to use than Airflow. Ideally, you would use cloud resources, but can be done locally with a VM. But more importantly, where do you intend the final resting place to be? I would recommend a database.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.