viper
dython
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
viper
dython
-
How to interpret scatterplot regarding customer purchasing habits
Make a categorical heatmap instead (example see https://github.com/shakedzy/dython/issues/2)
-
Time series prediction problem
to answer question one try just running a simple correlation matrix among your yearly and the average of your daily figures For years 2012+ when you have all four inputs. I frequently use the small convenience library Dython Dython in Github. If your features are very independent then you will not be able to fill in missing values and will need to find other surrogates such as “is my crop largely a fixed percentage of overall exports and are overall exports available for missing years?” If your features are highly dependent then essentially you don’t need them all - both XGBoost and LightGBM have simple fill-in-with-the-mean type imputation of missing values - run across all your data with imputation on and removing low impact features will remove all but one highly interdependent features.
What are some alternatives?
sketch - AI code-writing assistant that understands data content
RocketPy - Next generation High-Power Rocketry 6-DOF Trajectory Simulation
Mimesis - Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.
flopy - A Python package to create, run, and post-process MODFLOW-based models.
Prefect - The easiest way to build, run, and monitor data pipelines at scale.
dash - Data Apps & Dashboards for Python. No JavaScript Required.
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Machine-Learning-for-Asset-Managers - Implementation of code snippets, exercises and application to live data from Machine Learning for Asset Managers (Elements in Quantitative Finance) written by Prof. Marcos López de Prado.
PyFunctional - Python library for creating data pipelines with chain functional programming