Extracting git repository data with PyDriller

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • pydriller

    Python Framework to analyse Git repositories

  • PyDriller is an open-source Python library that allows you to "drill into" git repositories.

  • ML.NET

    ML.NET is an open source and cross-platform machine learning framework for .NET.

  • Important Note: looping over repository commits takes a long time for large repositories. It took 52 minutes to analyze the ML.NET repository this code example refers to, which had 2,681 commits at the time of analysis on February 25th, 2023.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

  • I chose to do this by using the popular Pandas library for tabular data analysis tasks.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts