Ask HN: How do you deal with large Python code bases?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • Bazel

    a fast, scalable, multi-language and extensible build system

  • The larger your codebase gets the more bazel becomes a requirement. Bazel is really not negotiable for large python code bases. The more bazel is put off, the more pain you will endure before you eventually are forced to use bazel. You will be forced to use bazel or a system like it because eventually your good devs will not tolerate your codebase and leave without it.

    https://bazel.build/

    Other than bazel you will have to start hacking away at dependency problems.

    No inline imports, no circular imports. Imports all sorted at the top of your file. You will have to start enforcing good hygiene with linters.

    You will need to create warnings against using the global scope.

    You will need to construct the clients for all your dependencies in main()

    You will need to discourage the use of calling non trivial functions in constructors. (this property largely encourages dependency injection).

    There are exceptions to every rule, but if you are going to violate scope or not dependency inject, those things need to be done very mindfully.

    As the structure of your code improves via good scoping and injected dependencies, it will become easier to change and easier to test.

    You will have to devote some serious consideration to how to quarantine business logic from server code. Generally, your product developers shouldn't be doing much outside of defining their data and altering business logic from within a route. If the place where business logic is executed is commingled with how data-stores are manipulated, you're going to have a bad time. Likewise if the place business logic is executed is commingled with the presentation of it to customers, you're going to have a bad time.

    Python does not have a culture of dependency injection because it's so easy to import antigravity and fly away. This makes writing tests hard and promotes spaghetti code. Lack of dependency injection (which means violating scoping) is the entropic force that makes codebases miserable as time increases.

    Additionally, you will have to think hard about state. If you can't restart a process trivially, or balance traffic to a different machine trivially, you are going to make your operational people's lives hard. State belongs in state storage. Put it in an RDBMS, put it in redis, put it in memcached, put it in anything but a python processes memory. This means that any two requests should be able to be sent to any two machines. This is a deeply important property for scaling.

    Lastly, if you do not have good answers for observability, in terms of time series data, log data, exception data, and event data (for observability only), you will have a bad time. These are generally the things it is ok to violate scope to use.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts