pex
ideas4
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pex
-
Our Plan for Python 3.13
We get (very) close to cross-environment reproducible builds for Python with https://github.com/pantsbuild/pex (via Pants). For instance, we build Linux x86-64 artifacts that run on AWS Lambda, and can build them natively on ARM macOS.
This is not raw requirements.txt, but isn’t too far off: Pants/PEX can consume one to produce a hash-pinned lock file.
-
Is it possible pickle a function with its dependencies?
You should look into pex, or it’s parent build system pants. A PEX (Python EXecutable) file can package up all your code including dependencies and run on another machine of similar OS with just an available compatible interpreter.
- Pex: Python EXecutable
-
security risks in python libs
For well-supported libraries, pip-audit might do the trick. Where I've worked, we have used a central build system with library version enforcement. The build system produces a deployable archive, like PEX or similar. Rock-solid tests and sandbox validation environments provide good paths for version upgrades. Restricting libraries to a small set, making sure those repos remain actively developed, performing audits and centralizing builds has helped organizations I've worked in keep on top of potential security issues.
- My latest blogpost, python packaging has moved forward, but we're still missing a crucial part - what do you think?
- PyBake: Create single file standalone Python scripts with builtin frozen file system
- I am frustrated with packaging python, please educate me.
-
A function decorator that rewrites the bytecode to enable goto in Python
Don't know if I agree about the goto thing, but there are actually a number of options now for delivering varying degrees of self-contained Python executable.
When I evaluated the landscape a few years ago, I settled on PEX [1] as the solution that happened to fit my use-case the best— it uses a system-provided Python + stdlib, but otherwise brings everything (including compiled modules) with it in a self-extracting executable. Other popular options include pyinstaller and cx_freeze, which have different tradeoffs as far as size, speed, convenience, etc.
[1]: https://github.com/pantsbuild/pex
-
Mypyc: Compile type-annotated Python to C
Somewhat related, I had a devil of a time a little bit ago trying to ship a small Python app as a fully standalone environment runnable on "any Linux" (but for practical purposes, Ubuntu 16.04, 18.04, and 20.04). It turns out that if you don't want to use pip, and you don't want to build separate bundles for different OSes and Python versions, it can be surprisingly tricky to get this right. Just bundling the whole interpreter doesn't work either because it's tied to a particular stdlib which is then linked to specific versions of a bunch of system dependencies, so if you go that route, you basically end up taking an entire rootfs/container with you.
After evaluating a number of different solutions, I ended up being quite happy with pex: https://github.com/pantsbuild/pex
It basically bundles up the wheels for whatever your workspace needs, and then ships them in an archive with a bootstrap script that can recreate that environment on your target. But critically, it natively supports the idea of targeting multiple OS and Python versions, you just explicitly tell it which ones to include, eg:
--platform=manylinux2014_x86_64-cp-38-cp38 # 16.04
ideas4
-
WTF is going on with R7RS Large?
https://github.com/samsquire/ideas4#334-knowledgegraph-progr...
-
Async rust – are we doing it all wrong?
How would you do control flow and scheduling and parallelism and async efficiently with this code?
`db.save()`, `download()` are IO intensive whereas `document.query("a")` and `parse` is CPU intensive.
I think its work diagram looks like this: https://github.com/samsquire/dream-programming-language/blob...
I've tried to design a multithreaded architecture that is scalable which combines lightweight threads + thread pools for work + control threads for IO epoll or liburing loops:
Here's the high level diagram:
https://github.com/samsquire/ideas5/blob/main/NonblockingRun...
The secret is modelling control flow as a data flow problem and having a simple but efficient scheduler.
I wrote about schedulers here and binpacking work into time:
https://github.com/samsquire/ideas4#196-binpacking-work-into...
I also have a 1:M:N lightweight thread scheduler/multiplexer:
https://github.com/samsquire/preemptible-thread
-
It Took Me a Decade to Find the Perfect Personal Website Stack – Ghost+Fathom
My blogging/journalling setup is simple.
I just use GitHub. I just rely on the default repository view on GitHub.com
I create a README.md and add markdown headings to the bottom or to the top (bottom if its a journal, top if it's a blog) and then when I get to 100-800 I create a new repository and repeat.
https://github.com/samsquire/ideas (2013)
https://github.com/samsquire/ideas4
https://github.com/samsquire/ideas3
https://github.com/samsquire/ideas2
-
Ask HN: Could you show your personal blog here?
Thanks for posting this Ask HN question.
I journal ideas and thoughts about computers and software. I am interested in software architecture, parallelism, async, coroutines, database internals, programming language implementation, software design and the web.
https://github.com/samsquire/ideas (2013)
https://github.com/samsquire/ideas2
https://github.com/samsquire/ideas3
https://github.com/samsquire/ideas4 <-- this is recent but needs editing
https://github.com/samsquire/ideas5 <-- this is what I'm working on now
https://github.com/samsquire/startups
https://github.com/samsquire/blog <-- thoughts I want to write about, but incomplete
I use README.md on GitHub and create a heading at the bottom for each entry. I use Typora on Windows or the GitHub web interface to edit.
-
Our Plan for Python 3.13
My deep interest is multithreaded code. For a software engineer working on business software, I'm not sure if they should be spending too much time debugging multithreaded bugs because they are operating at the wrong level of abstraction from my perspective for business operations.
I'm looking for an approach to writing concurrent code with parallelism that is elegant and easy to understand and hard to introduce bugs. This requires alternative programming approaches and in my perspective, alternative notations.
One such design uses monotonic state machines which can only move in one direction. I've designed a syntax and written a parser and very toy runtime for the notation.
https://github.com/samsquire/ideas5#56-stateful-circle-progr...
https://github.com/samsquire/ideas4#558-state-machine-formul...
The idea is inspired by LMAX Disruptor and queuing systems.
-
io_uring support for libuv – 8x increase in throughput
This is really good. Thank you!
I've been studying how to create an asynchronous runtime that works across threads. My goal: neither CPU and IO bound work slow down event loops.
I've only written two Rust programs but in Rust you presumably you can use Rayon (CPU scheduling) and Tokio (IO scheduling)
I wrote about using the LMAX Disruptor ringbuffer pattern between threads.
https://github.com/samsquire/ideas4#51-rewrite-synchronous-c...
I am designing a state machine formulation syntax that is thread safe and parallelises effectively. It looks like EBNF syntax or a bash pipeline. Parallel steps go in curly brackets. There is an implied interthread ringbuffer between pipes.
states = state1 | {state1a state1b state1c} {state2a state2b state2d} | state3
-
What Is Type-Level Programming?
This is very interesting and could lead to some futuristic programming technology.
I kind of want to plot the state space of a program to see all available states.
In my exploration of distributed systems, microservices and multithreaded systems, it is extremely helpful to try and see what potential states the system can be in. Global and local reasoning of these kinds of software is rather difficult.
I've written about value tracing but I've not heard of treating values as types. I would love to be able to see the trajectory of a value through different states.
https://github.com/samsquire/ideas4#571-value-calculus-varia...
I've never written a TLA+ specification and I'm a complete beginner to this space but I've been trying to understand the dining philosophers one. TLA+ Toolbox is aware of discrete states in the state space, which is absolutely awesome. Types can inform us about future possible valid states.
I began writing a visualisation of memory and animated the movement of memory around to try reveal patterns.
https://replit.com/@Chronological/ProgrammingRTS#index.html
If we see types or values as positions, we can create animations of the state space unfolding in front of us. This is the dream.
-
Late Architecture with Functional Programming
Great comment!
>I think late architecture is orthogonal to functional, imperative
Absolutely. From a truly architectural view, procedural, functional, and method-oriented (current OO) are really only variations on the call/return architectural style. Good and sometimes important distinctions, but not really that far apart. They are very much about computing, results from inputs. That is an appropriate architecture for fewer and fewer programs.
See Why Architecture Oriented Programming matters
https://blog.metaobject.com/2019/02/why-architecture-oriente...
and
Can Programmers Escape the Gentle Tyranny of call/return?
https://2020.programming-conference.org/details/salon-2020-p...
> its solution is higher level than even functional programming
Yes. Well, functional actually gets most of its utility from being lower level as far as paradigms go (less powerful). But yes.
> and more abstract
No. Well, yes, if expressed with current programming languages. But that's part of the problem set, not part of the solution set. We should be able to express our architectures less abstractly, more concretely, but for that we need linguistic support. Which is why I am working on that:
http://objective.st
> I want software architecture to be cheap and easy to change without breaking any existing behaviours. I don't know much research on this subject.
There was quite a bit of research at CMU, for example on packaging mismatch. Famous paper Architectural Mismatch, Why Reuse is so hard, and the 10 year follow up in 2009: Architectural Mismatch: Why Reuse is Still So Hard
https://repository.upenn.edu/cgi/viewcontent.cgi?article=107...*
Not much has changed since.
> https://github.com/samsquire/ideas4
> https://devops-pipeline.com
Will check those out. Dataflow is definitely a big part of it, with the extension of dataflow constraints (make, spreadsheets, "FRP"/"Rx"). But so is in-process REST with Storage Combinators!
And breaking down barriers between scripting and "real" programming.
-
Service Mesh Use Cases
Thanks for this.
I have never deployed a server mesh or used one but I am designing something similar at the code layer. It is designed to route between server components. That is, at the architecture between threads in a multithreaded system.
The problem I want to solve is that I want architecture to be trivially easy to change with minimal code changes. This is the promise and allure of enterprise service buses and messaging queues.
I have managed RabbitMQ and I didn't enjoy it.
If I want a system that can scale up and down and that multiples of any system object can be introduced or removed without drastic rewrites.
I would like to decouple bottleneck from code and turn it into runtime configuration.
My understanding of things such as Traefik and istio is that they are frustrating to set up.
Specifically I am working on designing interthread communication patterns for multithreaded software.
How do you design an architecture that is easy to change, scales and is flexible?
I am thinking of a message routing definition format that is extremely flexible and allows any topology to be created.
https://github.com/samsquire/ideas4#526-multiplexing-setting...
I think there is application of the same pattern to the network layer too.
Each communication event has associated with it an environment of keyvalues that look similar to this:
petsserver1
-
Release engineering is exhausting so here's cargo-dist
Thanks for remembering me :-)
I would like things to run locally by default and then deployed to the cloud where they run.
Should be easier to debug problems if I can get the code to my machine and investigate issues with tools that my computer has such as "strace", "perf" and debug logging that I liberally apply to the build script.
In production we would have log aggregation and log search (such as ELK stack) and it is a good habit to get into the perspective of debugging production via tooling.
But CICD feels before that tooling in the pipeline. You could wire up your CICD to log to ELK but I would prefer local deployable software.
I think my focus on automating things means I want to be capable of seeing how the thing works without relying on a deployed black box in the cloud and using assumptions of how it works rather than direct investigation.
One of my journal entries is almost a lamentation of all the things that need to be done to release and use software.
This is that entry:
https://github.com/samsquire/ideas4#5-permanent-softwareplat...
I wonder if software could be deployed more like a URL that has all the information to configure a virtual machine. Docker over URL or something.
What are some alternatives?
mypyc - Compile type annotated Python to fast C extensions
preemptible-thread - How to preempt threads in user space
setup.py - 📦 A Human's Ultimate Guide to setup.py.
ideas2 - Another 85+ Ideas for Computing https://samsquire.github.io/ideas2/
python-goto - A function decorator, that rewrites the bytecode, to enable goto in Python
wg-async - Working group dedicated to improving the foundations of Async I/O in Rust
pyBake - Create single file standalone Python scripts with builtin frozen file system
ideas - a hundred ideas for computing - a record of ideas - https://samsquire.github.io/ideas/
plusplus - Enables increment operators in Python using a bytecode hack
saddle-data-graph - where does it come from, where does it go?
typed_python - An llvm-based framework for generating and calling into high-performance native code from Python.
periphery - A tool to identify unused code in Swift projects.