php-spx
babashka
Our great sponsors
php-spx | babashka | |
---|---|---|
7 | 112 | |
1,872 | 3,790 | |
- | 0.8% | |
7.4 | 9.2 | |
3 months ago | 6 days ago | |
C | Clojure | |
GNU General Public License v3.0 only | Eclipse Public License 1.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
php-spx
-
What are modern profiling tools?
Not used it in a while, but https://github.com/NoiseByNorthwest/php-spx is worth checking out.
-
How to profile your PHP applications with Xdebug
https://github.com/NoiseByNorthwest/php-spx
SPX could be loaded with docker-compose like this article does for Xdebug. But if you already have a PHP environment, the easiest way to install it is to compile it (sudo apt install php-dev && make && cp modules/spx.so /usr/lib/php/....).
-
Looking for a PHP app profiler
php-spx
-
How to use xdebug to pinpoint PHP in a large application?
Looks like, this one was not yet mentioned: you can try SPX (https://github.com/NoiseByNorthwest/php-spx)
-
Crystal Lang 1.0 Release
(See also my other comment, which makes a totally different point that I decided to note separately because this got big and would have buried it)
Well, I have ADHD. I've found the most effective approach (on top of treatment) that helps me retain focus is reexec-on-save, a la `while :; do tput clear; $thing; inotifywait -q -e moved_to .; done`. I usually have a dozen of those in old shell histories (^R FTW). (Ha, my laptop actually has exactly 12, and my other machine has 23 - although ignoredups is off...)
$thing might be `bash ./script.sh` (because my text editor's atomic rename doesn't understand execute bits >.>), `php script.php` or `gcc -O0 script.c && ./script`. (Also, as an aside I used to use `-e close_write $file` until I realized watching even giant directories is equivalently efficient to watching a file.)
Shell scripts (the small kind that run few subprocesses) are typically fast. Likewise, small C programs of <1000-2000 lines compile just about instantly on modern hardware; and where modern hardware isn't available and what I'm trying to do doesn't leverage too many libraries or whatnot, tcc has been able to swing the balance firmly in my favor in the past, which has been great.
But for better or worse, PHP is currently the language I use the most. Because it's faster than Python and Ruby.
A while back I wanted to do a bit of analysis on a dataset of information that was only published as a set of PDF documents... yayyy. But after timidly gunzipping the stream blocks and googling random bits of PDF's command language ("wat even is this"), I discovered to my complete surprise that it was trivial to interpret the text coordinate system and my first "haha let's see how bad this is" actually produced readable text on pretty much the first go. (To be pedantic, step #-1 was "draw little boxes", step #0 was "how to x,y correctly" and step #1 was "replace boxes with texWHAT it worked?!")
With rendering basically... viable (in IIRC 300-500 LOC O.o), the next step was the boring stir-the-soup-for-8-hours bespoke state machine that cross-correlated text coordinates with field meanings ("okay, that's a heading, and the next text instruction draws the field value underneath. OK, assert that the heading is bold, the value is not, and they're both exactly the same (floating-point) Y position.")
While that part took a while, it was mostly extremely easy, because I was pretty much linearly writing the script "from start to finish", ie just chipping away at the rock face of the task at hand until I processed an entire document, then the next document ("oh no"), then the next one ("ugh") and so forth ("wait, the edge cases are... decreasing? :D"). My workflow was pretty much founded entirely on the above-noted method.
Loading/gunzipping a given PDF and getting to the point where the little pipeline would crash would typically complete in the span of time it would take me to release the CTRL key after hitting CTRL+S. So while the process was objectively quite like stirring soup, it did not feel like that at all and I was able to kind of float a bit as my brain cohesively absorbed the mental model of the architecture I was building without any distractions, pauses or forced context switches getting jammed in the mental encoding process like so many wrenches.
Soon 15 documents were handled correctly, then 20, then 30, then 100 ("oooh, if all the items on the page add up exactly right it pushes line 2 of the summary heading down to the second page! Hmmm... how on earth to special-case that without refactoring to look at more than 1 page at a time..."), and then I hit some sort of threshold and it suddenly just started ticking through PDFs like crazy without asserting. Which was both awesome and a Problemâ„¢: the thing ran at something like ~60 PDFs/sec, and while jumping to just after the last successfully-processed PDF on restart worked great when the code crashed constantly, now I was sitting spinning for tens of seconds, getting distracted as I anticipated the next crash. ADHD(R)(TM).
I wasn't surprised to learn from htop that the script was disk-bound; for some reason my ZFS mirror setup will happily read sequentially at 200MB/s, but thousands-of-tiny-files situations are... suffice to say apt unconditionally takes 60 seconds to install the smallest thing, unless the entire package db is in the FS cache. I'm not sure why. The PDFs were sharded sanely, but they were still in separate files. So I decided to pack them all into a giant blob, and since there weren't too many PDFs and they were numbered sequentially I used a simple offset-based index at the front of the blob where `fseek(data_start + ( * 4)); $o = fread(4); fseek($o);` would give me random seeking.
Reading the blob instead promptly pegged a single CPU core (yay!), and gave me IIRC ~150+ PDFs/sec. This was awesome. But I was still just a tiny bit curious, so after googling around for a profiler and having a small jawdrop moment about SPX (https://github.com/NoiseByNorthwest/php-spx), I had a tentative look at what was actually using the most CPU (via `SPX_ENABLED=1 php ./script.php`, which will automatically print a one-page profile trace to stdout at graceful exit or ^C).
Oh. The PDF stack machine interpreter is what's taking all the CPU time. That tiny 100 line function was the smallest in the whole script. lol
So, I moved that function to the preprocessor/packer, then (after some headscratching) serialized the array of tokenized commands/strings into the blob by prefixing commands with \xFF and elements with \xFF\xFE\xFF so I could explode() on \xFF and tell commands from strings by checking if the previous entry was \xFE (and just skip entries of '\xFE' when I found them) :D. Then I reran the preprocessor to regenerate the pack file.
$ php convert_dlcache.php
-
Don't blindly trust profilers
I've written a bit about this issue in php-spx's README https://github.com/NoiseByNorthwest/php-spx#notes-on-accuracy
- A simple straight-to-the-point PHP profiling extension with its built-in web UI
babashka
-
A Tour of Lisps
It also gives you access to Babashka if you want Clojure for other use-cases where start-up time is an issue
- Babashka: Fast native Clojure scripting runtime
-
What's the value proposition of meta circular interpreters?
I've tried researching this myself and can't find too much. There's this project metaes which is an mci for JS, and there's the SCI module of the Clojure babashka project, but that's about it. I also saw Triska's video on mci but it was pretty theoretical.
-
Adding Dependencies on Clojure Project the Node Way: A Small Intro to neil CLI
Created by the same guy who created babashka which is a way to write bash scripts, node scripts, and even apple scripts using Clojure. A very proficient and influential developer in the Clojure community. This is how borkduke's neil helps us:
- Babashka
-
Pure Bash Bible
Not what you asked for but there is Babashka for scripting in Clojure.
-
Critique of Lazy Sequences in Clojure
Clojure's lazy sequences by default are wonderful ergonomically, but it provides many ways to use strict evaluation if you want to. They aren't really a hassle either. I've been doing Clojure for the last few years and have a few grievances, but overall it's the most coherent, well thought out language I've used and I can't recommend it enough.
There is the issue of startup time with the JVM, but you can also do AOT compilation now so that really isn't a problem. Here are some other cool projects to look at if you're interested:
Malli: https://github.com/metosin/malli
Babashka: https://github.com/babashka/babashka
-
Sharpscript: Lisp for Scripting
Being a Clojure addict, I guess I have to leave the obligatory link to Babashka too then: https://github.com/babashka/babashka (Native, fast starting Clojure interpreter for scripting)
-
Rash – The Reckless Racket Shell
which is now on hiatus. babashka: https://babashka.org
-
Are there any languages (that are in common use in companies) and higher-level that give you the same feeling of simplicity and standardization as C?
I've enjoyed babashka for scripting; which is close enough to clojure to allow using some/many libraries; but (probably) not for embedding.
What are some alternatives?
PHPSpy - low-overhead sampling profiler for PHP 7+
janet - A dynamic language and bytecode vm
development - Docker based local development environment
malli - High-performance data-driven data specification library for Clojure/Script.
php-memory-profiler - Memory profiler for PHP. Helps finding memory leaks in PHP scripts.
joker - Small Clojure interpreter, linter and formatter.
clockwork - Clockwork - php dev tools in your browser - server-side component
nbb - Scripting in Clojure on Node.js using SCI
coz - Coz: Causal Profiling
clojure-lsp - Clojure & ClojureScript Language Server (LSP) implementation
Vrmac - Vrmac Graphics, a cross-platform graphics library for .NET. Supports 3D, 2D, and accelerated video playback. Works on Windows 10 and Raspberry Pi4.
racket - The Racket repository