smart_open
Nginx
smart_open | Nginx | |
---|---|---|
6 | 99 | |
3,091 | 20,211 | |
0.7% | 0.7% | |
8.3 | 8.9 | |
12 days ago | 9 days ago | |
Python | C | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
smart_open
- smart_open: Utils for streaming large files (S3, HDFS, gzip, bz2...)
-
Use AWS to unzip all of Wikipedia in 10 minutes
We’re using smart_open, which is an amazing library that lets you open objects in S3 (and other cloud object stores) as if they’re files on your filesystem. It’s obviously critical that we’re able to seek to an arbitrary position in an S3 file without first downloading the whole thing. We’ll assume you’re using Poetry, but you should be able to follow along with any other package manager:
-
Using AWS and Hyperscan to match regular expressions on 100GB of text
If you didn’t follow along with the first article in this series, you should be able to follow this article with your own dataset as long as you install smart_open and Meadowrun. smart_open is an amazing library that lets you open objects in S3 (and other cloud object stores) as if they’re files on your filesystem, and Meadowrun makes it easy to run your Python code on the cloud.
-
Ask HN: Codebases with great, easy to read code?
I see that you're primarily looking into Python work, so I'd recommend `smart_open` as a nice, compact way to get started.
https://github.com/RaRe-Technologies/smart_open
-
How to open an s3 binary file in lambda using python open() function?
You want smart_open. It gives you a (more complete) file-like interface to many different storage systems, including s3. You can read and seek as needed.
-
Fsspec: Filesystem Interfaces for Python
See also smart_open: https://github.com/RaRe-Technologies/smart_open which might be more user-friendly? Never used it myself but it was on HN before. Discussion on their bugtracker: https://github.com/RaRe-Technologies/smart_open/issues/579
Nginx
-
Nginx 1.26.0 Stable Released
Yeah, unless I'm looking at it wrong, there doesn't seem to be any meaningful difference between 1.25.5 and 1.26.0:
https://github.com/nginx/nginx/compare/release-1.25.5...rele...
-
How to securely reverse-proxy ASP.NET Core web apps
However, it's very unlikely that .NET developers will directly expose their Kestrel-based web apps to the internet. Typically, we use other popular web servers like Nginx, Traefik, and Caddy to act as a reverse-proxy in front of Kestrel for various reasons:
- Ask HN: Is nginx.org (the domain-name itself) gone?
-
Freenginx: Core Nginx Developer Announces Fork of Popular Web Server
> I actually don't understand why I am seeing arguments like this all the time.
Have a look at:
https://github.com/nginx/nginx/blob/master/src/http/modules/...
It's got the whole checklist: nginx idiosyncratic module system, inline parsing, custom utf conversion, buffer preallocation and adjustments, linked lists, comments about side effects of custom allocator, and probably other things.
It's not easy to deal with source like that and any serious improvement to that area would effectively be a rewrite anyway.
Since anything doing work in nginx is a module anyway, it wouldn't even have to be a full rewrite in one go.
-
The Internet is Maintained by 1 Software Developer
According to this article, nGinx is being used to serve 34% of all websites in the world. I checked out who's contributing to nGinx, and just like I thought, the project has 8,208 commits, and 5,366 of those commits was made by 2 software developers; igorsoev and mdounin.
- [06/52] Accessible Kubernetes with Terraform and DigitalOcean
- Freenginx.org
-
Performance benchmark of PHP runtimes
Nginx + Roadrunner (fcgi mode)
-
Web CGI programs aren't particularly slow these days
Apache’s mod_fastcgi’s last commit was 2 weeks ago:
https://svn.apache.org/viewvc/httpd/httpd/trunk/
It’s a fork of what you linked (and was more popular afaik back when fastcgi was state of the art, and apache was the undisputed champion of web servers).
These days, nginx has more market share than apache, and its fastcgi module is one of the more recently updated ones in its source tree (5 months vs multiple years):
https://github.com/nginx/nginx/tree/master/src/http/modules
If I was going to build an embedded web server, I’d start with nostd rust, probably with though axum + tokio, since thats already memory safe-ish.
If I needed fastcgi for some reason (dynamically loadable endpoints, or os-level isolation), there are at least four implementations of fastcgi for it. No idea if any are decent though.
-
Five Apache projects you probably didn't know about
APISIX is an API Gateway. It builds upon OpenResty, a Lua layer built on top of the famous nginx reverse-proxy. APISIX adds abstractions to the mix, e.g., Route, Service, Upstream, and offers a plugin-based architecture.
What are some alternatives?
s3fs - Amazon S3 filesystem for PyFilesystem2
Caddy - Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
Streamz - Real-time stream processing for python
envoy - Cloud-native high-performance edge/middle/service proxy
s3path - s3path is a pathlib extension for AWS S3 Service
Squid - Squid Web Proxy Cache
PyFilesystem2 - Python's Filesystem abstraction layer
nestjs-monorepo-microservices-proxy - Example of how to implement a Nestjs monorepo with no shared folder
rxsci - ReactiveX for data science
Hiawatha - Hiawatha is an open source webserver with security, easy to use and lightweight as the three key features. Hiawatha supports among others (Fast)CGI, IPv6, URL rewriting and reverse proxy. It has security features no other webserver has, like blocking SQL injections, XSS and CSRF attacks and exploit attempts. The built-in monitoring tool makes it perfect for large scale deployments.
fluvio-client-python - The Fluvio Python Client!
YARP - A toolkit for developing high-performance HTTP reverse proxy applications.