-
The article talks about multipart/form-data in particular.
Another thing one might run across is multipart/x-mixed-replace. I wrote a crate for that. [1] I didn't see a spec for it, but someone since pointed out to me that it's probably identical to multipart/x-mixed, and now seeing an example in the multer README it clicks that I should have looked at RFC 1341, which says this:
> All subtypes of "multipart" share a common syntax, defined in this section.
...and written a crate general enough for all of them. Maybe I'll update my crate for that sometime. My crate currently assumes there's a Content-Length: for each part, which isn't specified there but makes sense in the context I use it. It wouldn't be hard to also support just the boundary delimiters. And then maybe add a form-data parser on top of that.
btw, the article also talks specifically about proxying the body. I don't get why they're parsing the multipart data at all. I presume they have a reason, but I don't see it explained. I'd expect that a body is a body is a body. You can stream it along, and perhaps also buffer it in case you want to support retrying the backhaul request, probably stopping the buffering at some byte limit at which you give up on the possibility of retries, because keeping arbitrarily large bodies around (in RAM or even spilling to SSD/disk) doesn't sound fun.
[1] https://crates.io/crates/multipart-stream
-
InfluxDB
Purpose built for real-time analytics at any scale. InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.
-
Shameless plug for my multipart crate: https://github.com/cetra3/mpart-async which I've been using happily in production for a long time now
-
You can technically add a Content-Length header for each part. It's not forbidden by the RFC, but nor is it common. It caused [problems](https://github.com/square/okhttp/issues/2138) for OkHttp, and they eventually removed it. Might be fine for internal-only use, though.
Boundaries are a lot like UUIDs, and rely on the same logic. When generating random data, once you have enough bits, the odds are against that sequence of bits ever having been generated before in the universe.
-
HTTP/1 requests (uploads in this case) are also separate to some degree (though there are fairly stringent limits on connections per domain iirc which HTTP/2 resolves via the mentioned streams/multiplexing of connections).
The problem they have specifically would be that in a single request (form post for example) those uploads will be linear.
Solution really boils down to paralellizing the upload, using protocols/standards like https://tus.io/ or S3-compatible APIs to push the data up then syncronize with a record/document on the server.
Related posts
-
OpenSSL bug exposed up to 255 bytes of server heap and existed since 2011
-
Chat with any GPT right through your favorite text editor
-
Is there a server simulator available for testing API endpoints with low code or no code configuration?
-
Do you use OkHttp with custom maxRequestsPerHost or maxIdleConnections?
-
[HELP] Add a dependency in IntelliJ