html5ever
taffy
html5ever | taffy | |
---|---|---|
5 | 36 | |
1,987 | 1,807 | |
1.3% | 4.6% | |
7.6 | 8.5 | |
10 days ago | 16 days ago | |
Rust | Rust | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
html5ever
-
I'm fed up with it, so I'm writing a browser
Would you consider using some libraries in your project? There are lots of good ones in the Rust ecosystem, and many of them are not part of any existing browsers.
For example:
- https://github.com/servo/html5ever (HTML parsing - note: this is used in Servo)
- https://github.com/parcel-bundler/lightningcss (CSS parsing)
- https://github.com/DioxusLabs/taffy (web layout)
- https://github.com/pop-os/cosmic-text (text layout and rendering)
Obviously you should be free to work on whatever you like, but just as a benchmark on the scope of your project: I spent ~6 months implementing just the CSS Grid algorithm in Taffy last year. An entire browser from literal scratch is probably a 10 year project for one person.
-
Ask HN: A fast, Rust HTML parser that works?
So I'm doing some web scraping in Rust, and so I will need to parse HTML. [scraper](https://docs.rs/scraper/latest/scraper/) (which uses [html5ever](https://github.com/servo/html5ever)) is doing fine except that it's the bottleneck of my application.
So I need a faster parser. I've tried [tl](https://docs.rs/tl/latest/tl/) which would've been perfect except that it doesn't actually work on the HTML I have. When I try to `query_selector` the elements I need, it returns nothing.
[Kuchiki](https://docs.rs/kuchiki/latest/kuchiki/) is abandonded.
I couldn't figure out how to get [lol-html](https://github.com/cloudflare/lol-html) to work for me (it's designed for re-writing HTML, whatever that means). It doesn't seem to have an API to extract the inner text of an element.
[html5gum](https://github.com/untitaker/html5gum) seems to be just an HTML tokenizer, or otherwise just too low-level. I have not yet tried [quick-xml](https://github.com/tafia/quick-xml/) but judging from the README, it's pretty low-level too. I mean, if these are the only options left then I will try them. Otherwise, I would love to use a parser that's faster but as ergonomic as `scraper` or `tl`.
At this point, I would be happy with an Lxml bridge/port of some sort. I don't need to mutate HTML, just parse and read data from it.
- Any HTML parsing resources without going straight to W3C?
- I’m developing rust module like google pagespeed nginx module, which will rewrite html for each request it received for dynamic optimisation. what library is fastest to do this? I’m using this now
-
What is the best way to parse HTML tags?
See https://github.com/servo/html5ever/tree/master/rcdom for an example implementation to imitate.
taffy
-
Show HN: Dropflow, a CSS layout engine for node or <canvas>
I maintain a standalone web layout engine[0] (currently implementing Flexbox and CSS Grid) which has no scripting support. WPT layout tests using is a major blocker to us running WPT tests against our library. Yoga (used by React Native) is in a similar position.<p>Do you think the WPT would accept pull requests replacing such tests with equivalent tests that don't use <script> (perhaps using a build script to generate multiple tests instead - or simply writing out the tests longhand)?<p>I could run against only the ref-tests, but if I can't get full coverage then the WPT seems to provide little value over our own test suite.<p>[0]: <a href="https://github.com/DioxusLabs/taffy">https://github.com/DioxusLabs/taffy</a>
-
CSS for Printing to Paper
> Is there any easy to use/hack HTML layouting engine where I could experiment with custom CSS attributes and bridge that gap? Would anything from Servo be suitable?
Servo could be used for this. You'd want to add support for parsing the CSS properties themselves to the style crate in https://github.com/servo/stylo and then the layout implementation to the layout2020 crate in https://github.com/servo/servo. You do effectively get a whole browser though.
I'm currently working on building a lighter weight / hackable layout engine based on a combination of https://github.com/servo/stylo (for css parsing and selector resolution), https://github.com/DioxusLabs/taffy (for box-level layout) and https://github.com/pop-os/cosmic-text (for flow/inline layout). I expect to have something decent in around 6 months
Neither of these setups currently have any support for pagination though.
-
I'm fed up with it, so I'm writing a browser
I maintain a web layout library that is designed to be integrated into other software:
https://github.com/DioxusLabs/taffy
It needs to be combined with a text layout engine (such as https://github.com/pop-os/cosmic-text), and it doesn't support everything yet (notable features that are currently missing: "float", "display: inline-block", "box-sizing: content-box", "position: static"). But we have Block, Flexbox and CSS Grid support with more on the way.
-
Looking for this. html + css rendering through wgpu.
All of these projects have in common that they use Taffy (the project that I work on!) for box-level layout (which currently gives them block, flexbox, and grid layout) , and are either using or planning to use cosmic-text for text/inline layout. This gives you a decent first approximation of web layout, but it's not perfect and there are major features like float, display: inline-block, position: static, box-sizing: content-box missing. Not to mention that none of these implementations currently resolve CSS selectors, so you are effectively limited to inline styles (if you're interested in something in that direction then you may be interested in https://github.com/vizia/vizia).
-
Show HN: Slint - A Declarative UI Toolkit Written in Rust for Embedded & Desktop
While there are a lot of Rust UI frameworks, none of them are really recommended for production use yet. I suspect a few of the will die off and work will coalesce a few once things mature a bit.
Another nice feature of the Rust UI ecosystem is that lots of it is being built in a modular way. For example I maintain a layout engine [0] library which just does layout and can be easily integrated by anybody creating a UI library. And there a bunch of similar composable libraries covering rendering, text layout, accessibility, window creation, clipboard access, etc.
[0]: https://github.com/DioxusLabs/taffy
-
Conflict-Driven Synthesis for Layout Engines
You might be interested in the combination of Taffy [0] which handles box-level browser layout (block, flexbox, grid, etc) and Cosmic Text [1] which handles text-level layout and basic text editing functionality.
Integrating them into browsers while retaining accessibility could be tricky. But in they're general they're relatively small standalone libraries implementing most of the layout algorithms that browsers implement (although there are currently a few key missing features like laying out "inline-block" items in line with text).
[0]: https://github.com/DioxusLabs/taffy
[1]: https://github.com/pop-os/cosmic-text
-
Ink: React for interactive command-line apps
I maintain a library (https://github.com/DioxusLabs/taffy) that implements both Flexbox and CSS Grid, and is designed to be easily embedded (similar to Yoga, which Ink is using).
-
[Media] Version 0.3 of Inlyne - An interactive markdown renderer written entirely in Rust
https://github.com/DioxusLabs/taffy (disclaimer: I work on this crate) which does CSS layout given CSS styles. This would probably be much more useful once we merge support for display: block (https://github.com/DioxusLabs/taffy/pull/474), and if in the future we support display: table. Taffy doesn't handle text layout but is designed to integrate nicely with external layout systems.
-
Project idea: port markdownlint to Rust
Ok, "1.4GB" made me look into this more. I hadn't realised that we were using a "superlinter" action that includes linters for over 10 languages. Switching to a different github action brought to time down to 3 seconds! https://github.com/DioxusLabs/taffy/pull/463
- GitHub Accelerator: our first cohort and what's next
What are some alternatives?
rust-htmlescape - A HTML entity encoding library for Rust
dioxus - Fullstack GUI library for web, desktop, mobile, and more.
serde - Serialization framework for Rust
stretch - High performance flexbox implementation written in rust
byteorder - Rust library for reading/writing numbers in big-endian and little-endian.
mirrord - Connect your local process and your cloud environment, and run local code in cloud conditions.
retrokit - :joystick: Bring back the old Web(Kit) and make it secure
pomsky - A new, portable, regular expression language
bincode - A binary encoder / decoder implementation in Rust.
yoga - Yoga is an embeddable layout engine targeting web standards.
tersenet - A new type of JavaScript-free light-weight fast browser built on rst and web assembly. Does not actually exist.
pypandoc - Thin wrapper for "pandoc" (MIT)