hackernews
Our great sponsors
sanitizer-api | hackernews | |
---|---|---|
5 | 13 | |
220 | 605 | |
3.6% | - | |
6.5 | 0.0 | |
7 days ago | about 9 years ago | |
Bikeshed | Arc | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
sanitizer-api
-
Mastering DOM manipulation with vanilla JavaScript
That entire post is poor.
• “Using regular expressions”: it suggests that this approach is acceptable within its limits. It’s not at all. As a simple example, the expression shown is trivially bypassed by "…". This is why, unlike the post claims claims, using regular expressions for cleaning HTML is not a common approach.
• (“Eliminating the script tags”: not sure quite why you’re against it, but I also want to grumble about using `[...scriptElements].forEach((s) => s.remove())` instead of `for (const s of scriptElements) { s.remove(); }` or even `Array.prototype.forEach.call(scriptElements, (s) => s.remove())`. Creating an array from that HTMLCollection is just unnecessary and a bad habit.)
• “Removing event handlers”: `value.startsWith('javascript:') || value.startsWith('data:text/html')` is inadequate. Tricks like capitalising in order to bypass such poor checks have been common for decades.
• “Retrieving the sanitized HTML”: you are now vulnerable to mXSS attacks, which undo all your effort.
• “Elements and attributes to remove from the DOM tree”: this proposes a blacklist approach and mentions a few examples of things that should be removed. Each example misses adjacent but equally-important things that should be removed. You will not get acceptable filtering if you start from this approach.
• “Simplifying HTML sanitization with external libraries”: this is pitched merely as easier, faster and cheaper, rather than as the only way to have any confidence in the result.
• “Conclusion”: as I hope I’ve shown, “The DOMParser API is one tool you can use to get the job done right.” is not an acceptable position.
Really, the article could be significantly improved by presenting it as what a common developer might think, and then scribbling all over the problematic things with these explanations of why they’re so bad, and ending with the conclusion “so: just use the DOMPurify library; consider nothing else acceptable”. (There have at times been a couple of other libraries of acceptable quality, but as far as I’m concerned, DOMPurify has long been the one that everyone should use. I note also that this article is talking about client-side filtration. I’m not familiar with the state of the art in server-side HTML sanitisation, where you probably don’t have an actual DOM; this is also a reasonable place to wish to do filtering, but the remaining active mXSS vectors might pose a challenge. I’d want to research carefully before doing anything.)
I look forward to the Sanitizer API <https://wicg.github.io/sanitizer-api/> being completed and deployed, so that DOMPurify can become just a fallback library for older browsers.
-
5 injection vulnerabilities hackers don't want developers to know about (and how to prevent them)
The upcoming Sanitizer API - kinda like a native DOMPurify that provides el.setHTML() and Document.parseHTML()
-
Google, Mozilla Close to Finalizing Sanitizer API for Chrome and Firefox Browse
The benefit of doing this client-side instead of server-side is that you can stay up to date with any changes that the client may make to how it's processing HTML that may have security implications. Additionally, you get to use the exact same code that the browser is ultimately using to parse the HTML, so a browser parsing bug, spec nuance, or un-specced legacy behavior that your backend developer didn't consider don't turn into serious security flaws.
Additionally, the Sanitize API does a much better job of handling contextual parsing then many other similar backend APIs. What happens when you parse an HTML fragment assuming it will live in a `div`, and then it actually get inserted into a `table` cell? The spec goes into this is more detail here: https://wicg.github.io/sanitizer-api/#strings
The downsides, of course, are those associated with any thick-client/thin-server API design—more logic on the front-end means more logic to reimplement for different consumers.
Personally, I would probably still stick with Nokogiri for my own applications, but I can see both sides of the trade-off.
hackernews
-
Can anyone tech me how to make a forum like this one
this might help a little: https://github.com/wting/hackernews
-
Ask HN: How is it possible to shop on Walmart.com? Everything is out of stock
I think it's a ratio of votes to time. I think as little as 4 votes can get something on the homepage if they come in fairly quickly.
The source code for hn is available if you want to go and look up the specifics. I'm not sure if this is the most up-to-date mirror, but the site doesn't change that often: https://github.com/wting/hackernews
-
Why Lisp Syntax Works
Might not count as modern, but the original Reddit and HackerNews codebases:
- https://github.com/reddit-archive/reddit1.0
- https://github.com/wting/hackernews (actually news.arc, based on old hn)
-
Ask HN: Is there an open-source HN forum clone?
There's also this https://github.com/wting/hackernews -- which is a version of the source code to the site from sometime in the past.
-
Whoops: Linux's Strcmp() for the M68k Has Always Been Broken
"Otherwise" was the operative word in my (slightly sarcastic) example. :)
Avoiding all caps words means you sometimes have to go back and change "FAA" back from "Faa".
HN's software is no longer open source, but at one time, this is how it processed titles on initial submission: https://github.com/wting/hackernews/blob/master/news.arc#L15...
-
U.S. appeals court rejects big tech's right to regulate online speech
And at any rate, #1 on HN is not the product of any simple rule like "most upvotes per unit time with some decay function applied." There is significant judgment in expressed in the way that stories are ranked. The sourcecode as of 2012 was enough to demonstrate this, but in my understanding yet more judgment has been applied since then.
https://github.com/wting/hackernews/blob/master/news.arc
-
Ask HN: How does HN manage to be always online?
"ad-hoc filesystem based solution" is the closest of your definitions, I think. Last time I saw/heard, HN was built in Arc, a Lisp dialect, and use(s/d) a variant of this (mirrored) code: https://github.com/wting/hackernews
Check out around this area of the code to see how simple it is. All just plain files. A database, of sorts, but not in the way you might be expecting: https://github.com/wting/hackernews/blob/master/news.arc#L16...
There is a modern maintained variant at https://github.com/arclanguage/anarki/tree/master/apps/news as well.
File syncing between machines is pretty much an easily solved problem. I don't know how they do it, but it could be something like https://syncthing.net/ or even some scripting with `rsync`. Heck, a cronned `tar | gzip | scp` might even be enough for an app whose data isn't exactly mission critical.
- Ask HN: Why are you programming your hobby projects in a niche language?
- News.Y Combinator.com/S.gif
-
Ask HN: How is HN internally structured?
The old version in arc, mirrored at https://github.com/wting/hackernews/blob/5a3296417d23d1ecc90..., uses the file system as a database.
https://github.com/wting/hackernews/blob/5a3296417d23d1ecc90... shows the monotonically increasing number:
(def new-item-id ()
What are some alternatives?
uBlock-issues - This is the community-maintained issue tracker for uBlock Origin
Hacker News API - Documentation and Samples for the Official HN API
DOMPurify - DOMPurify - a DOM-only, super-fast, uber-tolerant XSS sanitizer for HTML, MathML and SVG. DOMPurify works with a secure default, but offers a lot of configurability and hooks. Demo:
anarki - Community-managed fork of the Arc dialect of Lisp; for commit privileges submit a pull request.
html-dom - Common tasks of managing HTML DOM with vanilla JavaScript. Give me 1 ⭐if it’s useful.
api - A RESTful API package for the Laravel and Lumen frameworks.
React - The library for web and native user interfaces.
nativefier - Make any web page a desktop application
hscrpt
ChessPositionRanking - Software suite for ranking chess positions and accurately estimating the number of legal chess positions
awesome-hacker-news - Awesome Hacker News: a collection of awesome Hacker News apps, libraries, resources and shiny things.