sansio-tld-parser
sansio-tld-parser | psl-problems | |
---|---|---|
1 | 4 | |
0 | 102 | |
- | - | |
0.0 | 0.0 | |
over 2 years ago | over 4 years ago | |
Python | ||
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
sansio-tld-parser
-
Public Suffix List
Small plug for a random python tool I maintain that uses this.
Parsing domains is a pain in the ass. It can be impossible to know what is part of tld, what is a subdomain etc without a canonical list and parser.
Here's a sansio domain / tld splitter: https://github.com/theelous3/sansio-tld-parser
Usecase: you want to block all edu domains - but tlds like wa.edu.au exists - gotta parse it out.
psl-problems
-
See this page fetch itself, byte by byte, over TLS
Ryan Sleevi has written about this before on Hacker News and here's his list https://github.com/sleevi/psl-problems
It's definitely possible that Ryan would consider using this for HN a reasonable choice, because it's mostly cosmetic, but in general you should just not add more dependencies.
- Public Suffix List Problems
-
Public Suffix List
Before you begin to make use of the PSL, consider some of its problems: https://github.com/sleevi/psl-problems
FWIW, the link above successfully convinced me and a coworker not to use the PSL.
-
W3C slaps down Google's proposal to treat multiple domains as same origin
(googler here, but this is my opinion)
I think there's a big abstraction gap between what we use domains for and what they were supposed to be used for, in a way that we shouldn't assume any ownership only based on the domain itself.
For instance you can have a number of sites that use separate domains but are owned by the same entity (N domains for 1 party). You could also have the same base domain being used for several unrelated parties, think hosting a store on Shopify (1 domain for N parties). This is so ambiguous that even inside the browser you have two different implementations on the way you handle this attribution, one for cookies and one for Single-Origin Policy.
There's a good write up about this problem at https://github.com/sleevi/psl-problems. Sometimes I wonder how the web got here with the amount of kludge that we have to carry.
What are some alternatives?
list - The Public Suffix List
first-party-sets
standards-positions
fenced-frame - Proposal for a strong boundary between a page and its embedded content
subtls - A proof-of-concept TypeScript TLS 1.3 client