-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I'm sure this must have been asked before, but I can't find the answer anywhere.
Here on HN, every link that is submitted has a site that it belongs to, which is shown to the left of the title, in parentheses. Usually it's just the domain name, but sometimes it includes part of the path, e.g. for a submission like https://github.com/rails/rails/pull/12345, the site would be github.com/rails.
How is this implemented? Is there a list somewhere of the sites to treat differently (like github.com or wordpress.com), and if so, is that list publicly available? If not, is there a similar list that someone maintains that I could use for a side project?
There's the Public Suffix List https://publicsuffix.org/ but it's limited to domain names, so your github.com/rails example isn't covered. I'm pretty sure HN simply has a manually coded list of URL patterns for popular domains.