-
URLExtract
URLExtract is python class for collecting (extracting) URLs from given text based on locating TLD.
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
You probably know that you can get the first half of that at
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
(that is, you can see the most-upvoted comments, but you do see the count of upvotes)
That'd explain some of the holes mentioned in these comments. I think you just want to match any "word" containing ".[valid TLD]" and then exclude invalid URLs ("@" in first part indicating email, etc).
I've been using this[0] Python library which seemed good enough for my needs in some scraping project.
0: https://github.com/lipoja/URLExtract