RSS-Link-Database-2023
Django-link-archive
RSS-Link-Database-2023 | Django-link-archive | |
---|---|---|
6 | 13 | |
2 | 14 | |
- | - | |
9.4 | 9.6 | |
5 months ago | about 21 hours ago | |
HTML | Python | |
GNU General Public License v3.0 only | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
RSS-Link-Database-2023
- Show HN: Link metadata. Complete year 2023
-
Ask HN: What apps have you created for your own use?
[4] https://github.com/rumca-js/Django-link-archive
These are exported then to github repositories:
[5] https://github.com/rumca-js/RSS-Link-Database - bookmarks
[6] https://github.com/rumca-js/RSS-Link-Database-2023 - 2023 year news headlines
[7] https://github.com/rumca-js/Internet-Places-Database - all known to me domains, and RSS feeds
-
The Small Website Discoverability Crisis
My own repositories:
- bookmarked entries https://github.com/rumca-js/RSS-Link-Database
- mostly domains https://github.com/rumca-js/Internet-Places-Database
- all 'news' from 2023 https://github.com/rumca-js/RSS-Link-Database-2023
I am using my own Django program to capture and manage links https://github.com/rumca-js/Django-link-archive.
-
What gets to the front page of Hacker News?
Hi, I am collecting links from various places. Even from Hacker news. I have links since start of the year [1]. Maybe someone will find them useful. You should look at files named like [2].
[1] https://github.com/rumca-js/RSS-Link-Database-2023
[2] https.hnrss.orgfrontpage_entries.json
-
Google No Longer Automatically Indexes Websites – WTF?
That is why I wrote [1] for myself. It stores links in database, which I can query. Everything is later on exported, like in [2] and [3]. I can browse history, I can find useful data. I do not say it has replaced google for me. It is a nice addition that helped me gather data I encounter on the Internet.
It is a link database, at first glance resembles Reddit clone, but my focus is on creating link database, not on providing social media experience cancer.
Links:
[1] https://github.com/rumca-js/Django-link-archive
[2] https://github.com/rumca-js/RSS-Link-Database
[3] https://github.com/rumca-js/RSS-Link-Database-2023
-
Link Archive – 03.2023 Update
- https://github.com/rumca-js/RSS-Link-Database-2023 - all captured links in 2023
Django-link-archive
-
Google fights Invidious (a privacy YouTube Front end)
I am running my of YouTube front-end.
Link: https://github.com/rumca-js/Django-link-archive
Demo: https://renegat0x0.ddns.net/apps/catalog
Allows me to add channels, download individual videos, bookmark videos, etc. Uses iframe to display the video.
I have no problems with viewing videos:
- I use ff with adblock
- videos are embedded using no-referrer-when-downgrade policy
I have literally no ads.
There are some drawbacks:
- YouTube may change their policies, change embedding strategies
- To have control over videos, you still have to manually download them
-
Show HN: Free Plain-Text Bookmarking
I wrote bookmark manager in Django.
https://github.com/rumca-js/Django-link-archive
You can self host it.
You can add RSS sources and auto import new links regularly.
It may not be stare of the art, but gets the job done.
Demo below, but may not be working when you look at it. It runs on raspberry pi.
https://renegat0x0.ddns.net/apps/catalog/entry/11503/
-
A search engine in 80 lines of Python
I have myself dabbled a little bit in that subject. Some of my notes:
- some RSS feeds are protected by cloudflare. It is true however that it is not necessary for majority of blogs. If you would like to do more then selenium would be a way to solve "cloudflare" protected links
- sometimes even selenium headless is not enough and full blown browser in selenium is necessary to fool it's protection
- sometimes even that is not enough
- then I started to wonder, why some RSS feeds are so well protected by cloudflare, but who am I to judge?
- sometimes it is beneficial to cover user agent. I feel bad for setting my user agent to chrome, but again, why RSS feeds are so well protected?
- you cannot parse, read entire Internet, therefore you always need to think about compromises. For example I have narrowed area of my searches in one of my projects to domains only. Now I can find most of the common domains, and I sort them by their "importance"
- RSS links do change. There need to be automated means to disable some feeds automatically to prevent checking inactive domains
- I do not see any configurable timeout for reading a page, but I am not familiar with aiohttp. Some pages might waste your time
- I hate that some RSS feeds are not configured properly. Some sites do not provide a valid meta "link" with "application/rss+xml". Some RSS feeds have naive titles like "Home", or no title at all. Such a waste of opportunity
My RSS feed parser, link archiver, web crawler: https://github.com/rumca-js/Django-link-archive. Especially interesting could be file rsshistory/webtools.py. It is not advanced programming craft, but it got the job done.
Additionally, in other project I have collected around 2378 of personal sites. I collect domains in https://github.com/rumca-js/Internet-Places-Database/tree/ma... . These files are JSONs. All personal sites have tag "personal".
Most of the things are collected from:
https://nownownow.com/
https://searchmysite.net/
I wanted also to process domains from https://downloads.marginalia.nu/, but haven't got time to read structure of the files
-
Is YouTube starting to protect channel RSS feeds?
"""
Disclaimer: I have a automated RSS reader enabled in my network: https://github.com/rumca-js/Django-link-archive
-
Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search [pdf]
On the other hand it is not 1995. Time has moved on. I wrote a Simple RSS feed, that also serves as search engine for bookmarks.
I am able to run it in attick on raspberry pi. We do not have to rely so heavily on google.
https://github.com/rumca-js/Django-link-archive
It is true that it does not serve me as google, or kagi replacement. It is a very nice addition though.
With a little bit off determination I do not have to be so dependent on google.
Here is also a dump of known domains. Some are personal.
https://github.com/rumca-js/Internet-Places-Database
...and my bookmarks
https://github.com/rumca-js/RSS-Link-Database
Some more years, and google can go to hell.
-
Ask HN: What apps have you created for your own use?
[4] https://github.com/rumca-js/Django-link-archive
These are exported then to github repositories:
[5] https://github.com/rumca-js/RSS-Link-Database - bookmarks
[6] https://github.com/rumca-js/RSS-Link-Database-2023 - 2023 year news headlines
[7] https://github.com/rumca-js/Internet-Places-Database - all known to me domains, and RSS feeds
-
The Small Website Discoverability Crisis
My own repositories:
- bookmarked entries https://github.com/rumca-js/RSS-Link-Database
- mostly domains https://github.com/rumca-js/Internet-Places-Database
- all 'news' from 2023 https://github.com/rumca-js/RSS-Link-Database-2023
I am using my own Django program to capture and manage links https://github.com/rumca-js/Django-link-archive.
-
Homebrew Website Club
A list od blogs mentioned by hacker news, some were adres manually by me:
https://github.com/rumca-js/Django-link-archive/blob/main/an...
-
Ask HN: Tell us about your project that's not done yet but you want feedback on
I have a project. I have posted it once herenon HN. I have not received any feedback then, it hasn't received much traction.
It is a link aggregation. Can be used as a RSS client, or YouTube front end for subscriptions.
It is intended for light, personal use, therefore it is not much scalable, but supports user management.
https://github.com/rumca-js/Django-link-archive
-
Self hosted YouTube media server – Tube Archivist
Ha, I have also wrote something similar
https://github.com/rumca-js/Django-link-archive
I support not only youtube, but also any RSS source.
It functions as link aggregation software. I can also fetch meta for all videos in channel, and download videos, audios.
I am using standard Django auth module.
It still lacks polish, and it is under development. I am not a webdev, so I am still struggling with overall architecture
What are some alternatives?
Django-rss-feed - Link archive for a NAS drive [Moved to: https://github.com/rumca-js/Django-link-archive]
hoyolab-rss-feeds - RSS feed (JSON & Atom) generator for Genshin Impact's official Hoyolab news feed
full-text-tabs-forever - Full text search all your browsing history
org-clive
spotprice - Quickly get AWS spot instance pricing
catwiki_p3 - CatWiki (using Python 3)
RSS-Link-Database - Bookmarked archived links
youtube-cue - Generate CUE sheet from timestamps in youtube video description
webring - Make yourself a website
ytdl-pvr - A script/Docker image to continuously archive YouTube videos using ytdlp.