tidy-html5
markdownload
tidy-html5 | markdownload | |
---|---|---|
9 | 36 | |
2,663 | 2,483 | |
0.2% | - | |
0.0 | 5.2 | |
8 days ago | 30 days ago | |
C | JavaScript | |
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tidy-html5
- Show HN: I made a tool to clean and convert any webpage to Markdown
- Localize HTML Tidy (README.md)
-
libtidy, compilation errors
So I included the tidy libraries in my project.
-
Searching for the *old* W3C XHTML/CSS validator or something of equivalent functionality
Maybe look into HTML Tidy. It's job is to clean up HTML and convert legacy code to modern form, so it knows about DTDs. You might be able to pass it some options to get what you want.
-
Converting a IETM delivered in HTML to XML S1000D 4.0.
I've always used tidy for HTML/XML formatting jobs.
-
Expand one very long HTML line (>30k characters) as multi-line formatted indented HTML?
Personally I use command that switches the file type to html, and then formats it with tidy. It assumes you're pasting into a new buffer.
-
Unminify HTML in terminal
I use tidy.
-
Inspecting the Clipboard (on Linux)
So I installed HTML tidy.
-
The most underused browser feature
Prune instructs the parser to remove any elements within the extracted article block that look superfluous. This can result in false positives, so we tend to disable it when we've gone to the trouble of creating site-specific extraction rules.
Tidy determines if the source HTML should be cleaned up first with HTML Tidy - https://github.com/htacg/tidy-html5. If you're parsing the source HTML with an HTML 5 parser, as we are now, it shouldn't be necessary any more (I think we actually ignore it now). We used it more before when we relied on libxml parsing, which often trips up on modern HTML.
markdownload
-
2markdown – Transform Websites into Markdown
I'll stick with using Markdownload (https://github.com/deathau/markdownload) when I need to do something like this.
-
Show HN: I made a tool to clean and convert any webpage to Markdown
This fork:
https://github.com/deathau/markdownload
With extension available for Firefox, Google Chrome, Microsoft Edge and Safari.
- Show HN: Zenfetch – Turn your saved browsing content into an AI second brain
-
A structured note-taking app for personal use
> Not really. Obsidian has its shares of problems too, and most of them originate from using Markdown.
Aha. Which problems do you mean?
> Markdown is a freeform text-format, and works very well for writing text, but it really sucks for data and structured content.
Joplin is using md to. And if Joplin does a good job on "data" and "structured content" (whatever you mean by that) by separating that in their DB, it's a big NO for me since it's a closed silo.
This: https://github.com/blacksmithgu/obsidian-dataview works so wonderful for me, and it never breaks anything in my simple md files.
> Most plugins and features in that area are very brittle and overspecialized, working only well enough in their specific use case.
Aha. I don't think so. Which authority says that? And even if It's like that, my markdown files would survive everything, since they are a) in git. https://github.com/denolehov/obsidian-git and b) easy to fix since it's a text file. Gosh!
> And gosh, Obsidian has really a huge amount of plugins for data-handling.
And gosh, this is a good thing!
> At some point, it was so bad that there were multiple competing task-plugins which broke each other just because they had different formatting for dates.
Installing multiple task plugins shows that something is "broke" on the user side. It's not the fault of Markdown or Obsidian.
Just have a look on: https://github.com/ivan-lednev/obsidian-day-planner but you dont need a fancy task plugin like this, if you know your way around https://github.com/blacksmithgu/obsidian-dataview or https://github.com/obsidian-tasks-group/obsidian-tasks
Since the Ecosystem around Obsidian and pure Markdown, most of the time I stay in my browser https://github.com/deathau/markdownload and nvim https://github.com/epwalsh/obsidian.nvim
-
What are your second brain apps like Obsidian?
markdownload - (firefox) - I can use to download entire webpages into markdown - https://github.com/deathau/markdownload - sometimes it's just easier to snippet out a thing I want to keep or reference.
- Ask HN: What are some unpopular technologies you wish people knew more about?
-
Grimoire: Open-Source bookmark manager with extra features
My perfect bookmark manager is Markdownload https://github.com/deathau/markdownload
Just save the complete page, only selected text or only the link to a markdown file or Obsidian. With downloaded, linked or without pictures. My OS and Obsidian can search those files, they have more (automatically added) metadata.
I can even edit them in the browser: add your thoughts, tags or change the name of the file before they are saved.
I can (automatically) do with them what ever I need. They can be used to (automatically) generate an always up to date start page or a data vault on GitHub.
My local AI assistant can parse them.
Local, versatile, permanent, flexible, cost effective, future save. No need for a bookmark manager.
- Copy webpage text, convert to Markdown
-
Ask HN: Should we be saving our favorite information locally?
Yes and no.
Instead of PDF, use Markdownload (on iOS, use a Safari web content to markdown file extension):
https://github.com/deathau/markdownload
And save in a journaled folder like "YYYY-MM-DD - Page Title.md" with a YAML frontmatter of all available metadata.
Have this as a folder in your PKM of choice (Obsidian, Foam, whatever).
These days, point some text embedding at it, and let it generate your own LLM brain.
But you can also static-site-generate that back into your own web knowledge site or base.
-
Los impactos de la nueva normativa que permite a las AFP invertir en ETF activos
Como extraigo texto: MarkDownload - PC y markdownr - Android.
What are some alternatives?
parser - 📜 Extract meaningful content from the chaos of a web page
logseq - A local-first, non-linear, outliner notebook for organizing and sharing your personal knowledge base. Use it to organize your todo list, to write your journals, or to record your unique life.
readability.php - PHP port of Mozilla's Readability.js
obsidian-clipper - A Chrome extension that easily clips selections to Obsidian
readability - Readability is a library written in Go (golang) to parse, analyze and convert HTML pages into readable content. Originally an Arc90 Experiment, it is now incorporated into Safari’s Reader View.
nulis - Mind-mapping software that helps writers collect and organize their knowledge, develop their ideas. Built with React, Redux, Node.js, hosted on Digital Ocean.
toltec - Community-maintained repository of free software for the reMarkable tablet.
obsidian-mind-map - An Obsidian plugin for displaying markdown notes as mind maps using Markmap.
SponsorBlock - Skip YouTube video sponsors (browser extension)
vscode-memo - Markdown knowledge base with bidirectional [[link]]s built on top of VSCode [Moved to: https://github.com/svsool/memo]
ftr-site-config - Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.
Templater - A template plugin for obsidian