Top 23 Pandoc Open-Source Projects

pandoc

420 32,396 9.8 Haskell

Universal markup converter

Project mention: Beautifying Org Mode in Emacs (2018) | news.ycombinator.com | 2024-04-15

My main authoring tool is then Emacs Markdown Mode (https://jblevins.org/projects/markdown-mode/). For data entry, it comes with some bells and whistles similar to org-mode, like C-c C-l for inserting links etc.
I seldom export my notes for external usage, but if it is the case, I use lowdown (https://kristaps.bsd.lv/lowdown/) which also comes with some nice output targets (among the more unusual are Groff and Terminal). Of cource pandoc (https://pandoc.org/) does a very good job here, too.

Zettlr

116 9,597 9.9 TypeScript

Your One-Stop Publication Workbench

Project mention: Obsidian 1.5 Desktop (Public) | news.ycombinator.com | 2023-12-26

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
nb

48 6,294 9.3 Shell

CLI and local web plain text note‑taking, bookmarking, and archiving with linking, tagging, filtering, search, Git versioning & syncing, Pandoc conversion, + more, in a single portable script.

Project mention: Nb – note taking and archiving on the command line | news.ycombinator.com | 2024-02-03

pandoc-latex-template

14 5,801 5.6 TeX

A pandoc LaTeX template to convert markdown files to PDF or LaTeX.

Project mention: A pandoc LaTeX template to convert Markdown files to PDF or LaTeX | news.ycombinator.com | 2024-02-22

markdown-preview-enhanced

5 4,058 4.9 HTML

One of the 'BEST' markdown preview extensions for Atom editor!
OSCP-Exam-Report-Template-Markdown

21 3,289 4.6 Ruby

:orange_book: Markdown Templates for Offensive Security OSCP, OSWE, OSCE, OSEE, OSWP exam report

Project mention: Exam Complete -- Got enough points but am worried about the report. | /r/oscp | 2023-07-05

Thank you! Yes, I used https://github.com/noraj/OSCP-Exam-Report-Template-Markdown and included vulnerability details, as well as how to fix the vulnerability and it got lengthy which I think was unnecessary, but I tried to make it nice and be thorough. I probably should've put more time in trying to fix the other issues I had but oh well.

rmarkdown

38 2,802 7.6 R

Dynamic Documents for R

Project mention: Pandoc | news.ycombinator.com | 2024-01-28

I'm surprised to see no one has pointed out [RMarkdown + RStudio](https://rmarkdown.rstudio.com) as one way to immediately interface with Pandoc.
I used to write papers and slides in LaTeX (using vim, because who needs render previews), then eventually switched to Pandoc (also vim). I eventually discovered RMarkdown+RStudio. I was looking for a nice way to format a simple table and discovered that rmarkdown had nice extensions of basic markdown (this was many years ago so maybe that is incorporated into vanilla markdown/pandoc).
The RMarkdown page claims:
> R Markdown supports dozens of static and dynamic output formats including HTML, PDF, MS Word, Beamer, HTML5 slides, Tufte-style handouts, books, dashboards, shiny applications, scientific articles, websites, and more.
...which I think is largely due to using pandoc as the core generator.
RStudio shows you the pandoc command it runs to generate your document, which I've used to figure out the pandoc command I want to run when I've switched to using pandoc directly.
This is a bit of a "lazy" way to interact with pandoc. Maybe the "laziest" aspect: when I get a new computer, I can install the entire stack by installing Rstudio, then opening a new rmarkdown document. Rstudio asks whether I'd like to install all the necessary libraries -- click "yes" and that's it. Maybe that sounds silly but it used to be a lot of work to manage your LaTeX install. These days I greatly favor things that save me time, which seems to get more precious every year.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
patat

9 2,323 8.0 Haskell

Terminal-based presentations using Pandoc

Project mention: patat: Terminal-based presentations using Pandoc | /r/commandline | 2023-10-25

djot

43 1,576 5.8 HTML

A light markup language

Project mention: LaTeX and Neovim for technical note-taking | news.ycombinator.com | 2024-02-21

I know this doesn't solve your problem directly, but I recommend people to try out Djot[0], a markup language from the author of CommonMark.
Djot has a single well-defined spec, and most of the basic formatting has the same syntax as (a) Markdown, so switching is pretty painless. It has as a main goal to be legible and visually aesthetic as-is, just like Markdown.
What Djot adds is its _predictability_. Nested formatting, precedence order, line breaks behavior, nested blocks, mixed inline and block formatting, custom attributes are all laid out precisely in the spec in a thought-out manner. Till this day I still can't remember how to put line break within a list item in Markdown (and I'm sure there're more than one way).
[0]: https://djot.net/

phd_thesis_markdown

3 1,186 5.5 HTML

Template for writing a PhD thesis in Markdown

Project mention: ArXiv now offers papers in HTML format | news.ycombinator.com | 2023-12-21

This is the reason I've never liked LaTeX from a data point view. It's made to be printed out or get to look beautiful on a PDF but was never designed to get you to a HTML file or a Word file.
I've written my thesis in Markdown in the past because of this (best for humans) which can be easily transformed to HTML, Word, PDF and even LaTeX https://github.com/tompollard/phd_thesis_markdown
And I think that XML is the best format for machines.

vim-pandoc

11 940 3.1 Vim Script

pandoc integration and utilities for vim

Project mention: Would you honestly recommend someone learning neovim as they begin their coding journey? Or would you suggest some other kind of IDE first? | /r/neovim | 2023-05-12

With that, the only thing left to do was the make it as convenient as possible to export an MLA-formatted PDF from inside Neovim, so I wrote a custom function using the vim-pandoc plugin as a wrapper to make the command simpler:

pandoc-crossref

3 886 7.9 Haskell

Pandoc filter for cross-references

Project mention: Is there a way to use pandoc-crossref for foonotes? | /r/linuxquestions | 2023-05-18

i was going through this link but couldn't find anything for footnotes.

asynctasks.vim

25 876 7.4 Vim Script

:rocket: Modern Task System for Project Building, Testing and Deploying !!

Project mention: Build and run in one task using asynctasks.vim | /r/neovim | 2023-07-05

I'm currently using skywind3000/asynctasks.vim to build and run my project.

Marker

7 822 5.1 JavaScript

🖊 A gtk3 markdown editor (by fabiocolacio)
pypandoc

5 799 6.6 Python

Thin wrapper for "pandoc" (MIT)

Project mention: Web Scraping in Python – The Complete Guide | news.ycombinator.com | 2024-02-20

I recently used [0] Playwright for Python and [1] pypandoc to build a scraper that fetches a webpage and turns the content into sane markdown so that it can be passed into an AI coding chat [2].
They are both very gentle dependencies to add to a project. Both packages contain built in or scriptable methods to install their underlying platform-specific binary dependencies. This means you don't need to ask end users to use some complex, platform-specific package manager to install playwright and pandoc.
Playwright let's you scrape pages that rely on js. Pandoc is great at turning HTML into sensible markdown. Below is an excerpt of the openai pricing docs [3] that have been scraped to markdown [4] in this manner.
[0] https://playwright.dev/python/docs/intro
[1] https://github.com/JessicaTegner/pypandoc
[2] https://github.com/paul-gauthier/aider
[3] https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turb...
[4] https://gist.githubusercontent.com/paul-gauthier/95a1434a28d...
  ## GPT-4 and GPT-4 Turbo

onenote-md-exporter

9 780 6.0 C#

ConsoleApp to export OneNote notebooks to Markdown formats

Project mention: What is your use case for One Note? | /r/OneNote | 2023-12-09

The situation isn’t as bad as it would be for a cloud only service though. As long as you have backups, you can open the “.one” files with OneNote even if you have no access to OneDrive or sync. You can also use an external tool like the OneNote markdown exporter (https://github.com/alxnbl/onenote-md-exporter) to get a copy that at least has the basic content accessible outside of OneNote.

emanote

20 735 9.3 Haskell

Emanate a structured view of your plain-text notes

Project mention: Taking math notes on your computer [LINUX] | /r/learnmath | 2023-05-05

Im personally using Emanote which does exactly what you describe. It supports LaTeX and lots of other features via Pandoc. Its also very nice to use in that it supports hot-reloading, instead of requiring manual refreshing. The only downside for some might be that its installed via the Nix ecosystem which is (great but) a bit of a rabbit hole you might not want to deal with, particularly depending on your level of technicality on the computer.

awesome-scientific-writing

1 692 5.8

:keyboard: A curated list of awesome tools, demos and resources to go beyond LaTeX
obsidian-pandoc

23 617 0.0 TypeScript

Pandoc document export plugin for Obsidian (https://obsidian.md)
ConvertOneNote2MarkDown

10 582 5.7 PowerShell

Ready to make the step to Markdown and saying farewell to your OneNote, EverNote or whatever proprietary note taking tool you are using? Nothing beats clear text, right? Read on!
boost

1 560 1.3

Get started right. Become a shell native. This is the way. (by rwxrob)
panflute

3 475 4.3 Python

An Pythonic alternative to John MacFarlane's pandocfilters, with extra helper functions

Project mention: Pandoc | news.ycombinator.com | 2024-01-28

Interesting idea re:internal links. For sufficiently complex issues of this nature, pandoc filters[0] are a powerful tool for this kind of mid-conversion processing. I've made some cool projects with the Python package panflute[1]
[0] https://pandoc.org/filters.html
[1] https://github.com/sergiocorreia/panflute

gwern.net

16 434 9.9 Haskell

Site infrastructure for gwern.net (CSS/JS/HS/images/icons). Custom Hakyll website with unique automatic link archiving, recursive tooltip popup UX, dark mode, and typography (sidenotes+dropcaps+admonitions+inflation-adjuster).

Project mention: Show HN: My related-posts finder script (with LLM and GPT4 enhancement) | news.ycombinator.com | 2023-12-08

I do something similar on my website ( https://www.gwern.net ; crummy code at https://github.com/gwern/gwern.net/ ) for the 'similar' feature: call OA API with embedding, nearest-neighbor via cosine, list of links for suggested further reading.
Because it's a static site, managing the similar links poses the difficulties OP mentions: where do you store & update it? In the raw original Markdown? We solve it by transclusion: the list of 'similar' links is stored in a separate HTML snippet, which is just transcluded into the web page on demand. The snippets can be arbitrarily updated without affecting the Markdown essay source. We do this for other things too, it's a handy design pattern for static sites, to make things more compositional (allowing one HTML snippet to be reused in arbitrarily many places or allowing 'extremely large' pages) at the cost of some client-side work doing the transclusion.
I refine it in a couple ways: I don't need to call GPT-4 for summarization because the links all have abstracts/excerpts; I usually write abstracts for my own essays/posts (which everyone should do, and if the summaries are good enough to embed, why not just use them yourself for your posts? would also help your cache & cost issues, and be more useful than the 'explanation'). Then I also throw in the table of contents (which is implicitly an abstract), available metadata like tags & authors, and I further throw into the embeddings a list of the parsed links as well as reverse citations/backlinks. My assumption is that these improve the embedding by explicitly listing the URLs/titles of references, and what other pages find a given thing worth linking.
Parsing the links means I can improve the list of suggestions by deleting anything already linked in the article. OP has so few posts this may not be a problem for him, if you are heavily hyperlinking and also have good embeddings (like I do), this will happen a lot, and it is annoying to a reader to be suggested links he has already seen and either looked at or ignored. This also means that it's easy to provide a curated 'see also' list: simply dump the similar list at the beginning, and keep the ones you like. They will be filtered out of the generated list automatically, so you can present known-good ones upfront and then the similars provide a regularly updated list of more. (Which helps handle the tension he notes between making a static list up front while new links regularly enter the system.)
One neat thing you can do with a list of hits, that I haven't seen anyone else do, is sort them by distance. The default presentation everyone does is to simply present them in order of distance to the target. This is sorta sensible because you at least see the 'closest' first, but the more links you have, the smaller the difference is, and the more that sorting looks completely arbitrary. What you can do instead is sort them by their distance to each other: if you do that, even in a simple greedy way, you get what is a list which automatically clusters by the internal topics. (Imagine there are two 'clusters' of topics equidistant to the current article; the default distance sort would give you something random-looking like A/B/B/A/B/A/A/A/B/B/A, which is painful to read, but if you sort by distance to each other to minimize the total distance, you'd get something more like B/B/B/B/B/B/A/A/A/A/A/A.) I call this 'sort by magic' or 'sort by semantic similarity': https://gwern.net/design#future-tag-features
Additional notes: I would not present 'Similarity score: 79% match' because I assume this is just the cosine distance, which is equal for both suggestions (and therefore not helpful) and also is completely embedding dependent and basically arbitrary. (A good heuristic is: would it mean anything to the reader if the number were smaller, larger, or has one less digit? A 'similarity score' of 89%, or 7.9, or 70%, would all mean the same thing to the reader - nothing.)
> Complex or not, calculating cosine similarity is a lot less work than creating a fully-fledged search algorithm, and the results will be of similar quality. In fact, I'd be willing to bet that the embedding-based search would win a head-to-head comparison most of the time.
You are probably wrong. The full search algorithm, using exact word count indexes of everything, is highly competitive with embedding search. If you are interested, the baseline you're looking for in research papers on retrieval is 'BM25'.
> For each post, the script then finds the top two most-similar posts based on the cosine similarity of the embedding vectors.
Why only top two? It's at the bottom of the page, you're hardly hurting for space.

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Pandoc related posts

Beautifying Org Mode in Emacs (2018)
6 projects | news.ycombinator.com | 15 Apr 2024
LaTeX makes me so angry at word
1 project | news.ycombinator.com | 26 Mar 2024
Launch HN: Onedoc (YC W24) – A better way to create PDFs
11 projects | news.ycombinator.com | 11 Mar 2024
A pandoc LaTeX template to convert Markdown files to PDF or LaTeX
1 project | news.ycombinator.com | 22 Feb 2024
Pandoc 3.1.12 Released
1 project | news.ycombinator.com | 16 Feb 2024
Show HN: CLI for generating beautiful PDF for offline reading
4 projects | news.ycombinator.com | 5 Feb 2024
Pandoc
17 projects | news.ycombinator.com | 28 Jan 2024
A note from our sponsor - WorkOS
workos.com | 25 Apr 2024

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →

Index

What are some of the best open-source Pandoc projects? This list will help you:

	Project	Stars
1	pandoc	32,396
2	Zettlr	9,597
3	nb	6,294
4	pandoc-latex-template	5,801
5	markdown-preview-enhanced	4,058
6	OSCP-Exam-Report-Template-Markdown	3,289
7	rmarkdown	2,802
8	patat	2,323
9	djot	1,576
10	phd_thesis_markdown	1,186
11	vim-pandoc	940
12	pandoc-crossref	886
13	asynctasks.vim	876
14	Marker	822
15	pypandoc	799
16	onenote-md-exporter	780
17	emanote	735
18	awesome-scientific-writing	692
19	obsidian-pandoc	617
20	ConvertOneNote2MarkDown	582
21	boost	560
22	panflute	475
23	gwern.net	434