Top 23 Haskell Pandoc Projects

pandoc

420 32,312 9.8 Haskell

Universal markup converter

Project mention: Beautifying Org Mode in Emacs (2018) | news.ycombinator.com | 2024-04-15

My main authoring tool is then Emacs Markdown Mode (https://jblevins.org/projects/markdown-mode/). For data entry, it comes with some bells and whistles similar to org-mode, like C-c C-l for inserting links etc.
I seldom export my notes for external usage, but if it is the case, I use lowdown (https://kristaps.bsd.lv/lowdown/) which also comes with some nice output targets (among the more unusual are Groff and Terminal). Of cource pandoc (https://pandoc.org/) does a very good job here, too.

patat

9 2,323 8.0 Haskell

Terminal-based presentations using Pandoc

Project mention: patat: Terminal-based presentations using Pandoc | /r/commandline | 2023-10-25

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
pandoc-crossref

3 886 7.9 Haskell

Pandoc filter for cross-references

Project mention: Is there a way to use pandoc-crossref for foonotes? | /r/linuxquestions | 2023-05-18

i was going through this link but couldn't find anything for footnotes.

emanote

20 735 9.3 Haskell

Emanate a structured view of your plain-text notes

Project mention: Taking math notes on your computer [LINUX] | /r/learnmath | 2023-05-05

Im personally using Emanote which does exactly what you describe. It supports LaTeX and lots of other features via Pandoc. Its also very nice to use in that it supports hot-reloading, instead of requiring manual refreshing. The only downside for some might be that its installed via the Nix ecosystem which is (great but) a bit of a rabbit hole you might not want to deal with, particularly depending on your level of technicality on the computer.

gwern.net

16 434 9.9 Haskell

Site infrastructure for gwern.net (CSS/JS/HS/images/icons). Custom Hakyll website with unique automatic link archiving, recursive tooltip popup UX, dark mode, and typography (sidenotes+dropcaps+admonitions+inflation-adjuster).

Project mention: Show HN: My related-posts finder script (with LLM and GPT4 enhancement) | news.ycombinator.com | 2023-12-08

I do something similar on my website ( https://www.gwern.net ; crummy code at https://github.com/gwern/gwern.net/ ) for the 'similar' feature: call OA API with embedding, nearest-neighbor via cosine, list of links for suggested further reading.
Because it's a static site, managing the similar links poses the difficulties OP mentions: where do you store & update it? In the raw original Markdown? We solve it by transclusion: the list of 'similar' links is stored in a separate HTML snippet, which is just transcluded into the web page on demand. The snippets can be arbitrarily updated without affecting the Markdown essay source. We do this for other things too, it's a handy design pattern for static sites, to make things more compositional (allowing one HTML snippet to be reused in arbitrarily many places or allowing 'extremely large' pages) at the cost of some client-side work doing the transclusion.
I refine it in a couple ways: I don't need to call GPT-4 for summarization because the links all have abstracts/excerpts; I usually write abstracts for my own essays/posts (which everyone should do, and if the summaries are good enough to embed, why not just use them yourself for your posts? would also help your cache & cost issues, and be more useful than the 'explanation'). Then I also throw in the table of contents (which is implicitly an abstract), available metadata like tags & authors, and I further throw into the embeddings a list of the parsed links as well as reverse citations/backlinks. My assumption is that these improve the embedding by explicitly listing the URLs/titles of references, and what other pages find a given thing worth linking.
Parsing the links means I can improve the list of suggestions by deleting anything already linked in the article. OP has so few posts this may not be a problem for him, if you are heavily hyperlinking and also have good embeddings (like I do), this will happen a lot, and it is annoying to a reader to be suggested links he has already seen and either looked at or ignored. This also means that it's easy to provide a curated 'see also' list: simply dump the similar list at the beginning, and keep the ones you like. They will be filtered out of the generated list automatically, so you can present known-good ones upfront and then the similars provide a regularly updated list of more. (Which helps handle the tension he notes between making a static list up front while new links regularly enter the system.)
One neat thing you can do with a list of hits, that I haven't seen anyone else do, is sort them by distance. The default presentation everyone does is to simply present them in order of distance to the target. This is sorta sensible because you at least see the 'closest' first, but the more links you have, the smaller the difference is, and the more that sorting looks completely arbitrary. What you can do instead is sort them by their distance to each other: if you do that, even in a simple greedy way, you get what is a list which automatically clusters by the internal topics. (Imagine there are two 'clusters' of topics equidistant to the current article; the default distance sort would give you something random-looking like A/B/B/A/B/A/A/A/B/B/A, which is painful to read, but if you sort by distance to each other to minimize the total distance, you'd get something more like B/B/B/B/B/B/A/A/A/A/A/A.) I call this 'sort by magic' or 'sort by semantic similarity': https://gwern.net/design#future-tag-features
Additional notes: I would not present 'Similarity score: 79% match' because I assume this is just the cosine distance, which is equal for both suggestions (and therefore not helpful) and also is completely embedding dependent and basically arbitrary. (A good heuristic is: would it mean anything to the reader if the number were smaller, larger, or has one less digit? A 'similarity score' of 89%, or 7.9, or 70%, would all mean the same thing to the reader - nothing.)
> Complex or not, calculating cosine similarity is a lot less work than creating a fully-fledged search algorithm, and the results will be of similar quality. In fact, I'd be willing to bet that the embedding-based search would win a head-to-head comparison most of the time.
You are probably wrong. The full search algorithm, using exact word count indexes of everything, is highly competitive with embedding search. If you are interested, the baseline you're looking for in research papers on retrieval is 'BM25'.
> For each post, the script then finds the top two most-similar posts based on the cosine similarity of the embedding vectors.
Why only top two? It's at the bottom of the page, you're hardly hurting for space.

pandoc-plot

2 209 7.2 Haskell

Render and include figures in Pandoc documents using your plotting toolkit of choice
citeproc

1 139 3.8 Haskell

CSL citation processing library in Haskell
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
pandoc-sidenote

1 135 3.4 Haskell

Convert Pandoc Markdown-style footnotes into sidenotes
pandoc-types

1 104 4.9 Haskell

types for representing structured documents
pandoc-csv2table

0 95 4.3 Haskell

A Pandoc filter that renders CSV as Pandoc Markdown Tables.
pandoc-include

0 60 0.0 Haskell

An include filter for Pandoc
asciidoc-hs

1 44 0.0 Haskell

AsciiDoc parser that can be used as a Pandoc front-end, written in Haskell
pandoc-citeproc-preamble

0 39 0.0 Haskell

Insert a preamble before pandoc-citeproc's bibliography
pandoc-placetable

1 38 0.0 Haskell

Pandoc filter to include CSV data (from file or URL)
pandoc-emphasize-code

0 28 0.0 Haskell

A Pandoc filter for emphasizing code in fenced blocks
pandoc-markdown-ghci-filter

0 14 0.0 Haskell

A Pandoc filter that identifies Haskell code in Markdown, executes the code in GHCI and embeds the results in the returned Markdown.
hakyll-shortcut-links

0 11 0.0 Haskell

✂️ Hakyll shortcut-links in markdown files
pandoc-filter-graphviz

0 10 0.0 Haskell

Interpret '~~~ graphviz' bloc as a call to graphviz software and substritude text with produced picture
pandoc-japanese-filters

0 10 0.0 Haskell

Pandoc filters to treat Japanese-specific markups
pandoc-lens

0 9 0.0 Haskell

Lenses for the Pandoc AST
reflex-dom-pandoc

0 6 0.0 Haskell

Render Pandoc documents in reflex-dom
styleFromMeta

0 4 4.5 Haskell

Pandoc filter to apply styles found in the metadata of the document
pandoc-utils

0 2 1.8 Haskell

Utility functions to work with Pandoc in Haskell applications.
SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Haskell Pandoc related posts

LaTeX makes me so angry at word
1 project | news.ycombinator.com | 26 Mar 2024
Pandoc
17 projects | news.ycombinator.com | 28 Jan 2024
What Happened to Pandoc-Discuss?
1 project | news.ycombinator.com | 19 Jan 2024
The Simplest Static Site Generator
1 project | dev.to | 7 Jan 2024
Running Quarto Markdown in Docker
4 projects | dev.to | 23 Dec 2023
A doctoral dissertation build system
3 projects | news.ycombinator.com | 9 Dec 2023
Show HN: My related-posts finder script (with LLM and GPT4 enhancement)
1 project | news.ycombinator.com | 8 Dec 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Pandoc projects in Haskell? This list will help you:

	Project	Stars
1	pandoc	32,312
2	patat	2,323
3	pandoc-crossref	886
4	emanote	735
5	gwern.net	434
6	pandoc-plot	209
7	citeproc	139
8	pandoc-sidenote	135
9	pandoc-types	104
10	pandoc-csv2table	95
11	pandoc-include	60
12	asciidoc-hs	44
13	pandoc-citeproc-preamble	39
14	pandoc-placetable	38
15	pandoc-emphasize-code	28
16	pandoc-markdown-ghci-filter	14
17	hakyll-shortcut-links	11
18	pandoc-filter-graphviz	10
19	pandoc-japanese-filters	10
20	pandoc-lens	9
21	reflex-dom-pandoc	6
22	styleFromMeta	4
23	pandoc-utils	2