Go Text processing

Open-source Go projects categorized as Text processing

Top 23 Go Text processing Projects

  • micro-editor

    A modern and intuitive terminal-based text editor

    Project mention: Does anyone know what program this is? | /r/commandline | 2023-05-16

    Hey there! That looks like the text editor "micro". It's got a nice ncurses interface and indeed has a minimap feature that renders the text. You can check it out here: https://github.com/zyedidia/micro

  • GoQuery

    A little like that j-thing, only in Go.

    Project mention: Check to see if JSON contains something | /r/golang | 2023-04-10

    How about - https://github.com/PuerkitoBio/goquery

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • sh

    A shell parser, formatter, and interpreter with bash support; includes shfmt (by mvdan)

    Project mention: Shfmt – format shell programs (like gofmt, rustfmt) | news.ycombinator.com | 2023-02-11
  • blackfriday

    Blackfriday: a markdown processor for Go

    Project mention: I wrote a markdown to html converter | /r/golang | 2023-02-01

    unless this is an exercise in "how to make my own markdown processor" I'd suggest using proven https://github.com/russross/blackfriday

  • toml

    TOML parser for Golang with reflection. (by BurntSushi)

    Project mention: how to write struct data into a file | /r/golang | 2022-10-20
  • go-humanize

    Go Humans! (formatters for units to human friendly sizes)

  • goldmark

    :trophy: A markdown parser written in Go. Easy to extend, standard(CommonMark) compliant, well structured.

    Project mention: Markdown library recommendations | /r/golang | 2023-05-22

    Goldmark used by Hugo.

  • SonarQube

    Static code analysis for 29 languages.. Your projects are multi-language. So is SonarQube analysis. Find Bugs, Vulnerabilities, Security Hotspots, and Code Smells so you can release quality code every time. Get started analyzing your projects today for free.

  • bluemonday

    bluemonday: a fast golang HTML sanitizer (inspired by the OWASP Java HTML Sanitizer) to scrub user generated content of XSS

    Project mention: Sponsor the open source projects you depend on | news.ycombinator.com | 2023-04-10

    I'm on the receiving end of donations from sourcegraph for this. It's around $10 per month from that single donation and is for the only Go HTML santizer, which you use when you have user generated / untrusted input that you need to display as HTML. https://github.com/microcosm-cc/bluemonday

    For me the library has been good enough for my own use for a very very long time. I mostly neglect it unless there's some critical issue. I don't improve it at all as my time is better spent on my day job.

    I've often thought that there's room for improvement such as a DOM style santizer to validate input rather than just a SAX style sanitizer, perhaps formatting of output in addition to sanitising input, transformation rules, etc.

    When I got the donation I was surprised, first ever bit of support for open source software I'd written (as this was not written on company dime).

    Even at $10 per month it's motivating enough to think someone values it. If it accrues into something significant I may actually feel motivated to improve it.

    Interesting is that I'd regard this as successful by usage, it's used by virtually everything in the Go world that makes a website.

    Perhaps people don't know it exists though? And for that awareness thanks to thanks.dev

  • gofeed

    Parse RSS, Atom and JSON feeds in Go

    Project mention: IndieWebifying my Website Part 1 - Microformats and Webmentions | dev.to | 2022-11-12

    Luckily I did not have to implement any of this myself apart from some glue code to fit it together: I used the library gocron for scheduling the regular intervals, gofeed for parsing the RSS feed and webmention for extracting links and sending webmentions.

  • frangipanni

    Program to convert lines of text into a tree structure.

    Project mention: Ask HN: Programs that saved you 100 hours? (2022 edition) | news.ycombinator.com | 2022-12-20
  • xurls

    Extract urls from text

    Project mention: Guys, when using the regexp package, in reporting errors how do I need to fix this. | /r/golang | 2022-10-09
  • slug

    URL-friendly slugify with multiple languages support.

  • lingua-go

    The most accurate natural language detection library for Go, suitable for long and short text alike

    Project mention: Lingua 1.2.0 - The most accurate natural language detection library for Go, now with support for detecting multiple languages in mixed-language text | /r/golang | 2022-12-12
  • commonregex

    🍫 A collection of common regular expressions for Go (by mingrammer)

  • htmlquery

    htmlquery is golang XPath package for HTML query.

    Project mention: Dumb idea for testing output of a static site generator: use on-page DOM inspection instead of playwright? | /r/golang | 2022-11-25

    I would suggest https://github.com/antchfx/htmlquery instead. It would basically give you access to the same sort of things you'd run in the browser for this case, only without a browser. At least as long as we're taking static HTML generation.

  • Dataflow kit

    Extract structured data from web sites. Web sites scraping.

  • whatlanggo

    Natural language detection library for Go

  • xpath

    XPath package for Golang, supports HTML, XML, JSON document query.

    Project mention: I have this code On Playground.. It is very simplified... but when reading from file it breaks and cannot handle rune characters.... The strings.Replace function just stops working | /r/golang | 2023-02-23

    It looks like you're trying to parse HTML by using the strings package. For reference, you might be better off using an xpath tool or the html package that has built-in tokenizers to do your tokenizing. That makes it easier to find the nodes you're looking for and the values contained within those nodes.

  • omniparser

    omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.

  • mxj

    Decode / encode XML to/from map[string]interface{} (or JSON); extract values with dot-notation paths and wildcards. Replaces x2j and j2x packages.

    Project mention: Newbie: I have a big xml file, the content is much nested tags and what I need to do is adding a field in a very nested tag in this file. One “not elegant” way is to make thousands of structs to parse the file. Do you guys have a simple solution for a task like that. | /r/golang | 2023-01-30

    It generates Go structs from XML files. Compared to projects like https://github.com/clbanning/mxj, it generates much better Go code and you can feed it multiple example XML files.

  • go-runewidth

    wcwidth for golang

  • html-to-markdown

    ⚙️ Convert HTML to Markdown. Even works with entire websites and can be extended through rules. (by JohannesKaufmann)

  • gographviz

    Parses the Graphviz DOT language in golang

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-05-22.

Go Text processing related posts

Index

What are some of the best open-source Text processing projects in Go? This list will help you:

Project Stars
1 micro-editor 21,628
2 GoQuery 12,607
3 sh 5,891
4 blackfriday 5,162
5 toml 4,210
6 go-humanize 3,617
7 goldmark 2,733
8 bluemonday 2,694
9 gofeed 2,178
10 frangipanni 1,188
11 xurls 1,053
12 slug 954
13 lingua-go 884
14 commonregex 856
15 htmlquery 606
16 Dataflow kit 593
17 whatlanggo 587
18 xpath 579
19 omniparser 576
20 mxj 560
21 go-runewidth 528
22 html-to-markdown 524
23 gographviz 520
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com