Python Text

Open-source Python projects categorized as Text

Top 21 Python Text Projects

  • TextRecognitionDataGenerator

    A synthetic data generator for text recognition

  • aeneas

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

    Project mention: Anyone know of a tool to align (existing) subtitles to audio along sentence boundaries? | reddit.com/r/LanguageTechnology | 2023-02-03

    You could try aeneas. Syncabook apparently uses the afaligner library, which says that it was inspired by aeneas but uses FastDTW to find an approximation to the optimal warping path. This might make it slightly less accurate than aeneas.

  • InfluxDB

    Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.

  • alibi-detect

    Algorithms for outlier, adversarial and drift detection

    Project mention: [D] Distributions to represent an Image Dataset | reddit.com/r/MachineLearning | 2023-02-24

    That is, to see whether a test image belongs in the distribution of the training images and to provide a routine for special cases. After a bit of reading Ive found that this is related to the field of drift detection in which I tried out alibi-detect . Whereby the training images are trained by an autoencoder and any subsequent drift will be flagged by the AE.

  • art

    🎨 ASCII art library for Python

    Project mention: ART 5.8 released: ASCII and Non-ASCII art library for Python | reddit.com/r/coolgithubprojects | 2022-11-23
  • evennia

    Python MUD/MUX/MUSH/MU* development system

    Project mention: I want to make a MUD - where to start? | reddit.com/r/MUD | 2023-03-22

    Evennia (https://www.evennia.com/) is quite easy to start with and flexible.

  • pytorch-widedeep

    A flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch

    Project mention: why can't I import pytorch-widedeep ? | reddit.com/r/learnmachinelearning | 2022-05-24

    Ask the dev https://github.com/jrzaurin/pytorch-widedeep/issues

  • pygame-menu

    A menu for pygame. Simple, and easy to use

    Project mention: Per-Pixel Alpha Function works (pretty much) | reddit.com/r/pygame | 2022-10-17

    So I was searching around for a clean way to make ripple effects for my splash screen when I came across pygame-menu. I was interested in the "apply_image_function" method in '/pygame-menu/baseimage.py

  • Sonar

    Write Clean Python Code. Always.. Sonar helps you commit clean code every time. With over 225 unique rules to find Python bugs, code smells & vulnerabilities, Sonar finds the issues while you focus on the work.

  • pygame-text-input

    a small module that enables you to input text with your keyboard using pygame

    Project mention: Show HN: Pygame's Text Input Module | news.ycombinator.com | 2022-12-28
  • zeroshot_topics

    Topic Inference with Zeroshot models

  • py_midicsv

    A Python port and library-fication of the midicsv tool by John Walker. If you need to convert MIDI files to human-readable text files and back, this is the library for you.

    Project mention: Can you install GitHub software with Termux? | reddit.com/r/termux | 2022-09-12
  • namekrea

    NameKrea is an AI Domain Name Generator which uses GPT-2

  • To-ASCII

    Convert videos, images, gifs, and even live video to ascii art! (Now with color support!)

    Project mention: To-ASCII 5.0 - A command line tool and library for making ASCII art from images and video! | reddit.com/r/Python | 2022-07-22

    To-ASCII is an old project of mine that converts images, video, and even live vieo from a webcam to ASCII art! I recently came back to it and rewrote it to add color support, improved modularity, an improved CLI, and Nim extensions!

  • pytextcodifier

    :package: Turn your text files into codified images or your codified images into text files.

  • litemark

    Lightweight Markdown dialect for Python desktop apps

    Project mention: Show HN: Exn – Write and render rich, scriptable, and interactive notes | news.ycombinator.com | 2023-02-26

    Hi HN !

    I'm Alex, a tech enthusiast. I'm excited to show you Exonote [1] and Exn [2], two projects I'm working on.

    Years ago I crafted Litemark [3] and Codegame [4], two projects for creating codegames, which are programming puzzles with a backstory. A codegame is made up of levels stored in plain old text files as prose written with Litemark, a markup language inspired by Markdown [5]. The player would open the Codegame app to access the first level of the game, read the prose, and then submit Python code to solve the puzzle. The submitted code is evaluated by the Codegame app, then the return compared to the expected result previously defined by the author for this level. A correct answer would unlock the next level.

    The idea behind Litemark and Codegame has evolved to embrace more possibilities, leading to the creation of Exonote and Exn which are no longer limited to "programming puzzles with a backstory".

    Exonote is a Markdown-inspired markup language for writing rich, scriptable, and interactive notes. An eponymous Python package is available on PyPI to serve as the reference library. The lowercase word 'exonote' could be used as a common noun for a document written with this markup language.

    This markup language makes it possible to add interactivity to notes by embedding GUI programs written with Tkinter [6]. Additionally, all or part of an exonote can be arbitrarily generated using custom Python scripts.

    On top of Exonote, with Tkinter I built Exn, a lightweight Python desktop application to browse a dossier of exonotes. A dossier is a directory that contains plain old text files with the ".exn" extension (exonotes), assets (images for examples), and Python source code.

    Exn's graphical user interface is a metaphor for a book whose pages are exonotes. Thus, the left and right arrow keys allow the reader to navigate from one page to another. The order of the pages is determined by an index file which can be generated automatically from the command line. This file contains the list of exonotes (ordered by their creation timestamp), their titles, and their tags.

    Exn also has a built-in search engine that supports regular expressions [7], a Table of Contents (ToC) builder, a 'switcher' (Ctrl+Tab) and other cool stuff.

    By expanding the original Litemark/Codegame idea, I unwittingly introduced a security risk. Suppose Bob doesn't have a personal website but has a GitHub repository that he uses as the public dossier for his exonotes. Alice is a tech savvy who knows what exonotes are and would love to explore the contents of the dossier with Exn. But she worries about the security risk of running untrusted code.

    To solve this problem without going on an endless sandbox over-engineering journey, I added to Exn two command line options to browse a folder with low and high restriction. Low restriction mode blocks the execution of embedded programs while high restriction mode not only does the same but also blocks executable links (preventing the user from inadvertently running code by clicking on an executable link).

    There is more to say about this (double) project, such as the Viewer API to manipulate the live representation of an exonote from a Python script.

    This project is functional, still a work in progress with precarious documentation. I'm planning to add a nice plugin mechanism (at the moment it's only possible to change colors and font size of elements) so people can customize Exn (themes, et cetera) or add new functions.

    A demo [8] is available to play around with and there is also a "Why use this project" section in the README which contains some interesting stuff not covered in this post.

    I would like to know what you think [9] of Exonote and Exn. Your questions, criticisms, and suggestions are welcome !

    Postscript: I found by serendipity that the third most popular post on HN (Mechanical Watch by Bartosz Ciechanowski) [10] looks like what I thought was a cool example of what can be done with Exonote/Exn. I had imagined a bike enthusiast working on a prototype on the weekends, taking notes, inserting relevant hyperlinks and images, building interactive 3D models of different parts of the bike and embedding them into the notes, et cetera. It would be a dossier of exonotes describing from scratch how the bike was built, with an astonishing level of detail. This person could keep this dossier private forever or publish it online. Once released, depending on the dossier's license, it can evolve much like open source software.

    [1] https://github.com/pyrustic/exonote

    [2] https://github.com/pyrustic/exn

    [3] https://github.com/pyrustic/litemark

    [4] https://github.com/pyrustic/codegame

    [5] https://en.wikipedia.org/wiki/Markdown

    [6] https://en.wikipedia.org/wiki/Tkinter

    [7] https://en.wikipedia.org/wiki/Regular_expression

    [8] https://github.com/pyrustic/exn#demo

    [9] http://sl4.org/crocker.html

    [10] https://news.ycombinator.com/item?id=31261533

  • lossytextcompressor

    Lossy text compressor

    Project mention: Silly Lossy Text Compression Idea | news.ycombinator.com | 2022-05-21

    Cool :) I made a little script that uses a thesaurus to try to shorten text https://github.com/anfractuosity/lossytextcompressor

  • TextyDungeon

    A text rpg game engine using python and json

    Project mention: I’m creating a text rpg game engine | reddit.com/r/Python | 2022-07-07
  • COVID19-Vaccine-Spotter-Extension-Python

    This is an extension of vaccine spotter I use there api with a combination of twilo and python to monitor vaccine appointments and send texts

  • pythontextnow

    Send SMS messages from Python! A Python wrapper for TextNow.

    Project mention: Python Library for TextNow | reddit.com/r/PythonProjects2 | 2022-08-23

    I recently released version 1.0.0 of a library called [pythontextnow](https://github.com/joeyagreco/pythontextnow) that allows you to interact with TextNow through Python.

  • downthon_AB

    downthon_AB is a simple Python script that can turn a folder of Markdown formatted .txt or .md files into something very nearly resembling a blog.

    Project mention: downthon_AB is a simple Python script that can turn a folder of Markdown formatted .txt or .md files into a static HMTL blog. | reddit.com/r/Python | 2022-06-02
  • screenshot_identifier

    make it easier to search through your screenshots in Mac OS, by mining the data in the pictures using optical character recognition (OCR), and putting it in the title

  • oneline

    Read a text file, one line at a time (by CharlesHawkins)

    Project mention: Command like “less” but show one line at a time | reddit.com/r/commandline | 2022-09-16

    This program, oneline, does almost that, though it doesn't clear the screen. You could maybe do

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-03-22.

Python Text related posts

Index

What are some of the best open-source Text projects in Python? This list will help you:

Project Stars
1 TextRecognitionDataGenerator 2,610
2 aeneas 2,133
3 alibi-detect 1,742
4 art 1,678
5 evennia 1,557
6 pytorch-widedeep 1,023
7 pygame-menu 424
8 pygame-text-input 131
9 zeroshot_topics 57
10 py_midicsv 56
11 namekrea 47
12 To-ASCII 43
13 pytextcodifier 14
14 litemark 10
15 lossytextcompressor 6
16 TextyDungeon 4
17 COVID19-Vaccine-Spotter-Extension-Python 3
18 pythontextnow 3
19 downthon_AB 1
20 screenshot_identifier 1
21 oneline 0
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com