bebe
Tabula
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
bebe
-
Ask HN: What are some tools / libraries you built yourself?
I built daria (https://github.com/MrPowers/spark-daria) to make it easier to write Spark and spark-fast-tests (https://github.com/MrPowers/spark-fast-tests) to provide a good testing workflow.
quinn (https://github.com/MrPowers/quinn) and chispa (https://github.com/MrPowers/chispa) are the PySpark equivalents.
Built bebe (https://github.com/MrPowers/bebe) to expose the Spark Catalyst expressions that aren't exposed to the Scala / Python APIs.
Also build spark-sbt.g8 to create a Spark project with a single command: https://github.com/MrPowers/spark-sbt.g8
-
Finished porting all the Spark SQL functions that aren't exposed via the Scala API to the bebe project
The bebe project fills all these gaps in the Scala API. See the project README for examples on how each function works.
-
Making the Spark DataFrame composition type safe(r)
See here for a more detailed discussion and let me know your thoughts!!
Tabula
- Tabula – Extract tables from PDF files
- Extract tables from PDF files
-
Ask HN: What are some tools / libraries you built yourself?
tabula-java [0], a library for extracting tables from PDF files. It started as a monolithic webapp written in JRuby, and we later extracted the table detection and segmentation logic into a Java library.
[0] https://github.com/tabulapdf/tabula-java
-
Tabula: Liberate Data From PDF Tables [jRuby]
Ties together a Cuba web app, the tabula-java library and lauch4j to provide a platform executable.
What are some alternatives?
frameless - Expressive types for Spark.
OpenPDF - OpenPDF is a free Java library for creating and editing PDF files, with a LGPL and MPL open source license. OpenPDF is based on a fork of iText. We welcome contributions from other developers. Please feel free to submit pull-requests and bugreports to this GitHub repository.
kondo - Cleans dependencies and build artifacts from your projects.
Open HTML to PDF - An HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!
sqldb-logger - A logger for Go SQL database driver without modifying existing *sql.DB stdlib usage.
vscode-jq - jq LiveView Extension for VS Code
gutenberg - A fast static site generator in a single binary with everything built-in. https://www.getzola.org
cakephp-swagger-bake - Automatically generate OpenAPI, Swagger, and Redoc documentation from your existing CakePHP code.
yadm - Yet Another Dotfiles Manager
intercooler-js - Making AJAX as easy as anchor tags
Shynet - Modern, privacy-friendly, and detailed web analytics that works without cookies or JS.
QR-Code-generator - High-quality QR Code generator library in Java, TypeScript/JavaScript, Python, Rust, C++, C.