spark-sbt.g8
Tabula
spark-sbt.g8 | Tabula | |
---|---|---|
2 | 6 | |
73 | 1,736 | |
- | 0.9% | |
1.8 | 0.0 | |
about 3 years ago | 8 days ago | |
Scala | Java | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
spark-sbt.g8
-
Open source PySpark project idea
I built this for Spark Scala projects with SBT and it has been used extensively: https://github.com/MrPowers/spark-sbt.g8
-
Ask HN: What are some tools / libraries you built yourself?
I built daria (https://github.com/MrPowers/spark-daria) to make it easier to write Spark and spark-fast-tests (https://github.com/MrPowers/spark-fast-tests) to provide a good testing workflow.
quinn (https://github.com/MrPowers/quinn) and chispa (https://github.com/MrPowers/chispa) are the PySpark equivalents.
Built bebe (https://github.com/MrPowers/bebe) to expose the Spark Catalyst expressions that aren't exposed to the Scala / Python APIs.
Also build spark-sbt.g8 to create a Spark project with a single command: https://github.com/MrPowers/spark-sbt.g8
Tabula
- Tabula – Extract tables from PDF files
- Extract tables from PDF files
-
Ask HN: What are some tools / libraries you built yourself?
tabula-java [0], a library for extracting tables from PDF files. It started as a monolithic webapp written in JRuby, and we later extracted the table detection and segmentation logic into a Java library.
[0] https://github.com/tabulapdf/tabula-java
-
Tabula: Liberate Data From PDF Tables [jRuby]
Ties together a Cuba web app, the tabula-java library and lauch4j to provide a platform executable.
What are some alternatives?
Pion WebRTC - Pure Go implementation of the WebRTC API
OpenPDF - OpenPDF is a free Java library for creating and editing PDF files, with a LGPL and MPL open source license. OpenPDF is based on a fork of iText. We welcome contributions from other developers. Please feel free to submit pull-requests and bugreports to this GitHub repository.
Nullboard - Nullboard is a minimalist kanban board, focused on compactness and readability.
Open HTML to PDF - An HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!
tera - A template engine for Rust based on Jinja2/Django
vscode-jq - jq LiveView Extension for VS Code
emr-job-templates - A sample repository of production-ready Spark code for use with Amazon EMR.
cakephp-swagger-bake - Automatically generate OpenAPI, Swagger, and Redoc documentation from your existing CakePHP code.
GoJS, a JavaScript Library for HTML Diagrams - JavaScript diagramming library for interactive flowcharts, org charts, design tools, planning tools, visual languages.
intercooler-js - Making AJAX as easy as anchor tags
yadm - Yet Another Dotfiles Manager
QR-Code-generator - High-quality QR Code generator library in Java, TypeScript/JavaScript, Python, Rust, C++, C.