spark-sbt.g8
gazpacho
spark-sbt.g8 | gazpacho | |
---|---|---|
2 | 1 | |
73 | 731 | |
- | - | |
1.8 | 3.2 | |
about 3 years ago | 5 months ago | |
Scala | Python | |
- | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
spark-sbt.g8
-
Open source PySpark project idea
I built this for Spark Scala projects with SBT and it has been used extensively: https://github.com/MrPowers/spark-sbt.g8
-
Ask HN: What are some tools / libraries you built yourself?
I built daria (https://github.com/MrPowers/spark-daria) to make it easier to write Spark and spark-fast-tests (https://github.com/MrPowers/spark-fast-tests) to provide a good testing workflow.
quinn (https://github.com/MrPowers/quinn) and chispa (https://github.com/MrPowers/chispa) are the PySpark equivalents.
Built bebe (https://github.com/MrPowers/bebe) to expose the Spark Catalyst expressions that aren't exposed to the Scala / Python APIs.
Also build spark-sbt.g8 to create a Spark project with a single command: https://github.com/MrPowers/spark-sbt.g8
gazpacho
-
Ask HN: What are some tools / libraries you built yourself?
I've been working on gazpacho [1] for last two years.
It's a general purpose web scraping library for Python that replaces BeautifulSoup + requests for most projects.
Just surpassed ~2K downloads every week!
[1] https://github.com/maxhumber/gazpacho
What are some alternatives?
Pion WebRTC - Pure Go implementation of the WebRTC API
selectolax - Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).
Nullboard - Nullboard is a minimalist kanban board, focused on compactness and readability.
lxml - The lxml XML toolkit for Python
tera - A template engine for Rust based on Jinja2/Django
html5lib - Standards-compliant library for parsing and serializing HTML documents and fragments in Python
emr-job-templates - A sample repository of production-ready Spark code for use with Amazon EMR.
xmltodict - Python module that makes working with XML feel like you are working with JSON
GoJS, a JavaScript Library for HTML Diagrams - JavaScript diagramming library for interactive flowcharts, org charts, design tools, planning tools, visual languages.
xhtml2pdf - A library for converting HTML into PDFs using ReportLab
yadm - Yet Another Dotfiles Manager
untangle - Converts XML to Python objects