chispa
leapp
Our great sponsors
chispa | leapp | |
---|---|---|
12 | 73 | |
508 | 1,524 | |
- | 1.2% | |
6.7 | 9.7 | |
7 days ago | 8 days ago | |
Python | TypeScript | |
MIT License | Mozilla Public License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
chispa
-
Testing spark applications
Unit and e2e tests using a combination of pytest and chispa (https://github.com/MrPowers/chispa). Custom library to create random test data that fits schema with optional hardcoded overrides for relevant fields to test business logic.
-
Spark open source community is awesome
here's a little README fix a user pushed to chispa
-
Invitation to collaborate on open source PySpark projects
chispa is a library of PySpark testing functions.
-
installing pyspark on my m1 mac, getting an env error
The other approach I've used is Poetry, see the chispa project as an example. Poetry is especially nice for projects that you'd like to publish to PyPi because those commands are built-in.
-
Spark: local dev environment
- All Spark transformations are tested with pytest + chispa (https://github.com/MrPowers/chispa)
-
Pyspark now provides a native Pandas API
Pandas syntax is far inferior to regular PySpark in my opinion. Goes to show how much data analysts value a syntax that they're already familiar with. Pandas syntax makes it harder to reason about queries, abstract DataFrame transformations, etc. I've authored some popular PySpark libraries like quinn and chispa and am not excited to add Pandas syntax support, haha.
-
Show dataengineering: beavis, a library for unit testing Pandas/Dask code
I am the author of spark-fast-tests and chispa, libraries for unit testing Scala Spark / PySpark code.
-
Tips for building popular open source data engineering projects
Blogging has been the main way I've been able to attract users. Someone searches "testing PySpark", they see this blog, and then they're motivated to try chispa.
-
Ask HN: What are some tools / libraries you built yourself?
I built daria (https://github.com/MrPowers/spark-daria) to make it easier to write Spark and spark-fast-tests (https://github.com/MrPowers/spark-fast-tests) to provide a good testing workflow.
quinn (https://github.com/MrPowers/quinn) and chispa (https://github.com/MrPowers/chispa) are the PySpark equivalents.
Built bebe (https://github.com/MrPowers/bebe) to expose the Spark Catalyst expressions that aren't exposed to the Scala / Python APIs.
Also build spark-sbt.g8 to create a Spark project with a single command: https://github.com/MrPowers/spark-sbt.g8
-
Open source contributions for a Data Engineer?
I've built popular PySpark (quinn, chispa) and Scala Spark (spark-daria, spark-fast-tests) libraries.
leapp
-
Ask HN: Who wants to be hired? (March 2024)
Summary:
Do you find yourself overwhelmed with work, requests, or complaints and in need of assistance to alleviate the pressure, enhance communication, facilitate organization, prioritize tasks, and foster greater trust and transparency?
Alternatively, I can work as a full stack developer.
AWS Community builder, AWS User group Leader, public speaker (https://www.youtube.com/watch?v=qdu58NAQfU0&t=271s)
Or perhaps you need both? =)
I have 4+ years of experience as a product manager and 8 in product development (before pm: agile coach, UX designer, and developer).
I've been the co-founder of the open-core company behind the OSS project Leapp (https://github.com/Noovolari/leapp)
Please feel free to reach out.
-
OKTA Identity Engine Upgrade
You can switch to saml2aws using the browser method instead of the Okta method and it will continue to work after the upgrade. There is also a really neat GUI tool to manage your session tokens that also works. https://www.leapp.cloud
-
When using AWS Organizations SSO for multiple accounts (dev, stage, prod) I have a hard time knowing which account I'm currently logged into.
Take a try to Leapp: https://github.com/Noovolari/leapp
-
Ask HN: Should open source projects track you?
Hello everyone, I'm the maintainer of an open-source DeveloperTool (https://github.com/Noovolari/leapp)
With a heuristic of 7000 users daily, I started feeling the need to have more information on how Users are using the project to improve it.
Is it the right thing to do to create a better Developer Experience and gain feedback for the end users?
On a side:
-
Ask HN: Secure and simple way for secret/credential management in a startup?
- For all your employees I can advice you Leapp as open-source project (https://github.com/Noovolari/leapp). It solve mayor of the problem listed here:
-
Alternative Official SDK
I am looking to manage Leapp (https://www.leapp.cloud/) from the StreamDeck. Leapp allows you to manage and switch between different Cloud Accounts (AWS, Azure, etc). Leapp has a command line interface which I could automate with a StreamDeck plugin. Unfortunately it looks like the only official SDK is the sandboxed JavaScript one. This means I cannot automate command line tools with it.
-
What are AWS credentials?
If you’re wondering if there is a tool that allows you to stop thinking about AWS credentials and where to store them in the right way, give a look at Leapp! It takes the responsibility of storing long-term credentials in the system vault, generating/refreshing short-term credentials, and placing them in the right place for the clients to use them.
-
AWS multi-account strategy explained
Still, there is an elementary problem that we need to address, and it’s more on the operational side of things. Once we secured and implemented a tremendous multi-account strategy, how do people access AWS accounts? It turns out there is a fantastic open-source tool that lets you handle that with no effort, and its name is Leapp.
-
AWS Credentials: from Environment Variables to credentials_process
When you have to configure access to multiple AWS accounts using the Assume Role access pattern, it becomes difficult to get rid of all the Named Profiles configuration data and relationships. When you’ve to deal with a complex access scenario, tools like Leapp (https://www.leapp.cloud) come to the rescue! Leapp avoids you to specify relationships between Named Profiles in the config file, as the access methods are stored in the tool-specific configuration file.
-
Multiple active AWS consoles in the same browser with Leapp open-source browser extension (for Firefox and Chrome)
Leapp Github repository
What are some alternatives?
spark-fast-tests - Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)
aws-vault - A vault for securely storing and accessing AWS credentials in development environments
spark-daria - Essential Spark extensions and helper methods ✨😲
sshportal - :tophat: simple, fun and transparent SSH (and telnet) bastion server
quinn - pyspark methods to enhance developer productivity 📣 👯 🎉
saml2aws - CLI tool which enables you to login and retrieve AWS temporary credentials using a SAML IDP
lowdefy - The config web stack for business apps - build internal tools, client portals, web apps, admin panels, dashboards, web sites, and CRUD apps with YAML or JSON.
gatus - ⛑ Automated developer-oriented status page
null - Nullable Go types that can be marshalled/unmarshalled to/from JSON.
dagster - An orchestration platform for the development, production, and observation of data assets.
simplelocalize-cli - SimpleLocalize CLI is a developer-friendly command-line tool for uploading and downloading translation files