humbug
DataProfiler
humbug | DataProfiler | |
---|---|---|
9 | 61 | |
38 | 1,363 | |
- | 1.0% | |
6.6 | 6.3 | |
10 months ago | 11 days ago | |
Python | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
humbug
- Humbug: Understand what keeps users coming back to your developer tool
-
See the errors your users are experiencing. From your IDE. Live.
Once you set up an integration and instrument your code, you can access your user reports at https://bugout.dev. This gives you a live view of what your users are experiencing:
-
Crash reports and usage metrics for JavaScript libraries
If you would like support for another programming language, please create an issue.
-
Show HN: Bugout.dev โ Crash and usage reports for developer tools
Hello everyone, Iโm Sophia, founder of Bugout.dev.
I started off as a professional ballerina, and entered technology later in my working life - through the OpenAI Scholars program. My co-founder, Neeraj (zomglings on HN), is a mathematician and now programmer.
When I was learning how to code I kept running into issues. I found Stackoverflow and GitHub issues hard to navigate, often leading me to outdated solutions to the problems I was experiencing. That experience made me want a product that would collect crashes and immediately let the creators of the software I was using know about the issue. And when they or their community had fixed the issue, they could notify me about that and direct me to a public site detailing the solution.
Over time, this idea evolved and resulted in Bugout.dev. Bugout makes it easy for creators of developer tools to collect usage metrics and crash reports from their users. This applies equally well to libraries, command line utilities, and APIs.
We're advocates of ethical data collection, and all reports are collected with clear user consent. Maintainers can also comply with GDPR requests for access and deletion with a single API call each.
We are also building a public knowledge base of issues and solutions from open source projects. We were inspired by rustc error messages in this and how they point users to documentation that can help you resolve compiler errors. Projects integrating with Bugout can link users to the knowledge base using a search query, which allows them to direct users to solutions customized to operating system, library version, and even compiler/runtime version.
We support developer tools written in Python and in Go - we just launched the Go library this week!
Please check out our GitHub page: https://github.com/bugout-dev/humbug. We would greatly appreciate your feedback.
-
Show HN: Usage and crash reports for Python libraries and command line tools
Understanding how your users experience your software is always difficult. It is especially difficult if we're talking about a developer tool like a library or command line utility.
Devtool maintainers have to rely on GitHub issues and IRC/Slack/Discord to talk with their users. They miss out the experience of the majority of their users, who never build up the motivation to create an issue or post a message on Slack.
Humbug addresses this problem. It collects developer tool usage reports and crash reports in a principled manner, only with the end user's full consent. Individuals or teams that maintain developer tools can use these reports to identify issues in their software, prioritize features, and in general improve their users' experience.
You can find a lot more information on GitHub: https://github.com/bugout-dev/humbug
Here is a short YouTube video showing how Humbug works: https://www.youtube.com/watch?v=-k2c8o_sXC4
Humbug is free to use for small projects. I hope you find it useful.
If you would like to discuss your use case in greater detail, I would love to speak with you in the comments. You can also reach me by email (check my profile).
-
Humbug: Usage and crash reports for Python libraries and command line tools
Thank you! Created an issue: https://github.com/bugout-dev/humbug/issues/31
-
Humbug: Usage and crash reports for Python developer tools
We have taken a big step forward this week with the release of Humbug, which helps developer tool maintainers collect usage and crash reports from their users only with their full consent.
DataProfiler
-
LongRoPE: Extending LLM Context Window Beyond 2M Tokens
It's been possible to skip tokenization for a long time, my team and I did it here - https://github.com/capitalone/DataProfiler
For what it's worth, we actually were working with LSTMs with nearly a billion params back in 2016-2017 area. Transformers made it far more effective to train and execute, but ultimately LSTMs are able to achieve similar results, though slow & require more training data.
- Data Profiler โ What's in your data?
-
Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
Great call out -- would you be willing to write up an issue for that on the repo? Thank you! https://github.com/capitalone/DataProfiler/issues/new/choose
- FLiPN-FLaNK Stack Weekly for 20 March 2023
- Release 0.8.3 ยท capitalone/DataProfiler
What are some alternatives?
pulumi-aws - An Amazon Web Services (AWS) Pulumi resource package, providing multi-language access to AWS
ydata-profiling - 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
privado - Open Source Static Scanning tool to detect data flows in your code, find data security vulnerabilities & generate accurate Play Store Data Safety Report.
pyWhat - ๐ธ Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! ๐งโโ๏ธ
gdpr-tools - Sanitize any PHP application HTML response to be GDPR-compliant, including integration with any CMP on the frontend to reload the resources upon consent.
usaddress - :us: a python library for parsing unstructured United States address strings into address components
XlsxWriter - A Python module for creating Excel XLSX files.
superset - Apache Superset is a Data Visualization and Data Exploration Platform
vtuber-livechat-dataset - ๐ VTuber 1B: Billion-scale Live Chat and Moderation Event Dataset
elementary - The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
visions - Type System for Data Analysis in Python
lightly - A python library for self-supervised learning on images.