zdbc
DataProfiler
zdbc | DataProfiler | |
---|---|---|
5 | 61 | |
2 | 1,363 | |
- | 1.0% | |
0.7 | 6.3 | |
about 1 year ago | 3 days ago | |
Java | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
zdbc
-
Best way to access SQL Server over the Internet.
OpenZiti could be the ticket, it's an open source solution that provides an overlay network built on zero trust principles (more secure than a VPN). Bonus, we took one of our SDKs and wrapped it around JDBC so that for users to access the DB, they do not need to have a VPN client; they just use 'ZDBC' instead of JDBC - https://github.com/openziti-test-kitchen/zdbc
-
How to connect to mysql server pod from external standalone application?
You can use agents or their ZDBC, which is one of the most amazing things (as they work with APIs you can embeded their SDK into your code, so no agents are required).
-
How to secure your Kubernetes cluster against attacks
- Example zitifications: https://github.com/openziti-test-kitchen/zdbc & https://openziti.github.io/articles/zitification/prometheus/part1.html
-
Clientless Private Access for Autonomous Database with no complexity of wallet or VPNs
Would be great to have thoughts, comments, etc - https://github.com/openziti-test-kitchen/zdbc or https://www.youtube.com/watch?v=H6o5z6iXYPQ&ab_channel=Percona
-
Launch HN: Metaplane (YC W20) – Datadog for Data
Congrats on the launch. Well done.
another security approach is to enable your customers to close their inbound firewall ports and link listeners. this helps cloud and on prem models have far stronger security.
example here (disclosure: i am a founder of the company behind this solution) with both open source and OEM/SaaS models:
https://github.com/openziti-incubator/zdbc (code for one implementation - a wrapper around the JDBC drivers)
https://netfoundry.io/zero-trust-database-security/ (blog post with links to developer example video, whitepaper, etc)
DataProfiler
-
LongRoPE: Extending LLM Context Window Beyond 2M Tokens
It's been possible to skip tokenization for a long time, my team and I did it here - https://github.com/capitalone/DataProfiler
For what it's worth, we actually were working with LSTMs with nearly a billion params back in 2016-2017 area. Transformers made it far more effective to train and execute, but ultimately LSTMs are able to achieve similar results, though slow & require more training data.
- Data Profiler – What's in your data?
-
Data Profiler 0.9.0 -- offering a massive improvement to memory usage during profiling of large datasets
Great call out -- would you be willing to write up an issue for that on the repo? Thank you! https://github.com/capitalone/DataProfiler/issues/new/choose
- FLiPN-FLaNK Stack Weekly for 20 March 2023
- Release 0.8.3 · capitalone/DataProfiler
What are some alternatives?
elementary - The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
ydata-profiling - 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
lightdash - Self-serve BI to 10x your data team ⚡️
pyWhat - 🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! 🧙♀️
Tailwind CSS - A utility-first CSS framework for rapid UI development.
usaddress - :us: a python library for parsing unstructured United States address strings into address components
superset - Apache Superset is a Data Visualization and Data Exploration Platform
XlsxWriter - A Python module for creating Excel XLSX files.
z_archived_kubectl - A zitified kubectl client. Archived in favor of the kubeztl repo jan-3-2023
vtuber-livechat-dataset - 📊 VTuber 1B: Billion-scale Live Chat and Moderation Event Dataset