proton
ydata-profiling
proton | ydata-profiling | |
---|---|---|
10 | 43 | |
1,293 | 12,070 | |
3.8% | 1.1% | |
9.7 | 8.5 | |
6 days ago | 4 days ago | |
C++ | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
proton
- FLaNK-AIM Weekly 06 May 2024
-
Loading a trillion rows of weather data into TimescaleDB
What's the process for adding support for other databases to your tool qStudio?
I'm thinking perhaps you could add support for Timeplus [1]? Timeplus is a streaming-first database built on ClickHouse. The core DB engine Timeplus Proton is open source [2].
It seems that qStudio is open source [3] and written in Java and will need a JDBC driver to add support for a new RDBMS? If yes, Timeplus Proton has an open source JDBC driver [4] based on ClickHouse's driver but with modifications added for streaming use cases.
1: https://www.timeplus.com/
2: https://github.com/timeplus-io/proton
3: https://github.com/timeseries/qstudio
4: https://github.com/timeplus-io/proton-java-driver
-
Comparing Timeplus Proton and ksqlDB for stream processing
* Proton is more developer friendly
To explore Proton yourself, visit the [Proton GitHub repo](https://github.com/timeplus-io/proton) or create your own workspace on [Timeplus Cloud](https://timeplus.com).
- FLaNK Stack Weekly 19 Feb 2024
- Proton, a fast and lightweight alternative to Apache Flink
- Proton, extending the historical data, storage, and computing of ClickHouse
- Proton, a unified database for streaming and historical data in a single binary
-
First 15 Open Source Advent projects
5. Proton by Timeplus | Github | tutorial
- Timeplus has open-sourced its core streaming processing engine Proton
ydata-profiling
- FLaNK 25 December 2023
-
First 15 Open Source Advent projects
6. Ydata-synthetic and Ydata-profiling by YData | Github | tutorial
-
Coding Wonderland: Contribute to YData Profiling and YData Synthetic in this Advent of Code
Send us your North ⭐️: "On the first day of Christmas, my true contributor gave to me..." a star in my GitHub tree! 🎵 If you love these projects too, star ydata-profiling or ydata-synthetic and let your friends know why you love it so much!
- Data exploration is not dead
- Explore your data in a single line of code
-
Which preprocessing steps to improve the performance of a naive bayes classifier
My suggestion start with the EDA - there are a lot of packages that automate that for you already. My usual go-to: https://github.com/ydataai/ydata-profiling.
-
Simulating sales data
If you're not sure about the behaviour of your data (i.e., if the original data has properties like seasonality), you can use ydata-profiling to profile your data first.
-
I recorded a Data Science Project using Python and uploaded it on Youtube
Super cool! For EDA, you could give ydata-profiling a spin sometime and speed up the process!
-
Ydata-Profiling and Dask
Hey guys,
We've been recently at the Dask Demo Day and we're hoping to launch a new feature on ydata-profiling, with the support for Dask dataframes!
We're looking for Dask Wizards to start collaborating on this feature, so if you're interested, please join us to define the roadmap of the project and start making it real
Current GitHub branch is here: https://github.com/ydataai/ydata-profiling/tree/feat/dask
Dedicated dask channel here: https://discord.gg/EHDBuSSDuy
-
🧠 ydata-profiling + Dask!
We're looking for Dask Wizards 🧙🏻♂️ to start collaborating on this branch, so if you're interested, please join us to define the roadmap of the project and start making it real 🚀
What are some alternatives?
ytsaurus - YTsaurus is a scalable and fault-tolerant open-source big data platform.
dtale - Visualizer for pandas data structures
proton-python-driver - Python driver for Proton which support Proton native wire protocol
DataProfiler - What's in your data? Extract schema, statistics and entities from datasets
OSQuery - SQL powered operating system instrumentation, monitoring, and analytics.
dataframe-go - DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
duckdb - DuckDB is an in-process SQL OLAP Database Management System
lux - Automatically visualize your pandas dataframe via a single print! 📊 💡
POCO - The POCO C++ Libraries are powerful cross-platform C++ libraries for building network- and internet-based applications that run on desktop, server, mobile, IoT, and embedded systems.
get-started-with-JAX - The purpose of this repo is to make it easy to get started with JAX, Flax, and Haiku. It contains my "Machine Learning with JAX" series of tutorials (YouTube videos and Jupyter Notebooks) as well as the content I found useful while learning about the JAX ecosystem.
ClickHouse - ClickHouse® is a free analytics DBMS for big data
evidently - Evaluate and monitor ML models from validation to production. Join our Discord: https://discord.com/invite/xZjKRaNp8b