csv_log_cleaner
Clean CSV files to conform to a type schema by streaming them through small memory buffers using multiple threads and logging data loss. (by ambidextrous)
dtype_diet
Tries to shrink your Pandas column dtypes with no data loss so you have more spare RAM (by noklam)
csv_log_cleaner | dtype_diet | |
---|---|---|
2 | 1 | |
2 | 34 | |
- | - | |
6.3 | 3.8 | |
about 2 months ago | 4 months ago | |
Rust | Jupyter Notebook | |
MIT License | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
csv_log_cleaner
Posts with mentions or reviews of csv_log_cleaner.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-04-14.
-
How do you guys handle pandas and its sh*tty data type inference
Sounds like it could be more of a data cleansing problem you're facing than a data inference one. Even a single non-numerical value in a million rows of numbers will necessarily mess up type inference for the whole column. I work with a lot of CSVs and that's one of the issues we have to spend a huge amount of time dealing with. I even ended up writing this open source tool to handle the cleansing: https://github.com/ambidextrous/csv_log_cleaner
-
Hey Rustaceans! Got a question? Ask here! (39/2022)!
Hi. I'm new to Rust. I've written up a little opensource tool to clean CSV files as a practical learning exercise that will help me with my job: https://github.com/ambidextrous/csv_cleaner Where would be a good place to post it for code review?
dtype_diet
Posts with mentions or reviews of dtype_diet.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-04-14.
What are some alternatives?
When comparing csv_log_cleaner and dtype_diet you can also consider the following projects:
unescape-rs - "Unescapes" strings with escape sequences written with literal characters and converts it into a properly escaped one.
CSVLint - CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.
doku - fn(Code) -> Docs
mimalloc - mimalloc is a compact general purpose allocator with excellent performance.
Peroxide - Rust numeric library with R, MATLAB & Python syntax
esp8266-hal - A experimental hardware abstraction layer for the esp8266 written in Rust.