Unredactor
In this project we are tryinbg to create unredactor. Unredactor will take a redacted document and the redacted flag as input, inreturn it will give the most likely candidates to fill in redacted location. In this project we are only considered about unredacting names only. The data that we are considering is imdb data set with many review files. These files are used to buils corpora for finding tfidf score. Few files are used to train and in these files names are redacted and written into redacted folder. These redacted files are used for testing and different classification models are built to predict the probabilies of each class. Top 5 classes i.e names similar to the test features are written at the end of text in unreddacted foleder. (by gt0410)
mljar-supervised
Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation (by mljar)
Unredactor | mljar-supervised | |
---|---|---|
1 | 51 | |
0 | 2,936 | |
- | 0.8% | |
10.0 | 8.5 | |
over 2 years ago | 20 days ago | |
Python | Python | |
GNU General Public License v3.0 only | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Unredactor
Posts with mentions or reviews of Unredactor.
We have used some of these posts to build our list of alternatives
and similar projects.
-
Redacted and Sanitized
Interestingly, some years back (perhaps 12-15 years?) someone developed a program that would examine the font a physically redacted document was written in, and the spacing to try to unredact it, with some relatively decent success as only a set combination of words/letters etc. could fill a specific redacted portion. Of course the larger the redacted block, the harder it becomes. It was interesting none the less, not sure what happened to it though. This: https://github.com/gt0410/Unredactor is similar, but not what I was thinking of, and this: https://hackaday.com/2008/08/01/exposing-poorly-redacted-pdfs/ may also prove interesting for you.
mljar-supervised
Posts with mentions or reviews of mljar-supervised.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-08-24.
-
Show HN: Web App with GUI for AutoML on Tabular Data
Web App is using two open-source packages that I've created:
- MLJAR AutoML - Python package for AutoML on tabular data https://github.com/mljar/mljar-supervised
- Mercury - framework for converting Jupyter Notebooks into Web App https://github.com/mljar/mercury
You can run Web App locally. What is more, you can adjust notebook's code for your needs. For example, you can set different validation strategies or evalutaion metrics or longer training times. The notebooks in the repo are good starting point for you to develop more advanced apps.
-
Fairness in machine learning
It's an Automated Machine Learning python package. It's open-source, you can see how it works on GitHub: https://github.com/mljar/mljar-supervised
-
[P] Build data web apps in Jupyter Notebook with Python only
Sure, at the bottom of our website you can subscribe for newsletter.
- Show HN: AutoML Python Package for Tabular Data with Automatic Documentation
-
library / framework to test multiple sklearn regression models at once
If you need a simple and fast solution, go with auto-sklearn Maybe a bit more complex, but very powerful was mljar-supervised
- Python AutoML on Tabular Data with FeatureEng, HP Tuning, Explanations, AutoDoc
-
Data Science and full-stack-web development
In my case, I had experience in DS and software engineering. It gives me ability to start a company that works on Data Science tools.
-
Learning Python tricks by reading other people's code. But who?
MLJAR AutoML is a Python package for Automated Machine Learning on tabular data with feature engineering, explanations, and automatic documentation.
-
'start with a simple model'
I recommend trying my AutoML package. You can easily check many different algorithms. Waht is more, the baseline algorithms are checked (major class predictor for classification and mean predictor for regression). The advance of AutoML is that it is really quick. You dont need to write preprocessing code, just call fit method.