file
tablib
Our great sponsors
file | tablib | |
---|---|---|
7 | 2 | |
979 | 4,220 | |
2.3% | 0.7% | |
8.6 | 4.2 | |
5 days ago | 9 days ago | |
C | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
file
- Linux `file` Equivalent
-
Fun with File Formats
Also the magic number database for guessing the format of a file:
tablib
-
Fun with File Formats
There are two problems leading to the decision of only accepting public domain info: licensing and provenance.
"Licensing" is hard. The "Open Specifications Promise" [1], which covers a bunch of Microsoft-designed file formats, is merely a covenant not to sue.
"Provenance" is tricky. For example, much of the knowledge of the Apple iWork formats were derived by reverse-engineering the source programs and extracting protobuf definitions. Many open source projects have freely copied from each other, making detailed analysis tricky [2].
[1] https://en.wikipedia.org/wiki/Microsoft_Open_Specification_P...
What are some alternatives?
pymorphy2 - Morphological analyzer / inflection engine for Russian and Ukrainian languages.
Kaitai Struct - Kaitai Struct: declarative language to generate binary data parsers in C++ / C# / Go / Java / JavaScript / Lua / Nim / Perl / PHP / Python / Ruby
tika-docker - Convenience Docker images for Apache Tika Server
feather - Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow
fuzzywuzzy - Fuzzy String Matching in Python