tika-docker
spacedrive
Our great sponsors
tika-docker | spacedrive | |
---|---|---|
20 | 31 | |
100 | 28,841 | |
- | 2.3% | |
4.1 | 9.9 | |
24 days ago | 1 day ago | |
Shell | TypeScript | |
Apache License 2.0 | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tika-docker
- Text Extraction from Documents
- Apache Tika – Extract text and metadata from doc types (the backbone of RAG)
-
Demystifying Text Data with the Unstructured Python Library
If you accept running Java, the Apache Tika is extremely good at parsing content (https://tika.apache.org/)
- Ajuda com Buscador
-
How do you manage and find large amount of files?
Apache Tika can spit out text from lots of formats. I've used it with grep (or rg) to make a small scale searching of local folders. Tika does a really good job at OCR for finding if text is in a file.
-
40 Containers & Counting...
https://tika.apache.org Meta data from things.
- Hosted app to manage server inventory
- Best FOSS (ideally Docker) that can split PDF files ?
- OK, ElasticSearch works, text files are indexed. How about images? Can images be indexed in NextCloud and fulltextsearched?
-
Document Parsing - an unsolved problem?
At my previous job we had the same problem which we solved by using Tika. We called it on the server along with other stuff, but there is also a Python binding.
spacedrive
-
Interview with Mo Rajabi, co-founder and CEO of Noor
In the video, Mo talked about a few packages like Cidre and StrOm, and we referred to SpaceDrive.
-
Spacedrive: Unify files from all your devices and clouds into one easy explorer
AGPLv3 (switched in 2022 https://github.com/spacedriveapp/spacedrive/commit/8e5c71dea... ) and FWIW I don't see any mention of CLA or other license assignment, so I don't believe they can currently rug pull containing contributed changes since they don't own the license for them: https://github.com/spacedriveapp/spacedrive/blob/main/CONTRI...
- Spacedrive Alpha 0.1.0
-
Spacedrive – an open source cross-platform file explorer
Already opened a bug report for that: https://github.com/spacedriveapp/spacedrive/issues/1481
- Spacedrive is an open source cross-platform file explorer written in Rust
-
Modern graphical file explorer
While Electron wouldn't be on top of my wishlist, if it looked nice and was functional I wouldn't mind at all. I found this project https://github.com/spacedriveapp/spacedrive which uses Tauri and seems to be very interesting, but they haven't released yet
-
(Ab)using a server library as a GUI - bad idea or only sort of bad idea?
In Tauri (or Axum) the app compiles to a single binary. rspc is the key to this because it allows for multiple transports with the frontend. It supports both Tauri IPC, HTTP or websockets. Our core crate (at ./core) exports an rspc router that is transport agnostic then within the apps (at ./apps/desktop or ./app/server) we expose it with a transport. We use Tauri IPC for desktop and websockets for Axum because we use subscriptions. Then in the wrapper React project (at ./apps/desktop/src/App.tsx) we create the rspc client with the Tauri link, mount its React context and then mount the app package (). You can give the codebase a look if you want cause it’s all open source https://github.com/spacedriveapp/spacedrive
-
Real World Rust Backend For Web APIs (GraphQL / REST)
Taking a departure from REST and GraphQL, I'd suggest checking out rspc instead of GraphQL and Prisma Client Rust as your ORM. Both have been developed by a coworker and I for Spacedrive, the company we work for, and have provided what we believe is the best Rust + TypeScript stack that doesn't use GraphQL (new GrpahQL server incoming one day tho).
-
Sync Github, Local, and Google Drive together?
This might help https://github.com/spacedriveapp/spacedrive
- Space drive - open source cross-platform file explorer, powered by a virtual distributed filesystem written in Rust
What are some alternatives?
Paperless-ng - A supercharged version of paperless: scan, index and archive all your physical documents
xlite - Query Excel spredsheets (.xlsx, .xls, .ods) using SQLite
sist2 - Lightning-fast file system indexer and search tool
sigma-file-manager - "Sigma File Manager" is a free, open-source, quickly evolving, modern file manager (explorer / browser) app for Windows and Linux.
spyglass - A personal search engine: Create a searchable library from your personal documents, interests, and more!
QDirStat - QDirStat - Qt-based directory statistics (KDirStat without any KDE - from the original KDirStat author)
yew - Rust / Wasm framework for creating reliable and efficient web applications
Envy - Envy. Multi P2P Filesharing+Bittorrent, Shareaza Legacy.
self-hosted_docker_setups - A collection of my docker-compose files used to setup self-hosted services on Raspberry Pi 4 running 64-bit Raspberry Pi OS
asammdf - a rust crate to parse and write ASAM MDF file.
server - self-hosted tag-based time tracking
memfs - JavaScript file system utilities