tika-docker
filemanager
tika-docker | filemanager | |
---|---|---|
20 | 305 | |
103 | 23,791 | |
4.9% | 2.2% | |
4.1 | 8.8 | |
about 1 month ago | 1 day ago | |
Shell | Go | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tika-docker
- Text Extraction from Documents
- Apache Tika – Extract text and metadata from doc types (the backbone of RAG)
-
Demystifying Text Data with the Unstructured Python Library
If you accept running Java, the Apache Tika is extremely good at parsing content (https://tika.apache.org/)
- Ajuda com Buscador
-
How do you manage and find large amount of files?
Apache Tika can spit out text from lots of formats. I've used it with grep (or rg) to make a small scale searching of local folders. Tika does a really good job at OCR for finding if text is in a file.
-
40 Containers & Counting...
https://tika.apache.org Meta data from things.
- Hosted app to manage server inventory
- Best FOSS (ideally Docker) that can split PDF files ?
- OK, ElasticSearch works, text files are indexed. How about images? Can images be indexed in NextCloud and fulltextsearched?
-
Document Parsing - an unsolved problem?
At my previous job we had the same problem which we solved by using Tika. We called it on the server along with other stuff, but there is also a Python binding.
filemanager
-
Ask HN: Online File Repository System?
Checkout https://awesome-selfhosted.net/tags/file-transfer---web-base...
I've used https://filebrowser.org/ and it's okay. I've also Seafile, but my current setup is sftp clients (Transmit nowadays) and Syncthing if I need the files on multiple computers.
-
Homelab Adventures: Crafting a Personal Tech Playground
File Browser
-
h5ai – modern HTTP web server index
Thanks for sharing. I wasn't aware of dufs and it looks very solid. Fileserver[0] is another popular choice, though it's more GUI-oriented for file operations.
[0]: https://filebrowser.org/
-
Ask HN: Spreadsheets like Google Sheets but not from Google?
The OnlyOfffice desktop app is a pretty good and free alternative to Microsoft Office Suite. You can simply install it on your local machine for offline access.
OnlyOfffice is also self-hostable as a web app for a cloud alternative to Google Sheets.
Filebrowser is a self-hostable alternative to Google Drive.
There's a pull request open to integrate OnlyOffice with Filebrowser for self-hosted google-drive + google docs.
https://github.com/filebrowser/filebrowser/pull/1420
-
Ask HN: What is the best FOSS file sharing protocol/app?
For strictly local use, Google's Nearby share is technically FOSS but the documentation is basically non-existent and a proper Linux implementation is not here yet. Alternatives aren't hard to find though, with Mint's Warpinator or KDE Connect having worked well for me.
For non-local use (everything out of Bluetooth range), you almost have to trust a third party and it really depends on your use case. Want to send your friend a file or host pictures of your birthday for multiple people to download? For the former magic wormhole works great, for the later you could almost spin up a nextcloud or similar (personally I like https://github.com/filebrowser/filebrowser ). Want to regularly send files from device 1 to device 2? Now classic sync solutions like syncthing become really viable.
If everything else fails, FTP always has your back
-
Finally a decent file browser in Game mode
I have been looking for a file browser which can run in game mode and is reasonably user friendly for simple file operations (copy/delete/rename, etc). Most people recommend Dolphin. it does work but there are issues: the color scheme looks really weird in game mode. context menu does not like game mode, either. Got file browser working (https://github.com/filebrowser/filebrowser) in game mode, which essentially an Edge app accessing a web server on localhost (running as user service). It took some time to set up but the end result is exactly what I would like to have.
-
List of your reverse proxied services
File Browser - For access to the files on my NAS
-
Self Hosted File upload service
filebrowser has user management plus sharing capabilities
-
Folder/File sharing with multiple links
Filebrowser suppports multiple shares with different expiration dates. It also offers file previews and generates QR Codes for the shares.
-
I need help creating a diy nas for under $1000
NextCloud is great for this, but if we're talking sharing files from your sync'd project collection, I'd probably instead recommend Filebrowser. You can point it to the same data store that syncthing is using and it'll make it easy to share the projects. Note that in order to do this you'll need to open up and expose filebrowser publicly. The simplest way to do this would probably be a cloudflare tunnel and for sharing files like this ad-hoc I don't see any issues with their TOS. For things like SyncThing though you'll still wanna do conventional port forwarding. the DIY approach instead of CloudFlare tunnel would be to port forward, set up a dynamic dns record, and set up letsencrypt certs
What are some alternatives?
Paperless-ng - A supercharged version of paperless: scan, index and archive all your physical documents
Nextcloud - ☁️ Nextcloud server, a safe home for all your data
sist2 - Lightning-fast file system indexer and search tool
Filestash - 🦄 A modern web client for SFTP, S3, FTP, WebDAV, Git, Minio, LDAP, CalDAV, CardDAV, Mysql, Backblaze, ...
spyglass - A personal search engine: Create a searchable library from your personal documents, interests, and more!
filegator - Powerful Multi-User File Manager
yew - Rust / Wasm framework for creating reliable and efficient web applications
OpenMediaVault - openmediavault is the next generation network attached storage (NAS) solution based on Debian Linux. Thanks to the modular design of the framework it can be enhanced via plugins. openmediavault is primarily designed to be used in home environments or small home offices.
spacedrive - Spacedrive is an open source cross-platform file explorer, powered by a virtual distributed filesystem written in Rust.
h5ai - HTTP web server index for Apache httpd, lighttpd and nginx.
self-hosted_docker_setups - A collection of my docker-compose files used to setup self-hosted services on Raspberry Pi 4 running 64-bit Raspberry Pi OS
tinyfilemanager - Single-file PHP file manager, browser and manage your files efficiently and easily with tinyfilemanager