trafilatura
docker-minecraft-server
trafilatura | docker-minecraft-server | |
---|---|---|
13 | 211 | |
2,853 | 8,384 | |
- | - | |
8.7 | 9.4 | |
2 days ago | 6 days ago | |
Python | Shell | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
trafilatura
-
Trafilatura: Python tool to gather text on the Web
The feature list answers that question pretty well: https://github.com/adbar/trafilatura#features
Basically: you could implement all of this on top of BeautifulSoup - polite crawling policies, sitemap and feed parsing, URL de-duplication, parallel processing, download queues, heuristics for extracting just the main article content, metadata extraction, language detection... but it would require writing an enormous amount of extra code.
-
Show HN: Build AI Dags with Memory; Run and Validate LLM Tools in Containers
The WebScraper tool uses Trafilatura [1] to scrape and parse HTML—nothing too fancy. "Scraping" a React site would require a totally different approach, probably something more akin to Adept's ACT-1 [2].
I run a local chat app built with Griptape and I use it to give me summaries of web pages or answer specific questions all the time :)
1. https://github.com/adbar/trafilatura/
-
Powerful and free scraper with a headless browser under the hood and Readability for parsing
I've been playing with Trafilatura lately, and it's very good. There are a few very thorough comparisons to other projects and it really shines. It doesn't do anything headless from what I can tell, but it doesn't have to do the scraping itself. Maybe an option could be to use Playwright to scrape, then Trafilatura to parse. Food for thought.
-
I made a Chrome Extension that lets you ask any question about the page you are on (bluf.ai)
Cool! If you care to explain me further... :) ... I tried parsing a page using: https://github.com/adbar/trafilatura, json stringify it and passing it to https://platform.openai.com/docs/api-reference/embeddings/create. How do I use the response as an input later? <3
-
Testing fast installation in tear-down environment
I want to test how easy it is to install a package plus special extra dependencies to run a certain script in that package: https://github.com/adbar/trafilatura
- Advice on standard design pattern for comparison test script
- Automate dependency installation
- Issue with sklearn
- Questions about some code
- How does Firefox's Reader View work?
docker-minecraft-server
-
PaperMC/Paper: The most widely used, high performance Minecraft server
Looks nice! I have a comprehensive docker compose file with itzg’s image [0] repeated a dozen times.
[0] https://github.com/itzg/docker-minecraft-server
- Docker-Minecraft-Server
-
Minecraft server
This is just from a quick search: https://github.com/itzg/docker-minecraft-server
-
Game Server Manager For Windows
I'd like something like this that works with the itzg docker image. I usually have 2-3 servers up and running at any given time for me and my kids to play on. Different modpacks, haven't done anything fancy to manage them except portainer.
-
Need help with PS4 / Java co-play
I have a nephew who loves minecraft (my fault, I introduced it to him). He plays it on PS4 - which I don't have - and he would like to play it together with me. I live a few thousand kilometers away, so I thought I'd host a server he can join to.I decided to host the server via docker, using itgz's repo. The server is unauthenticated and is exposed to the internet, I can join to it using my java instance.
-
I made an install script which sets up a Minecraft server for Linux.
I think that a better option would be to use https://github.com/itzg/docker-minecraft-server and then use a networking solution like tailscale or equivalent.
-
homelab snowball still snowballing
using itzg/docker-minecraft-server to host a curseforge modded server that's running all the mods 8 mod pack, though I might look into pterodactyl for future use!
-
Easiest way to run a server with a given modpack?
I'm aware of itzg/docker-minecraft-server: Docker image that provides a Minecraft Server that will automatically download selected version at startup (github.com) which makes things pretty easy, but getting any modpack to work still requires quite a bit of fiddling with environment variables (mods to exclude on the server, java version, mc version, enabling command block, etc). Is there something easier, which doesn't require me to search through issues to find working configurations?
-
Installing Java 8 on EndeavourOS
Also, as somebody already mentioned, you could run server in a Docker container. I quickly found this: https://hub.docker.com/r/itzg/minecraft-server. You could look into it. For this you will have to setup Docker, but it should be pretty easy (install and add user to group, check Arch wiki). Docker then creates a small "VM" (without kernel) in which Java is installed to run the server itself.
-
Help with setting up TLS for Dynmap with Caddy Reverse Proxy on Docker server
Server is running latest version of PaperMC using the itzg/docker-minecraft-server image on Ubuntu 22.04 LTS. Ports 8123 and 25565 of the container are exposed. Dynmap with LiveAtlas is installed in the server's /plugins/ directory and can be accessed through [server IP]:8123.
What are some alternatives?
newspaper - newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
Geyser - A bridge/proxy allowing you to connect to Minecraft: Java Edition servers with Minecraft: Bedrock Edition.
python-goose - Html Content / Article Extractor, web scrapping lib in Python
Paper - The most widely used, high performance Minecraft server that aims to fix gameplay and mechanics inconsistencies
TWINT - An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
docker-rcon-web-admin - A Docker image that runs rcon-web-admin
html2text - Convert HTML to Markdown-formatted text.
panel - Pterodactyl® is a free, open-source game server management panel built with PHP, React, and Go. Designed with security in mind, Pterodactyl runs all game servers in isolated Docker containers while exposing a beautiful and intuitive UI to end users.
Goose3 - A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html
papermc-docker - Docker image for a PaperMC Minecraft server.
textract - extract text from any document. no muss. no fuss.
Purpur - Purpur is a drop-in replacement for Paper servers designed for configurability, and new fun and exciting gameplay features.