Trino
fd
Trino | fd | |
---|---|---|
44 | 172 | |
9,576 | 31,668 | |
1.8% | - | |
10.0 | 8.8 | |
3 days ago | about 11 hours ago | |
Java | Rust | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Trino
- Trino: Fast distributed SQL query engine for big data analytics
-
Game analytic power: how we process more than 1 billion events per day
We decided not to waste time reinventing the wheel and simply installed Trino on our servers. It’s a full featured SQL query engine that works on your data. Now our analysts can use it to work with data from AppMetr and execute queries at different levels of complexity.
-
Your Thoughts on OLAPs Clickhouse vs Apache Druid vs Starrocks in 2023/2024
DevRel for StarRocks. Trino doesn't have a great caching layer (https://github.com/trinodb/trino/pull/16375) and performance (https://github.com/trinodb/trino/issues/14237) and https://github.com/oap-project/Gluten-Trino. In benchmarks and community user testing, StarRocks has outperformed.
-
Making Hard Things Easy
What if my SQL engine is Presto, Trino [1], or a similar query engine? If it's federating multiple source databases we peel the SQL back and get... SQL? Or you peel the SQL back and get... S3 + Mongo + Hadoop? Junior analysts would work at 1/10th the speed if they had to use those raw.
[1] https://trino.io/
- Trino, a open query engine that runs at ludicrous speed
-
Questions about Athena, Trino and Iceberg
The good thing is that the concepts in terms to the SQL supported by Trino transfers between them all. So its completely reasonable to start with one and move to another. In fact that is something that happens regularly. I invite to you check out the talks from the Trino Fest event that is just wrapping up today. There are presentations about all these aspects and different scenarios users encounter. All videos and slides will go live on the Trino website soon. Also feel free to join the Trino slack to chat about about all this with other users.
-
Multi-Databases across Multiple Servers - MySQL
There are distributed query engines like Trino that help with this sort of problem https://trino.io/
-
Iceberg on Cloudtrail Logs with Athena
This issue in particular is a killer for me: https://github.com/trinodb/trino/issues/10974
-
Data Lake, Real-time Analytics, or Both? Exploring Presto and ClickHouse
AFAIK Presto was forked and Trino https://trino.io/ is now the leading SQL Query engine .
-
Apache Iceberg as storage for on-premise data store (cluster)
Trino or Hive for SQL querying. Get Trino/Hive to talk to Nessie.
fd
-
Level Up Your Dev Workflow: Conquer Web Development with a Blazing Fast Neovim Setup (Part 1)
ripgrep: A super-fast file searcher. You can install it using your system's package manager (e.g., brew install ripgrep on macOS). fd: Another blazing-fast file finder. Installation instructions can be found here: https://github.com/sharkdp/fd
-
Hyperfine: A command-line benchmarking tool
hyperfine is such a great tool that it's one of the first I reach for when doing any sort of benchmarking.
I encourage anyone who's tried hyperfine and enjoyed it to also look at sharkdp's other utilities, they're all amazing in their own right with fd[1] being the one that perhaps get the most daily use for me and has totally replaced my use of find(1).
[1]: https://github.com/sharkdp/fd
-
Z – Jump Around
You call it with `n` and get an interactive fuzzy search for your directories. If you do `n ` instead, it’ll start the find with `` already filled in (and if there’s only one match, jump to it directly). The `ls` is optional but I find that I like having the contents visible as soon as I change a directory.
I’m also including iCloud Drive but excluding the Library directory as that is too noisy. I have a separate `nl` function which searches just inside `~/Library` for when I need it, as well as other specialised `n` functions that search inside specific places that I need a lot.
¹ https://github.com/sharkdp/fd
² https://github.com/junegunn/fzf
-
Unix as IDE: Introduction (2012)
Many (most?) of them have been overhauled with success. For find there is fd[1]. There's batcat, exa (ls), ripgrep, fzf, atuin (history), delta (diff) and many more.
Most are both backwards compatible and fresh and friendly. Your hardwon muscle memory still of good use. But there's sane flags and defaults too. It's faster, more colorful (if you wish), better integration with another (e.g. exa/eza or aware of git modifications). And, in my case, often features I never knew I needed (atuin sync!, ripgrep using gitignore).
1 https://github.com/sharkdp/fd
- Tell HN: My Favorite Tools
-
Potencializando Sua Experiência no Linux: Conheça as Ferramentas em Rust para um Desenvolvimento Eficiente
Descubra mais sobre o fd em: https://github.com/sharkdp/fd
-
Making Hard Things Easy
AFAIK there is a find replacement with sane defaults: https://github.com/sharkdp/fd , a lot of people I know love it.
However, I already have this in my muscle memory:
-
🐚🦀Comandos shell reescritos em Rust
fd
-
Oils 0.17.0 – YSH Is Becoming Real
> without zsh globs I have to remember find syntax
My "solution" to this is using https://github.com/sharkdp/fd (even when in zsh and having glob support). I'm not sure if using a tool that's not present by default would be suitable for your use cases, but if you're considering alternate shells, I suspect you might be
-
Bfs 3.0: The Fastest Find Yet
Nice to see other alternatives to find. I personally use fd (https://github.com/sharkdp/fd) a lot, as I find the UX much better. There is one thing that I think could be better, around the difference between "wanting to list all files that follow a certain pattern" and "wanting to find one or a few specific files". Technically, those are the same, but an issue I'll often run into is wanting to search something in dotfiles (for example the Go tools), use the unrestricted mode, and it'll find the few files I'm looking for, alongside hundreds of files coming from some cache/backup directory somewhere. This happens even more with rg, as it'll look through the files contents.
I'm not sure if this is me not using the tool how I should, me not using Linux how I should, me using the wrong tool for this job, something missing from the tool or something else entirely. I wonder if other people have this similar "double usage issue", and I'm interested in ways to avoid it.
What are some alternatives?
Apache Spark - Apache Spark - A unified analytics engine for large-scale data processing
telescope.nvim - Find, Filter, Preview, Pick. All lua, all the time.
dremio-oss - Dremio - the missing link in modern data
ripgrep - ripgrep recursively searches directories for a regex pattern while respecting your gitignore
Presto - The official home of the Presto distributed SQL query engine for big data
fzf - :cherry_blossom: A command-line fuzzy finder
Apache Drill - Apache Drill is a distributed MPP query layer for self describing data
exa - A modern replacement for ‘ls’.
Apache Calcite - Apache Calcite
skim - Fuzzy Finder in rust!
ClickHouse - ClickHouse® is a free analytics DBMS for big data
vim-grepper - :space_invader: Helps you win at grep.