marvin
jdupes
DISCONTINUED
Our great sponsors
marvin | jdupes | |
---|---|---|
16 | 44 | |
4,601 | 1,681 | |
6.3% | - | |
9.9 | 0.0 | |
about 18 hours ago | 6 months ago | |
Python | C | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
marvin
-
Show HN: Magentic – Use LLMs as simple Python functions
Seems a lot like https://github.com/PrefectHQ/marvin?
The prompting you do seems an awfully like:
Yes, similar ideas. Marvin [asks the LLM to mimic the python function](https://github.com/PrefectHQ/marvin/blob/f37ad5b15e2e77dd998...), whereas in magentic the function signature just represents the inputs/outputs to the prompt-template/LLM, so the LLM “doesn’t know” that it is pretending to be a python function - you specify all the prompts.
-
4-Apr-2023
Marvin: a batteries-included library for building AI-powered software. Marvin's job is to integrate AI directly into your codebase by making it look and feel like any other function (https://github.com/PrefectHQ/marvin)
-
Magic - AI functions for Typescript
Sure! I was inspired by this Python library: https://github.com/PrefectHQ/marvin
-
Show HN: A ChatGPT TUI with custom bots
I see Langchain has support for Azure chat models, and Marvin is built on Langchain so it may not be so difficult! Tracking issue here: https://github.com/PrefectHQ/marvin/issues/189
- FLaNK Stack Weekly 3 April 2023
-
Show HN: Marvin – build AI functions that use an LLM as a runtime
Check out this example from the docs to see how to take a URL as argument and then pass content to the LLM: https://www.askmarvin.ai/guide/concepts/ai_functions/#sugges...
(The previous example is also good)
A few things you could consider:
1. We have a utility for getting content out of HTML at marvin.utilities.strings.html_to_content. That would probably significantly compress it.
2. Chunk the HTML into batches that fit in context, send each over with an AI function that summarizes it (you could instruct the AI function to optimize the summary to help with title generation), then send all the resulting summaries to a title generator
3. We have a suite of HTML loader classes that will probably be ready for production in a couple releases (see https://github.com/PrefectHQ/marvin/blob/main/src/marvin/loa...) but you could try them out now (note: these use parts of Marvin beyond just AI functions, so I'm not recommending it as a drop-in right now). Our loader classes are (ideally) designed to do more than just chunk the input; depending on the nature of the input we do different preprocessing steps to help with insight.
4. Experiment and let us know what you learn - we can incorporate it into a loader class if its effective
Here https://github.com/PrefectHQ/marvin/blob/main/examples/end-t... the prompt says
instructions=(
Hi!
This example was produced using GPT 3.5 turbo, where yes, the LLM does not always align ideally. I used 3.5 for the example since that's Marvin's default and I know many people wouldn't have gpt4 access yet (which is significantly better at following instructions) - didn't want to set a misleading expectation.
that said, my instructions for the bot in this example certainly could have been more precise :) for a more real example, you could check out the other example (which works pretty well on 3.5) https://github.com/PrefectHQ/marvin/blob/main/examples/load_...
Thanks!
Caching is highly requested! We have an issue open (https://github.com/PrefectHQ/marvin/issues/102) and expect to tackle it soon.
You can set temperature as a setting today (sorry we haven't documented all the settings yet) by setting the env var `MARVIN_OPENAI_MODEL_TEMPERATURE=0.2` or at runtime with `marvin.settings.openai_model_temperature=0.2`. Note the temperature is set when a bot / ai_fn is created, not when it's called, so you need to do this early.
jdupes
-
fdupes: Identify or Delete Duplicate Files
200 lines of Nim [1] seems to run about 9X faster than the 8000 lines of C in fdupes on a little test dir I have. If you need C, I think jdupes [2] is faster as @TacticalCoder points out a couple of times here. In my testing, `dups` is usually faster than `jdupes`, though.
-
I'm amazed how I find anything & why I have so many dupes!
There's always the well-respected tool, Czkawka. Or, of the CLI is your thing, jdupes is a good option.
- Anyone know of any good file deduplication tools?
-
Johnny Decimal
My research into this many years ago turned out that jdupes was the right / best solution I could find for my usecase.
https://github.com/jbruchon/jdupes
Though that works fine from a script perspective I'd like some more interactive way of sorting directories etc. Identifying is just the first step, jdupes helps with linking the files (both soft and hard links comes with caveats though!) but that is mostly to save space, not to help in reorganisation.
-
Any good duplicate file finder for windows?
jdupes is a tuned fork of the well-known fdupes, and has Win32 releases.
- FLaNK Stack Weekly 3 April 2023
- Backing Up Data: Tips/Advice for Tons of Unorganized Data and Duplicate Files from Multiple Sources
-
Anyone running Bees? Or deduping data some other way?
If not bees, do you run other programs for deduping? I see jdupes has support for BTRFS, https://github.com/jbruchon/jdupes, and also duperemove, https://github.com/markfasheh/duperemove.
- Ask HN: Tool to find identical file subtrees scattered over disks
What are some alternatives?
fdupes - FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
dupeguru - Find duplicate files
rmlint - Extremely fast tool to remove duplicates and other lint from your filesystem
rdfind - find duplicate files utility
duperemove - Tools for deduping file systems
czkawka - Multi functional app to find duplicates, empty folders, similar images etc.
fclones - Efficient Duplicate File Finder
phockup - Media sorting tool to organize photos and videos from your camera in folders by year, month and day.
btrfs-progs - Development of userspace BTRFS tools
cdecrypt - Decrypt Wii U NUS content — Forked from: https://code.google.com/archive/p/cdecrypt/
dduper - Fast block-level out-of-band BTRFS deduplication tool.
xxHash - Extremely fast non-cryptographic hash algorithm