parquet-format vs generative-models

parquet-format

Apache Parquet (by apache)

Source Code

Suggest alternative

Edit details

generative-models

Generative Models by Stability AI (by Stability-AI)

Suggest topics

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

parquet-format		generative-models
	Project
4	Mentions	21
1,655	Stars	22,508
1.8%	Growth	4.4%
7.2	Activity	7.3
5 days ago	Latest Commit	26 days ago
Thrift	Language	Python
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

parquet-format

Posts with mentions or reviews of parquet-format. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-11-16.

Summing columns in remote Parquet files using DuckDB
4 projects | news.ycombinator.com | 16 Nov 2023

Right, there's all sorts of metadata and often stats included in any parquet file: https://github.com/apache/parquet-format#file-format
The offsets of said metadata are well-defined (i.e. in the footer) so for S3 / blob storage so long as you can efficiently request a range of bytes you can pull the metadata without having to read all the data.
FLaNK Stack for 4th of July
15 projects | dev.to | 3 Jul 2023
I have question related to Parquet files and AWS Glue
1 project | /r/dataengineering | 18 Jun 2023

As i read here https://github.com/apache/parquet-format/blob/master/LogicalTypes.md , they are store in Integer formats and these integers represent the number of days (for Date) or number of milliseconds, microseconds or nanoseconds (for DateTime) since 1970-01-01. This works as expected with the parquet file that written by our ETL tool from internal database --> S3, all Data/DateTime columns are Integers, means that in Glue Job, i have to convert these Integers back to Date/Datetime value to do some transformation on them. But when parquet files are written by Spark, they are Date/DateTime (or TimeStamp to be more concise) format not Integers (i checked by read these files again in other Glue Job) and that make me confused.
Parquet: More than just “Turbo CSV”
7 projects | news.ycombinator.com | 3 Apr 2023

Date is confusing with a timezone (UTC or otherwise) and the doco makes no such suggestion.
The Parquet datatypes documentation is pretty clear that there is a flag isAdjustedToUTC to define if the timestamp should be interpreted as having Instant semantics or Local semantics.
https://github.com/apache/parquet-format/blob/master/Logical...
Still no option to include a TZ offset in the data (so the same datum can be interpreted with both Local and Instant semantics) but not bad really.

generative-models

Posts with mentions or reviews of generative-models. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-12.

Creating Videos with Stable Video Diffusion
3 projects | dev.to | 12 Feb 2024

git clone https://github.com/Stability-AI/generative-models.git && cd generative-models
Show HN: I have created a free text-to-image website that supports SDXL Turbo
2 projects | news.ycombinator.com | 17 Dec 2023
How To Increase Performance Time on MacOS
3 projects | /r/StableDiffusion | 10 Dec 2023
Introducing Stable Video Diffusion: Stability AI's New AI Research Tool for Image-to-Video Synthesis
1 project | /r/Linkeesproject | 8 Dec 2023

Generative Models by Stability AI Github Repository
image-to-video tutorial
1 project | /r/StableDiffusion | 26 Nov 2023

# clone SD repo !git clone https://github.com/Stability-AI/generative-models.git # cd into working directory # the % sets the pwd globally as usually each command is run in a subshell in google colab %cd /content/generative-models/ # installing dependencies !pip install -r requirements/pt2.txt !pip install . # HACK # I was getting ModuleNotFoundError: No module named 'scripts' # This is what ChatGPT suggested (let me know if there is a better way) file_path = '/content/generative-models/scripts/sampling/simple_video_sample.py' new_text = "import sys\nsys.path.append('/content/generative-models')\n\n" with open(file_path, 'r') as file: original_content = file.read() updated_content = new_text + original_content with open(file_path, 'w') as file: file.write(updated_content) # Need to create a checkpoints/ folder - that is where the system looks for weights import os dir_name = 'checkpoints' if not os.path.exists(dir_name): os.makedirs(dir_name) print(f"Directory '{dir_name}' created") else: print(f"Directory '{dir_name}' already exists") # Download weights into checkpoints/ folder from huggingface_hub import hf_hub_download hf_hub_download(repo_id="stabilityai/stable-video-diffusion-img2vid", filename="svd.safetensors", local_dir="checkpoints", local_dir_use_symlinks=False) # I can't remember if this step is needed but it aims to reduce the memory footprint of pytorch # I kept getting CUDA out of memory # I got these instructions from the out of memory error message os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb:512' print(os.environ['PYTORCH_CUDA_ALLOC_CONF']) # Inside of scripts/sampling/simple_video_sample.py you need to make 2 updates 1. input_path (line 26): update to the location of your file (I attached Gdrive so mine was "/content/drive/MyDrive/examples/car.jpeg" 2. decoding_t (line 34): update it to 5. you need to do this for memory preservation (CUDA out of memory). I'm not sure if 5 is the best value but it worked for me # Finally generate the video (output will be in the outputs/ folder) !python scripts/sampling/simple_video_sample.py
Stable Video Diffusion
6 projects | news.ycombinator.com | 21 Nov 2023

It looks like the huggingface page links their github that seems to have python scripts to run these: https://github.com/Stability-AI/generative-models
GitHub - Stability-AI/generative-models: Generative Models by Stability AI
1 project | /r/cryptogeum | 4 Nov 2023
How does ComfyUI load SDXL 1.0 so VRAM-efficiently? How do I do the same in vanilla python code?
1 project | /r/StableDiffusion | 18 Aug 2023

However, when using the example code from HuggingFace or setting up stuff from the StabilityAI/generative-models repo in a jupyter notebook, I end up using 21 GB of VRAM just for running the default pipeline (with no base model output). If I try to run the extra `base.vae.decode(base_latents)` after generation to get unrefined outputs, I get a CUDA out of memory error as it blows past the 24GB of my NVIDIA RTX 3090.
SDXL 1.0 is out!
1 project | /r/StableDiffusion | 28 Jul 2023
SDXL 0.9 Anyone having luck NOT centering subjects?
1 project | /r/StableDiffusion | 10 Jul 2023

SDXL uses cropping information as part of the conditioning. Images were randomly cropped during training and the coordinates of the crop were included as two integers at the end of the conditioning vector. If you're using ComfyUI you can use the CLIPTextEncodeSDXL node to specify where the upper left corner of the image should appear to be in relation to some hypothetical uncropped image. Here's a figure with examples from the report on SDXL:

What are some alternatives?

When comparing parquet-format and generative-models you can also consider the following projects:

rapidgzip - Gzip Decompression and Random Access for Modern Multi-Core Machines

background-removal-js - Remove backgrounds from images directly in the browser environment with ease and no additional costs or privacy concerns. Explore an interactive demo.

xgen - Salesforce open-source LLMs with 8k sequence length.

wizmap - Explore and interpret large embeddings in your browser with interactive visualization! 📍

evernote-ai-chatbot

FastSAM - Fast Segment Anything

gping - Ping, but with a graph

graphic-walker - An open source alternative to Tableau. Embeddable visual analytic

parquet-format vs rapidgzip generative-models vs background-removal-js parquet-format vs xgen generative-models vs wizmap parquet-format vs wizmap generative-models vs evernote-ai-chatbot parquet-format vs FastSAM generative-models vs gping parquet-format vs background-removal-js generative-models vs graphic-walker parquet-format vs graphic-walker generative-models vs xgen

Compare parquet-format vs generative-models and see what are their differences.

parquet-format

generative-models

parquet-format

generative-models

What are some alternatives?