parquet-format
configu
parquet-format | configu | |
---|---|---|
4 | 17 | |
1,655 | 1,506 | |
2.4% | 1.3% | |
7.2 | 9.1 | |
5 days ago | 2 days ago | |
Thrift | TypeScript | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
parquet-format
-
Summing columns in remote Parquet files using DuckDB
Right, there's all sorts of metadata and often stats included in any parquet file: https://github.com/apache/parquet-format#file-format
The offsets of said metadata are well-defined (i.e. in the footer) so for S3 / blob storage so long as you can efficiently request a range of bytes you can pull the metadata without having to read all the data.
- FLaNK Stack for 4th of July
-
I have question related to Parquet files and AWS Glue
As i read here https://github.com/apache/parquet-format/blob/master/LogicalTypes.md , they are store in Integer formats and these integers represent the number of days (for Date) or number of milliseconds, microseconds or nanoseconds (for DateTime) since 1970-01-01. This works as expected with the parquet file that written by our ETL tool from internal database --> S3, all Data/DateTime columns are Integers, means that in Glue Job, i have to convert these Integers back to Date/Datetime value to do some transformation on them. But when parquet files are written by Spark, they are Date/DateTime (or TimeStamp to be more concise) format not Integers (i checked by read these files again in other Glue Job) and that make me confused.
-
Parquet: More than just “Turbo CSV”
Date is confusing with a timezone (UTC or otherwise) and the doco makes no such suggestion.
The Parquet datatypes documentation is pretty clear that there is a flag isAdjustedToUTC to define if the timestamp should be interpreted as having Instant semantics or Local semantics.
https://github.com/apache/parquet-format/blob/master/Logical...
Still no option to include a TZ offset in the data (so the same datum can be interpreted with both Local and Instant semantics) but not bad really.
configu
-
Hacktoberfest 2023: Where Open Source Enthusiasts of All Levels Unite
As we celebrate Hacktoberfest, we at Configu invite you to be part of our mission to redefine software configuration management. We've set out to tackle the persistent challenge of configuration chaos, and we're making strides every day. If you're searching for a place to make a significant impact this Hacktoberfest, consider Configu. Delve into our open-source repository, understand our vision, and contribute to shaping our journey. If you're unsure where to begin or need some help along the way, our Configu Discord community is always here to guide you. For newcomers, we recommend starting with issues labeled 'good-first-issues'.
-
Unmasking Ghost Parameters, or How to Save Time and Money
Enter Configu, an open source implementation of the concept of Configuration-as-Code, ensuring that the code remains the source of truth. But we didn't stop there. We've just launched a new feature called configu find that takes configuration management to the next level.
-
Shift from ENV Files to Configuration-as-Code
Thanks Geva for sharing my article.
In this piece, I delve into the world of application configuration, discussing the drawbacks of using environment variables stored in env files and introducing the powerful alternative - Configuration as Code (CaC).
Looking forward to your thoughts! and If you're intrigued, I invite you to explore our open-source project on GitHub: https://github.com/configu/configu
- GitHub - Configu: Open-source project that puts an end to your configuration Chaos
- Configu - Open-source project that puts an end to your configuration Chaos
- Configu - Unified all your configuration solutions under the same interface
- FLaNK Stack for 4th of July
- Configu: a simple, modern, and generic standard for managing and collaborating software configurations ⚙️✨
-
Configu: Unleashing the Power of Configuration as Code
View on GitHub
What are some alternatives?
rapidgzip - Gzip Decompression and Random Access for Modern Multi-Core Machines
background-removal-js - Remove backgrounds from images directly in the browser environment with ease and no additional costs or privacy concerns. Explore an interactive demo.
xgen - Salesforce open-source LLMs with 8k sequence length.
wizmap - Explore and interpret large embeddings in your browser with interactive visualization! 📍
gping - Ping, but with a graph
FastSAM - Fast Segment Anything
generative-models - Generative Models by Stability AI
graphic-walker - An open source alternative to Tableau. Embeddable visual analytic
hacktoberfest-data - Generating stats from the raw Hacktoberfest application data.