parquet-files

Open-source projects categorized as parquet-files

Top 5 parquet-file Open-Source Projects

  • petastorm

    Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

  • Cinchoo ETL

    ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • parquet4s

    Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.

  • parquet-floor

    A lightweight Java library that facilitates reading and writing Apache Parquet files without Hadoop dependencies

  • Project mention: Welcome to mwmbl, the free, open-source and non-profit search engine | news.ycombinator.com | 2023-09-18

    ChatGPT has other failure modes. When a question doesn't have an answer written down somewhere, it really struggles. A case is something like "how do I write a parquet file in Java without using Hadoop".

    This not at all trivial but quite possible[1], but ChatGPT will in 100% of the time either hallucinate APIs, disregard the instructions to not use Hadoop or give otherwise plausible but incorrect-looking answers.

    The trick is that it isn't doable by simply finding the correct dependencies and API calls, you need extract and override filesystem classes from the Hadoop project to cut those ties.

    [1] https://github.com/strategicblue/parquet-floor

  • Threat-Detection-and-Visualization

    Threat Detection and Visualization

  • Project mention: help by increase in stars, forks and pull request | /r/github | 2023-09-10
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Index

What are some of the best open-source parquet-file projects? This list will help you:

Project Stars
1 petastorm 1,752
2 Cinchoo ETL 736
3 parquet4s 271
4 parquet-floor 36
5 Threat-Detection-and-Visualization 35

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com