Read files from s3 using Pandas/s3fs or AWS Data Wrangler?

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. s3fs

    S3 Filesystem

  4. AWS Data Wrangler

    pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

    I had no problem with awswrangler (https://github.com/aws/aws-sdk-pandas) and it supports reading and writing partitions which was really helpful and a few other optimizations that made it a great tool

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • We are the developers behind pandas, currently preparing for the 2.0 release :) AMA

    9 projects | /r/Python | 1 Mar 2023
  • Data Visualisation Basics

    3 projects | dev.to | 6 Sep 2024
  • Useful Python Libraries for AI/ML

    5 projects | dev.to | 10 Aug 2024
  • The Design Philosophy of Great Tables (Software Package)

    7 projects | news.ycombinator.com | 4 Apr 2024
  • Welcome to 14 days of Data Science!

    1 project | dev.to | 7 Mar 2024

Did you know that Python is
the 2nd most popular programming language
based on number of references?