wimsey VS data-engineer-handbook

Compare wimsey vs data-engineer-handbook and see what are their differences.

data-engineer-handbook

This is a repo with links to everything you'd ever want to learn about data engineering (by DataExpert-io)
Judoscale - Save 47% on cloud hosting with autoscaling that just works
Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
judoscale.com
featured
InfluxDB high-performance time series database
Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
influxdata.com
featured
wimsey data-engineer-handbook
4 3
128 27,545
3.9% 2.8%
7.3 9.1
15 days ago 14 days ago
Python Jupyter Notebook
MIT License -
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

wimsey

Posts with mentions or reviews of wimsey. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-02-09.
  • Classic Data science pipelines built with LLMs
    5 projects | news.ycombinator.com | 9 Feb 2025
    I'm definitely biased because my day job is writing ETL pipelines and supporting software, and my current side project is a data contracts library for helping the above[0]. Still I'm not sure I see this happening.

    80% of the focus of an ETL pipeline is in ensuring edge cases are handled appropriately (i.e. not producing models from potentially erroneous data, dead letter queing unknown fields etc).

    I think an LLM would be great for "take this json and make it a pandas dataframe", but a lot less great for interact with this billing API to produce auditable payment tables.

    For areas that are reliability focused, LLMs still need a lot more improvments to be useful.

    [0] https://github.com/benrutter/wimsey

  • The Data Engineering Handbook
    2 projects | news.ycombinator.com | 19 Nov 2024
    Nice list! Although as somebody who works on open source tools for data engineering, it kills me a little to see "companies" as the the list header rather than, say, "projects".

    (also, shameless plug for my.latest project Wimsey which is non-company affiliated but does let you test data in a nice, lightweight way: https://github.com/benrutter/wimsey)

  • Wimsey: A flexible, lightweight data contracts library
    1 project | news.ycombinator.com | 15 Nov 2024
  • This Week In Python
    5 projects | dev.to | 1 Nov 2024
    wimsey – Easy and flexible data testing and documentation

data-engineer-handbook

Posts with mentions or reviews of data-engineer-handbook. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-11-25.
  • Data-engineer-handbook: everything to learn about data engineering
    1 project | news.ycombinator.com | 3 Dec 2024
    This thing points to some sort of github metrics dashboard.

    The actual handbook is at: https://github.com/DataExpert-io/data-engineer-handbook

  • FLAIV-KING Weekly 25 Nov 2024
    19 projects | dev.to | 25 Nov 2024
    ❄️ Snowflake Cortex AI + Slack 🌐 Mistral Multi Modal ❄️ Journey to Snowflake Monitoring Mastery ❄️ Best Practices for Using QueryTag in Snowflake 💻 CFP The Way 🦾 Multi Agent Framework AWS 📊 Cool Emojis 📈 bRAG LangChain 📝 Tool: What's in your stack? 📎 Awesome Event Driven Architecture Articles 📝 10 AI Open Source Tools 🫶 Documind 💻 Zerox AI 📈 Garak Open Source LLM Scanner 🦾 Magic Quill Image Project 🏃 vLLM to serve LLMs 🤖 OASIS Simulator 🙋🏻‍♂️ Creating Projects with UV 🛠️ RagFormation ✅ AutoKitteh Tool ✅ WebVM in the browser ✅ UV Pytorch ✅ Redis SQL Trino ✅ Bluesky Websocket Firehose in browser ✅ Data Engineer Handook ✅ Automatic Speech Recognition (ASR) on Edge Devices 🦾 Automatic Researcher with Ollama 🛠️ Podcastify Open Source 🙋🏻‍♂️ NVIDIA Jetpack 6.1 Upgrade
  • The Data Engineering Handbook
    2 projects | news.ycombinator.com | 19 Nov 2024

What are some alternatives?

When comparing wimsey and data-engineer-handbook you can also consider the following projects:

Scrapling - 🕷️ An undetectable, powerful, flexible, high-performance Python library that makes Web Scraping simple and easy again!

LLaVA-o1

finstruments - Financial instrument definitions built with Python and Pydantic

awesome-sqlite - A curated list of awesome things related to SQLite

abacus-minimal - A minimal event-based ledger in Python that follows accounting rules

awesome-selfhosted-data - machine-readable data for https://awesome-selfhosted.net

Judoscale - Save 47% on cloud hosting with autoscaling that just works
Judoscale integrates with Django, FastAPI, Celery, and RQ to make autoscaling easy and reliable. Save big, and say goodbye to request timeouts and backed-up task queues.
judoscale.com
featured
InfluxDB high-performance time series database
Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
influxdata.com
featured

Did you know that Python is
the 2nd most popular programming language
based on number of references?