dbt-fal
Pandas
dbt-fal | Pandas | |
---|---|---|
12 | 399 | |
851 | 42,104 | |
- | 0.9% | |
7.7 | 10.0 | |
about 1 month ago | 4 days ago | |
Python | Python | |
Apache License 2.0 | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dbt-fal
-
machine learning in snowflake, unhappy data scientists
Happy data scientists use fal and dbt
-
dbt for ML Engineering
fal (https://github.com/fal-ai/fal) helps with this! In fact we wrote a blog post about feature engineering with fal and dbt recently
-
Dbt-fal: a dbt Python adapter with local code execution
We built a dbt adapter that helps you run local Python code with your dbt project with any other data warehouse. You can see it here: https://github.com/fal-ai/fal/tree/main/adapter
This new adapter helps you run your dbt Python models with isolated Python environments using our open source library: https://github.com/fal-ai/isolate
-
Data Stack for Python Scripts (and other transformations)
Have you considered fal? https://github.com/fal-ai/fal
-
Comparing dbt with Delta Live Tables for doing transformations
Something to maybe comment on the post is that dbt is introducing Python transformations on the data warehouse offering (e.g. Snowspark) soon and that there are tools like fal that enable these Python transformations to run in a different environment which you have control over.
-
What are the hottest dbt Repositories you should star on Github 2022? - Here are mine.
Fal-AI ( https://github.com/fal-ai/fal ) Fal helps to run Python scripts directly from the dbt project. For example, you can load dbt models directly into the Python context which helps to apply Data Science libraries like SKlearn and Prophet in the dbt models. This especially improves the data science capabilities within a data pipeline. What I extremely like about fal is that it extends dbt from a interesting angle.
-
What are your hottest dbt repositories in 2022 so far? Here are mine!
- π fal ai: Fal helps to run Python scripts directly from the dbt project. For example you can load dbt models directly into the Python context which helps to apply Data Science libaries like SKlearn and Prophet in the dbt models.
-
Wanting to move away from SQL
I havenβt tried it yet but I know https://fal.ai/ helps you run python alongside dbt.
-
Do I need orchestration for a Fivetran-dbt stack?
Yes I agree with you that having fivetran/airbyte and dbt covers a lot of the airflow use cases.. That being said you might still want to run some scripts after the DBT transformation is over, we ran into this exact problem and built a useful CLI tool for running python scripts alongside the dbt run.
-
Why is Data Build Tool (DBT) is so popular? What are some other alternatives?
Great write-up! For your logging integration, you might have a look at fal. There's an example of sending events to Datadog
Pandas
- The Birth of Parquet
- PDEP-13: The Pandas Logical Type System
- PHP Doesn't Suck Anymore
-
AWS Serverless Diversity: Multi-Language Strategies for Optimal Solutions
Python is a natural fit for serverless development. It boasts a vast array of libraries, including Powertools for AWS and robust libraries for data engineers. Its versatility and excellent developer experience make it a top choice for serverless projects, offering a seamless and enjoyable development experience.
-
Pandas reset_index(): How To Reset Indexes in Pandas
In data analysis, managing the structure and layout of data before analyzing them is crucial. Python offers versatile tools to manipulate data, including the often-used Pandas reset_index() method.
-
Deploying a Serverless Dash App with AWS SAM and Lambda
Dash is a Python framework that enables you to build interactive frontend applications without writing a single line of Javascript. Internally and in projects we like to use it in order to build a quick proof of concept for data driven applications because of the nice integration with Plotly and pandas. For this post, I'm going to assume that you're already familiar with Dash and won't explain that part in detail. Instead, we'll focus on what's necessary to make it run serverless.
-
Help Us Build Our Roadmap β Pydantic
there is pull request to integrate in both pydantic extra types and into pandas cose [1]
[1]: https://github.com/pandas-dev/pandas/issues/53999
-
Stuff I Learned during Hanukkah of Data 2023
Last year I worked through the challenges using VisiData, Datasette, and Pandas. I walked through my thought process and solutions in a series of posts.
-
Introducing Flama for Robust Machine Learning APIs
pandas: A library for data analysis in Python
-
Exploring Open-Source Alternatives to Landing AI for Robust MLOps
Data analysis involves scrutinizing datasets for class imbalances or protected features and understanding their correlations and representations. A classical tool like pandas would be my obvious choice for most of the analysis, and I would use OpenCV or Scikit-Image for image-related tasks.
What are some alternatives?
dbt-metabase - dbt + Metabase integration
Cubes - [NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis
dbt-expectations - Port(ish) of Great Expectations to dbt test macros
tensorflow - An Open Source Machine Learning Framework for Everyone
kuwala - Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times
orange - π :bar_chart: :bulb: Orange: Interactive data analysis
evidence - Business intelligence as code: build fast, interactive data visualizations in pure SQL and markdown
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
airflow-dbt - Apache Airflow integration for dbt
Keras - Deep Learning for humans
re_data - re_data - fix data issues before your users & CEO would discover them π
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration