eventsim vs corp

eventsim

Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic. (by viirya)

Suggest topics

Source Code

Suggest alternative

Edit details

corp

Assets related to the operation of Fishtown Analytics. (by dbt-labs)

Suggest topics

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

eventsim		corp
	Project
3	Mentions	12
61	Stars	413
-	Growth	-0.2%
0.0	Activity	4.6
4 months ago	Latest Commit	19 days ago
Scala	Language
-	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

eventsim

Posts with mentions or reviews of eventsim. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-05-25.

What are some good publicly available real-time data sources?
6 projects | /r/dataengineering | 25 May 2023

In my last real-time sideproject, I used an open source music app simulator called Eventsim. The original doesn't work anymore, but this repo still did when I used it around Nov/Dec.
Mock Data Stream for learning purposes
1 project | /r/dataengineering | 6 Mar 2023

You can use this event simulator repo perhaps? They have a docker file as well. Hope this helps.
Completed my first Data Engineering project with Kafka, Spark, GCP, Airflow, dbt, Terraform, Docker and more!
13 projects | /r/dataengineering | 2 Apr 2022

Eventsim is a program that generates event data to replicate page requests for a fake music web site. The results look like real use data, but are totally fake. The docker image is borrowed from viirya's fork of it, as the original project has gone without maintenance for a few years now.

corp

Posts with mentions or reviews of corp. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-05-03.

Are there database design Standards out there? As in, formal documents listing exact best practices for OLTP database design?
2 projects | /r/dataengineering | 3 May 2023

Here's one that covers some of your points and that I like in general: https://github.com/dbt-labs/corp/blob/main/dbt_style_guide.md Except instead of prefixing my table names with the processing stage, I keep them in schemas by processing stage (source, staging, analytics). So, I can tell my analysts to look into the analytics schema for all the final tables, and they won't be bothered by intermediate models. The table names also have a precise structure that corresponds to our specific subject.
Looking to understand why the dbt style guide recommends to use *all lower case* for keywords, field names, and function names?
1 project | /r/SQL | 8 Feb 2023
Best practices for data modeling with SQL and dbt
1 project | /r/SQL | 23 Aug 2022

I find the content more or less ripped from of dbt's own styleguide
SQL Code Style Properties Questions
1 project | /r/Jetbrains | 15 Jun 2022

For anyone wondering this is the DBT style guide I am referencing from.
A modern data stack for startups
4 projects | dev.to | 21 Apr 2022

While the tool choice is obvious, how to use dbt is going to be a more controversial. There's a load of great resources on dbt best practices, but as you can see from my Slack questions, there's enough ambiguity to tie you up.
Completed my first Data Engineering project with Kafka, Spark, GCP, Airflow, dbt, Terraform, Docker and more!
13 projects | /r/dataengineering | 2 Apr 2022

Just a slight critique, but I noticed some of the dbt models are a bit hard to read. Especially your dim_users SCD2 model, which uses lots of nested subqueries and multiple columns on the same line. You may want to refer to this style guide from dbt Labs. I find CTEs are a lot easier to parse and read.
What are some good resources for learning to write clean, production-quality code?
3 projects | /r/datascience | 20 Feb 2022

I really like thisthis SQL STYLE GUIDE, and if you use dbt, the dbt style guide.
How do you format your SQL queries?
1 project | /r/SQL | 10 Feb 2022

I like this one very much from dbt very much.
Where do you like to do the L of ELT? Python or DBT?
1 project | /r/dataengineering | 8 Feb 2022

I recommend you write one. You can take inspiration from dbt's one or Gitlab
Confused about benefits of CTE
1 project | /r/SQL | 2 Apr 2021

I've seen fishtown analytics coding conventions recommend a lot around here, but there are a few things about their recommendations of CTE use that confuse me.

What are some alternatives?

When comparing eventsim and corp you can also consider the following projects:

streamify - A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!

nodejs-bigquery - Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

spark-bigquery-connector - BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

sql-style-guide - An opinionated guide for writing clean, maintainable SQL.

RedfinScraper - Scrapes Redfin data.

terraform - Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.

mockingbird - Mockingbird is a mock streaming data generator

pgsink - Logically replicate data out of Postgres into sinks (files, Google BigQuery, etc)

eventsim - Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.

awesome-public-real-time-datasets - A list of publicly available datasets with real-time data maintained by the team at bytewax.io