Launch HN: Evidence (YC S21) – Web framework for data analysts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • evidence

    Business intelligence as code: build fast, interactive data visualizations in pure SQL and markdown

    Hi HN!

    We’re Adam and Sean from Evidence (https://evidence.dev). We’re building a static site generator for data analysts.

    It's like Jekyll or Hugo for SQL analysts.

    In Evidence, pages are markdown documents. When you write SQL inside that markdown, the SQL runs against your database (we support BigQuery, Snowflake, and Postgres - with more to come). You can reference the results of those queries using a simple templating syntax, which you can use to inline query results into text or to generate report sections from a query. Evidence also includes a component library that lets you do things like add charts and graphs (driven by your queries) by writing declarative tags like:

    How is it different? Most BI tools use a no-code drag-and-drop interface. Analysts click around to build their queries, set up their charts etc. and then they drag them into place onto a dashboard. To stick with the analogy, if Evidence is Hugo, most BI tools are Squarespace. BI tools are built that way because they assume that data analysts are fundamentally non-technical. In our experience, that assumption is no longer correct. Data analysts increasingly want tools that let them adopt software engineering practices like version control, testing, and abstraction.

    When everything is under version control, you are less likely to ship an incorrect report. When you can write a for loop, you can show sections for each region, product-line etc., instead of asking your users to engage with a filter interface. When you can abstract a piece of analysis into a reusable component, you don’t have to maintain the same content in multiple places. Basically, we’re providing the fundamentals of programming in a way that analysts can easily make use of.

    Reporting tools have been around since COBOL, and have gone through many iterations as tech and markets have evolved. Our view is that it’s time for the next major iteration. We worked together for five years building the data science group at a private equity firm in Canada. We set up ‘the modern data stack’ (Fivetran, dbt, BigQuery etc.) at many of the firm’s portfolio companies and we were in the room during a lot of key corporate decisions.

    In our experience, the BI layer is the weakest part of the modern data stack. The BI layer has a poor developer experience, and decision makers don’t really like the outputs they get. It turns out, these two issues are closely related. The drag and drop experience is so slow and low-leverage that the only way to get all the content on the page is to push a lot of cognitive load onto the end user: global filters, drill down modals, grids of charts without context. Like most users, business people hate that shit. And because the production process isn’t in code, the outputs are hard to version control and test—so dashboards break, results are internally inconsistent, and so on, in just the way that software would suck if you didn’t version control and test it.

    As early adopters of the modern data stack, we saw the value in treating analytics more like software development, but we were consistently disappointed with the workflow and the quality of the outputs our team could deliver using BI tools and notebook products. Graphics teams that we admire at newspapers like the New York Times don’t use BI tools or Jupyter notebooks to present their work. They code their data products by hand, and the results are dramatically better than what you see in a typical BI deployment. That’s too much of an engineering lift for most data teams, but with a framework designed for their needs and their range of expertise, we think data teams could build products that come much closer to those high standards.

    Evidence is built on Svelte and Svelte Kit. This is the JS framework that the NYT has used to build some of their more recent data products, like their Covid risk maps. Sean and I fell in love with Svelte, and we owe a huge debt to that project. In this early stage,Evidence is really just a set of convenience features wrapped around SvelteKit to make it accessible to data analysts (the markdown preprocessor, db connections, chart library). The core framework will always be open source, and eventually we plan to launch a paid cloud version of our product, including hosting, granular access control, and other features that enterprises might pay for.

    We would love to hear your thoughts, questions, concerns, or ideas about what we’re building - or about your experiences with business intelligence in general. We appreciate all feedback and suggestions!

  • rmarkdown

    Dynamic Documents for R

    Have you heard of knitr (https://yihui.org/knitr/)? It's the gold standard as far as I'm concerned for dynamic report generation that needs to run code. Since it supports running arbitrary shell commands, it can already be used to query remote databases as long as you have a CLI to query them with. Combined with RMarkdown (https://rmarkdown.rstudio.com/), which augments Markdown with support for LaTeX typesetting, it's the ultimate toolset for doing this kind of thing. You can read a blog post here on how to use knitr within RMarkdown: https://kbroman.org/knitr_knutshell/pages/Rmarkdown.html

    I'm not trying to be a downer, but it seems like your product is just duplicating the functionality of these existing products but does less since it only supports SQL and Markdown.

    I guess you autogenerate charts, but it says you're targeting a technical audience that is presumably comfortable calling functions in Python and R for graphical data visualization.

    This is nitpicky, and I'm sure you have some command line option to choose another port (though your "get started" doesn't show how), but mdbook also uses 3000. I'm sure they probably weren't the first to default to that, either.

    I hope this doesn't come across as downplaying your product. It looks nice. I just don't see what you offer here that can't already be done with existing data ecosystem tools. I was using RMarkdown with knitr to generate all of my papers when I was an ML grad student years ago. It felt back then like I was the only person at Georgia Tech who realized these tools existed, and now it still feels that way.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • metriql

    The metrics layer for your data. Join us at https://metriql.com/slack

    We use BSL license and metriql is free with a single database target. If you want to connect multiple dbt projects in a single deployment, you need to go through the sales cycle.

    We work with ETL vendors that use metriql to make revenue with our BI tool integrations so we picked BSL license to be able to structure our business model in a way that you should be required to pay only if you're reselling metriql to your customers.

    You can find the license here: https://github.com/metriql/metriql

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts