datasets
sqlfluff
datasets | sqlfluff | |
---|---|---|
15 | 35 | |
18,443 | 7,219 | |
1.0% | 1.2% | |
9.5 | 9.6 | |
1 day ago | 7 days ago | |
Python | Python | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
datasets
- ππ 23 issues to grow yourself as an exceptional open-source Python expert π§βπ» π₯
- Mastering ROUGE Matrix: Your Guide to Large Language Model Evaluation for Summarization withΒ Examples
-
How to Train Large Models on Many GPUs?
https://github.com/huggingface/datasets
https://github.com/huggingface/transformers
-
[D] Can we use Ray for distributed training on vertex ai ? Can someone provide me examples for the same ? Also which dataframe libraries you guys used for training machine learning models on huge datasets (100 gb+) (because pandas can't handle huge data).
https://huggingface.co/docs/datasets backed with an Arrow file or buffer
- Need help with a data science project
-
Is there a text evaluation metric that does not need reference text?
I'm looking for an automatic evaluation metric that can score the first text higher (since it's more grammatically correct/better for other reasons). All the metrics for NLG I found require some reference text to match the generated text with, which I don't have.
-
FauxPilot β an open-source GitHub Copilot server
And then pass that my_code.json as the dataset name.
[1] https://github.com/huggingface/datasets
-
Hugging Face Introduces βDatasetsβ: A Lightweight Community Library For Natural Language Processing (NLP)
Code for https://arxiv.org/abs/2109.02846 found: https://github.com/huggingface/datasets
Quick Read | Paper | Github
- Datasets: A Community Library for Natural Language Processing
sqlfluff
-
ππ 23 issues to grow yourself as an exceptional open-source Python expert π§βπ» π₯
Repo : https://github.com/sqlfluff/sqlfluff
-
SQL Reserved Words β The Empirical List
I'm surprised sqlfluff hasn't been mentioned yet. Perhaps not a comprehensive list, but it's worked for everything I've thrown at it. There's an ANSI keyword list [0], and then dialect-specific lists for everything from DB2 [1] to Snowflake [2].
[0]: https://github.com/sqlfluff/sqlfluff/blob/main/src/sqlfluff/...
-
Show HN: Postgres Language Server
It has tons of annoying quirks, but I couldn't imagine running a DBT project without it: https://github.com/sqlfluff/sqlfluff
-
Front page news headline scraping data engineering project
Move SQL queries to sql files and read from files (Use sqlfluff to lint the code https://github.com/sqlfluff/sqlfluff)
- Anything like SQLFluff written in Rust?
-
Code autoformatter for SQL in VSCode that plays nicely with dbt
SQLFluff is a good CLI tool for this and includes support for jinja and dbt. I don't think there's a VSCode plugin for it yet.
-
Ask HN: How do you test SQL?
This linter can really enforce some best practices https://github.com/sqlfluff/sqlfluff
A list of best practices:
-
What is something you would learn at college but not a bootcamp (hard skills)
BigQuery SQL and SQLFluff
-
Is the knowledge on how Compilers work applicable to the role of a Data Engineer?
There's a SQL parser/linter called SQLFluff that my team uses for our CI/CD. I've made a few pull requests to fix the parser for the particular SQL dialect we used, and my college compiler classes definitely helped.
-
sqlfluff VS ANTLR - a user suggested alternative
2 projects | 12 Dec 2022
What are some alternatives?
sentence-transformers - Multilingual Sentence & Image Embeddings with BERT
vscode-sqlfluff - An extension to use the sqlfluff linter in vscode.
datumaro - Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.
sqlparse - A non-validating SQL parser module for Python
cypress-realworld-app - A payment application to demonstrate real-world usage of Cypress testing methods, patterns, and workflows.
dbt-utils - Utility functions for dbt projects.
edex-ui - A cross-platform, customizable science fiction terminal emulator with advanced monitoring & touchscreen support.
ale - Check syntax in Vim/Neovim asynchronously and fix files, with Language Server Protocol (LSP) support
first-contributions - πβ¨ Help beginners to contribute to open source projects
soda-sql - Data profiling, testing, and monitoring for SQL accessible data.
frankmocap - A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator
Metabase - The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum: