I built the largest open database of Australian law

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

open-australian-legal-corpus-creator

3 57 8.3 Python

The code used to create and update the Open Australian Legal Corpus, the first and only multijurisdictional open corpus of Australian legislative and judicial documents.

> Just one note - the link in your Github readme to https://umarbutler.com/open-australian-legal-corpus doesn't seem to go anywhere.
Thanks for the heads up! I've fixed that now.
> For someone interested in using the data (and help out with bugs/issues), where would you suggest starting?
I think the best place to start is by downloading the Corpus (visit https://huggingface.co/datasets/umarbutler/open-australian-l... , and then click "Files and versions" and then "corpus.jsonl"). You can then use my Python library orjsonl to parse the dataset (you'd run, `corpus = orjsonl.load('corpus.jsonl')`). At that point, there's any number of applications you could use the dataset for. You could pretrain a model like BERT, ELECTRA, etc... and share it on HuggingFace. You could connect the dataset to GPT and do RAG over it. Etc...

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Show HN: Mapping almost every law, regulation and case in Australia

1 project | news.ycombinator.com | 22 Mar 2024
[N] Grammarly releases a grammatical error correction (GEC) dataset for the Ukrainian language

1 project | /r/MachineLearning | 6 Apr 2021
Unconventional Aggregation

1 project | news.ycombinator.com | 29 Apr 2024
How Slow Is Database IO？

1 project | news.ycombinator.com | 25 Apr 2024
Computing Engine on Web

1 project | news.ycombinator.com | 22 Apr 2024

I built the largest open database of Australian law

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Law australia Corpus Dataset Legal
Post date: 29 Oct 2023

open-australian-legal-corpus-creator

InfluxDB

Related posts

Show HN: Mapping almost every law, regulation and case in Australia

[N] Grammarly releases a grammatical error correction (GEC) dataset for the Ukrainian language

Unconventional Aggregation

How Slow Is Database IO？

Computing Engine on Web

I built the largest open database of Australian law

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Law australia Corpus Dataset Legal Post date: 29 Oct 2023

open-australian-legal-corpus-creator

InfluxDB

Related posts

Show HN: Mapping almost every law, regulation and case in Australia

[N] Grammarly releases a grammatical error correction (GEC) dataset for the Ukrainian language

Unconventional Aggregation

How Slow Is Database IO？

Computing Engine on Web

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Law australia Corpus Dataset Legal
Post date: 29 Oct 2023