Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Open-australian-legal-corpus-creator Alternatives
Similar projects and alternatives to open-australian-legal-corpus-creator
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
DragonBreath
"DragonBreath F.10 (American Constitutional Supreme and Mandatory Primary Source Case/Common Law Bulk Parser)" (by wethepeopleonline)
open-australian-legal-corpus-creator reviews and mentions
-
Show HN: Mapping almost every law, regulation and case in Australia
Hey HN,
After months of hard work, I am excited to share the first ever semantic map of Australian law.
My map represents the first attempt to map Australian laws, cases and regulations across the Commonwealth, States and Territories semantically, that is, by their underlying meaning.
Each point on the map is a unique document in the [Open Australian Legal Corpus](https://huggingface.co/datasets/umarbutler/open-australian-l...), the largest open database of Australian law (which, full disclosure, I [created](https://umarbutler.com/how-i-built-the-largest-open-database...)). The closer any two points are on the map, the more similar they are in underlying meaning.
As I cover in my article, there’s a lot you can learn by mapping Australian law. Some of the most interesting insights to come out of this initiative are that:
⦁ Migration, family and substantive criminal law are the most isolated branches of case law on the map;
⦁ Migration, family and substantive criminal law are the most distant branches of case law from legislation on the map;
⦁ Development law is the closest branch of case law to legislation on the map;
⦁ Case law is more of a continuum than a rigidly defined structure and the borders between branches of case law can often be quite porous; and
⦁ The map does not reveal any noticeable distinctions between Australian state and federal law, whether it be in style, principles of interpretation or general jurisprudence.
If you’re interested in learning more about what the map has to teach us about Australian law or if you’d like to find out how you can create semantic maps of your own, check out the full article on my blog, which provides a detailed analysis of my map and also covers the finer details of how I built it, with code examples offered along the way.
-
I built the largest open database of Australian law
> Just one note - the link in your Github readme to https://umarbutler.com/open-australian-legal-corpus doesn't seem to go anywhere.
Thanks for the heads up! I've fixed that now.
> For someone interested in using the data (and help out with bugs/issues), where would you suggest starting?
I think the best place to start is by downloading the Corpus (visit https://huggingface.co/datasets/umarbutler/open-australian-l... , and then click "Files and versions" and then "corpus.jsonl"). You can then use my Python library orjsonl to parse the dataset (you'd run, `corpus = orjsonl.load('corpus.jsonl')`). At that point, there's any number of applications you could use the dataset for. You could pretrain a model like BERT, ELECTRA, etc... and share it on HuggingFace. You could connect the dataset to GPT and do RAG over it. Etc...
-
Show HN: I created a first-of-its-kind open corpus of Australian law
Hey HN, today I'm sharing my latest project, the Open Australian Legal Corpus, a first-of-its-kind multijurisdictional open corpus of Australian legislative and judicial documents. The idea behind this dataset was born a few months ago, when, while attempting to pretrain a BERT model for the Australian legal domain, I discovered that there was no freely accessible, openly licensed text corpus of Australian laws and cases that I could use. This was in contrast to the US, UK and EU which all had multiple large open legal corpora available. Thus, I set out to the fill the gap in Australian legal AI research by compiling a dataset of as many in force Australian laws, regulations, bills and decisions as I could find. The end product was a corpus of 97,750 texts totalling over forty million lines and half a billion tokens, and spanning five states, one external territory and the Commonwealth.
You can view the corpus on [HuggingFace](https://huggingface.co/datasets/umarbutler/open-australian-l...) and the code used to create it on [Github]( https://github.com/umarbutler/open-australian-legal-corpus-c...).
-
A note from our sponsor - InfluxDB
www.influxdata.com | 2 May 2024
Stats
umarbutler/open-australian-legal-corpus-creator is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of open-australian-legal-corpus-creator is Python.
Popular Comparisons
- open-australian-legal-corpus-creator VS realestate-com-au-api
- open-australian-legal-corpus-creator VS open-australian-legal-corpus-c
- open-australian-legal-corpus-creator VS licensee
- open-australian-legal-corpus-creator VS Chinese-Names-Corpus
- open-australian-legal-corpus-creator VS Actions
- open-australian-legal-corpus-creator VS ua-gec
- open-australian-legal-corpus-creator VS a2jauthor
- open-australian-legal-corpus-creator VS DragonBreath
Sponsored