-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
It's all fully public. Here are some tools that will get you the negotiated rates links along with the sizes of all the files: https://github.com/alecstein/transparency-in-coverage-filesi...
It's all fully public. Here are some tools that will get you the negotiated rates links along with the sizes of all the files: https://github.com/alecstein/transparency-in-coverage-filesi...
So - veteran data wrangler here. I skimmed some of Humana's files. There's lots of repetition that can easily be removed when converting from a raw input to an analytical dataset - basically the huge blocks of text in "BILL_TYPE_CODE: 130,139,..." in the ADDITIONAL_INFO field can be normalized away by building a quasi-Huffman-encoded lookup table.
Noteworthy(?): there seems to be a limit of 100 sets of prices, as seen in the filenames:
2022-08-25_NN_in-network-rates_0000000XXXXX.csv.gz
Did I miss something? ... or is this some kind of technical limitation for Humana?
Also, each plan member's JSON file has a small chunk of useful information, then a useless list of all 15k gz parts of a relevant NN_in-network-rates file (you only need the first filename to figure out which NN to reference).
For these files, you can use Range requests to download only the first, say, 50KB, and pipe it to gunzip and jq. (https://github.com/stedolan/jq/issues/31#issuecomment-900184...)
I would also be interested in helping throw such an analytical dataset into BigQuery. It'll be great for sharing an open dataset. No doubt this will still be a gigantic headache, but it is tractable.
I'm the CEO and cofounder of a health insurer that published its pricing data. See it on GitHub at https://github.com/evryhealth/price-transparency To add some fuel to this discussion, US healthcare has grown 3x as a percentage of GDP in 60 years from approx. 5% to more than 15% [1]. It has done so at the expense of the rest of the economy and communities.
Pricing transparency is only one piece of the puzzle. It is a tremendously antiquated industry. Fax is still state of the art -- welcome to the 1980s!
[1] CMS. https://www.cms.gov/Research-Statistics-Data-and-Systems/Sta...
All this banter arguing over CSV, JSON, sqlite seems unnecessary when you can just push format X through a pipe and get whichever format Y you want back out: https://github.com/liquidaty/zsv/blob/main/docs/csv_json_sql...
(disclaimer: I'm one of the zsv authors)