-
citation-file-format
The Citation File Format lets you provide citation metadata for software or datasets in plaintext files that are easy to read by both humans and machines.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
TLDR: Think of it like Zotero having field names that change depending on the item type. If you select "Journal Article", the CSL `container-title` field is displayed as "Publication", whereas for a book chapter, it's "Book Title". The `software` type in CSL gets a similar treatment in Zotero, and CFF is like that but as a YAML schema for writing by hand. Additionally, you get to give reference data for the dependencies or datasets you built the present one from.
The main page for CFF does not argue for its existence very effectively. You're better off reading the schema (https://github.com/citation-file-format/citation-file-format...). Two takeaways:
1. It is is specifically for citing software and datasets. Only those two things. The fields look useful for its intended purpose: You can have multiple dois, for different versions of the software as published. You can include a few more different URLs than most formats, like zipped repo contents for a version or a dataset download link (repository-artifact) etc. If you use the preferred-citation field to point people to a paper instead, then you are warned it is against the principle of citing software and datasets as if they are papers themselves.
2. Unfortunately, though, because the schema only has enough fields for citing those two, if your software is (e.g.) an implementation of an algorithm described in a paper, you cannot express that in CFF. There is a `references` field, but in reality it can only contain other software and datasets because CFF can't describe anything else. It would be better if there were separate fields for CFF software/dataset references and other kinds of reference data, the latter incorporating CSL-JSON by reference. CSL-JSON isn't really written by hand except by a handful of people making CSL tests (me!); the point of such a field would be a space to dump an export from your reference library.
So in sum, what is the advantage of this?
- Anything but JSON. People hate writing it by hand.
- Software-specific field names, whereas if you used CSL-JSON directly `date-released` would be `issued`... who issues software?
- Separates the "main" citation and the "references", whereas other formats are basically a dump of a reference library.
(Disclosure: I'm the author of Zotero's next-gen citation processor, citeproc-rs.)
CSL doesn't appear to support software as a 'type' of thing, which it has a hardcoded list of options[0]. Of course, maybe they should have just fixed an existing format instead of creating a new one.
[0] https://github.com/citation-style-language/schema/blob/maste...