-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
There are a few papers on watermarking LLM output, but from what I have seen they all use complex methods of detection to allow the watermark to go unseen by the end user, only to be detected by algorithm. I believe that a more overt system of watermarking might also be beneficial. One simple method that I have tried is character substitution. For this model, I LORA finetuned openlm-research/open_llama_3b on the alpaca_data_cleaned_archive.json dataset from https://github.com/tloen/alpaca-lora/ modified by replacing all instances of the "." character in the outputs with a "ι" The results are pretty good, with the correct the correct substitutions being generated by the model in most cases. It doesn't always work, but this was only a LORA training and for two epochs of 400 steps each, and 100% substitution isn't really required.