Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
java-probabilistic-earley-parser
🎲 Efficient Java implementation of the probabilistic Earley algorithm to parse Stochastic Context Free Grammars (SCFGs)
Next with some research I read Parsing With Derivatives, with an implementation in scala. Unfortunately, this still appears to have exponential runtime; my initial test target string of ~20 characters took an hour to complete.
Next, I tried a general GLL parser. This, however, required more memory than my 16GB macbook could provide. A 256G AWS cloud instance proved insufficient, as did a 1TB cloud instance. No good.
Finally, I looked for java libraries for Earley parsers, and found this one. Constructing the grammar was straightforward, and I had the advantage of being able to "weight" each production rule with a 1-epsilon probability, meaning it would selectively produce the shortest possible parse derivation first.