Jevko: a minimal general-purpose syntax

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
surveyjs.io
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • easyjevko.lua

    An Easy Jevko library for Lua.

  • > 1. It relies on later stages to do things like convert to native values or remove whitespace. As such an intermediary can't really "understand" the document very well. E.g., if you are using this syntax to express key/value then you can only know the keys if you know the whitespace rules that will be used to interpret the keys.

    From the point of view of Jevko, handling whitespace is a higher-level concern. Indeed, you typically will need additional rules to communicate effectively with Jevko.

    This is where you should specify a format (which should be standardized) that you use, e.g.:

    * https://github.com/jevko/easyjevko.lua

    Note: this is just a simple library I wrote recently that does the most straightforward thing imaginable. I haven't yet wrote a spec for the format.

    > 2. It reminds me more of XML than S-Expressions. If you are familiar with the ElementTree [1] representation of XML then these parsed examples become more familiar, just a prefix instead of suffix, and no attribute children.

    Yes, Jevko has the features of both XML and S-expressions (or neither, depending on how you look). It's supposed to be uniquely suitable for both markup and encoding of data/code in a simple way.

    > 3. Connecting whitespace and XML, I suppose this explains why so many XML formats ended up only using attributes for text values, and used tags and children only for nesting. Otherwise interpretation requires understanding how to interpret these mixed documents with unclear whitespace rules. But Jevko doesn't have attributes.

    A markup format built on Jevko can have the notion of attributes and rules for whitespace, e.g. see https://news.ycombinator.com/item?id=33334774 -- a nice thing about this particular format is that text nodes are explicitly specified, so you know exactly where your significant whitespace goes.

    The nice thing about attributes made with Jevko over XML attributes is that you could naturally make them extensible (which is a pain in XML -- if you want to turn your unstructured string value stored in an attribute into a tree you have a problem).

    > 5. I don't see any escaping rules so including literal [] seems... impossible? You can essentially deserialize the parsed value to reintroduce them, but that's weird and awkward, and only works with balanced brackets anyway.

    That's not correct. There are only 3 special symbols (delimiters) and they can all be escaped. It's all in the specification.

    > 6. No way to embed binary data, or otherwise uninterpreted data. JSON has the same problem. Base64 encoding things is yet another way in which the data is not interpretable by intermediaries.

    Indeed, Jevko is not a binary format. Although I've been experimenting with binary equivalents, e.g.:

    https://github.com/jevko/binary-experiments

    At some point I might go forward with one, but that's not the focus right now.

  • easyjevko.js

    A JavaScript library for Easy Jevko -- a simple data format built on Jevko.

  • Responding to some points I left off here https://news.ycombinator.com/item?id=33336789

    I guess the main one is this:

    > If your audience is people like me, I think it would probably be worthwhile for you to spend some time up front describing the intended semantics of a data model, as I've attempted above, rather than leaving people to infer it from the grammar. (Maybe OCaml is not a good way to explain it, though.) You might also want to specify that leading and trailing whitespace in prefixes is not significant, though it is in the suffix ("body"); this would enable people to format their name-value pairs readably without corrupting the data. As far as I can tell, this addendum wouldn't interfere with any of your existing uses for Jevko, though in some cases it would simplify their implementations.

    You're right, things should be explained more clearly (TODO). Especially the exact role of Jevko and treatment of whitespace. I'll try to improve that.

    Here is a sketch of an explanation.

    Plain Jevko is meant to be a low-level syntactic layer.

    It takes care of turning a unicode sequence into a tree.

    On this level, all whitespace is preserved in the tree.

    To represent key-value pairs and other data, you most likely want another layer above Jevko -- this would be a Jevko-based format, such as queryjevko (somewhat explained below) or, a very similar one, easyjevko, implemented and very lightly documented here: https://github.com/jevko/easyjevko.js

    Or you could have a markup format, such as https://github.com/jevko/markup-experiments#asttoxml5

    This format layer defines certain restrictions which may make a subset of Jevkos invalid in it.

    It also specifies how to interpret the valid Jevkos. This includes the treatment of whitespace, e.g. that a leading or trailing whitespace in prefixes is insignificant, but conditionally significant in suffixes, etc.

    Different formats will define different restrictions and interpretations.

    For example:

    # queryjevko

    queryjevko is a format which uses (a variant of) Jevko as a syntax. Only a subset of Jevko is valid queryjevko.

    > I think this is a more useful level of abstraction, and it's more or less the level used by, for example, queryjevko.js's jevkoToJs, although that erroneously uses () instead of [].

    The `()` are used on purpose -- queryjevko is meant to be used in URL query strings and be readable. If square brackets were used, things like JS' encodeURIComponent would escape them, making the string unreadable. Using `()` solves that. "~" is used instead of "`" for the same reason. So technically we are dealing not with a spec-compliant Jevko, but a trivial variant of it. Maybe I should write a meta-spec which allows one to pick the three special characters before instantiating itself into a spec. Anyway the parser implementation is configurable in that regard, so I simply configure it to use "~()" instead of "`[]".

    > (Also, contrary to your assertion above that this is an example of "leaving [Jevko's data model] as-is", it forgets the order of the name-value pairs as well as I guess all but one of any duplicate set of fields with the same name and also the possibility that there could be both fields and a body.)

    I meant [whitespace] rather than [Jevko's data model].

    Again, queryjevko is a format which uses Jevko as an underlying syntax. It specifies how syntax trees are converted to JS values, by restricting the range of valid Jevkos. It also specifies conversion in the opposite direction, likewise placing restrictions on JS values that can be safely converted to queryjevko.

    The order of name-value pairs happens to get preserved (because of the way JS works), but that's not necessarily relevant. If I were to write a cross-language spec for queryjevko, I'd probably specify that this shouldn't be relied upon.

    Duplicate fields and Jevkos with both fields and a non-whitespace body will produce an error when converting Jevko->JS.

    I hope this clarifies things somewhat.

    Lastly, I'll respond to this for completeness:

    > (By the way, if you want to attribute your JSON example for copyright reasons, you need to attribute it to its author or authors, not to the Wikipedia, which is just the site they posted it on.)

    According to this:

    https://en.wikipedia.org/wiki/Wikipedia:Reusing_Wikipedia_co...

    there are 3 options, one of them being what I did, which is to include a link.

    I think that's all.

    Have a good one!

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
  • specifications

    Specifications related to Jevko. (by jevko)

  • Thank you for your feedback. Can you clarify?

    What is the "first page" that you are referring to?

    Can you paste a link to it along with the broken examples link?

    This Hacker News submission features the blog post under this URL:

    https://djedr.github.io/posts/jevko-2022-02-22.html

    Clearly, you are not talking about this page, as that contains multiple links rather than a singular link.

    Perhaps you are talking about the specification which is here:

    https://github.com/jevko/specifications/blob/master/spec-sta...

    (linked from the blog post)

    and here:

    https://jevko.org/spec.html

    (linked from jevko.org)

    All three link to Jevko examples here:

    https://github.com/jevko/examples

    but all these examples links seem to be correct on my end.

    I agree about the importance of examples, and I try to lead with them on jevko.org and jevko.github.io (which are the front pages of Jevko -- possibly I should merge them into one).

    However a formal specification is not necessarily the place to put the leading examples.

    This is also where the Subjevko rule is defined. It isn't quite introduced as "known knowledge" -- the purpose of a specification is to define the unknown, more or less from the ground up. This is also why specifications tend to get a little abstract. Jevko's spec is no exception. This should be in line with expectations of authors of tools such as parsers, validators, generators, or other kinds of processors, for which the spec is the authoritative reference.

    It is not necessarily the best first place to look for explanation, if you are approaching from a more casual side.

    I agree that from that side a clear picture of what Jevko is and how it can be used is still lacking. I certainly should add more examples and explain the concepts with analogies.

    So I appreciate the essence of your advice and hope I'll manage to improve on that.

  • examples

    Examples of information encoded with Jevko. (by jevko)

  • Thank you for your feedback. Can you clarify?

    What is the "first page" that you are referring to?

    Can you paste a link to it along with the broken examples link?

    This Hacker News submission features the blog post under this URL:

    https://djedr.github.io/posts/jevko-2022-02-22.html

    Clearly, you are not talking about this page, as that contains multiple links rather than a singular link.

    Perhaps you are talking about the specification which is here:

    https://github.com/jevko/specifications/blob/master/spec-sta...

    (linked from the blog post)

    and here:

    https://jevko.org/spec.html

    (linked from jevko.org)

    All three link to Jevko examples here:

    https://github.com/jevko/examples

    but all these examples links seem to be correct on my end.

    I agree about the importance of examples, and I try to lead with them on jevko.org and jevko.github.io (which are the front pages of Jevko -- possibly I should merge them into one).

    However a formal specification is not necessarily the place to put the leading examples.

    This is also where the Subjevko rule is defined. It isn't quite introduced as "known knowledge" -- the purpose of a specification is to define the unknown, more or less from the ground up. This is also why specifications tend to get a little abstract. Jevko's spec is no exception. This should be in line with expectations of authors of tools such as parsers, validators, generators, or other kinds of processors, for which the spec is the authoritative reference.

    It is not necessarily the best first place to look for explanation, if you are approaching from a more casual side.

    I agree that from that side a clear picture of what Jevko is and how it can be used is still lacking. I certainly should add more examples and explain the concepts with analogies.

    So I appreciate the essence of your advice and hope I'll manage to improve on that.

  • writing

    A public place for unpolished technical writing. (by jevko)

  • I had the same idea. Simple enough, but still. Brackets are simpler to formalize and implement and not harder to explain.

    > They also put some thought into designing a query language for their rose-tree-like data model, which might be adaptable to Jevko — though they label only nodes, and Jevko labels both nodes (with suffixes) and arcs (with prefixes).

    Yes, that might be interesting to look at, thanks for pointing it out. I have thought about this and came up with some ideas, but haven't decided on anything. I was thinking more along the lines of having the path DSL be simply implemented on top of Jevko, not as a completely separate grammar.

    > Maybe that's the subtitle for Jevko? "A minimal Unicode syntax for ordered trees with labeled nodes and labeled arcs." If that's the intended semantics it would be pretty easy to whip up a diagram in Dot to illustrate it.

    It's a nice description, but I think a little to detailed and technical to fit into a tagline. Maybe a little explanatory article with the diagram included. Would probably look something like this:

    https://github.com/jevko/writing/blob/main/2022-01-10-jevko-...

    Although I'd gladly see your take on it. ;)

  • markup-experiments

    A collection of experiments with Jevko and text markup.

  • Responding to some points I left off here https://news.ycombinator.com/item?id=33336789

    I guess the main one is this:

    > If your audience is people like me, I think it would probably be worthwhile for you to spend some time up front describing the intended semantics of a data model, as I've attempted above, rather than leaving people to infer it from the grammar. (Maybe OCaml is not a good way to explain it, though.) You might also want to specify that leading and trailing whitespace in prefixes is not significant, though it is in the suffix ("body"); this would enable people to format their name-value pairs readably without corrupting the data. As far as I can tell, this addendum wouldn't interfere with any of your existing uses for Jevko, though in some cases it would simplify their implementations.

    You're right, things should be explained more clearly (TODO). Especially the exact role of Jevko and treatment of whitespace. I'll try to improve that.

    Here is a sketch of an explanation.

    Plain Jevko is meant to be a low-level syntactic layer.

    It takes care of turning a unicode sequence into a tree.

    On this level, all whitespace is preserved in the tree.

    To represent key-value pairs and other data, you most likely want another layer above Jevko -- this would be a Jevko-based format, such as queryjevko (somewhat explained below) or, a very similar one, easyjevko, implemented and very lightly documented here: https://github.com/jevko/easyjevko.js

    Or you could have a markup format, such as https://github.com/jevko/markup-experiments#asttoxml5

    This format layer defines certain restrictions which may make a subset of Jevkos invalid in it.

    It also specifies how to interpret the valid Jevkos. This includes the treatment of whitespace, e.g. that a leading or trailing whitespace in prefixes is insignificant, but conditionally significant in suffixes, etc.

    Different formats will define different restrictions and interpretations.

    For example:

    # queryjevko

    queryjevko is a format which uses (a variant of) Jevko as a syntax. Only a subset of Jevko is valid queryjevko.

    > I think this is a more useful level of abstraction, and it's more or less the level used by, for example, queryjevko.js's jevkoToJs, although that erroneously uses () instead of [].

    The `()` are used on purpose -- queryjevko is meant to be used in URL query strings and be readable. If square brackets were used, things like JS' encodeURIComponent would escape them, making the string unreadable. Using `()` solves that. "~" is used instead of "`" for the same reason. So technically we are dealing not with a spec-compliant Jevko, but a trivial variant of it. Maybe I should write a meta-spec which allows one to pick the three special characters before instantiating itself into a spec. Anyway the parser implementation is configurable in that regard, so I simply configure it to use "~()" instead of "`[]".

    > (Also, contrary to your assertion above that this is an example of "leaving [Jevko's data model] as-is", it forgets the order of the name-value pairs as well as I guess all but one of any duplicate set of fields with the same name and also the possibility that there could be both fields and a body.)

    I meant [whitespace] rather than [Jevko's data model].

    Again, queryjevko is a format which uses Jevko as an underlying syntax. It specifies how syntax trees are converted to JS values, by restricting the range of valid Jevkos. It also specifies conversion in the opposite direction, likewise placing restrictions on JS values that can be safely converted to queryjevko.

    The order of name-value pairs happens to get preserved (because of the way JS works), but that's not necessarily relevant. If I were to write a cross-language spec for queryjevko, I'd probably specify that this shouldn't be relied upon.

    Duplicate fields and Jevkos with both fields and a non-whitespace body will produce an error when converting Jevko->JS.

    I hope this clarifies things somewhat.

    Lastly, I'll respond to this for completeness:

    > (By the way, if you want to attribute your JSON example for copyright reasons, you need to attribute it to its author or authors, not to the Wikipedia, which is just the site they posted it on.)

    According to this:

    https://en.wikipedia.org/wiki/Wikipedia:Reusing_Wikipedia_co...

    there are 3 options, one of them being what I did, which is to include a link.

    I think that's all.

    Have a good one!

  • binary-experiments

    Experiments with various binary formats based on Jevko.

  • > 1. It relies on later stages to do things like convert to native values or remove whitespace. As such an intermediary can't really "understand" the document very well. E.g., if you are using this syntax to express key/value then you can only know the keys if you know the whitespace rules that will be used to interpret the keys.

    From the point of view of Jevko, handling whitespace is a higher-level concern. Indeed, you typically will need additional rules to communicate effectively with Jevko.

    This is where you should specify a format (which should be standardized) that you use, e.g.:

    * https://github.com/jevko/easyjevko.lua

    Note: this is just a simple library I wrote recently that does the most straightforward thing imaginable. I haven't yet wrote a spec for the format.

    > 2. It reminds me more of XML than S-Expressions. If you are familiar with the ElementTree [1] representation of XML then these parsed examples become more familiar, just a prefix instead of suffix, and no attribute children.

    Yes, Jevko has the features of both XML and S-expressions (or neither, depending on how you look). It's supposed to be uniquely suitable for both markup and encoding of data/code in a simple way.

    > 3. Connecting whitespace and XML, I suppose this explains why so many XML formats ended up only using attributes for text values, and used tags and children only for nesting. Otherwise interpretation requires understanding how to interpret these mixed documents with unclear whitespace rules. But Jevko doesn't have attributes.

    A markup format built on Jevko can have the notion of attributes and rules for whitespace, e.g. see https://news.ycombinator.com/item?id=33334774 -- a nice thing about this particular format is that text nodes are explicitly specified, so you know exactly where your significant whitespace goes.

    The nice thing about attributes made with Jevko over XML attributes is that you could naturally make them extensible (which is a pain in XML -- if you want to turn your unstructured string value stored in an attribute into a tree you have a problem).

    > 5. I don't see any escaping rules so including literal [] seems... impossible? You can essentially deserialize the parsed value to reintroduce them, but that's weird and awkward, and only works with balanced brackets anyway.

    That's not correct. There are only 3 special symbols (delimiters) and they can all be escaped. It's all in the specification.

    > 6. No way to embed binary data, or otherwise uninterpreted data. JSON has the same problem. Base64 encoding things is yet another way in which the data is not interpretable by intermediaries.

    Indeed, Jevko is not a binary format. Although I've been experimenting with binary equivalents, e.g.:

    https://github.com/jevko/binary-experiments

    At some point I might go forward with one, but that's not the focus right now.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • treenotation.org

    TreeNotation.org website

  • > concatenating them changes the label for [b] from "a" to "z\na", and perhaps more damningly, erases the whitespace before "z". But, since none of the alternative formats (except ndjson and I guess plain uninterpreted binary, ASCII, or Unicode) is closed under concatenation, maybe that's less important.

    Yes, being closed under concatenation is a feature I was aiming for and it indeed does bring with it this issue.

    Just something to have in mind when devising formats. A simple solution here is to disallow having anything other than whitespace in the suffix of a Jevko with > 0 children. Then, if a format converts these labels to keys in a map, trimming leading and trailing whitespace, there is no problem. This is how I did it here:

    https://github.com/jevko/easyjevko.js

    > I don't know if you saw the last time this topic came up I linked to https://ogdl.org/, which seems pretty close to a minimal rose-tree notation.

    Yes, I've seen OGDL before. It's pretty nice. A similar one is https://treenotation.org/

    I have experimented with indentation-based syntaxes myself, before settling on brackets.

    I have found them to be problematic, at least because:

    * For complex structures they become less compact.

    * A grammar that correctly captures significant indentation can't really be written in pure BNF. The way OGDL does it is this:

      [12] space(n) ::= char_space*n ; where n is the equivalent number of spaces (can be 0)

  • jevkalk

    A Jevko-based interpreter.

  • > is doing? It sure looks to me like it's asking whether a symbol (i.e. indivisible atom) ends with an equal sign, which is semantic gibberish.

    There are no symbols or indivisible atoms here.

    What's happening here is parsing. `jevkoToHtml` is a kind of parser-transpiler which operates on a syntax tree, rather than a sequence of characters or tokens.

    The syntax tree is the output of an earlier stage of parsing, done by the Jevko parser.

    So you can think of this as multi-pass parsing, by analogy with multi-pass compilation.

    At the same time as this second pass of parsing is happening, translation to HTML is happening as well.

    Hope this clarifies things!

    ---

    [0] To clearly see the point, here is a toy programming language which uses Jevko as its syntax: https://github.com/jevko/jevkalk

  • community

    Features Jevko-related things created by various authors (by jevko)

  • If you ever publish the project, send me a gmail at darius.j.chuck

    I'd love to feature it here: https://github.com/jevko/community

  • jevkodom.js

    Experimental Jevko to browser DOM

  • queryjevko.js

    Functions to convert between complex values and a human-readable format which fits into URL query strings.

  • The grammar of S-exps on the other hand, I won't quote here, but I assure you it's much more complicated. How much depends on your flavor (Jevko is also simpler in this regard: there is only one flavor, clearly specified).

    There is no (intended) ambiguity around whitespace in Jevko: whitespace does not occur explicitly in the grammar. Whitespace characters are just characters. This is the defining feature of the syntax.

    For this reason Jevko is more low-level: if you want to treat whitespace in some special way, you have to do that yourself. Although for most use-cases this is very similar and simple, e.g. https://news.ycombinator.com/item?id=33334314

    But the point is that you can also leave it as-is, e.g.: https://github.com/jevko/queryjevko.js

    or do something else -- it's up to your format.

  • jevko

  • Nice! I wrote a little parser for it (https://github.com/lgastako/jevko). It was fun. I'll have to play with building higher level formats on top of it.

  • interjevko.js

    Experimental Schema-based Minimal Data Interchange with Jevko.

  • yapl

    YAml Programming Language

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Jevko: a minimal general-purpose syntax

    5 projects | /r/programming | 25 Oct 2022
  • Labeled ordered trees encoded with Jevko and visualized with Dot diagrams

    1 project | /r/jevko | 7 Dec 2022
  • Syntax Design

    9 projects | news.ycombinator.com | 18 Oct 2022
  • The Journey Ahead: My 6-Month Plan to Master GoLang

    2 projects | dev.to | 17 Apr 2024
  • ExFAT Driver Boasts Much Faster "Dirsync" Performance with Linux 6.9

    2 projects | news.ycombinator.com | 21 Mar 2024