PEGTL
Hopscotch map
Our great sponsors
PEGTL | Hopscotch map | |
---|---|---|
12 | 3 | |
1,861 | 698 | |
1.5% | - | |
7.2 | 3.7 | |
5 days ago | 7 months ago | |
C++ | C++ | |
Boost Software License 1.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PEGTL
-
Show HN: Matcheroni, a tiny C++20 header library for building lexers/parsers
Very cool, and I like the name!
I'd be interested in reading about how Matcheroni compares with PEGTL and Lexy.
https://github.com/taocpp/PEGTL
-
Use PEGTL to remove my clunky homemade parser
I found a library I wanted to test: Pegtl
-
What are some cool modern libraries you enjoy using?
I like PEGTL
-
Are C/C++ developers allowed to import libraries to make coding easier or are they expected to build every functions and methods from scratch (without importing anything like String.h)?
Sure - libraries that are expected to be entirely self-contained. The one that comes to mind is PEGTL, a parser combinator library that is intended to be embedded inside a larger program. Making it import more dependencies would break this philosophy. Similarly, in the Rust world, there are a variety of "no-std" crates that should be able to be imported even if the standard library is not available on the target platform.
-
TIL: Visual Studio has quantum state values 🤨
The program in the post was just an example meant to illustrate the problem. Originally, this (new) behavior of MSVC broke my code in the PEGTL, see [this commit](https://github.com/taocpp/PEGTL/commit/e3c8cb499dc3d1d76d23f2d5d79469dcb15550c5) that I needed to apply to fix it.
-
We Built a C++ Rendering Engine for the Web
As a professional C++ programmer I feel a lot of the reasons C++ gets this response is because it's simply not "batteries included" like Go or Rust.
C++ is a very powerful, unopinionated language, that gives you a lot of freedom to attack your problem domain the way you best see fit.
If you're writing a networked application, don't use POSIX sockets, go and find a higher level library. If you're parsing complex text formats, don't iterate over buffers with char*'s, go pick up PEGTL[0]. If you're working on graphs, or need to properly index in-memory data, go pick up Boost[1][2]. If you need a GUI, go pick up Qt.
It's extremely common in C++, due to the lack of a universal package management solution, for people to try and "muddle through" and do shit themselves when it's far outside their core competency.
At one of my last employers, the core product was parsing JSON with std::regex, simply because they couldn't be bothered to integrate a JSON library.
[0] https://github.com/taocpp/PEGTL
[1] https://www.boost.org/doc/libs/1_76_0/libs/graph/
[2] https://www.boost.org/doc/libs/1_76_0/libs/multi_index/doc/i...
-
Is there anything like sly for C++?
You are looking for Boost.Spirit (https://www.boost.org/doc/libs/1_76_0/libs/spirit/doc/x3/html/index.html) or PEGTL (https://github.com/taocpp/PEGTL)
-
Why no more Lex/Yakk/ANTLR/whatever?
I personally prefer to use parsing combinator libraries in C++, where the "grammar" is just part of normal C++ and directly integrate. Examples are Boost.Spirit, pegtl, or (my own) lexy.
- Rust's Most Unrecognized Contributor
-
Wondered if anyone is interested in a c++ parser combinators library?
While I'm not quite sure how this might transfer to your approach, with your Haskell-inspired style being quite different from our C++ templates, in the PEGTL our equivalent to your Char, which is called one, is variadic (true to the T in PEGTL a variadic template) and takes a list of possible matches.
Hopscotch map
-
boost::unordered map is a new king of data structures
Unordered hash map shootout CMAP = https://github.com/tylov/STC KMAP = https://github.com/attractivechaos/klib PMAP = https://github.com/greg7mdp/parallel-hashmap FMAP = https://github.com/skarupke/flat_hash_map RMAP = https://github.com/martinus/robin-hood-hashing HMAP = https://github.com/Tessil/hopscotch-map TMAP = https://github.com/Tessil/robin-map UMAP = std::unordered_map Usage: shootout [n-million=40 key-bits=25] Random keys are in range [0, 2^25). Seed = 1656617916: T1: Insert/update random keys: KMAP: time: 1.949, size: 15064129, buckets: 33554432, sum: 165525449561381 CMAP: time: 1.649, size: 15064129, buckets: 22145833, sum: 165525449561381 PMAP: time: 2.434, size: 15064129, buckets: 33554431, sum: 165525449561381 FMAP: time: 2.112, size: 15064129, buckets: 33554432, sum: 165525449561381 RMAP: time: 1.708, size: 15064129, buckets: 33554431, sum: 165525449561381 HMAP: time: 2.054, size: 15064129, buckets: 33554432, sum: 165525449561381 TMAP: time: 1.645, size: 15064129, buckets: 33554432, sum: 165525449561381 UMAP: time: 6.313, size: 15064129, buckets: 31160981, sum: 165525449561381 T2: Insert sequential keys, then remove them in same order: KMAP: time: 1.173, size: 0, buckets: 33554432, erased 20000000 CMAP: time: 1.651, size: 0, buckets: 33218751, erased 20000000 PMAP: time: 3.840, size: 0, buckets: 33554431, erased 20000000 FMAP: time: 1.722, size: 0, buckets: 33554432, erased 20000000 RMAP: time: 2.359, size: 0, buckets: 33554431, erased 20000000 HMAP: time: 0.849, size: 0, buckets: 33554432, erased 20000000 TMAP: time: 0.660, size: 0, buckets: 33554432, erased 20000000 UMAP: time: 2.138, size: 0, buckets: 31160981, erased 20000000 T3: Remove random keys: KMAP: time: 1.973, size: 0, buckets: 33554432, erased 23367671 CMAP: time: 2.020, size: 0, buckets: 33218751, erased 23367671 PMAP: time: 2.940, size: 0, buckets: 33554431, erased 23367671 FMAP: time: 1.147, size: 0, buckets: 33554432, erased 23367671 RMAP: time: 1.941, size: 0, buckets: 33554431, erased 23367671 HMAP: time: 1.135, size: 0, buckets: 33554432, erased 23367671 TMAP: time: 1.064, size: 0, buckets: 33554432, erased 23367671 UMAP: time: 5.632, size: 0, buckets: 31160981, erased 23367671 T4: Iterate random keys: KMAP: time: 0.748, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 CMAP: time: 0.627, size: 23367671, buckets: 33218751, repeats: 8, sum: 4465059465719680 PMAP: time: 0.680, size: 23367671, buckets: 33554431, repeats: 8, sum: 4465059465719680 FMAP: time: 0.735, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 RMAP: time: 0.464, size: 23367671, buckets: 33554431, repeats: 8, sum: 4465059465719680 HMAP: time: 0.719, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 TMAP: time: 0.662, size: 23367671, buckets: 33554432, repeats: 8, sum: 4465059465719680 UMAP: time: 6.168, size: 23367671, buckets: 31160981, repeats: 8, sum: 4465059465719680 T5: Lookup random keys: KMAP: time: 0.943, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 CMAP: time: 0.863, size: 23367671, buckets: 33218751, lookups: 34235332, found: 29040438 PMAP: time: 1.635, size: 23367671, buckets: 33554431, lookups: 34235332, found: 29040438 FMAP: time: 0.969, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 RMAP: time: 1.705, size: 23367671, buckets: 33554431, lookups: 34235332, found: 29040438 HMAP: time: 0.712, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 TMAP: time: 0.584, size: 23367671, buckets: 33554432, lookups: 34235332, found: 29040438 UMAP: time: 1.974, size: 23367671, buckets: 31160981, lookups: 34235332, found: 29040438
-
Yes, this is embarrassingly slow .so I solved your problem
the map member used for the lookups is a tsl::hopscotch_map (https://github.com/Tessil/hopscotch-map), which is a proper hash map. so it seems to be the latter, that the API is wrong, but from what I can tell it is only a wrongly named class. i don't see where the API makes guarantees about iteration order, which is where the implementation difference would be noticeable (beyond performance for lookup).
-
Any suggestions for resources to optimize for memory allocation/reallocation?
using an open-addressing hash table, such as abseil flat_hash_map or tessil/hopscotch-map
What are some alternatives?
lexy - C++ parsing DSL
C++ B-tree - Git mirror of the official (mercurial) repository of cpp-btree
cpp-peglib - A single file C++ header-only PEG (Parsing Expression Grammars) library
sparsehash-c11 - Experimental C++11 version of sparsehash
spirit - Boost.org spirit module
sparsehash - C++ associative containers
Optional Argument in C++ - Named Optional Arguments in C++17
pybind11 - Seamless operability between C++11 and Python
Hashmaps - Various open addressing hashmap algorithms in C++
sparsepp - A fast, memory efficient hash map for C++
robin-map - C++ implementation of a fast hash map and hash set using robin hood hashing