amzn-drivers
Aerospike
amzn-drivers | Aerospike | |
---|---|---|
4 | 16 | |
441 | 972 | |
0.7% | 1.6% | |
9.1 | 8.7 | |
17 days ago | 5 days ago | |
C | C | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
amzn-drivers
-
Looking for programmer volunteers who want to contribute/learn about low level C++, Linux, Networking, high frequency trading.
Amazon (AWS) cloud EC2 instance specific role (Kernel and User space networking, linux OS related). Amazon has it's own network card with it's own linux driver (open source), for user space they use DPDK (open source). https://github.com/amzn/amzn-drivers I've measured the time between calling tcp send in software, and packet leaving the NIC (network card), it is around ~50 microseconds latency, aws also stated in a paper it is around that number. Goals:- Figure out the way to build from source code and load the kernel.- Reduce latency
-
FreeBSD optimizations used by Netflix to serve video at 800Gb/s [pdf]
It means, for example, writing a FreeBSD kernel driver for Elastic Network Adapter (ENA). Both Linux kernel driver and FreeBSD kernel driver is available at https://github.com/amzn/amzn-drivers
-
Dragonflydb – A modern replacement for Redis and Memcached
Of course, there are.
I was mostly running on AWS. In terms of hardware, for small packets loadtests most systems are constrained on throughput, i.e. number of packets per second. Some systems saturate on interrupts reaching 100% CPU on all cores and some can not even saturate the CPU and you will see that CPU is at 60% but you can not go beyond some limit. Best systems networkwise are c6gn family types. They are also better than other cloud provide. btw, you mentioned hypervisors... About 8 months ago I opened a bug on AWS Graviton team https://github.com/amzn/amzn-drivers/issues/195 - about performance issue they had on their instances at high throughput. Recently they issued the fix. I suspect it was in their hypervisor.
In terms of my software I found many performance bugs at those speeds. For example, using a default allocator is a big no. I use mimalloc for uncontended allocations. In general, you can not use mutexes and spinlocks at those speeds. Those will just cripple the system. Sometimes it can be very annoying since you can not rely on a 3rd party library without carefully analyzing its design. For example, I could not use openmetrics c++ library because it was not performant enough. Even to implement a simple counter, say to gather statistics for INFO command becomes an interesting engineering problem:
-
Ask HN: Anybody enabled IOMMU on AWS metal servers?
https://doc.dpdk.org/guides/nics/ena.html
and:
https://github.com/amzn/amzn-drivers/tree/master/userspace/dpdk/enav2-vfio-patch
Enabling IOMMU on i3 or c5 metal instances is as easy as adding "iommu=1 intel_iommu=on" to /etc/default/grub followed by update-grub, reboot.
I can't get this to work. Everything I update grub and reboot I cannot re-connected via ssh. Also EC2 console fails to get good status.
My config:
Ubuntu 20.04 stock AWS AMI x86 64-bit
Aerospike
- System Design: Databases and DBMS
- Ask HN: Why are there no open source NVMe-native key value stores in 2023?
-
Aerospike Driver for LINQPad
Aerospike for LINQPad 7 is a data context dynamic driver for interactively querying and updating an Aerospike database using “LINQPad”. The driver is free. For more information go to this blog post. You can directly download the driver from the LINQPad NuGet manager.
-
Using In-Memory Databases in Data Science
Aerospike is a real-time cloud structured platform with good performance capabilities. This IMDB platform allows enterprises to perform their operations in real time through the hybrid memory and parallelism model.
- System Design: Caching, Content Delivery Networks (CDN) & Proxies.
-
Block and Filesystem side-by-side with K8s and Aerospike
Block storage stores a sequence of bytes in a fixed size block (page) on a storage device. Each block has a unique hash that references the address location of the specified block. Unlike a filesystem, block storage doesn't have the associated metadata such as format-type, owner, date, etc. Also, block storage doesn’t use the conventional storage paths to access data like a filesystem file. This reduction in overhead contributes to improved overall access speeds when using raw block devices. The ability to store bytes in blocks allows applications the flexibility to decide how these blocks are accessed and managed, making block storage an ideal choice for low latency databases such as Aerospike. From a developer's perspective, a block device is simply a large array of bytes, usually with some minimum granularity for reads and writes. In Aerospike this granularity is configured and referred to as the write-block-size. The Aerospike Kubernetes Operator uses the storage infrastructure software inside of Kubernetes and the need for data platforms to use raw block storage becomes ever more important.
-
Aerospike & IoT using MQTT
This example shows how the Aerospike database can be easily and scalably used to store industrial time series data made available by the MQTT ecosystem. Aerospike plus its Community Time Series Client streamlines the storage and retrieval of the data, supporting the ability to both write and read millions of data points per second if required.
-
Building Large-Scale Real-Time JSON Applications
Real-time large-scale JSON applications need reliably fast access to data, high ingest rates, powerful queries, rich document functionality, scalability with no practical limit, always-on operation, and integration with streaming and analytical platforms. They need all this at low cost. The Aerospike Real-time Data Platform provides all this functionality, making it a good choice for building such applications. The Collection Data Types (CDTs) in Aerospike provide powerful support for modeling, organizing, and querying a large JSON document store. Visit the tutorials and code sandbox on the Developer Hub to explore the capabilities of the platform, and play with the Document API and query capabilities for JSON.
- System Design: NoSQL databases
- System Design: Caching
What are some alternatives?
dragonfly - A modern replacement for Redis and Memcached
neon - Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, branching, and bottomless storage.
Redis - Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.
cachegrand - cachegrand - a modern data ingestion, processing and serving platform built for today's hardware
yugabyte-db - YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
helio - A modern framework for backend development based on io_uring Linux interface
ClickHouse - ClickHouse® is a free analytics DBMS for big data
midi-redis - A toy memory store with great performance
webdis - A Redis HTTP interface with JSON output
ydb - YDB is an open source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions