Aerospike
yugabyte-db
Our great sponsors
Aerospike | yugabyte-db | |
---|---|---|
15 | 87 | |
971 | 8,486 | |
2.7% | 1.3% | |
8.7 | 10.0 | |
24 days ago | 3 days ago | |
C | C | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Aerospike
- Ask HN: Why are there no open source NVMe-native key value stores in 2023?
-
Aerospike Driver for LINQPad
Aerospike for LINQPad 7 is a data context dynamic driver for interactively querying and updating an Aerospike database using “LINQPad”. The driver is free. For more information go to this blog post. You can directly download the driver from the LINQPad NuGet manager.
-
Using In-Memory Databases in Data Science
Aerospike is a real-time cloud structured platform with good performance capabilities. This IMDB platform allows enterprises to perform their operations in real time through the hybrid memory and parallelism model.
- System Design: Caching, Content Delivery Networks (CDN) & Proxies.
-
Block and Filesystem side-by-side with K8s and Aerospike
Block storage stores a sequence of bytes in a fixed size block (page) on a storage device. Each block has a unique hash that references the address location of the specified block. Unlike a filesystem, block storage doesn't have the associated metadata such as format-type, owner, date, etc. Also, block storage doesn’t use the conventional storage paths to access data like a filesystem file. This reduction in overhead contributes to improved overall access speeds when using raw block devices. The ability to store bytes in blocks allows applications the flexibility to decide how these blocks are accessed and managed, making block storage an ideal choice for low latency databases such as Aerospike. From a developer's perspective, a block device is simply a large array of bytes, usually with some minimum granularity for reads and writes. In Aerospike this granularity is configured and referred to as the write-block-size. The Aerospike Kubernetes Operator uses the storage infrastructure software inside of Kubernetes and the need for data platforms to use raw block storage becomes ever more important.
-
Aerospike & IoT using MQTT
This example shows how the Aerospike database can be easily and scalably used to store industrial time series data made available by the MQTT ecosystem. Aerospike plus its Community Time Series Client streamlines the storage and retrieval of the data, supporting the ability to both write and read millions of data points per second if required.
-
Building Large-Scale Real-Time JSON Applications
Real-time large-scale JSON applications need reliably fast access to data, high ingest rates, powerful queries, rich document functionality, scalability with no practical limit, always-on operation, and integration with streaming and analytical platforms. They need all this at low cost. The Aerospike Real-time Data Platform provides all this functionality, making it a good choice for building such applications. The Collection Data Types (CDTs) in Aerospike provide powerful support for modeling, organizing, and querying a large JSON document store. Visit the tutorials and code sandbox on the Developer Hub to explore the capabilities of the platform, and play with the Document API and query capabilities for JSON.
- System Design: NoSQL databases
- System Design: Caching
- Aerospike named to Inc. 5000 list of fastest-growing companies in America.
yugabyte-db
-
Best Practice: use the same datatypes for comparisons, like joins and foreign keys
It is possible to apply Batched Nested Loop but with additional code that checks the range of the outer bigint and compare it only if it matches the range of integer. This has been added in YugabyteDB 2.21 with #20715 YSQL: Allow BNL on joins over different integer types to help migrations from PostgreSQL with such datatype inconsistencies.
-
Jonathan Katz: Thoughts on PostgreSQL in 2024
It can be done like https://github.com/yugabyte/yugabyte-db/ has.
-
Is co-partition or interleave necessary in Distributed SQL?
Therefore, interleaving or co-partitioning is probably not necessary, and would reduce agility and scalability more than improving the performance. Unless you have a good reason for it that you can share on Issue #79. But, first, test and tune the queries to see if you need something else.
-
PostGIS on YugabyteDB Alma8 (workarounds)
This is a workaround, not supported. I've opened the following issue to get it solve in the YugabyteDB deployment: https://github.com/yugabyte/yugabyte-db/issues/19389
-
Bitmap Scan in YugabyteDB
Note that there may still be a need for bitmaps, especially with disjunctions (OR) as the following is about conjunction (AND), and it can still be implemented, differently than PostgreSQL. This is tracked by #4634.
- Yugabyte – distributed PostgreSQL, 100% open source
-
PL/Python on YugabyteDB
FROM almalinux:8 as build RUN dnf -y update &&\ dnf groupinstall -y 'Development Tools' # get YugabyteDB sources ARG YB_TAG=2.18 RUN git clone --branch ${YB_TAG} https://github.com/yugabyte/yugabyte-db.git WORKDIR yugabyte-db # install dependencies and compilation tools RUN dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm RUN dnf -y install epel-release libatomic rsync python3-devel cmake3 java-1.8.0-openjdk maven npm golang gcc-toolset-12 gcc-toolset-12-libatomic-devel patchelf glibc-langpack-en ccache vim wget python3.11-devel python3.11-pip clang ncurses-devel readline-devel libsqlite3x-devel RUN mkdir /opt/yb-build RUN chown "$USER" /opt/yb-build # Install Python 3 RUN alternatives --remove-all python3 RUN alternatives --remove-all python RUN alternatives --install /usr/bin/python python /usr/bin/python3.11 3 RUN alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 3 # add #include "pg_yb_utils.h" to src/postgres/src/pl/plpython/plpy_procedure.c RUN sed -e '/#include "postgres.h"/a#include "pg_yb_utils.h"' -i src/postgres/src/pl/plpython/plpy_procedure.c # if using python > 3.9 remove #include and #include from src/postgres/src/pl/plpython/plpython.h RUN sed -e '/#include /d' -e '/#include /d' -i src/postgres/src/pl/plpython/plpython.h # add '--with-python', to python/yugabyte/build_postgres.py under the configure_postgres method RUN sed -e "/'\.\/configure',/a\ '--with-python'," -i python/yugabyte/build_postgres.py # Build and package the release RUN YB_CCACHE_DIR="$HOME/.cache/yb_ccache" ./yb_build.sh -j$(nproc) --clean-all --build-yugabyted-ui --no-linuxbrew --clang15 -f release RUN chmod +x bin/get_clients.sh bin/parse_contention.py bin/yb-check-consistency.py RUN YB_USE_LINUXBREW=0 ./yb_release --force WORKDIR / RUN mv /yugabyte-db/build/yugabyte*.tar.gz /yugabyte.tgz
-
YugabyteDB official Dockerfile
You have seen me using the official YugabyteDB Docker image extensively. This image is suitable for various purposes, including labs, development, testing, and even production. In the past, we used to create it internally due to its seamless integration with our build process. However, some companies prefer to construct the image on their own, which is indeed a commendable practice. After all, it's not advisable to run random images with root privileges on your servers. As a result, we have made a significant alteration by introducing a refined Dockerfile to our Github repository.
-
FlameGraphs on Steroids with profiler.firefox.com
Of course, I can guess from the function names, but YugabyteDB is Open Source and I can search for them. What happens here is that I didn't declare a Primary Key for my table and then an internal one (ybctid) is generated, because secondary indexes need a key to address the table row. This ID generation calls /dev/urandom. I made this simple example to show that low-level traces can give a clue about high level data model problems.
-
Understand what you run before publishing your (silly) benchmark results
To show that it is not difficut to understand what you run, when in a PostgreSQL-compatible database, I'll look at the HammerDB benchmark connected to YugabyteDB. HammerDB has no specific code for it but YugabyteDB is PostgreSQL-compatible (it uses PostgreSQL code on top of distributed storage and transaction).
What are some alternatives?
dragonfly - A modern replacement for Redis and Memcached
citus - Distributed PostgreSQL as an extension
Redis - Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.
cockroach - CockroachDB - the open source, cloud-native distributed SQL database.
ClickHouse - ClickHouse® is a free analytics DBMS for big data
neon - Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, branching, and bottomless storage.
psycopg2 - PostgreSQL database adapter for the Python programming language
ydb - YDB is an open source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions
realtime - Broadcast, Presence, and Postgres Changes via WebSockets
rupy - HTTP App. Server and JSON DB - Shared Parallel (Atomic) & Distributed
Apache AGE - Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL. [Moved to: https://github.com/apache/age]