Software Development Magazine - Project Management, Programming, Software Testing |
Scrum Expert - Articles, tools, videos, news and other resources on Agile, Scrum and Kanban |
An Introduction to the UnQLite Open Source NoSQL Database
Mrad Chems Eddine, Symisc Systems, http://symisc.net/
UnQLite is an embedded NoSQL database engine. It's a standard Key/Value store similar to the more popular Berkeley DB and a document-store database similar to MongoDB with a built-in scripting language called Jx9 that looks like Javascript.
Unlike most other NoSQL databases, UnQLite does not have a separate server process. UnQLite reads and writes directly to ordinary disk files. A complete database with multiple collections is contained in a single disk file. The database file format is cross-platform, you can freely copy a database between 32-bit and 64-bit systems or between big-endian and little-endian architectures.
In this article, we shall introduce the concept behind UnQLite, it's architecture and a high level overview of its C/C++ API.
Web Site: http://unqlite.orgVersion tested: 1.1.6
System requirements: Windows or UNIX systems (Linux, FreeBSD, Oracle Solaris, Mac OS X)
License & Pricing: open source, http://unqlite.org/licensing.html
Support: http://unqlite.org/suport.html
Development Timeline
UnQLite was designed in the later 2012 when the Symisc (The company behind UnQLite) development team lead by Mrad Chems Eddine was working on a distributed P2P and Voip solutions similar in concepts to Skype and Kademlia. The challenge was to be able to store all nodes meta-information such as IP address, blobs, etc. locally and very efficiently. The available solutions were not at our taste, Berkeley DB was too complex to embed in a plain C application and Google levelDB did not support transactions (Unlike UnQLite). After a moment of reflection, we decided to implement our own embedded database engine by taking the SQLite3 backend (i.e. the VFS layer, the locking mechanism and the Transaction manager) with our storage backend which should support on-disk as well in-memory operations plus the Jx9 stuff. A few months lather, UnQLite was stable enough and its performance rock!
The UnQLite Architecture
Like most modern database engines, UnQLite is built-up on layers, The upper-layers hence the document-store and the Key/Value store layers are presented to the host application via a set of exported interfaces (i.e. The UnQLite API).
The principal task of a database engine is to store and retrieve records. UnQLite support both structured and raw database record storage. The Document-store layer is used to store JSON docs (i.e. Objects, Arrays, Strings, etc.) in the database and is powered by the Jx9 programming language while the Key/Value store layer is used to store raw records in the database.
Key/Value store layer.
UnQLite is a standard key/value store similar to BerkeleyDB, Tokyo Cabinet, LevelDB, etc. but, with a rich feature set including support for transactions (ACID), concurrent reader, etc. Under the KV store, both keys and values are treated as simple arrays of bytes, so content can be anything from ASCII strings, binary blob and even disk files. The KV store layer is presented to host applications via a set of interfaces, these includes: unqlite_kv_store(), unqlite_kv_append(), unqlite_kv_fetch_callback(), etc.
Document store layer.
The document store that is used to store JSON docs (i.e. Objects, Arrays, Strings, etc.) in the database is powered by the Jx9 programming language.
Jx9 is an embeddable scripting language also called extension language designed to support general procedural programming with data description facilities. Jx9 is a Turing-Complete, dynamically typed programming language based on JSON and implemented as a library in the UnQLite core.
Jx9 is built with a ton of features and has a clean and familiar syntax similar to C and Javascript. Being an extension language, Jx9 has no notion of a main program, it only works embedded in a host application. The host program (UnQLite in our case) can write and read Jx9 variables and can register C/C++ functions to be called by Jx9 code.
Pluggable Run-time Interchangeable Storage Engines.
UnQLite works with run-time interchangeable storage engines (i.e. Hash, B+Tree, R+Tree, LSM, etc.). The storage engine works with key/value pairs where both the key and the value are byte arrays of arbitrary length and with no restrictions on content. UnQLite come with two built-in KV storage engine: A Virtual Linear Hash (VLH) storage engine is used for persistent on-disk databases with O(1) lookup time and an in-memory hash-table or Red-black tree storage engine is used for in-memory databases. Future versions of UnQLite might add other built-in storage engines (i.e. LSM).
Transaction Manger/Pager module.
The underlying storage engine requests information from the disk in fixed-size chunks. The default chunk size is 4096 bytes but can vary between 512 and 65536 bytes. The pager module is responsible for reading, writing, and caching these chunks. The pager module also provides the rollback and atomic commit abstraction and takes care of locking of the database file. The storage engine requests particular pages from the page cache and notifies the pager module when it wants to modify pages or commit or rollback changes. The pager module handles all the messy details of making sure the requests are handled quickly, safely, and efficiently.
Virtual File System.
UnQLite is designed to run on multitude of platforms. In order to provide portability between these platforms (i.e. Between POSIX and Win32/64 operating systems), UnQLite uses an abstraction layer to interface with the operating system. Each supported operating system has its own implementation.
An instance of the unqlite_vfs object defines the interface between the UnQLite core and the underlying operating system. The "vfs" in the name of the object stands for "Virtual File System".
Introduction to the UnQLite C/C++ API
Early versions of UnQLite were very easy to learn since they only supported 12 C/C++ interfaces. But as the UnQLite engine has grown in capability, new C/C++ interfaces have been added so that now there are over 95 distinct APIs. This can be overwhelming to a new programmer. Fortunately, most of the C/C++ interfaces in UnQLite are very specialized and never need to be used. Despite having so many entry points, the core API is still relatively simple and easy to code to. This article aims to provide all of the background information needed to easily understand how the UnQLite database engine works.
The principal task of a database engine is to store and retrieve records. UnQLite support both structured and raw database record storage.
In order to accomplish this purpose, the developer needs to know about two objects:
- The Database Engine Handle: unqlite
- The UnQLite (Via Jx9) Virtual Machine Object: unqlite_vm
The database engine handle and optionally, the virtual machine object are controlled by a small set of C/C++ interface routines. The dozens of the C/C++ interface routines and two objects form the core functionality of UnQLite. The developer who understands them will have a good foundation for using UnQLite.
Typical Usage of the Core C/C++ Interfaces
An application that wants to use UnQLite will typically use the following interfaces with their optional components such as the transaction manager and the cursors interfaces. Note that an application may switch between the Key/Value store and the Document-store interfaces without any problem.
unqlite_open() |
This routine opens a connection to an UnQLite database file and returns a database handle. This is often the first UnQLite API call that an application makes and is a prerequisite for most other UnQLite APIs. Many UnQLite interfaces require a pointer to the database handle as their first parameter and can be thought of as methods on the database handle. This routine is the constructor for the database handle. |
Key/Value Store Interfaces Under the Key/Value store, both keys and values are treated as simple arrays of bytes, so content can anything from ASCII strings, binary blob and even disk files. This set of interfaces allows clients to store and retrieve raw database records efficiently regardless of the underlying KV storage engine. |
|
Document Store Interfaces These set of interfaces works with the unqlite_vm object which is obtained after successful compilation of the target Jx9 script. Jx9 is the scripting language which power the document-store interface to UnQLite. The Document-Store interface to UnQLite works as follows: Obtain a new database handle via unqlite_open(). Compile your Jx9 script using one of the compile interfaces such as unqlite_compile() or unqlite_compile_file(). On successful compilation, the engine will automatically create an instance of this structure (unqlite_vm) and a pointer to this structure is made available to the caller. When something goes wrong during compilation of the target Jx9 script due to a compile-time error, the caller must discard this pointer and fix its erroneous Jx9 code. Compile-time error logs can be extracted via a call to unqlite_config() with a configuration verb set to UNQLITE_CONFIG_JX9_ERR_LOG. Optionally, configure the virtual machine using unqlite_vm_config(). Optionally, register one or more foreign functions or constants using the unqlite_create_function() or unqlite_create_constant() interfaces. Execute the compiled Jx9 program by calling unqlite_vm_exec(). Optionally, extract the contents of one or more variables declared inside your Jx9 script using unqlite_vm_extract_variable(). Optionally, Reset the virtual machine using unqlite_vm_reset(), then go back to step 6. Do this zero or more times. Destroy the virtual machine using unqlite_vm_release() |
|
unqlite_kv_cursor_first_entry() |
Database Cursors Cursors provide a mechanism by which you can iterate over the records in a database. Using cursors, you can seek, fetch, move, and delete database records. |
unqlite_begin() unqlite_commit() |
Manual Transaction Manager This set of interfaces allows the host application to manually start a write-transaction. Note that UnQLite is smart enough and will automatically start a write-transaction in the background when needed and so call to these routines is usually not necessary. |
unqlite_close() |
This routine is the destructor for the unqlite handle obtained by a prior successful call to unqlite_open(). Each database connection must be closed in order to avoid memory leaks and malformed database image. |
Conclusion
This article only mentions the foundational UnQLite interfaces. The UnQLite library includes many other APIs implementing useful features that are not described here. A complete list of functions that form the UnQLite application programming interface is found at the C/C++ Interface Specification. Refer to that document for complete and authoritative information about all UnQLite interfaces.