Reindexer

Reindexer is an embeddable, in-memory, document-oriented database with a high-level Query builder interface.

Reindexer’s goal is to provide fast search with complex queries. We at Restream weren’t happy with Elasticsearch and created Reindexer as a more performant alternative.

The core is written in C++ and the application level API is in Go.

This document describes Go connector and its API. To get information about reindexer server and HTTP API refer to reindexer documentation

Features

Key features:

Sortable indices
Aggregation queries
Indices on array fields
Complex primary keys
Composite indices
Join operations
Full-text search
Up to 256 indexes (255 user’s index + 1 internal index) for each namespace
ORM-like query interface
SQL queries

Performance

Performance has been our top priority from the start, and we think we managed to get it pretty good. Benchmarks show that Reindexer’s performance is on par with a typical key-value database. On a single CPU core, we get:

up to 500K queries/sec for queries SELECT * FROM items WHERE id='?'
up to 50K queries/sec for queries SELECT * FROM items WHERE year > 2010 AND name = 'string' AND id IN (....)
up to 20K queries/sec for queries SELECT * FROM items WHERE year > 2010 AND name = 'string' JOIN subitems ON ...

See benchmarking results and more details in benchmarking repo

Memory Consumption

Reindexer aims to consume as little memory as possible; most queries are processed without any memory allocation at all.

To achieve that, several optimizations are employed, both on the C++ and Go level:

Documents and indices are stored in dense binary C++ structs, so they don’t impose any load on Go’s garbage collector.
String duplicates are merged.
Memory overhead is about 32 bytes per document + ≈4-16 bytes per each search index.
There is an object cache on the Go level for deserialized documents produced after query execution. Future queries use pre-deserialized documents, which cuts repeated deserialization and allocation costs
The Query interface uses sync.Pool for reusing internal structures and buffers. The combination of these technologies allows Reindexer to handle most queries without any allocations.

Disk Storage

Reindexer can store documents to and load documents from disk via LevelDB. Documents are written to the storage backend asynchronously by large batches automatically in background.

When a namespace is created, all its documents are stored into RAM, so the queries on these documents run entirely in in-memory mode.

Official docker image

The simplest way to get reindexer server, is pulling & run docker image from dockerhub.

docker run -p9088:9088 -p6534:6534 -it reindexer/reindexer

Dockerfile

Prerequirements

Reindexer’s core is written in C++17 and uses LevelDB as the storage backend, so the Cmake, C++17 toolchain and LevelDB must be installed before installing Reindexer.

To build Reindexer, g++ 8+, clang 7+ or mingw64 is required.

In those modes Reindexer’s Go-binding depends on reindexer’s static libraries (core, server and resource).

Get Reindexer using go.mod and replace

If you need modified Reindexer’s sources, you can use replace like that.

Download and build reindexer:

# Clone reindexer via git. It's also possible to use 'go get -a github.com/restream/reindexer/v3', but it's behavior may vary depending on Go's version
git clone https://github.com/restream/reindexer.git $GOPATH/src/reindexer
bash $GOPATH/src/reindexer/dependencies.sh
# Generate builtin binding
cd $GOPATH/src/reindexer
go generate ./bindings/builtin
# Optional (build builtin server binding)
go generate ./bindings/builtinserver

Add Reindexer’s module to your application’s go.mod and replace it with the local package:

# Go to your app's directory
cd /your/app/path
go get -a github.com/restream/reindexer/v3
go mod edit -replace github.com/restream/reindexer/v3=$GOPATH/src/reindexer

In this case, Go-binding will generate explicit libraries’ and paths’ list and will not use pkg-config.

Get Reindexer for apps without go.mod (vendoring)

If you’re not using go.mod it’s possible to get and build reindexer from sources this way:

export GO111MODULE=off # Disable go1.11 modules
# Go to your app's directory
cd /your/app/path
# Clone reindexer via git. It's also possible to use 'go get -a github.com/restream/reindexer', but it's behavior may vary depending on Go's version
git clone --branch master https://github.com/restream/reindexer.git vendor/github.com/restream/reindexer/v3
# Generate builtin binding
go generate -x ./vendor/github.com/restream/reindexer/v3/bindings/builtin
# Optional (build builtin server binding)
go generate -x ./vendor/github.com/restream/reindexer/v3/bindings/builtinserver

Nested Structs

By default, Reindexer scans all nested structs and adds their fields to the namespace (as well as indexes specified).

type Actor struct {
	Name string `reindex:"actor_name"`
}

type BaseItem struct {
	ID        int64  `reindex:"id,hash,pk"`
	UUIDValue string `reindex:"uuid_value,hash,uuid"`
}

type ComplexItem struct {
	BaseItem         // Index fields of BaseItem will be added to reindex
	Actor    []Actor // Index fields of Actor will be added to reindex as arrays
	Name     string  `reindex:"name"`      // Hash-index for "name"
	Year     int     `reindex:"year,tree"` // Tree-index for "year"
	Value    int     `reindex:"value,-"`   // Store(column)-index for "value"
	Metainfo int     `json:"-"`            // Field "MetaInfo" will not be stored in reindexer
	Parent   *Item   `reindex:"-"`         // Index fields of parent will NOT be added to reindex
}

Sort

Reindexer can sort documents by fields (including nested and fields of the joined namespaces) or by expressions in ascending or descending order.

To sort by non-index fields all the values must be convertible to each other, i.e. either have the same types or be one of th numeric types (bool, int, int64 or float).

Sort expressions can contain:

fields and indexes names (including nested fields and fields of the joined namespaces) of bool, int, int64, float or string types. All the values must be convertible to numbers ignoring leading and finishing spaces;
numbers;
functions rank(), abs() and ST_Distance();
parenthesis;
arithmetic operations: +, - (unary and binary), * and /.

If field name followed by + they must be separated by space to distinguish composite index name. Fields of the joined namespaces must be written like this: joined_namespace.field.

Abs() means absolute value of an argument.

Rank() means fulltext rank of match and is applicable only in fulltext query.

ST_Distance() means distance between geometry points (see geometry subsection). The points could be columns in current or joined namespaces or fixed point in format ST_GeomFromText('point(1 -3)')

In SQL query sort expression must be quoted.

type Person struct {
	Name string `reindex:"name"`
	Age  int    `reindex:"age"`
}

type City struct {
	Id                 int               `reindex:"id"`
	NumberOfPopulation int               `reindex:"population"`
	Center             reindexer.Point `reindex:"center,rtree,linear"`
}

type Actor struct {
	ID          int               `reindex:"id"`
	PersonData  Person            `reindex:"person"`
	Price       int               `reindex:"price"`
	Description string            `reindex:"description,text"`
	BirthPlace  int               `reindex:"birth_place_id"`
	Location    reindexer.Point `reindex:"location,rtree,greene"`
}
....

query := db.Query("actors").Sort("id", true)           // Sort by field
....
query = db.Query("actors").Sort("person.age", true)   // Sort by nested field
....
// Sort by joined field
// Works for inner join only, when each item from left namespace has exactly one joined item from right namespace
query = db.Query("actors").
	InnerJoin(db.Query("cities")).On("birth_place_id", reindexer.EQ, "id").
	Sort("cities.population", true)
....
// Sort by expression:
query = db.Query("actors").Sort("person.age / -10 + price / 1000 * (id - 5)", true)
....
query = db.Query("actors").Where("description", reindexer.EQ, "ququ").
    Sort("rank() + id / 100", true)   // Sort with fulltext rank
....
// Sort by geometry distance
query = db.Query("actors").
    Join(db.Query("cities")).On("birth_place_id", reindexer.EQ, "id").
    SortStPointDistance(cities.center, reindexer.Point{1.0, -3.0}, true).
    SortStFieldDistance("location", "cities.center", true)
....
// In SQL query:
iterator := db.ExecSQL ("SELECT * FROM actors ORDER BY person.name ASC")
....
iterator := db.ExecSQL ("SELECT * FROM actors WHERE description = 'ququ' ORDER BY 'rank() + id / 100' DESC")
....
iterator := db.ExecSQL ("SELECT * FROM actors ORDER BY 'ST_Distance(location, ST_GeomFromText(\'point(1 -3)\'))' ASC")
....
iterator := db.ExecSQL ("SELECT * FROM actors ORDER BY 'ST_Distance(location, cities.center)' ASC")

It is also possible to set a custom sort order like this

type SortModeCustomItem struct {
	ID      int    `reindex:"id,,pk"`
	InsItem string `reindex:"item_custom,hash,collate_custom=a-zA-Z0-9"`
}

or like this

type SortModeCustomItem struct {
	ID      int    `reindex:"id,,pk"`
	InsItem string `reindex:"item_custom,hash,collate_custom=АаБбВвГгДдЕеЖжЗзИиКкЛлМмНнОоПпРрСсТтУуФфХхЦцЧчШшЩщЪъЫыЬьЭ-ЯAaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz0-9ЁёЙйэ-я"`
}

The very first character in this list has the highest priority, priority of the last character is the smallest one. It means that sorting algorithm will put items that start with the first character before others. If some characters are skipped their priorities would have their usual values (according to characters in the list).

Text pattern search with LIKE condition

For simple searching text pattern in string fields condition LIKE can be used. It looks for strings matching the pattern. In the pattern _ means any char and % means any sequence of chars.

Go example:

	query := db.Query("items").
		Where("field", reindexer.LIKE, "pattern")

SQL example:

SELECT * FROM items WHERE fields LIKE 'pattern'

‘me_t’ corresponds to ‘meet’, ‘meat’, ‘melt’ and so on ‘%tion’ corresponds to ‘tion’, ‘condition’, ‘creation’ and so on

CAUTION: condition LIKE uses scan method. It can be used for debug purposes or within queries with another good selective conditions.

Generally for full text search with reasonable speed we recommend to use fulltext index.

Update queries

UPDATE queries are used to modify existing items of a namespace. There are several kinds of update queries: updating existing fields, adding new fields and dropping existing non-indexed fields.

UPDATE Sql-Syntax

UPDATE nsName
SET field1 = value1, field2 = value2, ..
WHERE condition;

It is also possible to use arithmetic expressions with +, -, /, * and brackets

UPDATE NS SET field1 = field2+field3-(field4+5)/2

including functions like now(), sec() and serial(). To use expressions from Golang code SetExpression() method needs to be called instead of Set().

To make an array-field empty

UPDATE NS SET arrayfield = [] WHERE id = 100

and set it to null

UPDATE NS SET field = NULL WHERE id > 100

In case of non-indexed fields, setting its value to a value of a different type will replace it completely; in case of indexed fields, it is only possible to convert it from adjacent type (integral types and bool), numeric strings (like “123456”) to integral types and back. Setting indexed field to null resets it to a default value.

It is possible to add new fields to existing items

UPDATE NS SET newField = 'Brand new!' WHERE id > 100

and even add a new field by a complex nested path like this

UPDATE NS SET nested.nested2.nested3.nested4.newField = 'new nested field!' WHERE id > 100

will create the following nested objects: nested, nested2, nested3, nested4 and newField as a member of object nested4.

Example of using Update queries in golang code:

db.Query("items").Where("id", reindexer.EQ, 40).Set("field1", values).Update()

Update field with object

Reindexer enables to update and add object fields. Object can be set by either a struct, a map or a byte array (that is a JSON version of object representation).

type ClientData struct {
    Name          string `reindex:"name" json:"name"`
	Age           int    `reindex:"age" json:"age"`
	Address       int    `reindex:"year" json:"year"`
	Occupation    string `reindex:"occupation" json:"occupation"`
	TaxYear       int    `reindex:tax_year json:"tax_year"`
	TaxConsultant string `reindex:tax_consultant json:"tax_consultant"`
}
type Client struct {
	ID      int         `reindex:"id" json:"id"`
	Data    ClientData  `reindex:"client_data" json:"client_data"`
	...
}
clientData := updateClientData(clientId)
db.Query("clients").Where("id", reindexer.EQ, 100).SetObject("client_data", clientData).Update()

In this case, Map in golang can only work with string as a key. map[string]interface{} is a perfect choice.

Updating of object field by Sql statement:

UPDATE clients SET client_data = {"Name":"John Doe","Age":40,"Address":"Fifth Avenue, Manhattan","Occupation":"Bank Manager","TaxYear":1999,"TaxConsultant":"Jane Smith"} WHERE id = 100;

Remove field via update-query

UPDATE Sql-Syntax of queries that drop existing non-indexed fields:

UPDATE nsName
DROP field1, field2, ..
WHERE condition;

db.Query("items").Where("id", reindexer.EQ, 40).Drop("field1").Update()

Synchronous mode

	// Create new transaction object
	tx, err := db.BeginTx("items");
	if err != nil {
		panic(err)
	}
	// Fill transaction object
	tx.Upsert(&Item{ID: 100})
	tx.Upsert(&Item{ID: 101})
	tx.Query().WhereInt("id", reindexer.EQ, 102).Set("Name", "Petya").Update()
	// Apply transaction
	if err := tx.Commit(); err != nil {
		panic(err)
	}

Async batch mode

For speed up insertion of bulk records async mode can be used.

	// Create new transaction object
	tx, err := db.BeginTx("items");
	if err != nil {
		panic(err)
	}
	// Prepare transaction object async.
	tx.UpsertAsync(&Item{ID: 100},func(err error) {})
	tx.UpsertAsync(&Item{ID: 100},func(err error) {})
	// Wait for async operations done, and apply transaction.
	if err := tx.Commit(); err != nil {
		panic(err)
	}

The second argument of UpsertAsync is completion function, which will be called after receiving server response. Also, if any error occurred during prepare process, then tx.Commit should return an error. So it is enough, to check error returned by tx.Commit - to be sure, that all data has been successfully committed or not.

Transactions commit strategies

Depending on amount of changes in transaction there are 2 possible Commit strategies:

Locked atomic update. Reindexer locks namespace and applying all changes under common lock. This mode is used with small amounts of changes.
Copy & atomic replace. In this mode Reindexer makes namespace’s snapshot, applying all changes to this snapshot, and atomically replaces namespace without lock

The amount of data for selecting a Commit strategy can be selected in the namespace configuration. Check fields StartCopyPolicyTxSize, CopyPolicyMultiplier and TxSizeToAlwaysCopy in struct DBNamespacesConfig(describer.go)

Implementation notes

Transaction object is not thread safe and can’t be used from different goroutines;
Transaction object holds Reindexer’s resources, therefore application should explicitly call Rollback or Commit, otherwise resources will leak;
It is safe to call Rollback after Commit;
It is possible to call Query from transaction by call tx.Query("ns").Exec() ...;
Only serializable isolation is available, i.e. each transaction takes exclusive lock over the target namespace until all the steps of the transaction committed.

Join

Reindexer can join documents from multiple namespaces into a single result:

type Actor struct {
	ID        int    `reindex:"id"`
	Name      string `reindex:"name"`
	IsVisible bool   `reindex:"is_visible"`
}

type ItemWithJoin struct {
	ID          int      `reindex:"id"`
	Name        string   `reindex:"name"`
	ActorsIDs   []int    `reindex:"actors_ids"`
	ActorsNames []int    `reindex:"actors_names"`
	Actors      []*Actor `reindex:"actors,,joined"`
}
....

query := db.Query("items_with_join").Join(
	db.Query("actors").
		WhereBool("is_visible", reindexer.EQ, true),
	"actors"
).On("actors_ids", reindexer.SET, "id")

it := query.Exec()

In this example, Reindexer uses reflection under the hood to create Actor slice and copy Actor struct.

Join query may have from one to several On conditions connected with And (by default), Or or Not operators:

query := db.Query("items_with_join").
	Join(
		db.Query("actors").
			WhereBool("is_visible", reindexer.EQ, true),
		"actors").
	On("actors_ids", reindexer.SET, "id").
	Or().
	On("actors_names", reindexer.SET, "name")

An InnerJoin combines data from two namespaces where there is a match on the joining fields in both namespaces. A LeftJoin returns all valid items from the namespaces on the left side of the LeftJoin keyword, along with the values from the table on the right side, or nothing if a matching item doesn’t exist. Join is an alias for LeftJoin.

InnerJoins can be used as a condition in Where clause:

query1 := db.Query("items_with_join").
	WhereInt("id", reindexer.RANGE, []int{0, 100}).
	Or().
	InnerJoin(db.Query("actors").WhereString("name", reindexer.EQ, "ActorName"), "actors").
	On("actors_ids", reindexer.SET, "id").
	Or().
	InnerJoin(db.Query("actors").WhereInt("id", reindexer.RANGE, []int{100, 200}), "actors").
	On("actors_ids", reindexer.SET, "id")

query2 := db.Query("items_with_join").
	WhereInt("id", reindexer.RANGE, []int{0, 100}).
	Or().
	OpenBracket().
		InnerJoin(db.Query("actors").WhereString("name", reindexer.EQ, "ActorName"), "actors").
		On("actors_ids", reindexer.SET, "id").
		InnerJoin(db.Query("actors").WhereInt("id", reindexer.RANGE, []int{100, 200}), "actors").
		On("actors_ids", reindexer.SET, "id").
	CloseBracket()

query3 := db.Query("items_with_join").
	WhereInt("id", reindexer.RANGE, []int{0, 100}).
	Or().
	InnerJoin(db.Query("actors").WhereInt("id", reindexer.RANGE, []int{100, 200}), "actors").
	On("actors_ids", reindexer.SET, "id").
	Limit(0)

Note that usually Or operator implements short-circuiting for Where conditions: if the previous condition is true the next one is not evaluated. But in case of InnerJoin it works differently: in query1 (from the example above) both InnerJoin conditions are evaluated despite the result of WhereInt. Limit(0) as part of InnerJoin (query3 from the example above) does not join any data - it works like a filter only to verify conditions.

Joinable interface

To avoid using reflection, Item can implement Joinable interface. If that implemented, Reindexer uses this instead of the slow reflection-based implementation. This increases overall performance by 10-20%, and reduces the amount of allocations.

// Joinable interface implementation.
// Join adds items from the joined namespace to the `ItemWithJoin` object.
// When calling Joinable interface, additional context variable can be passed to implement extra logic in Join.
func (item *ItemWithJoin) Join(field string, subitems []interface{}, context interface{}) {

	switch field {
	case "actors":
		for _, joinItem := range subitems {
			item.Actors = append(item.Actors, joinItem.(*Actor))
		}
	}
}

Subqueries (nested queries)

A condition could be applied to result of another query (subquery) included into the current query. The condition may either be on resulting rows of the subquery:

query := db.Query("main_ns").
	WhereQuery(db.Query("second_ns").Select("id").Where("age", reindexer.GE, 18), reindexer.GE, 100)

or between a field of main query’s namespace and result of the subquery:

query := db.Query("main_ns").
	Where("id", reindexer.EQ, db.Query("second_ns").Select("id").Where("age", reindexer.GE, 18))

Result of the subquery may either be a certain field pointed by Select method (in this case it must set the single field filter):

query1 := db.Query("main_ns").
	WhereQuery(db.Query("second_ns").Select("id").Where("age", reindexer.GE, 18), reindexer.GE, 100)
query2 := db.Query("main_ns").
	Where("id", reindexer.EQ, db.Query("second_ns").Select("id").Where("age", reindexer.GE, 18))

or count of items satisfying to the subquery required by ReqTotal or CachedTotal methods:

query1 := db.Query("main_ns").
	WhereQuery(db.Query("second_ns").Where("age", reindexer.GE, 18).ReqTotal(), reindexer.GE, 100)
query2 := db.Query("main_ns").
	Where("id", reindexer.EQ, db.Query("second_ns").Where("age", reindexer.GE, 18).CachedTotal())

or aggregation:

query1 := db.Query("main_ns").
	WhereQuery(db.Query("second_ns").Where("age", reindexer.GE, 18).AggregateMax("age"), reindexer.GE, 33)
query2 := db.Query("main_ns").
	Where("age", reindexer.GE, db.Query("second_ns").Where("age", reindexer.GE, 18).AggregateAvg("age"))

Min, Max, Avg, Sum, Count and CountCached aggregations are allowed only. Subquery can not contain multiple aggregations at the same time.

Subquery can be applied to the same namespace or to the another one.

Subquery can not contain another subquery, join or merge.

If you want to check if at least one of the items is satisfying to the subqueries, you may use ANY or EMPTY condition:

query1 := db.Query("main_ns").
	WhereQuery(db.Query("second_ns").Where("age", reindexer.GE, 18), reindexer.ANY, nil)
query2 := db.Query("main_ns").
		WhereQuery(db.Query("second_ns").Where("age", reindexer.LE, 18), reindexer.EMPTY, nil)

Complex Primary Keys and Composite Indexes

A Document can have multiple fields as a primary key. To enable this feature add composite index to struct. Composite index is an index that involves multiple fields, it can be used instead of several separate indexes.

type Item struct {
	ID    int64 `reindex:"id"`     // 'id' is a part of a primary key
	SubID int   `reindex:"sub_id"` // 'sub_id' is a part of a primary key
	// Fields
	//	....
	// Composite index
	_ struct{} `reindex:"id+sub_id,,composite,pk"`
}

type Item struct {
	ID       int64 `reindex:"id,-"`         // 'id' is a part of primary key, WITHOUT personal searchable index
	SubID    int   `reindex:"sub_id,-"`     // 'sub_id' is a part of a primary key, WITHOUT a personal searchable index
	SubSubID int   `reindex:"sub_sub_id,-"` // 'sub_sub_id' is a part of a primary key WITHOUT a personal searchable index

	// Fields
	// ....

	// Composite index
	_ struct{} `reindex:"id+sub_id+sub_sub_id,,composite,pk"`
}

Also composite indexes are useful for sorting results by multiple fields:

type Item struct {
	ID     int64 `reindex:"id,,pk"`
	Rating int   `reindex:"rating"`
	Year   int   `reindex:"year"`

	// Composite index
	_ struct{} `reindex:"rating+year,tree,composite"`
}

...
	// Sort query results by rating first, then by year
	query := db.Query("items").Sort("rating+year", true)

	// Sort query results by rating first, then by year, and put item where rating == 5 and year == 2010 first
	query := db.Query("items").Sort("rating+year", true,[]interface{}{5,2010})

For make query to the composite index, pass []interface{} to .WhereComposite function of Query builder:

	// Get results where rating == 5 and year == 2010
	query := db.Query("items").WhereComposite("rating+year", reindexer.EQ,[]interface{}{5,2010})

All the fields in regular (non-fulltext) composite index must be indexed. I.e. to be able to create composite index rating+year, it is necessary to create some kind of indexes for both raiting and year first:

type Item struct {
	ID     int64 `reindex:"id,,pk"`
	Rating int   `reindex:"rating,-"` // this field must be indexed (using index type '-' in this example)
	Year   int   `reindex:"year"`     // this field must be indexed (using index type 'hash' in this example)
	_ struct{} `reindex:"rating+year,tree,composite"`
}

Search in array fields with matching array indexes

Reindexer allows to search data in array fields when matching values have same indexes positions. For instance, we’ve got an array of structures:

type Elem struct {
	F1 int `reindex:"f1"`
	F2 int `reindex:"f2"`
}

type A struct {
	Elems []Elem
}

Common attempt to search values in this array

db.Query("Namespace").Where("f1",EQ,1).Where("f2",EQ,2)

finds all items of array Elem[] where f1 is equal to 1 and f2 is equal to 2.

EqualPosition function allows to search in array fields with equal indexes. Queries like this:

db.Query("Namespace").Where("f1", reindexer.GE, 5).Where("f2", reindexer.EQ, 100).EqualPosition("f1", "f2")

SELECT * FROM Namespace WHERE f1 >= 5 AND f2 = 100 EQUAL_POSITION(f1,f2);

will find all the items of array Elem[] with equal array indexes where f1 is greater or equal to 5 and f2 is equal to 100 (for instance, query returned 5 items where only 3rd elements of both arrays have appropriate values).

With complex expressions (expressions with brackets) equal_position() could be within a bracket:

SELECT * FROM Namespace WHERE (f1 >= 5 AND f2 = 100 EQUAL_POSITION(f1,f2)) OR (f3 = 3 AND f4 < 4 AND f5 = 7 EQUAL_POSITION(f3,f4,f5));
SELECT * FROM Namespace WHERE (f1 >= 5 AND f2 = 100 AND f3 = 3 AND f4 < 4 EQUAL_POSITION(f1,f3) EQUAL_POSITION(f2,f4)) OR (f5 = 3 AND f6 < 4 AND f7 = 7 EQUAL_POSITION(f5,f7));
SELECT * FROM Namespace WHERE f1 >= 5 AND (f2 = 100 AND f3 = 3 AND f4 < 4 EQUAL_POSITION(f2,f3)) AND f5 = 3 AND f6 < 4 EQUAL_POSITION(f1,f5,f6);

equal_position doesn’t work with the following conditions: IS NULL, IS EMPTY and IN(with empty parameter list).

Atomic on update functions

There are atomic functions, which executes under namespace lock, and therefore guarantees data consistency:

serial() - sequence of integer, useful for auto-increment keys
now() - current time stamp, useful for data synchronization. It may have one of the following arguments: msec, usec, nsec and sec. The “sec” argument is used by default.

These functions can be passed to Upsert/Insert/Update in 3-rd and next arguments.

If these functions are provided, the passed by reference item will be changed to updated value

	// set ID field from serial generator
	db.Insert ("items",&item,"id=serial()")

	// set current timestamp in nanoseconds to updated_at field
	db.Update ("items",&item,"updated_at=now(NSEC)")

	// set current timestamp and ID
	db.Upsert ("items",&item,"updated_at=now(NSEC)","id=serial()")

Direct JSON operations

Upsert data in JSON format

If source data is available in JSON format, then it is possible to improve performance of Upsert/Delete operations by directly passing JSON to reindexer. JSON deserialization will be done by C++ code, without extra allocs/deserialization in Go code.

Upsert or Delete functions can process JSON just by passing []byte argument with json

	json := []byte (`{"id":1,"name":"test"}`)
	db.Upsert  ("items",json)

It is just faster equivalent of:

	item := &Item{}
	json.Unmarshal ([]byte (`{"id":1,"name":"test"}`),item)
	db.Upsert ("items",item)

Get Query results in JSON format

In case of requirement to serialize results of Query in JSON format, then it is possible to improve performance by directly obtaining results in JSON format from reindexer. JSON serialization will be done by C++ code, without extra allocs/serialization in Go code.

...
	iterator := db.Query("items").
		Select ("id","name").        // Filter output JSON: Select only "id" and "name" fields of items, another fields will be omitted
		Limit (1).
		ExecToJson ("root_object")   // Name of root object of output JSON

	json,err := iterator.FetchAll()
	// Check the error
	if err != nil {
		panic(err)
	}
	fmt.Printf ("%s\n",string (json))
...

This code will print something like:

{ "root_object": [{ "id": 1, "name": "test" }] }

Using object cache

To avoid race conditions, by default object cache is turned off and all objects are allocated and deserialized from reindexer internal format (called CJSON) per each query. The deserialization is uses reflection, so its speed is not optimal (in fact CJSON deserialization is ~3-10x faster than JSON, and ~1.2x faster than GOB), but performance is still seriously limited by reflection overhead.

There are 2 ways to enable object cache:

Provide DeepCopy interface
Ask query return shared objects from cache

DeepCopy interface

If object is implements DeepCopy interface, then reindexer will turn on object cache and use DeepCopy interface to copy objects from cache to query results. The DeepCopy interface is responsible to make deep copy of source object.

Here is sample of DeepCopy interface implementation

func (item *Item) DeepCopy () interface {} {
	copyItem := &Item{
		ID: item.ID,
		Name: item.Name,
		Articles: make ([]int,cap (item.Articles),len (item.Articles)),
		Year: item.Year,
	}
	copy (copyItem.Articles,item.Articles)
	return copyItem
}

Get shared objects from object cache (USE WITH CAUTION)

To speed up queries and do not allocate new objects per each query it is possible ask query return objects directly from object cache. For enable this behavior, call AllowUnsafe(true) on Iterator.

WARNING: when used AllowUnsafe(true) queries returns shared pointers to structs in object cache. Therefore, application MUST NOT modify returned objects.

	res, err := db.Query("items").WhereInt ("id",reindexer.EQ,1).Exec().AllowUnsafe(true).FetchAll()
	if err != nil {
		panic (err)
	}

	if len (res) > 1 {
		// item is SHARED pointer to struct in object cache
		item = res[0].(*Item)

		// It's OK - fmt.Printf will not modify item
		fmt.Printf ("%v",item)

		// It's WRONG - can race, and will corrupt data in object cache
		item.Name = "new name"
	}

Limit size of object cache

By default, maximum size of object cache is 256000 items for each namespace. To change maximum size use ObjCacheSize method of NameapaceOptions, passed to OpenNamespace. e.g.

	// Set object cache limit to 4096 items
	db.OpenNamespace("items_with_huge_cache", reindexer.DefaultNamespaceOptions().ObjCacheSize(4096), Item{})

!This cache should not be used for the namespaces, which were replicated from the other nodes: it may be inconsistent for those replica’s namespaces.

Logging, debug, profiling and tracing

Turn on logger

Reindexer logger can be turned on by db.SetLogger() method, just like in this snippet of code:

type Logger struct {
}
func (Logger) Printf(level int, format string, msg ...interface{}) {
	log.Printf(format, msg...)
}
...
	db.SetLogger (Logger{})

Debug queries

Another useful feature is debug print of processed Queries. To debug print queries details there are 2 methods:

db.SetDefaultQueryDebug(namespace string,level int) - it globally enables print details of all queries by namespace
query.Debug(level int) - print details of query execution level is level of verbosity:
reindexer.INFO - will print only query conditions
reindexer.TRACE - will print query conditions and execution details with timings
query.Explain () - calculate and store query execution details.
iterator.GetExplainResults () - return query execution details

Profiling

Known profiling issues

Due to internal Golang’s specific it’s not recommended to try to get CPU and heap profiles simultaneously, because it may cause deadlock inside the profiler.

Integration with other program languages

A list of connectors for work with Reindexer via other program languages (TBC later):

Spring wrapper

Spring wrapper for Java-connector: https://github.com/evgeniycheban/spring-data-reindexer

3rd party open source connectors

Limitations and known issues

Currently, Reindexer is stable and production ready, but it is still a work in progress, so there are some limitations and issues:

Internal C++ API is not stabilized and is subject to change.

Getting help

You can get help in several ways:

Join Reindexer Telegram group
Write an issue

References

Landing: https://reindexer.io/

Packages repo: https://repo.reindexer.io/

More documentation (RU): https://reindexer.io/reindexer-docs/