XUtils

libucl

Universal configuration library parser. [`FreeBSD`](https://directory.fsf.org/wiki?title=License:FreeBSD)


Basic structure

UCL is heavily infused by nginx configuration as the example of a convenient configuration system. However, UCL is fully compatible with JSON format and is able to parse json files. For example, you can write the same configuration in the following ways:

  • in nginx like:
param = value;
section {
    param = value;
    param1 = value1;
    flag = true;
    number = 10k;
    time = 0.2s;
    string = "something";
    subsection {
        host = {
            host = "hostname";
            port = 900;
        }
        host = {
            host = "hostname";
            port = 901;
        }
    }
}
  • or in JSON:
{
    "param": "value",
    "section": {
        "param": "value",
        "param1": "value1",
        "flag": true,
        "number": 10000,
        "time": "0.2s",
        "string": "something",
        "subsection": {
            "host": [
                {
                    "host": "hostname",
                    "port": 900
                },
                {
                    "host": "hostname",
                    "port": 901
                }
            ]
        }
    }
}

Improvements to the json notation.

There are various things that make ucl configuration more convenient for editing than strict json:

General syntax sugar

  • Braces are not necessary to enclose a top object: it is automatically treated as an object:
"key": "value"

is equal to:

{"key": "value"}
  • There is no requirement of quotes for strings and keys, moreover, : may be replaced = or even be skipped for objects:
key = value;
section {
    key = value;
}

is equal to:

{
    "key": "value",
    "section": {
        "key": "value"
    }
}
  • No commas mess: you can safely place a comma or semicolon for the last element in an array or an object:
{
    "key1": "value",
    "key2": "value",
}

Automatic arrays creation

  • Non-unique keys in an object are allowed and are automatically converted to the arrays internally:
{
    "key": "value1",
    "key": "value2"
}

is converted to:

{
    "key": ["value1", "value2"]
}

Named keys hierarchy

UCL accepts named keys and organize them into objects hierarchy internally. Here is an example of this process:

section "blah" {
	key = value;
}
section foo {
	key = value;
}

is converted to the following object:

section {
	blah {
		key = value;
	}
	foo {
		key = value;
	}
}

Plain definitions may be more complex and contain more than a single level of nested objects:

section "blah" "foo" {
	key = value;
}

is presented as:

section {
	blah {
		foo {
			key = value;
		}
	}
}

Convenient numbers and booleans

  • Numbers can have suffixes to specify standard multipliers:
    • [kKmMgG] - standard 10 base multipliers (so 1k is translated to 1000)
    • [kKmMgG]b - 2 power multipliers (so 1kb is translated to 1024)
    • [s|min|d|w|y] - time multipliers, all time values are translated to float number of seconds, for example 10min is translated to 600.0 and 10ms is translated to 0.01
  • Hexadecimal integers can be used by 0x prefix, for example key = 0xff. However, floating point values can use decimal base only.
  • Booleans can be specified as true or yes or on and false or no or off.
  • It is still possible to treat numbers and booleans as strings by enclosing them in double quotes.

General improvements

Sample single line comment

/* some comment /* nested comment */ end of comment */


# Default priority: 0.
foo = 6
.priority 5
# The following will have priority 5.
bar = 6
baz = 7
# The following will be included with a priority of 3, 5, and 6 respectively.
.include(priority=3) "path.conf"
.include(priority=5) "equivalent-path.conf"
.include(priority=6) "highpriority-path.conf"

Multiline strings

UCL can handle multiline strings as well as single line ones. It uses shell/perl like notation for such objects:

key = <<EOD
some text
splitted to
lines
EOD

In this example key will be interpreted as the following string: some text\nsplitted to\nlines. Here are some rules for this syntax:

  • Multiline terminator must start just after << symbols and it must consist of capital letters only (e.g. <<eof or << EOF won’t work);
  • Terminator must end with a single newline character (and no spaces are allowed between terminator and newline character);
  • To finish multiline string you need to include a terminator string just after newline and followed by a newline (no spaces or other characters are allowed as well);
  • The initial and the final newlines are not inserted to the resulting string, but you can still specify newlines at the beginning and at the end of a value, for example:
key <<EOD

some
text

EOD

Single quoted strings

It is possible to use single quoted strings to simplify escaping rules. All values passed in single quoted strings are NOT escaped, with two exceptions: a single ' character just before \ character, and a newline character just after \ character that is ignored.

key = 'value'; # Read as value
key = 'value\n\'; # Read as  value\n\
key = 'value\''; # Read as value'
key = 'value\
bla'; # Read as valuebla

Performance

Are UCL parser and emitter fast enough? Well, there are some numbers. I got a 19Mb file that consist of ~700 thousand lines of json (obtained via http://www.json-generator.com/). Then I checked jansson library that performs json parsing and emitting and compared it with UCL. Here are results:

jansson: parsed json in 1.3899 seconds
jansson: emitted object in 0.2609 seconds

ucl: parsed input in 0.6649 seconds
ucl: emitted config in 0.2423 seconds
ucl: emitted json in 0.2329 seconds
ucl: emitted compact json in 0.1811 seconds
ucl: emitted yaml in 0.2489 seconds

So far, UCL seems to be significantly faster than jansson on parsing and slightly faster on emitting. Moreover, UCL compiled with optimizations (-O3) performs significantly faster:

ucl: parsed input in 0.3002 seconds
ucl: emitted config in 0.1174 seconds
ucl: emitted json in 0.1174 seconds
ucl: emitted compact json in 0.0991 seconds
ucl: emitted yaml in 0.1354 seconds

You can do your own benchmarks by running make check in libucl top directory.

Conclusion

UCL has clear design that should be very convenient for reading and writing. At the same time it is compatible with JSON language and therefore can be used as a simple JSON parser. Macro logic provides an ability to extend configuration language (for example by including some lua code) and comments allow to disable or enable the parts of a configuration quickly.


Articles

  • coming soon...