XUtils

protox

Elixir implementation for Protocol Buffers.


Prerequisites

  • Elixir >= 1.12
  • protoc >= 3.0 This dependency is only required at compile-time protox uses Google’s protoc (>= 3.0) to parse .proto files. It must be available in $PATH.

👉 You can download it here or you can install it with your favorite package manager (brew install protobuf, apt install protobuf-compiler, etc.).

ℹ️ If you choose to generate files, protoc won’t be needed to compile these files.

Protobuf binary format

Encode

Here’s how to create and encode a new message to binary protobuf:

iex> msg = %Fiz.Foo{a: 3, b: %{1 => %Fiz.Baz{}}}
iex> {:ok, iodata} = Protox.encode(msg)

Or, with throwing style:

iex> iodata = Protox.encode!(msg)

It’s also possible to call encode/1 and encode!/1 directly on the generated structures:

iex> {:ok, iodata} = Fiz.Foo.encode(msg)
iex> iodata = Fiz.Foo.encode!(msg)

ℹ️ Note that encode/1 returns an IO data for efficiency reasons. Such IO data can be used directly with files or sockets write operations:

iex> {:ok, iodata} = Protox.encode(%Fiz.Foo{a: 3, b: %{1 => %Fiz.Baz{}}})
[[[], <<18>>, <<4>>, "\b", <<1>>, <<18>>, <<0>>], "\b", <<3>>]
iex> {:ok, file} = File.open("msg.bin", [:write])
{:ok, #PID<0.1023.0>}
iex> IO.binwrite(file, iodata)
:ok

👉 You can use :binary.list_to_bin/1 or IO.iodata_to_binary to get a binary:

iex> %Fiz.Foo{a: 3, b: %{1 => %Fiz.Baz{}}} |> Protox.encode!() |> :binary.list_to_bin()
<<8, 3, 18, 4, 8, 1, 18, 0>>

Decode

Here’s how to decode a message from binary protobuf:

iex> {:ok, msg} = Protox.decode(<<8, 3, 18, 4, 8, 1, 18, 0>>, Fiz.Foo)

Or, with throwing style:

iex> msg = Protox.decode!(<<8, 3, 18, 4, 8, 1, 18, 0>>, Fiz.Foo)

It’s also possible to call decode/1 and decode!/1 directly on the generated structures:

iex> {:ok, msg} = Fiz.Foo.decode(<<8, 3, 18, 4, 8, 1, 18, 0>>)
iex> msg = Fiz.Foo.decode!(<<8, 3, 18, 4, 8, 1, 18, 0>>)

Protobuf JSON format

protox implements the Google’s JSON specification.

Encode

Here’s how to encode a message to JSON, exported as IO data:

iex> msg = %Fiz.Foo{a: 42}
iex> {:ok, iodata} = Protox.json_encode(msg)
{:ok, ["{", ["\"a\"", ":", "42"], "}"]}

Or, with throwing style:

iex> msg = %Fiz.Foo{a: 42}
iex> iodata = Protox.json_encode!(msg)
["{", ["\"a\"", ":", "42"], "}"]

It’s also possible to call json_encode and json_encode! directly on the generated structures:

iex> {:ok, iodata} = Fiz.Foo.json_encode(msg)
iex> iodata = Fiz.Foo.json_encode!(msg)

Decode

Here’s how to decode JSON to a message:

iex> Protox.json_decode("{\"a\":42}", Fiz.Foo)
{:ok, %Fiz.Foo{__uf__: [], a: 42, b: %{}}}

Or, with throwing style:

iex> Protox.json_decode!("{\"a\":42}", Fiz.Foo)
%Fiz.Foo{__uf__: [], a: 42, b: %{}}

It’s also possible to call json_decode and json_decode! directly on the generated structures:

iex> Fiz.Foo.json_decode("{\"a\":42}")
iex> Fiz.Foo.json_decode!("{\"a\":42}")

JSON library configuration

By default, protox uses Jason to encode values to JSON (mostly to escape strings). You can also use Poison:

iex> Protox.json_decode!(iodata, Fiz.Foo, json_library: Protox.Poison)
iex> Protox.json_encode!(msg, json_library: Protox.Poison)

ℹ️ You can use any other library by implementing the Protox.JsonLibrary behaviour.

👉 Don’t forget to add the chosen library to the list of dependencies in mix.exs.

Packages and namespaces

Packages

Protobuf provides a package directive:

package abc.def;
message Baz {}

Modules generated by protox will include this package declaration. Thus, the example above will be translated to Abc.Def.Baz (note the camelization of package abc.def to Abc.Def).

Prepend namespaces

In addition, protox provides the possibility to prepend a namespace with the namespace option to all generated modules:

defmodule Bar do
  use Protox, schema: """
    syntax = "proto3";

    package abc;

    message Msg {
        int32 a = 1;
      }
    """,
    namespace: MyApp
end

In this example, the module MyApp.Abc.Msg is generated:

iex> msg = %MyApp.Abc.Msg{a: 42}

Specify import path

An import path can be specified using the path: or paths: options that respectively specify the directory or directories in which to search for imports:

defmodule Baz do
  use Protox,
    files: [
      "./defs/prefix/foo.proto",
      "./defs/prefix/bar/bar.proto",
    ],
    path: "./defs"
end

If multiple search paths are needed:

defmodule Baz do
  use Protox,
    files: [
      "./defs1/prefix/foo.proto",
      "./defs1/prefix/bar.proto",
      "./defs2/prefix/baz/baz.proto"
    ],
    paths: [
      "./defs1",
      "./defs2"
    ]
end

It corresponds to the -I option of protoc.

Unknown fields

Unknown fields are fields that are present on the wire but which do not correspond to an entry in the protobuf definition. Typically, it occurs when the sender has a newer version of the protobuf definition. It enables backwards compatibility as the receiver with an old version of the protobuf definition will still be able to decode old fields.

When unknown fields are encountered at decoding time, they are kept in the decoded message. It’s possible to access them with the unknown_fields/1 function defined with the message.

iex> msg = Msg.decode!(<<8, 42, 42, 4, 121, 97, 121, 101, 136, 241, 4, 83>>)
%Msg{a: 42, b: "", z: -42, __uf__: [{5, 2, <<121, 97, 121, 101>>}]}

iex> Msg.unknown_fields(msg)
[{5, 2, <<121, 97, 121, 101>>}]

You must always use unknown_fields/1 as the name of the field (e.g. __uf__ in the above example) is generated at compile-time to avoid collision with the actual fields of the Protobuf message. This function returns a list of tuples {tag, wire_type, bytes}. For more information, please see protobuf encoding guide.

When you encode a message that contains unknown fields, they will be reencoded in the serialized output.

Implementation choices

  • This library enforces the presence of required fields (Protobuf 2). Therefore an error is raised when encoding or decoding a message with a missing required field:

    defmodule Bar do
      use Protox, schema: """
        syntax = "proto2";
    
    
        message Required {
          required int32 a = 1;
        }
      """
    end
    
    
    iex> Protox.encode!(%Required{})
    ** (Protox.RequiredFieldsError) Some required fields are not set: [:a]
    
    
    iex> Required.decode!(<<>>)
    ** (Protox.RequiredFieldsError) Some required fields are not set: [:a]
    
  • When decoding enum aliases, the last encountered constant is used. For instance, in the following example, :BAR is always used if the value 1 is read on the wire:

    enum E {
      option allow_alias = true;
      FOO = 0;
      BAZ = 1;
      BAR = 1;
    }
    
  • Unset optionals

    • For Protobuf 2, unset optional fields are mapped to nil. You can use the generated default/1 function to get the default value of a field:

      defmodule Bar do
        use Protox,
        schema: """
          syntax = "proto2";
      
      
          message Foo {
            optional int32 a = 1 [default = 42];
          }
        """
      end
      
      
      iex> Foo.default(:a)
      {:ok, 42}
      
      
      iex> %Foo{}.a
      nil
      
      
      

      It means that if you need to know if a field has been set by the sender, you just have to test if its value is nil or not.

    • For Protobuf 3, unset fields are mapped to their default values. However, if you use the optional keyword (available in protoc version 3.15 and higher), then unset fields will be mapped to nil:

      defmodule Bar do
        use Protox,
        schema: """
          syntax = "proto3";
      
      
          message Foo {
            int32 a = 1;
            optional int32 b = 2;
          }
        """
      end
      
      
      iex> Foo.default(:a)
      {:ok, 0}
      
      
      iex> %Foo{}.a
      0
      
      
      iex> Foo.default(:b)
      {:error, :no_default_value}
      
      
      iex> %Foo{}.b
      nil
      
  • Messages and enums names: they are converted using the Macro.camelize/1 function. Thus, in the following example, non_camel_message becomes NonCamelMessage, but the field non_camel_field is left unchanged:

    defmodule Bar do
      use Protox,
      schema: """
        syntax = "proto3";
    
    
        message non_camel_message {
        }
    
    
        message CamelMessage {
          int32 non_camel_field = 1;
        }
      """
    end
    
    
    iex> msg = %NonCamelMessage{}
    %NonCamelMessage{__uf__: []}
    
    
    iex> msg = %CamelMessage{}
    %CamelMessage{__uf__: [], non_camel_field: 0}
    

Generated code reference

The detailed reference of the generated code is available here.

Files generation

It’s possible to generate a file that will contain all code corresponding to the protobuf messages:

MIX_ENV=prod mix protox.generate --output-path=/path/to/message.ex --include-path=./test/samples test/samples/messages.proto test/samples/proto2.proto

The generated file will be usable in any project as long as protox is declared in the dependencies as it needs functions from the protox runtime.

Conformance

The protox library has been thoroughly tested using the conformance checker provided by Google.

Here’s how to launch the conformance tests:

  • Get conformance-test-runner sources.

  • Compile conformance-test-runner (macOS and Linux only):

    tar xf protobuf-3.18.0.tar.gz && cd protobuf-3.18.0 && ./autogen.sh && ./configure && make -j && cd conformance && make -j
    
  • Launch the conformance tests:

    mix protox.conformance --runner=/path/to/protobuf-3.18.0/conformance/conformance-test-runner
    
  • A report will be generated in the directory conformance_report and the following text should be displayed:

    CONFORMANCE TEST BEGIN ====================================
    
    
    CONFORMANCE SUITE PASSED: 1996 successes, 0 skipped, 21 expected failures, 0 unexpected failures.
    
    
    CONFORMANCE TEST BEGIN ====================================
    
    
    CONFORMANCE SUITE PASSED: 0 successes, 120 skipped, 0 expected failures, 0 unexpected failures.
    
  • You can alternatively launch these conformance tests with mix test by setting the PROTOBUF_CONFORMANCE_RUNNER environment variable and including the conformance tag:

     PROTOBUF_CONFORMANCE_RUNNER=/path/to/conformance-test-runner MIX_ENV=test mix test --include conformance
    

Types mapping

The following table shows how Protobuf types are mapped to Elixir’s ones.

Protobuf Elixir
int32 integer()
int64 integer()
uint32 integer()
uint64 integer()
sint32 integer()
sint64 integer()
fixed32 integer()
fixed64 integer()
sfixed32 integer()
sfixed64 integer()
float float() \| :infinity \| :'-infinity' \| :nan
double float() \| :infinity \| :'-infinity' \| :nan
bool boolean()
string String.t()
bytes binary()
repeated list(value_type) where value_type is the type of the repeated field
map map()
oneof {atom(), value_type} where atom() is the type of the set field and where value_type is the type of the set field
enum atom() \| integer()
message struct()

Benchmarks

You can launch benchmarks to see how protox perform:

mix run ./benchmarks/generate_payloads.exs # first time only, generates random payloads
mix run ./benchmarks/run.exs --lib=./benchmarks/protox.exs
mix run ./benchmarks/load.exs

Development

protox uses pre-commit to launch git hooks. Thus, it’s strongly recommended to install it, and then to install hooks as follows:

pre-commit install && pre-commit install -t pre-push

Credits

Both gpb and exprotobuf were very useful in understanding how to implement Protocol Buffers.


Articles

  • coming soon...