XUtils

VictoriaMetrics

fast, resource-effective and scalable open source time series database. May be used as long-term remote storage for Prometheus. Supports PromQL.


Components

VictoriaMetrics ecosystem contains the following components additionally to single-node VictoriaMetrics:

  • vmagent - lightweight agent for receiving metrics via pull-based and push-based protocols, transforming and sending them to the configured Prometheus-compatible remote storage systems such as VictoriaMetrics.
  • vmalert - a service for processing Prometheus-compatible alerting and recording rules.
  • vmalert-tool - a tool for validating alerting and recording rules.
  • vmauth - authorization proxy and load balancer optimized for VictoriaMetrics products.
  • vmgateway - authorization proxy with per-tenant rate limiting capabilities.
  • vmctl - a tool for migrating and copying data between different storage systems for metrics.
  • vmbackup, vmrestore and vmbackupmanager - tools for creating backups and restoring from backups for VictoriaMetrics data.
  • vminsert, vmselect and vmstorage - components of VictoriaMetrics cluster.
  • VictoriaLogs - user-friendly cost-efficient database for logs.

Operation

Environment variables

All the VictoriaMetrics components allow referring environment variables in yaml configuration files (such as -promscrape.config) and in command-line flags via %{ENV_VAR} syntax. For example, -metricsAuthKey=%{METRICS_AUTH_KEY} is automatically expanded to -metricsAuthKey=top-secret if METRICS_AUTH_KEY=top-secret environment variable exists at VictoriaMetrics startup. This expansion is performed by VictoriaMetrics itself.

VictoriaMetrics recursively expands %{ENV_VAR} references in environment variables on startup. For example, FOO=%{BAR} environment variable is expanded to FOO=abc if BAR=a%{BAZ} and BAZ=bc.

Additionally, all the VictoriaMetrics components allow setting flag values via environment variables according to these rules:

  • The -envflag.enable flag must be set.
  • Each . char in flag name must be substituted with _ (for example -insert.maxQueueDuration <duration> will translate to insert_maxQueueDuration=<duration>).
  • For repeating flags an alternative syntax can be used by joining the different values into one using , char as separator (for example -storageNode <nodeA> -storageNode <nodeB> will translate to storageNode=<nodeA>,<nodeB>).
  • Environment var prefix can be set via -envflag.prefix flag. For instance, if -envflag.prefix=VM_, then env vars must be prepended with VM_.

Configuration with snap package

Snap package for VictoriaMetrics is available here.

Command-line flags for Snap package can be set with following command:

echo 'FLAGS="-selfScrapeInterval=10s -search.logSlowQueryDuration=20s"' > $SNAP_DATA/var/snap/victoriametrics/current/extra_flags
snap restart victoriametrics

Do not change value for -storageDataPath flag, because snap package has limited access to host filesystem.

Changing scrape configuration is possible with text editor:

vi $SNAP_DATA/var/snap/victoriametrics/current/etc/victoriametrics-scrape-config.yaml

After changes were made, trigger config re-read with the command curl 127.0.0.1:8428/-/reload.

Running as Windows service

In order to run VictoriaMetrics as a Windows service it is required to create a service configuration for WinSW and then install it as a service according to the following guide:

  1. Create a service configuration:

    <service>
      <id>VictoriaMetrics</id>
      <name>VictoriaMetrics</name>
      <description>VictoriaMetrics</description>
      <executable>%BASE%\victoria-metrics-windows-amd64-prod.exe"</executable>
    
    
      <onfailure action="restart" delay="10 sec"/>
      <onfailure action="restart" delay="20 sec"/>
    
    
      <resetfailure>1 hour</resetfailure>
    
    
      <arguments>-envflag.enable</arguments>
    
    
      <priority>Normal</priority>
    
    
      <stoptimeout>15 sec</stoptimeout>
    
    
      <stopparentprocessfirst>true</stopparentprocessfirst>
        <startmode>Automatic</startmode>
        <waithint>15 sec</waithint>
        <sleeptime>1 sec</sleeptime>
    
    
      <logpath>%BASE%\logs</logpath>
      <log mode="roll">
        <sizeThreshold>10240</sizeThreshold>
        <keepFiles>8</keepFiles>
      </log>
    
    
      <env name="loggerFormat" value="json" />
      <env name="loggerOutput" value="stderr" />
      <env name="promscrape_config" value="C:\Program Files\victoria-metrics\promscrape.yml" />
    
    
    </service>
    
  2. Install WinSW by following this documentation.

  3. Install VictoriaMetrics as a service by running the following from elevated PowerShell:

    winsw install VictoriaMetrics.xml
    Get-Service VictoriaMetrics | Start-Service
    

See this issue for more details.

How to upgrade VictoriaMetrics

VictoriaMetrics is developed at a fast pace, so it is recommended periodically checking the CHANGELOG page and performing regular upgrades.

It is safe upgrading VictoriaMetrics to new versions unless release notes say otherwise. It is safe skipping multiple versions during the upgrade unless release notes say otherwise. It is recommended performing regular upgrades to the latest version, since it may contain important bug fixes, performance optimizations or new features.

It is also safe downgrading to older versions unless release notes say otherwise.

The following steps must be performed during the upgrade / downgrade procedure:

  • Send SIGINT signal to VictoriaMetrics process in order to gracefully stop it. See how to send signals to processes.
  • Wait until the process stops. This can take a few seconds.
  • Start the upgraded VictoriaMetrics.

Prometheus doesn’t drop data during VictoriaMetrics restart. See this article for details. The same applies also to vmagent.

Top queries

VMUI provides top queries tab, which can help determining the following query types:

  • the most frequently executed queries;
  • queries with the biggest average execution duration;
  • queries that took the most summary time for execution.

This information is obtained from the /api/v1/status/top_queries HTTP endpoint.

Active queries

VMUI provides active queries tab, which shows currently execute queries. It provides the following information per each query:

  • The query itself, together with the time range and step args passed to /api/v1/query_range.
  • The duration of the query execution.
  • The client address, who initiated the query execution.

This information is obtained from the /api/v1/status/active_queries HTTP endpoint.

Metrics explorer

VMUI provides an ability to explore metrics exported by a particular job / instance in the following way:

  1. Open the vmui at http://victoriametrics:8428/vmui/.
  2. Click the Explore Prometheus metrics tab.
  3. Select the job you want to explore.
  4. Optionally select the instance for the selected job to explore.
  5. Select metrics you want to explore and compare.

It is possible to change the selected time range for the graphs in the top right corner.

Cardinality explorer

VictoriaMetrics provides an ability to explore time series cardinality at Explore cardinality tab in vmui in the following ways:

  • To identify metric names with the highest number of series.
  • To identify labels with the highest number of series.
  • To identify values with the highest number of series for the selected label (aka focusLabel).
  • To identify label=name pairs with the highest number of series.
  • To identify labels with the highest number of unique values. Note that cluster version of VictoriaMetrics may show lower than expected number of unique label values for labels with small number of unique values. This is because of implementation limits.

By default, cardinality explorer analyzes time series for the current date. It provides the ability to select different day at the top right corner. By default, all the time series for the selected date are analyzed. It is possible to narrow down the analysis to series matching the specified series selector.

Cardinality explorer is built on top of /api/v1/status/tsdb.

See cardinality explorer playground. See the example of using the cardinality explorer here.

Cardinality explorer statistic inaccuracy

In cluster version of VictoriaMetrics each vmstorage tracks the stored time series individually. vmselect requests stats via /api/v1/status/tsdb API from each vmstorage node and merges the results by summing per-series stats. This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes due to replication or rerouting.

How to apply new config to VictoriaMetrics

VictoriaMetrics is configured via command-line flags, so it must be restarted when new command-line flags should be applied:

  • Send SIGINT signal to VictoriaMetrics process in order to gracefully stop it.
  • Wait until the process stops. This can take a few seconds.
  • Start VictoriaMetrics with the new command-line flags.

Prometheus doesn’t drop data during VictoriaMetrics restart. See this article for details. The same applies also to vmagent.

How to send data from DataDog agent

VictoriaMetrics accepts data from DataDog agent, DogStatsD and DataDog Lambda Extension via “submit metrics” API at /datadog/api/v2/series or via “sketches” API at /datadog/api/beta/sketches.

Sending metrics to VictoriaMetrics

DataDog agent allows configuring destinations for metrics sending via ENV variable DD_DD_URL or via configuration file in section dd_url.

To configure DataDog agent via ENV variable add the following prefix:

DD_DD_URL=http://victoriametrics:8428/datadog

Choose correct URL for VictoriaMetrics here.

To configure DataDog agent via configuration file add the following line:

dd_url: http://victoriametrics:8428/datadog

vmagent also can accept DataDog metrics format. Depending on where vmagent will forward data, pick single-node or cluster URL formats.

Sending metrics to DataDog and VictoriaMetrics

DataDog allows configuring Dual Shipping for metrics sending via ENV variable DD_ADDITIONAL_ENDPOINTS or via configuration file additional_endpoints.

Run DataDog using the following ENV variable with VictoriaMetrics as additional metrics receiver:

DD_ADDITIONAL_ENDPOINTS='{\"http://victoriametrics:8428/datadog\": [\"apikey\"]}'

Choose correct URL for VictoriaMetrics here.

To configure DataDog Dual Shipping via configuration file add the following line:

additional_endpoints:
  "http://victoriametrics:8428/datadog":
  - apikey

Send via cURL

See how to send data to VictoriaMetrics via DataDog “submit metrics” API here.

The imported data can be read via export API.

How to send data in InfluxDB v2 format

VictoriaMetrics exposes endpoint for InfluxDB v2 HTTP API at /influx/api/v2/write and /api/v2/write.

In order to write data with InfluxDB line protocol to local VictoriaMetrics using curl:

curl -d 'measurement,tag1=value1,tag2=value2 field1=123,field2=1.23' -X POST 'http://localhost:8428/api/v2/write'

The /api/v1/export endpoint should return the following response:

{"metric":{"__name__":"measurement_field1","tag1":"value1","tag2":"value2"},"values":[123],"timestamps":[1695902762311]}
{"metric":{"__name__":"measurement_field2","tag1":"value1","tag2":"value2"},"values":[1.23],"timestamps":[1695902762311]}

statsd_aggr.yaml

last output will keep the last sample on interval

How to send data from Graphite-compatible agents such as StatsD

Enable Graphite receiver in VictoriaMetrics by setting -graphiteListenAddr command line flag. For instance, the following command will enable Graphite receiver in VictoriaMetrics on TCP and UDP port 2003:

/path/to/victoria-metrics-prod -graphiteListenAddr=:2003

Use the configured address in Graphite-compatible agents. For instance, set graphiteHost to the VictoriaMetrics host in StatsD configs.

Example for writing data with Graphite plaintext protocol to local VictoriaMetrics using nc:

echo "foo.bar.baz;tag1=value1;tag2=value2 123 `date +%s`" | nc -N localhost 2003

VictoriaMetrics sets the current time if the timestamp is omitted. An arbitrary number of lines delimited by \n (aka newline char) can be sent in one go. After that the data may be read via /api/v1/export endpoint:

curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'

The /api/v1/export endpoint should return the following response:

{"metric":{"__name__":"foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123],"timestamps":[1560277406000]}

Graphite relabeling can be used if the imported Graphite data is going to be queried via MetricsQL.

Sending data via telnet put protocol

Enable OpenTSDB receiver in VictoriaMetrics by setting -opentsdbListenAddr command line flag. For instance, the following command enables OpenTSDB receiver in VictoriaMetrics on TCP and UDP port 4242:

/path/to/victoria-metrics-prod -opentsdbListenAddr=:4242

Send data to the given address from OpenTSDB-compatible agents.

Example for writing data with OpenTSDB protocol to local VictoriaMetrics using nc:

echo "put foo.bar.baz `date +%s` 123 tag1=value1 tag2=value2" | nc -N localhost 4242

An arbitrary number of lines delimited by \n (aka newline char) can be sent in one go. After that the data may be read via /api/v1/export endpoint:

curl -G 'http://localhost:8428/api/v1/export' -d 'match=foo.bar.baz'

The /api/v1/export endpoint should return the following response:

{"metric":{"__name__":"foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123],"timestamps":[1560277292000]}

Sending OpenTSDB data via HTTP /api/put requests

Enable HTTP server for OpenTSDB /api/put requests by setting -opentsdbHTTPListenAddr command line flag. For instance, the following command enables OpenTSDB HTTP server on port 4242:

/path/to/victoria-metrics-prod -opentsdbHTTPListenAddr=:4242

Send data to the given address from OpenTSDB-compatible agents.

Example for writing a single data point:

curl -H 'Content-Type: application/json' -d '{"metric":"x.y.z","value":45.34,"tags":{"t1":"v1","t2":"v2"}}' http://localhost:4242/api/put

Example for writing multiple data points in a single request:

curl -H 'Content-Type: application/json' -d '[{"metric":"foo","value":45.34},{"metric":"bar","value":43}]' http://localhost:4242/api/put

After that the data may be read via /api/v1/export endpoint:

curl -G 'http://localhost:8428/api/v1/export' -d 'match[]=x.y.z' -d 'match[]=foo' -d 'match[]=bar'

The /api/v1/export endpoint should return the following response:

{"metric":{"__name__":"foo"},"values":[45.34],"timestamps":[1566464846000]}
{"metric":{"__name__":"bar"},"values":[43],"timestamps":[1566464846000]}
{"metric":{"__name__":"x.y.z","t1":"v1","t2":"v2"},"values":[45.34],"timestamps":[1566464763000]}

Extra labels may be added to all the imported time series by passing extra_label=name=value query args. For example, /api/put?extra_label=foo=bar would add {foo="bar"} label to all the ingested metrics.

NewRelic agent data mapping

VictoriaMetrics maps NewRelic Events to raw samples in the following way:

  1. Every numeric field is converted into a raw sample with the corresponding name.
  2. The eventType and all the other fields with string value type are attached to every raw sample as metric labels.
  3. The timestamp field is used as timestamp for the ingested raw sample. The timestamp field may be specified either in seconds or in milliseconds since the Unix Epoch. If the timestamp field is missing, then the raw sample is stored with the current timestamp.

For example, let’s import the following NewRelic Events request to VictoriaMetrics:

[
  {
    "Events":[
      {
        "eventType":"SystemSample",
        "entityKey":"macbook-pro.local",
        "cpuPercent":25.056660790748904,
        "cpuUserPercent":8.687987912389374,
        "cpuSystemPercent":16.36867287835953,
        "cpuIOWaitPercent":0,
        "cpuIdlePercent":74.94333920925109,
        "cpuStealPercent":0,
        "loadAverageOneMinute":5.42333984375,
        "loadAverageFiveMinute":4.099609375,
        "loadAverageFifteenMinute":3.58203125
      }
    ]
  }
]

Save this JSON into newrelic.json file and then use the following command in order to import it into VictoriaMetrics:

curl -X POST -H 'Content-Type: application/json' --data-binary @newrelic.json http://localhost:8428/newrelic/infra/v2/metrics/events/bulk

Let’s fetch the ingested data via data export API:

curl http://localhost:8428/api/v1/export -d 'match={eventType="SystemSample"}'
{"metric":{"__name__":"cpuStealPercent","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[0],"timestamps":[1697407970000]}
{"metric":{"__name__":"loadAverageFiveMinute","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[4.099609375],"timestamps":[1697407970000]}
{"metric":{"__name__":"cpuIOWaitPercent","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[0],"timestamps":[1697407970000]}
{"metric":{"__name__":"cpuSystemPercent","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[16.368672878359],"timestamps":[1697407970000]}
{"metric":{"__name__":"loadAverageOneMinute","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[5.42333984375],"timestamps":[1697407970000]}
{"metric":{"__name__":"cpuUserPercent","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[8.687987912389],"timestamps":[1697407970000]}
{"metric":{"__name__":"cpuIdlePercent","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[74.9433392092],"timestamps":[1697407970000]}
{"metric":{"__name__":"loadAverageFifteenMinute","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[3.58203125],"timestamps":[1697407970000]}
{"metric":{"__name__":"cpuPercent","entityKey":"macbook-pro.local","eventType":"SystemSample"},"values":[25.056660790748],"timestamps":[1697407970000]}

Prometheus querying API enhancements

VictoriaMetrics accepts optional extra_label=<label_name>=<label_value> query arg, which can be used for enforcing additional label filters for queries. For example, /api/v1/query_range?extra_label=user_id=123&extra_label=group_id=456&query=<query> would automatically add {user_id="123",group_id="456"} label filters to the given <query>. This functionality can be used for limiting the scope of time series visible to the given tenant. It is expected that the extra_label query args are automatically set by auth proxy sitting in front of VictoriaMetrics. See vmauth and vmgateway as examples of such proxies.

VictoriaMetrics accepts optional extra_filters[]=series_selector query arg, which can be used for enforcing arbitrary label filters for queries. For example, /api/v1/query_range?extra_filters[]={env=~"prod|staging",user="xyz"}&query=<query> would automatically add {env=~"prod|staging",user="xyz"} label filters to the given <query>. This functionality can be used for limiting the scope of time series visible to the given tenant. It is expected that the extra_filters[] query args are automatically set by auth proxy sitting in front of VictoriaMetrics. See vmauth and vmgateway as examples of such proxies.

VictoriaMetrics accepts multiple formats for time, start and end query args - see these docs.

VictoriaMetrics accepts round_digits query arg for /api/v1/query and /api/v1/query_range handlers. It can be used for rounding response values to the given number of digits after the decimal point. For example, /api/v1/query?query=avg_over_time(temperature[1h])&round_digits=2 would round response values to up to two digits after the decimal point.

VictoriaMetrics accepts limit query arg for /api/v1/labels and /api/v1/label/<labelName>/values handlers for limiting the number of returned entries. For example, the query to /api/v1/labels?limit=5 returns a sample of up to 5 unique labels, while ignoring the rest of labels. If the provided limit value exceeds the corresponding -search.maxTagKeys / -search.maxTagValues command-line flag values, then limits specified in the command-line flags are used.

By default, VictoriaMetrics returns time series for the last day starting at 00:00 UTC from /api/v1/series, /api/v1/labels and /api/v1/label/<labelName>/values, while the Prometheus API defaults to all time. Explicitly set start and end to select the desired time range. VictoriaMetrics rounds the specified start..end time range to day granularity because of performance optimization concerns. If you need the exact set of label names and label values on the given time range, then send queries to /api/v1/query or to /api/v1/query_range.

VictoriaMetrics accepts limit query arg at /api/v1/series for limiting the number of returned entries. For example, the query to /api/v1/series?limit=5 returns a sample of up to 5 series, while ignoring the rest of series. If the provided limit value exceeds the corresponding -search.maxSeries command-line flag values, then limits specified in the command-line flags are used.

Additionally, VictoriaMetrics provides the following handlers:

  • /vmui - Basic Web UI. See these docs.
  • /api/v1/series/count - returns the total number of time series in the database. Some notes:
    • the handler scans all the inverted index, so it can be slow if the database contains tens of millions of time series;
    • the handler may count deleted time series additionally to normal time series due to internal implementation restrictions;
  • /api/v1/status/active_queries - returns the list of currently running queries. This list is also available at active queries page at VMUI.
  • /api/v1/status/top_queries - returns the following query lists:
    • the most frequently executed queries - topByCount
    • queries with the biggest average execution duration - topByAvgDuration
    • queries that took the most time for execution - topBySumDuration

The number of returned queries can be limited via topN query arg. Old queries can be filtered out with maxLifetime query arg. For example, request to /api/v1/status/top_queries?topN=5&maxLifetime=30s would return up to 5 queries per list, which were executed during the last 30 seconds. VictoriaMetrics tracks the last -search.queryStats.lastQueriesCount queries with durations at least -search.queryStats.minQueryDuration.

See also top queries page at VMUI.

How to build from sources

We recommend using either binary releases or docker images instead of building VictoriaMetrics from sources. Building from sources is reasonable when developing additional features specific to your needs or when testing bugfixes.

Production build

  1. Install docker.
  2. Run make victoria-metrics-prod from the root folder of the repository. It builds victoria-metrics-prod binary and puts it into the bin folder.

ARM build

ARM build may run on Raspberry Pi or on energy-efficient ARM servers.

Production ARM build

  1. Install docker.
  2. Run make victoria-metrics-linux-arm-prod or make victoria-metrics-linux-arm64-prod from the root folder of the repository. It builds victoria-metrics-linux-arm-prod or victoria-metrics-linux-arm64-prod binary respectively and puts it into the bin folder.

Building docker images

Run make package-victoria-metrics. It builds victoriametrics/victoria-metrics:<PKG_TAG> docker image locally. <PKG_TAG> is auto-generated image tag, which depends on source code in the repository. The <PKG_TAG> may be manually set via PKG_TAG=foobar make package-victoria-metrics.

The base docker image is alpine but it is possible to use any other base image by setting it via <ROOT_IMAGE> environment variable. For example, the following command builds the image on top of scratch image:

ROOT_IMAGE=scratch make package-victoria-metrics

Start with docker-compose

Docker-compose helps to spin up VictoriaMetrics, vmagent and Grafana with one command. More details may be found here.

Setting up service

Read instructions on how to set up VictoriaMetrics as a service for your OS. A snap package is available for Ubuntu.

How to work with snapshots

Send a request to http://<victoriametrics-addr>:8428/snapshot/create endpoint in order to create an instant snapshot. The page returns the following JSON response on successful creation of snapshot:

{"status":"ok","snapshot":"<snapshot-name>"}

Snapshots are created under <-storageDataPath>/snapshots directory, where <-storageDataPath> is the corresponding command-line flag value. Snapshots can be archived to backup storage at any time with vmbackup.

Snapshots consist of a mix of hard-links and soft-links to various files and directories inside -storageDataPath. See this article for more details. This adds some restrictions on what can be done with the contents of <-storageDataPath>/snapshots directory:

  • Do not delete subdirectories inside <-storageDataPath>/snapshots with rm or similar commands, since this will leave some snapshot data undeleted. Prefer using the /snapshot/delete API for deleting snapshot. See below for more details about this API.
  • Do not copy subdirectories inside <-storageDataPath>/snapshot with cp, rsync or similar commands, since there are high chances that these commands won’t copy some data stored in the snapshot. Prefer using vmbackup for making copies of snapshot data.

See also snapshot troubleshooting.

The http://<victoriametrics-addr>:8428/snapshot/list endpoint returns the list of available snapshots.

Send a query to http://<victoriametrics-addr>:8428/snapshot/delete?snapshot=<snapshot-name> in order to delete the snapshot with <snapshot-name> name.

Navigate to http://<victoriametrics-addr>:8428/snapshot/delete_all in order to delete all the snapshots.

How to restore from a snapshot

  1. Stop VictoriaMetrics with kill -INT.
  2. Restore snapshot contents from backup with vmrestore to the directory pointed by -storageDataPath.
  3. Start VictoriaMetrics.

Snapshot troubleshooting

Snapshot doesn’t occupy disk space just after its’ creation thanks to the used approach. Old snapshots may start occupying additional disk space if they refer to old parts, which were already deleted during background merge. That’s why it is recommended deleting old snapshots after they are no longer needed in order to free up disk space used by old snapshots. This can be done either manually or automatically if the -snapshotsMaxAge command-line flag is set. Make sure that the backup process has enough time to complete when setting -snapshotsMaxAge command-line flag.

VictoriaMetrics exposes the current number of available snapshots via vm_snapshots metric at /metrics page.

Forced merge

VictoriaMetrics performs data compactions in background in order to keep good performance characteristics when accepting new data. These compactions (merges) are performed independently on per-month partitions. This means that compactions are stopped for per-month partitions if no new data is ingested into these partitions. Sometimes it is necessary to trigger compactions for old partitions. For instance, in order to free up disk space occupied by deleted time series. In this case forced compaction may be initiated on the specified per-month partition by sending request to /internal/force_merge?partition_prefix=YYYY_MM, where YYYY_MM is per-month partition name. For example, http://victoriametrics:8428/internal/force_merge?partition_prefix=2020_08 would initiate forced merge for August 2020 partition. The call to /internal/force_merge returns immediately, while the corresponding forced merge continues running in background.

Forced merges may require additional CPU, disk IO and storage space resources. It is unnecessary to run forced merge under normal conditions, since VictoriaMetrics automatically performs optimal merges in background when new data is ingested into it.

How to export time series

VictoriaMetrics provides the following handlers for exporting data:

  • /api/v1/export for exporting data in JSON line format. See these docs for details.
  • /api/v1/export/csv for exporting data in CSV. See these docs for details.
  • /api/v1/export/native for exporting data in native binary format. This is the most efficient format for data export. See these docs for details.

How to export data in native format

Send a request to http://<victoriametrics-addr>:8428/api/v1/export/native?match[]=<timeseries_selector_for_export>, where <timeseries_selector_for_export> may contain any time series selector for metrics to export. Use {__name__=~".*"} selector for fetching all the time series.

On large databases you may experience problems with limit on the number of time series, which can be exported. In this case you need to adjust -search.maxExportSeries command-line flag:

# count unique time series in database
wget -O- -q 'http://your_victoriametrics_instance:8428/api/v1/series/count' | jq '.data[0]'

# relaunch victoriametrics with search.maxExportSeries more than value from previous command

Optional start and end args may be added to the request in order to limit the time frame for the exported data. See allowed formats for these args.

For example:

curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'

The exported data can be imported to VictoriaMetrics via /api/v1/import/native. The native export format may change in incompatible way between VictoriaMetrics releases, so the data exported from the release X can fail to be imported into VictoriaMetrics release Y.

The deduplication isn’t applied for the data exported in native format. It is expected that the de-duplication is performed during data import.

How to import time series data

VictoriaMetrics can discover and scrape metrics from Prometheus-compatible targets (aka “pull” protocol) - see these docs. Additionally, VictoriaMetrics can accept metrics via the following popular data ingestion protocols (aka “push” protocols):

Please note, most of the ingestion APIs (except Prometheus remote_write API) are optimized for performance and processes data in a streaming fashion. It means that client can transfer unlimited amount of data through the open connection. Because of this, import APIs may not return parsing errors to the client, as it is expected for data stream to be not interrupted. Instead, look for parsing errors on the server side (VictoriaMetrics single-node or vminsert) or check for changes in vm_rows_invalid_total (exported by server side) metric.

How to import data in JSON line format

VictoriaMetrics accepts metrics data in JSON line format at /api/v1/import endpoint. See these docs for details on this format.

Example for importing data obtained via /api/v1/export:

# Export the data from <source-victoriametrics>:
curl http://source-victoriametrics:8428/api/v1/export -d 'match={__name__!=""}' > exported_data.jsonl

# Import the data to <destination-victoriametrics>:
curl -X POST http://destination-victoriametrics:8428/api/v1/import -T exported_data.jsonl

Pass Content-Encoding: gzip HTTP request header to /api/v1/import for importing gzipped data:

# Export gzipped data from <source-victoriametrics>:
curl -H 'Accept-Encoding: gzip' http://source-victoriametrics:8428/api/v1/export -d 'match={__name__!=""}' > exported_data.jsonl.gz

# Import gzipped data to <destination-victoriametrics>:
curl -X POST -H 'Content-Encoding: gzip' http://destination-victoriametrics:8428/api/v1/import -T exported_data.jsonl.gz

Extra labels may be added to all the imported time series by passing extra_label=name=value query args. For example, /api/v1/import?extra_label=foo=bar would add "foo":"bar" label to all the imported time series.

Note that it could be required to flush response cache after importing historical data. See these docs for detail.

VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into shorter lines. It is OK if samples for a single time series are split among multiple JSON lines. JSON line length can be limited via max_rows_per_line query arg when exporting via /api/v1/export.

The maximum JSON line length, which can be parsed by VictoriaMetrics, is limited by -import.maxLineLen command-line flag value.

How to import data in native format

The specification of VictoriaMetrics’ native format may yet change and is not formally documented yet. So currently we do not recommend that external clients attempt to pack their own metrics in native format file.

If you have a native format file obtained via /api/v1/export/native however this is the most efficient protocol for importing data in.

# Export the data from <source-victoriametrics>:
curl http://source-victoriametrics:8428/api/v1/export/native -d 'match={__name__!=""}' > exported_data.bin

# Import the data to <destination-victoriametrics>:
curl -X POST http://destination-victoriametrics:8428/api/v1/import/native -T exported_data.bin

Extra labels may be added to all the imported time series by passing extra_label=name=value query args. For example, /api/v1/import/native?extra_label=foo=bar would add "foo":"bar" label to all the imported time series.

Note that it could be required to flush response cache after importing historical data. See these docs for detail.

How to import data in Prometheus exposition format

VictoriaMetrics accepts data in Prometheus exposition format, in OpenMetrics format and in Pushgateway format via /api/v1/import/prometheus path.

For example, the following command imports a single line in Prometheus exposition format into VictoriaMetrics:

curl -d 'foo{bar="baz"} 123' -X POST 'http://localhost:8428/api/v1/import/prometheus'

The following command may be used for verifying the imported data:

curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"foo"}'

It should return something like the following:

{"metric":{"__name__":"foo","bar":"baz"},"values":[123],"timestamps":[1594370496905]}

The following command imports a single metric via Pushgateway format with {job="my_app",instance="host123"} labels:

curl -d 'metric{label="abc"} 123' -X POST 'http://localhost:8428/api/v1/import/prometheus/metrics/job/my_app/instance/host123'

Pass Content-Encoding: gzip HTTP request header to /api/v1/import/prometheus for importing gzipped data:

# Add {cluster="dev"} label.
- target_label: cluster
  replacement: dev

# Drop the metric (or scrape target) with `{__meta_kubernetes_pod_container_init="true"}` label.
- action: drop
  source_labels: [__meta_kubernetes_pod_container_init]
  regex: true

VictoriaMetrics provides additional relabeling features such as Graphite-style relabeling. See these docs for more details.

The relabeling can be debugged at http://victoriametrics:8428/metric-relabel-debug page or at our public playground. See these docs for more details.

Federation

VictoriaMetrics exports Prometheus-compatible federation data at http://<victoriametrics-addr>:8428/federate?match[]=<timeseries_selector_for_federation>.

Optional start and end args may be added to the request in order to scrape the last point for each selected time series on the [start ... end] interval. See allowed formats for these args.

For example:

curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'

By default, the last point on the interval [now - max_lookback ... now] is scraped for each time series. The default value for max_lookback is 5m (5 minutes), but it can be overridden with max_lookback query arg. For instance, /federate?match[]=up&max_lookback=1h would return last points on the [now - 1h ... now] interval. This may be useful for time series federation with scrape intervals exceeding 5m.

High availability

The general approach for achieving high availability is the following:

  • To run two identically configured VictoriaMetrics instances in distinct datacenters (availability zones);
  • To store the collected data simultaneously into these instances via vmagent or Prometheus.
  • To query the first VictoriaMetrics instance and to fail over to the second instance when the first instance becomes temporarily unavailable. This can be done via vmauth according to these docs.

Such a setup guarantees that the collected data isn’t lost when one of VictoriaMetrics instance becomes unavailable. The collected data continues to be written to the available VictoriaMetrics instance, so it should be available for querying. Both vmagent and Prometheus buffer the collected data locally if they cannot send it to the configured remote storage. So the collected data will be written to the temporarily unavailable VictoriaMetrics instance after it becomes available.

If you use vmagent for storing the data into VictoriaMetrics, then it can be configured with multiple -remoteWrite.url command-line flags, where every flag points to the VictoriaMetrics instance in a particular availability zone, in order to replicate the collected data to all the VictoriaMetrics instances. For example, the following command instructs vmagent to replicate data to vm-az1 and vm-az2 instances of VictoriaMetrics:

/path/to/vmagent \
  -remoteWrite.url=http://<vm-az1>:8428/api/v1/write \
  -remoteWrite.url=http://<vm-az2>:8428/api/v1/write

If you use Prometheus for collecting and writing the data to VictoriaMetrics, then the following remote_write section in Prometheus config can be used for replicating the collected data to vm-az1 and vm-az2 VictoriaMetrics instances:

remote_write:
  - url: http://<vm-az1>:8428/api/v1/write
  - url: http://<vm-az2>:8428/api/v1/write

It is recommended to use vmagent instead of Prometheus for highly loaded setups, since it uses lower amounts of RAM, CPU and network bandwidth than Prometheus.

If you use identically configured vmagent instances for collecting the same data and sending it to VictoriaMetrics, then do not forget enabling deduplication at VictoriaMetrics side.

Storage

VictoriaMetrics buffers the ingested data in memory for up to a second. Then the buffered data is written to in-memory parts, which can be searched during queries. The in-memory parts are periodically persisted to disk, so they could survive unclean shutdown such as out of memory crash, hardware power loss or SIGKILL signal. The interval for flushing the in-memory data to disk can be configured with the -inmemoryDataFlushInterval command-line flag (note that too short flush interval may significantly increase disk IO).

In-memory parts are persisted to disk into part directories under the <-storageDataPath>/data/small/YYYY_MM/ folder, where YYYY_MM is the month partition for the stored data. For example, 2022_11 is the partition for parts with raw samples from November 2022. Each partition directory contains parts.json file with the actual list of parts in the partition.

Every part directory contains metadata.json file with the following fields:

  • RowsCount - the number of raw samples stored in the part
  • BlocksCount - the number of blocks stored in the part (see details about blocks below)
  • MinTimestamp and MaxTimestamp - minimum and maximum timestamps across raw samples stored in the part
  • MinDedupInterval - the deduplication interval applied to the given part.

Each part consists of blocks sorted by internal time series id (aka TSID). Each block contains up to 8K raw samples, which belong to a single time series. Raw samples in each block are sorted by timestamp. Blocks for the same time series are sorted by the timestamp of the first sample. Timestamps and values for all the blocks are stored in compressed form in separate files under part directory - timestamps.bin and values.bin.

The part directory also contains index.bin and metaindex.bin files - these files contain index for fast block lookups, which belong to the given TSID and cover the given time range.

Parts are periodically merged into bigger parts in background. The background merge provides the following benefits:

  • keeping the number of data files under control, so they don’t exceed limits on open files
  • improved data compression, since bigger parts are usually compressed better than smaller parts
  • improved query speed, since queries over smaller number of parts are executed faster
  • various background maintenance tasks such as de-duplication, downsampling and freeing up disk space for the deleted time series are performed during the merge

Newly added parts either successfully appear in the storage or fail to appear. The newly added part is atomically registered in the parts.json file under the corresponding partition after it is fully written and fsynced to the storage. Thanks to this algorithm, storage never contains partially created parts, even if hardware power off occurs in the middle of writing the part to disk - such incompletely written parts are automatically deleted on the next VictoriaMetrics start.

The same applies to merge process — parts are either fully merged into a new part or fail to merge, leaving the source parts untouched. However, due to hardware issues data on disk may be corrupted regardless of VictoriaMetrics process. VictoriaMetrics can detect corruption during decompressing, decoding or sanity checking of the data blocks. But it cannot fix the corrupted data. Data parts that fail to load on startup need to be deleted or restored from backups. This is why it is recommended performing regular backups.

VictoriaMetrics doesn’t use checksums for stored data blocks. See why here.

VictoriaMetrics doesn’t merge parts if their summary size exceeds free disk space. This prevents from potential out of disk space errors during merge. The number of parts may significantly increase over time under free disk space shortage. This increases overhead during data querying, since VictoriaMetrics needs to read data from bigger number of parts per each request. That’s why it is recommended to have at least 20% of free disk space under directory pointed by -storageDataPath command-line flag.

Information about merging process is available in the dashboard for single-node VictoriaMetrics and the dashboard for VictoriaMetrics cluster. See more details in monitoring docs.

See this article for more details.

See also how to work with snapshots.

Alerting

It is recommended using vmalert for alerting.

Additionally, alerting can be set up with the following tools:

Tuning

  • No need in tuning for VictoriaMetrics - it uses reasonable defaults for command-line flags, which are automatically adjusted for the available CPU and RAM resources.
  • No need in tuning for Operating System - VictoriaMetrics is optimized for default OS settings. The only option is increasing the limit on the number of open files in the OS. The recommendation is not specific for VictoriaMetrics only but also for any service which handles many HTTP connections and stores data on disk.
  • VictoriaMetrics is a write-heavy application and its performance depends on disk performance. So be careful with other applications or utilities (like fstrim) which could exhaust disk resources.
  • The recommended filesystem is ext4, the recommended persistent storage is persistent HDD-based disk on GCP, since it is protected from hardware failures via internal replication and it can be resized on the fly. If you plan to store more than 1TB of data on ext4 partition or plan extending it to more than 16TB, then the following options are recommended to pass to mkfs.ext4:
mkfs.ext4 ... -O 64bit,huge_file,extent -T huge

Monitoring

VictoriaMetrics exports internal metrics in Prometheus exposition format at /metrics page. These metrics can be scraped via vmagent or any other Prometheus-compatible scraper.

If you use Google Cloud Managed Prometheus for scraping metrics from VictoriaMetrics components, then pass -metrics.exposeMetadata command-line to them, so they add TYPE and HELP comments per each exposed metric at /metrics page. See these docs for details.

Alternatively, single-node VictoriaMetrics can self-scrape the metrics when -selfScrapeInterval command-line flag is set to duration greater than 0. For example, -selfScrapeInterval=10s would enable self-scraping of /metrics page with 10 seconds interval.

Please note, never use loadbalancer address for scraping metrics. All the monitored components should be scraped directly by their address.

Official Grafana dashboards available for single-node and clustered VictoriaMetrics. See an alternative dashboard for clustered VictoriaMetrics created by community.

Graphs on the dashboards contain useful hints - hover the i icon in the top left corner of each graph to read it.

We recommend setting up alerts via vmalert or via Prometheus.

VictoriaMetrics exposes currently running queries and their execution times at active queries page.

VictoriaMetrics exposes queries, which take the most time to execute, at top queries page.

See also VictoriaMetrics Monitoring and troubleshooting docs.

TSDB stats

VictoriaMetrics returns TSDB stats at /api/v1/status/tsdb page in the way similar to Prometheus - see these Prometheus docs. VictoriaMetrics accepts the following optional query args at /api/v1/status/tsdb page:

  • topN=N where N is the number of top entries to return in the response. By default top 10 entries are returned.
  • date=YYYY-MM-DD where YYYY-MM-DD is the date for collecting the stats. By default the stats is collected for the current day. Pass date=1970-01-01 in order to collect global stats across all the days.
  • focusLabel=LABEL_NAME returns label values with the highest number of time series for the given LABEL_NAME in the seriesCountByFocusLabelValue list.
  • match[]=SELECTOR where SELECTOR is an arbitrary time series selector for series to take into account during stats calculation. By default all the series are taken into account.
  • extra_label=LABEL=VALUE. See these docs for more details.

In cluster version of VictoriaMetrics each vmstorage tracks the stored time series individually. vmselect requests stats via /api/v1/status/tsdb API from each vmstorage node and merges the results by summing per-series stats. This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes due to replication or rerouting.

VictoriaMetrics provides an UI on top of /api/v1/status/tsdb - see cardinality explorer docs.

Cardinality limiter

By default VictoriaMetrics doesn’t limit the number of stored time series. The limit can be enforced by setting the following command-line flags:

  • -storage.maxHourlySeries - limits the number of time series that can be added during the last hour. Useful for limiting the number of active time series.
  • -storage.maxDailySeries - limits the number of time series that can be added during the last day. Useful for limiting daily churn rate.

Both limits can be set simultaneously. If any of these limits is reached, then incoming samples for new time series are dropped. A sample of dropped series is put in the log with WARNING level.

The exceeded limits can be monitored with the following metrics:

  • vm_hourly_series_limit_rows_dropped_total - the number of metrics dropped due to exceeded hourly limit on the number of unique time series.

  • vm_hourly_series_limit_max_series - the hourly series limit set via -storage.maxHourlySeries command-line flag.

  • vm_hourly_series_limit_current_series - the current number of unique series during the last hour. The following query can be useful for alerting when the number of unique series during the last hour exceeds 90% of the -storage.maxHourlySeries:

  vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9
  • vm_daily_series_limit_rows_dropped_total - the number of metrics dropped due to exceeded daily limit on the number of unique time series.

  • vm_daily_series_limit_max_series - the daily series limit set via -storage.maxDailySeries command-line flag.

  • vm_daily_series_limit_current_series - the current number of unique series during the last day. The following query can be useful for alerting when the number of unique series during the last day exceeds 90% of the -storage.maxDailySeries:

  vm_daily_series_limit_current_series / vm_daily_series_limit_max_series > 0.9

These limits are approximate, so VictoriaMetrics can underflow/overflow the limit by a small percentage (usually less than 1%).

See also more advanced cardinality limiter in vmagent and cardinality explorer docs.

Cache removal

VictoriaMetrics uses various internal caches. These caches are stored to <-storageDataPath>/cache directory during graceful shutdown (e.g. when VictoriaMetrics is stopped by sending SIGINT signal). The caches are read on the next VictoriaMetrics startup. Sometimes it is needed to remove such caches on the next startup. This can be done in the following ways:

  • By manually removing the <-storageDataPath>/cache directory when VictoriaMetrics is stopped.
  • By placing reset_cache_on_startup file inside the <-storageDataPath>/cache directory before the restart of VictoriaMetrics. In this case VictoriaMetrics will automatically remove all the caches on the next start. See this issue for details.

It is also possible removing rollup result cache on startup by passing -search.resetRollupResultCacheOnStartup command-line flag to VictoriaMetrics.

Rollup result cache

VictoriaMetrics caches query responses by default. This allows increasing performance for repeated queries to /api/v1/query and /api/v1/query_range with the increasing time, start and end query args.

This cache may work incorrectly when ingesting historical data into VictoriaMetrics. See these docs for details.

The rollup cache can be disabled either globally by running VictoriaMetrics with -search.disableCache command-line flag or on a per-query basis by passing nocache=1 query arg to /api/v1/query and /api/v1/query_range.

See also cache removal docs.

Data migration

From VictoriaMetrics

The simplest way to migrate data from one single-node (source) to another (destination), or from one vmstorage node to another do the following:

  1. Stop the VictoriaMetrics (source) with kill -INT;
  2. Copy (via rsync or any other tool) the entire folder specified via -storageDataPath from the source node to the empty folder at the destination node.
  3. Once copy is done, stop the VictoriaMetrics (destination) with kill -INT and verify that its -storageDataPath points to the copied folder from p.2;
  4. Start the VictoriaMetrics (destination). The copied data should be now available.

Things to consider when copying data:

  1. Data formats between single-node and vmstorage node aren’t compatible and can’t be copied.
  2. Copying data folder means complete replacement of the previous data on destination VictoriaMetrics.

For more complex scenarios like single-to-cluster, cluster-to-single, re-sharding or migrating only a fraction of data - see vmctl. Migrating data from VictoriaMetrics.

vmalert

A single-node VictoriaMetrics is capable of proxying requests to vmalert when -vmalert.proxyURL flag is set. Use this feature for the following cases:

  • for proxying requests from Grafana Alerting UI;
  • for accessing vmalerts UI through single-node VictoriaMetrics Web interface.

For accessing vmalerts UI through single-node VictoriaMetrics configure -vmalert.proxyURL flag and visit http://<victoriametrics-addr>:8428/vmalert/ link.

Benchmarks

Note, that vendors (including VictoriaMetrics) are often biased when doing such tests. E.g. they try highlighting the best parts of their product, while highlighting the worst parts of competing products. So we encourage users and all independent third parties to conduct their benchmarks for various products they are evaluating in production and publish the results.

As a reference, please see benchmarks conducted by VictoriaMetrics team. Please also see the helm chart for running ingestion benchmarks based on node_exporter metrics.

Profiling

VictoriaMetrics provides handlers for collecting the following Go profiles:

  • Memory profile. It can be collected with the following command (replace 0.0.0.0 with hostname if needed):
curl http://0.0.0.0:8428/debug/pprof/heap > mem.pprof
  • CPU profile. It can be collected with the following command (replace 0.0.0.0 with hostname if needed):
curl http://0.0.0.0:8428/debug/pprof/profile > cpu.pprof

The command for collecting CPU profile waits for 30 seconds before returning.

The collected profiles may be analyzed with go tool pprof. It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information.

Integrations

Third-party contributions

Reporting bugs

Report bugs and propose new features here.

Documentation

VictoriaMetrics documentation is available at https://docs.victoriametrics.com/. It is built from *.md files located in docs folder and gets automatically updated once changes are merged to master branch. To update the documentation follow the steps below:

  • Fork VictoriaMetrics repo and apply changes to the docs:
  • If your changes contain an image then see images in documentation.
  • Once changes are made, execute the command below to finalize and sync the changes:
make docs-sync
  • Create a pull request with proposed changes and wait for it to be merged.

Requirements for changes to docs:

  • Keep backward compatibility of existing links. Avoid changing anchors or deleting pages as they could have been used or posted in other docs, GitHub issues, stackoverlow answers, etc.
  • Keep docs simple. Try using as simple wording as possible.
  • Keep docs consistent. When modifying existing docs, verify that other places referencing to this doc are still relevant.
  • Prefer improving the existing docs instead of adding new ones.
  • Use absolute links.

Images in documentation

Please, keep image size and number of images per single page low. Keep the docs page as lightweight as possible.

If the page needs to have many images, consider using WEB-optimized image format webp. When adding a new doc with many images use webp format right away. Or use a Makefile command below to convert already existing images at docs folder automatically to web format:

make docs-images-to-webp

Once conversion is done, update the path to images in your docs and verify everything is correct.

Zip contains three folders with different image orientations (main color and inverted version).

Files included in each folder:

  • 2 JPEG Preview files
  • 2 PNG Preview files with transparent background
  • 2 EPS Adobe Illustrator EPS10 files

Font used

  • Lato Black
  • Lato Regular

Color Palette

We kindly ask

  • Please don’t use any other font instead of suggested.
  • To keep enough clear space around the logo.
  • Do not change spacing, alignment, or relative locations of the design elements.
  • Do not change the proportions for any of the design elements or the design itself. You may resize as needed but must retain all proportions.

Articles

  • coming soon...