Description

If you landed here then you must have searched for examples about how you can get Dynatrace to work with Aviatrix.

Worry not…I also searched for those as I’m a pretty lazy person and prefer not to reinvent the wheel if I can avoid it.

I found a VSCode dynatrace plugin that makes things easier but then no premade examples so I went out to try Chat GPT / Gemini AI.

I lost 1h thinking that AI is the holy grail which will save me time and I was proven wrong.

Not a single line of code from it worked. I just ended up being frustrated.

See below what I came up with and how…

Also, to avoid confusion between VSCode and Dynatrace I call:

plugin the Dynatrace VSCode extension itself (which enables you to write Dynatrace Portal Extensions)
extension the Dynatrace extensions that get uploaded to their service and are used for gathering metrics and setting up Dynatrace-Aviatrix monitoring

Before we start

If you get lost (like me) on the Dynatrace website and feel overwhelmed by the volume of info available, then for this specific use case I found a great video.

You’ll need about an hour of your time but I really feel it was worth it:

Dynatrace VSCode & Extensions

Dynatrace elements - high level

Dynatrace has the following elements:

custom extension (Python, Prometheus metric config) By default the premade Prometheus extension that Dynatrace has is out of the box integrated with K8S. The extension defines among others metrics, topology (relations between elements/metrics it fetches, like Spoke depends on Transit), dashboards, alerts. The extension NEEDs to be RUN somewhere That somewhere = MONITORING CONFIGURATION
Monitoring Configuration
- defines which URL to contact to fetch the Metrics defined before
- where to run it from (OneAgent, ActiveGate)

I’ve made a picture here of what my own understanding of all the components is. I’m more the visual type so I need a 360 degree overview to know in which direction to go and what to read for configuring something.

Setting it up

Activate Aviatrix Metrics API

Aviatrix provides a Prometheus/Openmetrics Endpoint which can be queried on the CoPilot Platform over HTTPs while using a Bearer Token for authentication.

The documentation displayed here is pretty explanatory and clear: https://docs.aviatrix.com/documentation/latest/monitoring-troubleshooting/metrics-api-enable.html?expand=true

You basically have to go in:
CoPilot -> Settings -> Configuration -> Network Insights API -> Enable

and you will get a Token (the “key” from the picture above).

If you want to try out and see what you get as output you can use Curl from any Linux/MacOS Terminal or via Postman.

curl -k -X ‘GET’ ‘https:///metrics-api/v1/gateways’ -H 'Authorization: Bearer <TOKEN YOU GOT FROM COPILOT UI>>'

It will look like this:

memory_available{gateway="Controller"} 5297311744 1722869678000
memory_cached{gateway="Controller"} 2892021760 1722869678000
memory_buf{gateway="Controller"} 2364575744 1722869678000
memory_swpd{gateway="Controller"} 0 1722869678000
memory_free{gateway="Controller"} 363819008 1722869678000
cpu_idle{gateway="Controller"} 72 1722869678000
cpu_wait{gateway="Controller"} 0 1722869678000
cpu_us{gateway="Controller"} 4 1722869678000
cpu_ks{gateway="Controller"} 23 1722869678000
rx_drop{gateway="Controller", interface="eth0"} 0 1722869678000
tx_drop{gateway="Controller", interface="eth0"} 0 1722869678000
rate_rx_drop{gateway="Controller", interface="eth0"} 0 1722869678000
rate_tx_drop{gateway="Controller", interface="eth0"} 0 1722869678000
rate_pkt_drop{gateway="Controller", interface="eth0"} 0 1722869678000
bandwidth_ingress_limit_exceeded{gateway="Controller", interface="eth0"} 0 1722869678000
rate_sent{gateway="Controller", interface="eth0"} 181953.86666666667 1722869678000
rate_received{gateway="Controller", interface="eth0"} 162616.8 1722869678000
rate_total{gateway="Controller", interface="eth0"} 344570.6666666666 1722869678000
pps_limit_exceeded{gateway="Controller", interface="eth0"} 0 1722869678000
rx_drop{gateway="Controller", interface="docker0"} 0 1722869678000
tx_drop{gateway="Controller", interface="docker0"} 0 1722869678000
rate_rx_drop{gateway="Controller", interface="docker0"} 0 1722869678000
rate_tx_drop{gateway="Controller", interface="docker0"} 0 1722869678000
rate_pkt_drop{gateway="Controller", interface="docker0"} 0 1722869678000
bandwidth_ingress_limit_exceeded{gateway="Controller", interface="docker0"} 0 1722869678000
rate_sent{gateway="Controller", interface="docker0"} 0 1722869678000
rate_received{gateway="Controller", interface="docker0"} 0 1722869678000
rate_total{gateway="Controller", interface="docker0"} 0 1722869678000
pps_limit_exceeded{gateway="Controller", interface="docker0"} 0 1722869678000
rx_drop{gateway="Controller", interface="lo"} 0 1722869678000
tx_drop{gateway="Controller", interface="lo"} 0 1722869678000
rate_rx_drop{gateway="Controller", interface="lo"} 0 1722869678000
rate_tx_drop{gateway="Controller", interface="lo"} 0 1722869678000
rate_pkt_drop{gateway="Controller", interface="lo"} 0 1722869678000
bandwidth_ingress_limit_exceeded{gateway="Controller", interface="lo"} 0 1722869678000
rate_sent{gateway="Controller", interface="lo"} 471936.5333333333 1722869678000
rate_received{gateway="Controller", interface="lo"} 471936.5333333333 1722869678000
rate_total{gateway="Controller", interface="lo"} 943873.0666666667 1722869678000
pps_limit_exceeded{gateway="Controller", interface="lo"} 0 1722869678000

Similar entries you will see for gateway = {each of your gateway names}, respectively for each gw+interface combination {gateway=“spoke-123”, interface = “tun-XXX”}.

Keep the name of these metrics in mind as possible values to fetch data for from CoPilot split into categories based on “dimensions” where dimension means:

gateway = “Name of entity the values are returned for”
interface = “Together with Gateway, for which interface on it the stats are represented”

Install ActiveGate somewhere on a machine allowed to poll CoPilot IP (NSGS/SGs have to permit this)

Remember that ActiveGate or Oneagent machines will be the one(s) polling the CoPilot APIs. It’s where your extension gets deployed and monitoring configuration activated.

Dynatrace Saas is similar to a Control Plane which configures agents and gets stats from them. Then it processes them, builds dashboards, alerts, business logic, etc. Oneagent/ActiveGate is where your extensions run to gather data from the things you want to monitor.

If you’re curious like me, then my first question was can I not use Oneagent instead of ActiveGate? What is the difference?

As long as I can deploy my Prometheus custom extension on either and specify a non-localhost destination then is there any notable difference? The answer relates to more than functionality but also to metrics/license. Searching the dynatrace forum revealed this explanation (albeit for a custom use case iwth IBM MQ):

ActiveGate vs Oneagent

Extensions run within OneAgent and fully integrate the new technology monitoring into Dynatrace smartscape and problem detection.
ActiveGate Extensions (aka Remote Extensions) are executed on ActiveGate and can acquire metrics and topology from remote sources.
One of the main advantages is that it will start consuming host DDUs first. Metrics should be the same in both cases.
If you’ve got an OS that supports OA (OneAgent) extensions (Windows or Linux) and you’re allowed by the IBM MQ team to install the extension directly on the box I’d go with that one if I were you.
Maintaining the extension on an ActiveGate might be easier than on the IBM MQ server, but that’s more an organizational topic.

Run the package installer for ActiveGate on your Linux box that has access to poll CoPilot Metric API:

root@my-system# sh Dynatrace-ActiveGate-Linux-x86-1.295.27.sh
root@my-system# systemctl status dynatracegateway.service
root@my-system# systemctl status dynatraceautoupdater.service

Install Dynatrace Plugin in VSCode

Install Mac CA VSCode

Wondering what this is, right ?

In case you use the VSCode Dynatrace Plugin for auto-populating metric definitions in your custom Dynatrace extension code (kind of it as a cool discovery mode), then the extension needs to contact CoPilot from your Laptop to fetch the metrics.
When you do that, then if using some custom, internal company-CA on CoPilot, then VSCode won’t trust the certificate it gets from https://<copilot-ip/metrics-api/v1/gateways>.

By default VSCode extensions won’t look in your OS Certificate Trust Chain to see if that company-CA is in there.

This plugin changes this behavior and forces VSCode plugins to look there.

There is also an alternative where you can just use a file that contains a dump of the metrics to prepopulate/auto-discover metric definitions in the VSCode Dynatrace Extensions.

I admit to having used that as it was less time consuming.
I tried this extension, I solved the certificate trust problem (you cannot chose an option to ignore self-signed certs) but then VSCode complained the CoPilot Cert is self signed at the top level and not by a CA.

If you have Windows, then you need win-ca

Generate Access Token

VSCode needs to contact Dynatrace SaaS to publish the extension that you create and make it available to be used for collecting metric data.
For this communication to be allowed you need an API Token and the proper rights in Dynatrace assigned to it.

Go to your Dynatrace SaaS URL and generate a Token:

Add rights to it:

You will later configure this Token in VScode in the Dynatrace Plugin Settings.

Create Environment, Workspace, Initialize Workspace

Environment

Go to VSCode, click on the Sign above the Magnifying Glass which represents Dynatrace.

Add Environment -> put here your Dynatrace URL & then the access token that you previously generated

Choose to set it as your default environment, give it a name (in my case it is Aviatrix)

Don’t worry if clicking on it shows nothing.
Later on when we finish our Monitoring Configuration (what URL to poll for metrics and using which Agents = Oneagent/ActiveGate to send the request) it will start showing it there.

Workspace

Now it’s time to Add a Folder which will be your workspace for the custom metrics Prometheus extension that get values from Aviatrix CoPilot.
Queries will be sourced in my case from a machine running Dynatrace ActiveGate and contacting CoPilot Metrics API (I was too lazy and I installed ActiveGate on CoPilot itself).

If you want to see what other options of the Dynatrace plugin exist, then press FN+F1 on a Mac Keyboard to get this dropdown:

Initialize Workspace

You select a schema version.
This encompasses things like what you can define in your extension, formatting of metric definitions, structure of the elements you put in your extension, what fields are mandatory, what not, how they are validated before pushing to Dynatrace SaaS).

Then you need your Certificates setup. Remember that you (the VSCode Dynatrace plugin in fact) will be signing your Extension.

Select No, there’s only a 1% chance that you decide to install OneAgent/ActiveGate on your development workstation.

We choose Extension 2.0 (we have no Python custom extension or JMX = java monitoring)

Import VSCode Dynatrace extension CA Certificate on Active Gate

The VScode plugin only copies your CA to locally running (on the same machine) OneAgent/ActiveGate instances.
This is rarely the case in production => you must copy the cert to your ActiveGate instance.

Take the path from VSCode -> Plugin Settings for Dynatrace for the CA:

Get the CA contents and paste them then on the ActiveGate machine in the proper place:

cat <path from screenshot above>
ssh <machine with ActiveGate>
sudo vim /var/lib/dynatrace/remotepluginmodule/agent/conf/certificates/ca.pem
# paste the contents from the "cat" command above and save the file

Restart ActiveGate service:

systemctl restart dynatracegateway

Create Extension

Define your extension metadata:

name: custom:aviatrix
version: 2.0.56
minDynatraceVersion: "1.900"
author:
  name: Mihai

Add prometheus section:

Creating the extension to poll CoPilot API gives you a cool option: Scrape Data

What does this do?
It either polls your CoPilot URL and of course also asks for the API key we got at 1st step when configuring the Metrics Api in CoPilot
OR
if feeling lazy/running into errors with self-signed certificates, then you can dump the reply from CoPilot Metrics API into a file (query via Curl) and use that
file for reading metric definitions

It then can use the metric names (it does not need the values at this point) it just fetched to generate the Metric Definitions inside the Dynatrace extension in VSCode automagically.
This means it saves you time to go through documentation of Dynatrace, syntax, formatting, no more headaches, really cool stuff. These definitions, once you uploaded your extension to Dynatrace Saas, will tell Dynatrace what it expects to read from CoPilot and import into its own system/DB.

To see that in action type “metrics: “ statement and you will see a lightbulb appearing, select Insert all scraped metrics

The result should look like this (almost):

prometheus:
  - group: avx_metrics
    metrics:
      - key: memory_available
        value: metric:memory_available
        type: gauge
        featureSet: memory
      - key: memory_cached
        value: metric:memory_cached
        type: gauge
        featureSet: memory
      - key: memory_buf
        value: metric:memory_buf
        type: gauge
        featureSet: memory
      - key: memory_swpd
        value: metric:memory_swpd
        type: gauge
        featureSet: memory
      - key: memory_free
        value: metric:memory_free
        type: gauge
        featureSet: memory
      - key: cpu_idle
        value: metric:cpu_idle
        type: gauge
        featureSet: cpu
      - key: cpu_wait
        value: metric:cpu_wait
        type: gauge
        featureSet: cpu
      - key: cpu_us
        value: metric:cpu_us
        type: gauge
        featureSet: cpu
      - key: cpu_ks
        value: metric:cpu_ks
        type: gauge
        featureSet: cpu
      - key: rx_drop
        value: metric:rx_drop
        type: gauge
        featureSet: interfaces
      - key: tx_drop
        value: metric:tx_drop
        type: gauge
        featureSet: interfaces
      - key: rate_rx_drop
        value: metric:rate_rx_drop
        type: gauge
        featureSet: interfaces
      - key: rate_tx_drop
        value: metric:rate_tx_drop
        type: gauge
        featureSet: interfaces
      - key: rate_pkt_drop
        value: metric:rate_pkt_drop
        type: gauge
        featureSet: interfaces
      - key: bandwidth_ingress_limit_exceeded
        value: metric:bandwidth_ingress_limit_exceeded
        type: gauge
        featureSet: interfaces
      - key: rate_sent
        value: metric:rate_sent
        type: gauge
        featureSet: interfaces
      - key: rate_received
        value: metric:rate_received
        type: gauge
        featureSet: interfaces
      - key: rate_total
        value: metric:rate_total
        type: gauge
        featureSet: interfaces
      - key: pps_limit_exceeded
        value: metric:pps_limit_exceeded
        type: gauge
        featureSet: interfaces

I said almost because I cheated and added “featureSet”.
FeatureSet allows me to define some groups of metrics for which I can enable/disable polling (for example I can say later in my Monitoring Configuration: “when using ActiveGate XYZ to Poll CoPilot URL A.B.C.D, then only do it for category memory of metrics”)

Metric Metadata

This part means metrics names, descriptions, unit of measurement for each.

I added the following in my lab:

metrics:
  - key: memory_available
    metadata:
      displayName: (AVX) Memory Available
      description: Available Memory to be allocated to CPUs (includes cache)
      unit: Byte
  - key: memory_cached
    metadata:
      displayName: (AVX) Memory Cached
      description: Cached memory (can be reallocated if needed)
      unit: Byte
  - key: memory_buf
    metadata:
      displayName: (AVX) Memory Buffers
      description: Memory used for kernel buffers
      unit: Byte
  - key: memory_swpd
    metadata:
      displayName: (AVX) Memory Swapped
      description: Memory Swap
      unit: Byte
  - key: memory_free
    metadata:
      displayName: (AVX) Memory Free (available + cache)
      description: Free memory (not yet allocated, does NOT include cache)
      unit: Byte
  - key: cpu_idle
    metadata:
      displayName: (AVX) CPU Idle
      description: CPU Idle time, doing really nothing
      unit: Percent
  - key: cpu_us
    metadata:
      displayName: (AVX) CPU Userspace
      description: CPU used by userspace programs
      unit: Percent
  - key: cpu_wait
    metadata:
      displayName: (AVX) CPU Wait Time
      description: CPU waits on I/O access
      unit: Percent
  - key: cpu_ks
    metadata:
      displayName: (AVX) CPU KernelSpace
      description: CPU used by linux kernel
      unit: Percent
  - key: rx_drop
    metadata:
      displayName: (AVX) Packets dropped on RX
      description: Packets dropped on RECEIVE on interface
      unit: BitPerSecond
  - key: tx_drop
    metadata:
      displayName: (AVX) Packets dropped on TX
      description: Packets dropped on TRANSMIT on interface
      unit: BitPerSecond
  - key: rate_rx_drop
    metadata:
      displayName: (AVX) Rate RX Drop
      description: Rate Packets dropped on RECEIVE on interface
      unit: BitPerSecond
  - key: rate_tx_drop
    metadata:
      displayName: (AVX) Rate TX Drop
      description: Rate Packets dropped on TRANSMIT on interface
      unit: BitPerSecond
  - key: rate_pkt_drop
    metadata:
      displayName: (AVX) Rate Packets Dropped (overall)
      description: Rate dropped packets
      unit: PerSecond
  - key: bandwidth_ingress_limit_exceeded
    metadata:
      displayName: (AVX) BW Ingress Limit Exceeded
      description: Exceeded BW Allowance in AWS
      unit: PerSecond
  - key: rate_sent
    metadata:
      displayName: (AVX) Rate TX
      description: Rate Bytes Sent
      unit: BitPerSecond
  - key: rate_received
    metadata:
      displayName: (AVX) Rate RX
      description: Rate Bytes Received
      unit: BitPerSecond
  - key: rate_total
    metadata:
      displayName: (AVX) Rate TX+RX
      description: Rate Bytes Sent/Received
      unit: BitPerSecond
  - key: pps_limit_exceeded
    metadata:
      displayName: (AVX) PPS Limit Exceeded
      description: PPS allowance in AWS exceeded for instance
      unit: Count

In practice this means that let’s say pps_limit_exceeded will show up in Dynatrace under Metrics like:

key is the binding element between metadata definition and metric correlation (Dynatrace name to Metric Name returned from polling CoPilot API):

key: pps_limit_exceeded  ==> name ending up in Dynatrace for it
value: metric:pps_limit_exceeded ==> from CoPilot Polling take pps_limit_exceeded

Uploading your extension to Dynatrace

Normally you would need to follow these steps:

Build it (results in an archive)
Upload it to Dynatrace
Activate it (Dynatrace uses versioning and you can switch between different releases of your custom extension)

If you don’t like going through 3 different options there’s a much easier way that does it all upon Cmd+S / Save. Right click here and enable Fast Development Mode

Monitoring Configuration

We created our extension, we defined the metrics we fetch, their names, description but so far no configuration exists for polling CoPilot API:

From which ActiveGate
Which featureSets to poll for
Which IP + API key to use (CoPilot API + Bearer Authorization Token we got when enabling the Metrics API there)

You can do this part also in VSCode, the problem is the 1st time you cannot.
In VSCode for defining a Monitoring Configuration you require a field “scope” which identifies uniquely your ActiveGate.
This field you only see what value it has once you do the configuration manually in the Dynatrace Portal and select your ActiveGate :)

This is where you do it in VSCode (but 1st time as stated you need manual config in Dynatrace Portal):

In Dynatrace GUI on the left side, go to the search bar and look for Extensions. If you have only custom extension, it will just land on your Aviatrix Extension, if not, then select it.

Click on that Add Monitoring config option

Add Prometheus Endpoint (Aviatrix CoPilot)

Configure CoPilot URL + API Token:

Configure FeatureSets (for which metric categories to poll CoPilot):

Create a Dashboard in Dynatrace

Dashboard - CPU/Memory/Rate Received

You just need to add the counters that are relevant to you.
Recommended Aviatrix Counters to Monitor

Troubleshoot

You can see what is happening in case of errors in VSCode

AND

for metric/extension errors inside Dynatrace:

Extras: Create Alert from VSCode

On MacOSX press FN+F1, type Dynatrace, select Create Alert:

This results into creating the following json file:

Based on how you see the file contents look, you can also use a script to generate this kind of content for each of your monitored metrics without having to go through all those menus from the Dynatrace VScode Plugin.

Dynatrace: Monitor your Aviatrix Environment

Table of Contents