apex toolkit : docs





Author: Jens Saade <jens@verticle.io> v0.7, 2017-01

Note
The docs are work in progress. Please help us making them better by sending a PR on github.

1. APEX in a Nutshell

APEX was written to help getting better visibility and understanding of what is happening in the software products' application layer during runtime.

APEX is an abbreviation for A ll P urpose Ex traction Toolkit.

1.1. Foreword: Visibility creates Awareness

When first discussing and prototyping APEX we were looking to establish a better monitoring of our Java based applications. Too often we have seen software and projects getting out of control because nobody could get a clue what was really going on inside in detail. Reading the logs was not sufficient because the information we looked for was like searching the needle in the haystack - or worse - it simply was not put out.

As a result our most important KPI - Mean Time to Resolution (MTTR) - was simply too high because of low visibility caused by ineffective tooling.


Our finding was:

  1. You cannot get aware of a problem if it is not visible.

  2. Without awareness you have no chance to react proactively.

  3. Without the ability to act you make your customers ring the bell first.


Of course you can add debug and logging code, use JMX, btrace and other tools. And there are quite obvious methods like remote debugging the JVM - surely a good choice but not applicable everywhere. But we were searching for a more convenient way.

This was our first leap towards APEX - establishing a highly adaptable monitoring tool. After lots of talk we discovered that we have had created something more generic than we realized and were touching multiple scenarios of interest: runtime monitoring, debugging, advanced testing and business analysis.

We decided to take a step back from our initial intent and refactor and share the APEX framework components on Github so others can benefit. In the future we plan to develop services that are based on this framework.

1.2. Technical Approach

The basic idea is to hook into an existing code base without cluttering it with log statements or debug code and directly participate on code execution right on the spot. We do this by instrumenting classes when they get loaded in the classloader and installing handler hooks at specific locations with the Javassist framework.

The handler hooks can be configured to execute prebuilt and custom handler classes that execute specific tasks. For instance you can measure code executions and store the results to a metrics reservoir. Or you can inject a generic dumper that spills out the current scope to the log. We also provide a handler which can execute arbitrary groovy code.

Additionally there is a metric & event messaging component that can be used by the handlers to transport information to a service endpoint.

1.3. Component Overview

APEX consists of couple of components. Some are mandatory, some are optional. Lets take a look around.

1.3.1. Collector

This Java agent is used to connect to the target application and control the instrumentation of the code. It is attached to the JVM and will download instrumentation configs from the repository first and then inject the hooks once target classes get loaded by the application.

1.3.1.1. Handlers

Handlers are part of the instrumentation instructions and tell the collector what to do when the application code executes the placed hooks. Handlers are part of the SDK and intended to be expanded/extended by the user community.

Currently we have a very small set of handlers available that serve different purposes.

Note
Since release 0.7 handlers can be extended via the SDK. See chapter Extending APEX.
Measuring Performance

The BasicPerformanceHandler will measure a method body execution timing. We use dropwizard metrics to store and aggregate timers and histograms. See http://metrics.dropwizard.io/ for more details.

Tracking Execution Context

The BasicContextDumpHandler can be used on methods to dump the current state of all signature variables before and after method body execution.

Generic Class Access

The BasicScriptedHandler is probably the most powerful handler. It enables to run groovy scripts using a method callback. In that callback you have access to the so called AdvisorContext which includes the current class instance, the method arguments and the service message object to fill with metadata and object maps.

An example:
// A simple groovy script that gets executed on each instrumented method call
def before(){
    println("method executed")

    // we have some implicit objects here like 'context'  AdvisorContext containing e.g
    // the instrumented instance
    def instance = context.getInstance()

    // message - our message object for apex
    message.addField("msg", "Foo? Bar!")

    // you can execute methods ...
    message.addField("something", instance.sampleField.getSomething());

    // ... or fields ...
    message.addField("somethingElse", instance.somethingElse);

    // ... and also log to sysout, quick and dirty
    println("instance:" + context.instance);

    // we could also access the method arguments via 'args' and 'argsCount'
    // “ the number of arguments in the signature
    // arg1 ... argN the arguments
    println("arg 1:" + arg1.toString())

    // the message object will now be sent to the apex service for further processing.

def after(){

}
1.3.1.2. Messaging

Gathered data has to be forwarded to the service to make use of it. Currently, three methods are available: AMQP based messaging (e.g. using rabbitMQ), REST based webservices and ELK beats.

1.3.2. Instrumentation Configuration

In order to tell the collector what on how to instrument the target system the agent will connect to a specified git repository and clone the instructions to the local system.

The repository has a specific layout that needs to be followed. The directory structure follows the pattern

/<artifact name>/<user-package>/<artifact version>/, e.g. /minecraft/io.verticle.apex.minecraft/0.1/ or /jenkins/com.foo.bar/2.1/.

Inside this directory there are a couple of noteworthy files:

  • meta.json - package descriptor file

  • <metricqualifier>.json - instrumentation definition file (per qualified metric)

  • <metricqualifier>.groovy - optional groovy script for a BasicScriptedHandler

1.3.3. Instrumentation Repository

This Github based repo is supposed to store ready-to-use instrumentations for common software products. It is designed as contribution catalog and will be extended gradually using PRs. If you have some neat instrumentation configuration built for APEX - this is the place to share.

It also houses a configuration template to help you start creating your own. Simply fork the repo on Github and adjust existing configs to your needs.

1.3.4. APEX Service

This is the data sink where collectors send their gathered data to. We are providing two different approaches here:

  • Roll your own by cloning our Spring Boot template project on Github (https://github.com/verticle-io/apex-service-template). Messaging is already in and you can extend it to your needs.

  • Use our upcoming cloud service offering which is currently prepared for beta.




2. System requirements

2.1. for the target machine

JVM

APEX packages use JVM features like java agents and instrumentation.

You will need a Sun/Oracle JDK version >= 8.

Note
The code is currently developed for Java 8. We will backport parts of it to make it run smoothly with previous version up to Java 1.5
git

The collector is using git to access his instrumentations.

2.2. for the service machine

In case you build your own service based on our service template you will need:

JVM

You will need a Sun/Oracle JDK version >= 8.

MVN

The build is based on maven. Either install the lastest version or use the wrapper mvnw.




3. Quickstart

This guide will take approx. 15 minutes to walk though.

We will setup the individual components of the APEX toolkit: Install the collector, configure and change instrumentation configuration using the instrumentation repository and create a small consuming service using the apex service template project.

Note
Instructions are brief. If you want to dig deeper we add hints to other parts of the docs.

3.1. Preparations

Create a directory to for the setup. We will call it <APEX_HOME> further on.

3.1.1. Retrieve artifacts and configurations

Step 1:

Retrieve the latest binary release of the collector at https://github.com/verticle-io/apex-toolkit/releases/ and place it to APEX_HOME

Step 2:

Fork and clone the instrumentation repository on GitHub. Head over to https://github.com/verticle-io/apex-instrumentation-repo and click the fork button. We will reference to the fork as <FORKED_REPO>.

$> cd <APEX_HOME>
$> git clone <FORKED_REPO>.git
Step 3:

Finally clone and compile the apex service template.

$> cd <APEX_HOME>
$> git clone https://github.com/verticle-io/apex-service-template.git
$> cd apex-service-template
# compile using the maven wrapper
$> ./mvnw install

Your <APEX_HOME> directory now should look like this:

apexAgent-<version>.jar

apex-instrumentation-repo
    /tomcat/v8/io.verticle.apex.instrumentation.tomcat8/meta.json
    / ...

apex-service-template
    /pom.xml
    /src/main/java
    / ...

Now you are done with the preparations. Lets install and configure APEX to your target application next.

3.2. Installation

3.2.1. Apply the Collector to target

Head over to your target application and add the following params to your JVM:

-javaagent:/<APEX_HOME>/apexAgent-0.5-all.jar=/<APEX_HOME>/myApexAgentConfig.properties

Now your application will load the APEX Collector Agent. Lets configure it’s behaviour fist.

3.2.2. Configure the Collector

Create a configuration file called <APEX_HOME>/myApexAgentConfig.properties in an editor and adjust the settings.

# Transport settings. APEX can report via AMQP, REST and ELK-Beat
# [amqp,http,beat]
verticle.apex.service.method=http

# http and beat
verticle.apex.service.method.http.debug=false
verticle.apex.service.uri=http://127.0.0.1:9005

# http only
verticle.apex.service.username=user
verticle.apex.service.password=password

# amqp only
verticle.apex.mq.server=127.0.0.1
verticle.apex.mq.server.port=5672

# Repository Settings. APEX will be instrumented from the configuration within this repository.

# the local repo path (where to clone to)
verticle.apex.instrumentation.config.path=/var/opt/apex/repo/

# the external git repository (can be anywhere)
verticle.apex.instrumentation.remoterepository.uri=https://github.com/verticle-io/apex-instrumentation-repo.git
verticle.apex.instrumentation.remoterepository.username=user
verticle.apex.instrumentation.remoterepository.password=pass
verticle.apex.instrumentation.remoterepository.update=true

3.3. Configure Instrumentation

Now the tricky part. Which code of the target system do you want to instrument?

Check your <FORKED_REPO> to get an idea on how to attach. Check out the minimal project at <APEX_HOME>/apex-instrumentation-repo/minimal.

Open the the minimal.json and adjust

  • the target class: TODO

  • the signature classes: TODO

If the targeting is done right your target application will print out Foo? Bar! when the target is executed.

Lets give our installation a spin.

3.4. Run

Step 1:

Start the APEX Service template:

$> cd <APEX_HOME>/apex-service-template
$> ./mvnw spring-boot:run
Step 2:

Start your target application

Well, you know best how this works. The logfile should print some APEX related stuff, e.g. configuration bootstrapping:

***********************************************************************
***                     APEX COLLECTOR Configuration                ***
***********************************************************************

and you shoud see something like

<apex> trying to weave ...
<apex> weaving method ...
<apex> successfully weaved method ...

And finally our beloved

Foo? Bar!




4. Extending the APEX Collector Agent

@since 0.7

With release 0.7 the SDK and the API of the apexAgent collector have been open sourced. It is still in a very early phase and subject to change.

The SDK allows you to write your own Handlers to be processed by the Collector. You can create your own extension and deploy it as an additional agent.

Warning
Since there are no limitations in what your handlers can do you should test them thoroughly before applying it to a critical system!

4.1. Setting up a new Extension Project

Clone and open the example project provided with the SDK. Now either use and change the apexAgent-ext submodule to your needs or copy the structure to a new project (for this guide we assume that you change it).

4.1.1. Implement a custom Handler

This project skeleton comes with a gradle based build and a sample handler class to extend.

All handler methods will be called when your target configuration (class methods and constructors) matches.

  • handle() will be called at start and end of the method

  • handleBefore() at the start only

  • handleAfter() at the end only

public class MyHandler implements Handler {

    static final Logger logger = LoggerFactory.getLogger(MyHandler.class);

    /**
     * The handle method will be called after and before when your handler matches
     * a classmethod for execution.
     * It will supply a context object that includes e.g. the current instance
     */
    public void handle(AdvisorContext advisorContext) {
      logger.info("MyHandler handle() called");
    }

    /**
     * Same as above but only executed at the start of the matching classmethod
     */
    public void handleBefore(AdvisorContext advisorContext) {
      logger.info("MyHandler handleBefore() called");
    }

    /**
     * Same as above but only executed at the end of the matching classmethod
     */
    public void handleAfter(AdvisorContext advisorContext) {
      logger.info("MyHandler handleAfter() called");
    }

    /**
     * You can configure options for your handler, they will be injected here as a Map
     */
    public void setOptions(Map<HandlerOption, Object> map) {

    }
}


The AdvisorContext provides accessors to the current instance and enables your handler to

  • get the method signature’s arguments and types,

  • get the return value and its types of the method

  • and get the instantiated class.

4.1.2. Sending Messages

Sending messages can be done by creating a Message object

public void handleAfter(AdvisorContext advisorContext) {
      logger.info("MyHandler handleAfter() called");

      // use the NamingStrategy to create proper metric qualifiers
      // the domain is related to the area where the metric originates
      String name = DefaultNamingStrategy.getInstrumentedAdviceName(advisorContext);
      MetricMessage message = new MetricMessage(Domain.application, name, new Date());

      // you can add any amount of field to the message.
      message.addField("correlationId", advisorContext.getCorrelationId());
      message.addField("trigger", trigger);
      message.addField("args", args);

      try {
          // always access via the CollectorFactory
          ApexCollectorFactory.get().reportDirect(this.getClass(), message);
      } catch (Exception e) {
          logger.error("Could not send message", e);
      }
}

Check out the basic SDK handler implementations to get more ideas. You can also use them as a base implementation and extend them: https://github.com/verticle-io/apex-toolkit/tree/master/apexAgent-sdk/src/main/java/io/verticle/oss/apex/agent/sdk/handler

4.2. Compiling and Deploying your Extension

Create your apexAgent-ext.jar with shadowJar task:

cd to apex-toolkit directory
./gradlew -b apexAgent-ext/build.gradle shadowJar

and copy the apexAgent-ext.jar to the machine and place it along the apexAgent-all.jar

Change the JVM cmd and place an additional javaagent directive before the apexAgent-all jar:

-javaagent=<PATHTO>/apexAgent-ext.jar -javaagent=<PATHTO>/apexAgent-all.jar=<PATHTO>/apexAgent.properties




5. Integrations

5.1. Elasticsearch: ApexBeats

@since 0.7

Elasticsearch can be extended using CommunityBeats, a range of service integrations that provide data to the Elasticsearch stack.

APEX provides an integration called Apexbeats. This beats implementation is deployed with an APEX Collector Agent. It is a bridgehead for directly importing data generated by APEX into Elasticsearch.

Apexbeats is also open sourced on Github.

5.1.1. Installing and Running ApexBeats

5.1.1.1. Prerequisites
  • Golang 1.7

  • GOPATH set

  • project location at ${GOPATH}/github.com/verticle-io

5.1.1.2. Build

Clone https://github.com/verticle-io/apexbeats to your project location dir.

# fetch dependencies
make setup

# compile it with
make
5.1.1.3. Configure

Now alter the configuration file apexbeat.yml to your needs and set the http port properly.

5.1.1.4. Run

ApexBeats is now ready to start.

# run it with
./apexbeat -c apexbeat.yml -e -d "*"
Note
You can find a more detailed setup guide on the Github README.

5.1.2. Connecting the APEX Collector to ApexBeats

At this point we assume your APEX Collector has already been set up. If not checkout the Quickstart chapter.

To make the APEX Collector talk to ApexBeats is simple. Alter your Collector configuration like this:

# set the mode to beats
verticle.apex.service.method=beats
# set the service URL to point to ApexBeats
verticle.apex.service.uri=http://127.0.0.1:9005

Now restart the JVM with the Collector attached and view the logs.

Note
Both the Collector and the ApexBeats can be either deployed to the same or different hosts. When deployed tp different hosts make sure ApexBeats is not blocked by firewall.