Coroner

Interaction with the coronerd object store occurs either through the web console or the command-line tools. This section focuses on the coroner tool, which is a command-line client for interacting with the object store. It is fairly simple to use but allows for complex queries and integrates easily into your terminal development environment.

Initial Setup

coroner 0.12.0 or newer

After a user has been added to the object store, they must log into a server in order to interact with it.

For example, the user initiates a session with the object store located at errors.company.com using the following invocation:

$ coroner login https://errors.company.com
User: myuser
Password: **********

At this point the user is logged and credentials are cached locally. The password is not saved but instead a session token is used. Once logged in, the user is able to interact with the server as specified in the "Basic Usage" section.

Additional options can be specified in ~/.coroner.cf or a configuration file the user provides to the -c option.

For more information about coroner.cf files, see Coroner Client Configuration.

coroner 0.11.4 or older

The coroner tool expects to find a configuration file in the .coroner.cf in your home directory. An alternate configuration file may also be specified with the -c option to coroner. coroner -c /tmp/testing.cf will force coroner to use the configuration file /tmp/testing.cf.

The object store at corp.company.com is configured to store object snapshots for the projects aclient, fastore, cronndbcacher, cacher and squid.

A project is a grouping of events, snapshots and root causes. A project may host multiple applications or a single application, depending on whether your team would find correlation of faults across applications to be useful.

With this minimal configuration file, you will now be able to interact with the object store through the various coroner sub-commands.

For more information about coroner.cf files, see Coroner Client Configuration.

Basic Usage

Summary

The summary sub-command allows us to get a quick summary of all fault activity across configured projects.

$ coroner summary
PROJECT          TRACES  GROUPS      ACTIVITY
aclient               9       2      2015-06-15 17:45 EDT [1w]
fastore             254      24      2015-06-19 17:23 EDT [8h]
cronndbcacher        64       8      2015-06-19 10:34 EDT [2d]
cacher              523     171      2015-06-20 11:53 EDT [1M]
squid              6269      44      2015-06-21 09:06 EDT [1h]

The above output tells us that there are 6269 faults in the squid project, 44 of which were deemed to be unique. The last failure in squid occurred 2015-06-21 09:06 EDT or 1 hour ago from time of coroner summary invocation.

List

The list sub-command allows you to issue queries and perform analysis on the coroner object store. The basic usage of coroner list is specified below.

$ coroner list [<options>] <project>

coroner list aclient will output information on all faults to have ever occurred in the aclient project, grouped by uniqueness.

$ coroner list aclient
[42572cdc2329eb2fe6f149506671328d5a963d76d6639c80b23cdcd0b644289f]
  Date: Mon Jun 15 17:24:26 2015 - Mon Jun 15 17:29:45 2015
  Occurrences: 8 (over 0 days)
  Classification:
    segv (8 buckets)
    null (8 buckets)
    stop (8 buckets)
  Frames:
    event_client_init ← aclient_event_client_thread

[a36e5ba7862a49e653a7f5c7dc508423c850521186d457c6473ccbb83ecbafa8]
  Date: Thu May 14 16:11:17 2015
  Occurrences: 1 (over 0 days)
  Classification:
    ill (1 buckets)
    stop (1 buckets)
  Frames:
    crash_handler ← evhttp_handle_request ← evhttp_get_body ←
      bufferevent_trigger_nolock_ ← bufferevent_readcb ←
      event_persist_closure ← event_process_active_single_queue ←
      event_process_active ← event_base_loop ← agg_core_free_thread ← main

The long string of hex digits is a group identifier. This is a unique SHA256 signature that is used to identify a group of similar faults. The content below this group signature is summary data about the fault.

The 42572cdc2329eb2fe6f149506671328d5a963d76d6639c80b23cdcd0b644289f fault occurred 8 times starting on June 15 2015 17:24:26 with the last occurrence on the same day at 17:29:45. All 8 occurrences of the fault involved application snapshots where the monitored process suffered a segmentation fault from a null dereference (to learn more about classifiers, go here). In addition to this, all instances of the fault contained faulted threads with the backtrace sequence of event_client_init being called on aclient_event_client_thread.

In order to view all occurrences of faults with-in this group of faults, the -i option may be passed to display instances with-in the group. The -H1 option is passed here to indicate that only the first group should be displayed coroner list.

[42572cdc2329eb2fe6f149506671328d5a963d76d6639c80b23cdcd0b644289f]
  Date: Mon Jun 15 17:24:26 2015 - Mon Jun 15 17:29:45 2015
  Occurrences: 8 (over 0 days)
  Classification:
    segv (8 buckets)
    null (8 buckets)
    stop (8 buckets)
  Frames:
    packrat_client_init ← aclient_packrat_client_thread
  Objects:
    [e6702f1063af4e1f8d2b20fa6444e7e5] (Mon Jun 15 17:29:45 2015)
    [e45fa54e3d6f4f34b72fa50b45daf45a] (Mon Jun 15 17:29:27 2015)
    [e9ca85077fa54f6baae0d645e5799bba] (Mon Jun 15 17:29:10 2015)
    [1e2f8d976d70493784d886055ed0fe16] (Mon Jun 15 17:28:54 2015)
    [8cd727dabcba4046b166a827bf2db6f5] (Mon Jun 15 17:25:17 2015)
    [b096491ee72f4ddc8428fad00a626def] (Mon Jun 15 17:25:01 2015)
    [cf513af02a494db58944b92bbfa0dabf] (Mon Jun 15 17:24:43 2015)
    [7ebdd696c2154130ad66e2563b120edc] (Mon Jun 15 17:24:26 2015)

The shorter sequence of hex characters is a unique identifier for every fault. They can be used to retrieve and edit snapshots, as well as share them. We will learn more about viewing snapshots later.

Key-Value Attributes

How do we go about displaying key-value attributes? The list command can specify that certain key-value attributes by exposed and be used as a factor for grouping faults. To print all hostname and tag attributes along with instances the --expand option is used. In the example below, 5 is passed to -i so that only the five most recent instances of a fault should be displayed.

$ coroner list aclient --expand=hostname,tag -i5 -H1
[42572cdc2329eb2fe6f149506671328d5a963d76d6639c80b23cdcd0b644289f]
  Date: Mon Jun 15 17:24:26 2015 - Mon Jun 15 17:29:45 2015
  Occurrences: 8 (over 0 days)
  Attributes:
    tag (1 buckets)
    hostname (1 buckets)
  Classification:
    segv (8 buckets)
    null (8 buckets)
    stop (8 buckets)
  Frames:
    packrat_client_init ← aclient_packrat_client_thread
  Objects:
    [e6702f1063af4e1f8d2b20fa6444e7e5] (Mon Jun 15 17:29:45 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [e45fa54e3d6f4f34b72fa50b45daf45a] (Mon Jun 15 17:29:27 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [e9ca85077fa54f6baae0d645e5799bba] (Mon Jun 15 17:29:10 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [1e2f8d976d70493784d886055ed0fe16] (Mon Jun 15 17:28:54 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [8cd727dabcba4046b166a827bf2db6f5] (Mon Jun 15 17:25:17 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [b096491ee72f4ddc8428fad00a626def] (Mon Jun 15 17:25:01 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [cf513af02a494db58944b92bbfa0dabf] (Mon Jun 15 17:24:43 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [7ebdd696c2154130ad66e2563b120edc] (Mon Jun 15 17:24:26 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 

The --frequency option may be used to display histograms of all values associated with a key. For example, to request a histogram of all affected tag attributes for a group, then pass --frequency=tag to coroner list.

$ coroner list aclient --expand=hostname,tag -i5 -H1 --frequency=tag
[42572cdc2329eb2fe6f149506671328d5a963d76d6639c80b23cdcd0b644289f]
  Date: Mon Jun 15 17:24:26 2015 - Mon Jun 15 17:29:45 2015
  Occurrences: 8 (over 0 days)
  Attributes:
    tag (1 buckets)
      0.145                                          8 100.00% ███████████████
    hostname (1 buckets)
  Classification:
    segv (8 buckets)
    null (8 buckets)
    stop (8 buckets)
  Frames:
    packrat_client_init ← aclient_packrat_client_thread
  Objects:
    [e6702f1063af4e1f8d2b20fa6444e7e5] (Mon Jun 15 17:29:45 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [e45fa54e3d6f4f34b72fa50b45daf45a] (Mon Jun 15 17:29:27 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [e9ca85077fa54f6baae0d645e5799bba] (Mon Jun 15 17:29:10 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [1e2f8d976d70493784d886055ed0fe16] (Mon Jun 15 17:28:54 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [8cd727dabcba4046b166a827bf2db6f5] (Mon Jun 15 17:25:17 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [b096491ee72f4ddc8428fad00a626def] (Mon Jun 15 17:25:01 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [cf513af02a494db58944b92bbfa0dabf] (Mon Jun 15 17:24:43 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 
    [7ebdd696c2154130ad66e2563b120edc] (Mon Jun 15 17:24:26 2015)
      Classification: segv null stop 
      Attributes: tag=0.145 hostname=21.bm-aclient.prod.sin1 

Saving Preferences

It would be tedious to have to pass commonly used options to coroner list and other coroner sub-commandsfor every invocation. The default configuration block in your configuration may be used to specify that certain options always be set for sub-commands. See the example configuration below.

[universe]                                                                      
name=company                                                                   
read=127.0.0.1:4097                                                             

[default]                                                                       
list.frequency=environment,tag,dc,collection                                    
list.expand=environment,tag,dc,collection,application,hostname                  
list.sort=hostname
list.instances=3    

This specifies to coroner that whenever the list sub-command is invoked it should have --frequency=environment,tag,dc,collection, --expand=environment,tag,dc,collection,application,hostname, --sort=hostname and --instances=3 prepended to the list of options supplied to the command-line.

The general form of the default configuration block is <sub-command>.<option>=<value>. For example, X.Y=Z specifies that if coroner X <options> is specified then it should be interpreted as if it were coroner X --Y=Z <options>.

Examples

Common use-cases are displayed below.

Sort groups by recent activity

The -R option allows you to sort groups by date of recent activity. An example is coroner list project -R.

List all activity in the last week

The --age option may be used to filter groups with-in a recent time window. coroner list project --age=1w would list all groups and faults with activity in the past week.

Find groups with a certain classification

The --classifier is used to filter groups by classification. coroner list project --classifier=null displays all faults that were classified as a NULL dereference. For a complete list of classifiers, please visit the ptrace page.

Find all groups by a callstack

The --frames option is used to specify a sequence of functions to search for. coroner list project --frames=A,B,C finds all groups of faults that have a backtrace that matches the regular expression A, B and C in that order. For example, --frames=^ck_,^worker would match a callstack of ck_hs_put <- read_data <- open_cache <- worker <- start_thread.

Aggregate listed groups by key-value

The --aggregate option can be used to aggregate coroner list output. coroner list --aggregate=tag will aggregate all crashes resulting from a list by the tag attribute. It is also possible to aggregate by functions in a callstack by using ::frames. For example, coroner list --aggregate=::frames.

Download a snapshot to disk

The coroner get command can be used to view a snapshot. For example, coroner get aclient e9ca85077fa54f6baae0d645e5799bba will open the snapshot e9ca85077fa54f6baae0d645e5799bba of the aclient project locally. The -o option is used to store to a local file. coroner get -o copy.btt aclient e9ca85077fa54f6baae0d645e5799bba will download the specified snapshot to copy.btt.

Additional Information

Please refer to coroner list --help.