API¶

This covers all the documentation for Kadabra’s various classes, including the client API, Agent, Metrics, Channels, and Publishers.

Client¶

class kadabra.Kadabra(configuration=None)¶

Main client API for Kadabra. In conjunction with the MetricsCollector, allows you to collect metrics from your application and queue them for publishing via a channel.

Typically you will use like so:

kadabra = Kadabra()
metrics = kadabra.metrics()
...
metrics.add_count("myCount", 1.0)
...
metrics.set_timer("myTimer", datetime.timedelta(seconds=5))
...
metrics.add_count("myCount", 1.0)
...
kadabra.send(metrics.close())

Parameters:	configuration (dict) – Dictionary of configuration to use in place of the defaults.

metrics()¶

Return a MetricsCollector initialized with any dimensions as specified by the default dimensions. The collector can be used to gather metrics from your application code.

Return type:	MetricsCollector
Returns:	A `MetricsCollector` instance.

send(metrics)¶

Send a Metrics instance to this client’s configured channel so that it can be received and published by the agent. Note that a Metrics instance can be retrieved from a collector by calling its close() method.

Parameters:	metrics (Metrics) – The `Metrics` instance to be published.

class kadabra.client.MetricsCollector(timestamp_format, **dimensions)¶

A class for collecting metrics. Once initialized, instances of this class collect metrics by aggregating counts and keeping track of dimensions and timers. Typically you won’t instantiate this class directly, but rather retrieve an instance from the client’s metrics() method.

Counters are floating point values aggregated over the lifetime of this object, and published as a single value (per counter name).

Timers will be a floating point value along with a unit.

A collector instance can be used to collect metrics until it is closed by calling its close() method. After close() has been called, this object can be safely published without the possibility of “losing” additional metrics between the time it is closed and the time it is published.

Although collector objects are thread-safe (meaning the same object can be used by multiple threads), note that any threads that attempt to use a collector instance after it has been closed will throw an exception.

Parameters:	timestamp_format (string) – The format for timestamps when serializing into a `Metrics` instance. dimensions (dict) – Any dimensions that this object should be initialized with.

add_count(name, value, timestamp=None, metadata=None, replace_timestamp=False)¶

Add a new counter to this collector object, or add the value to an existing counter if it already exists.

Parameters:

name (string) – The name of the counter.
value (float) – The floating point value to either initialize a new counter with, or add to an existing one.
timestamp (datetime) – The timestamp to use for when this count was recorded. If unspecified, defaults to now (in UTC).
metadata (dict) – Any metadata to include with this counter as a dictionary of strings to strings. These will be included as unindexed fields for this counter in certain metrics databases. Note that if you specify this for an existing counter, it will completely overwrite the existing metadata. However if you do not specify it, the previous metadata for the counter will remain unchanged.
replace_timestamp (bool) – Whether to replace the exisiting timestamp for a counter if it already exists. This can be set to True if you want to update the timestamp when you add to an existing counter.

Raises:

CollectorClosedError – If this collector object has already been closed.

close()¶

Close this collector object and return an equivalent Metrics object. After this method is called, you can no longer set dimensions, set timers, or add counts to this object.

Return type:	Metrics
Returns:	A `Metrics` instance from the collector’s dimensions, counters, and timers.
Raises:	CollectorClosedError – Raised if this collector object has already been closed.

set_dimension(name, value)¶

Set a dimension for this collector object. If it already exists, it will be overwritten with the new value.

Parameters:	name (string) – The name of the dimension to set. value (string) – The value of the dimension to be set.
Raises:	CollectorClosedError – If this collector object has already been closed.

set_timer(name, value, unit, timestamp=None, metadata=None)¶

Set a timer value for this collector object using a timedelta. If it already exists, it will be overwritten with the new value.

Parameters:

name (string) – The name of the timer to set.
value (timedelta) – The value to use for the timer.
unit (Unit) – The unit to use for this timer. Common units are specified in kadabra.Units.
timestamp (datetime) – The timestamp to use for when this timer was recorded. If unspecified, defaults to now (in UTC).
metadata (dict) – Any metadata to include with this timer as a dictionary of strings to strings. These will be included as unindexed fields for this timer in certain metrics databases. Note that if you specify this for an existing timer, it will completely overwrite the existing metadata. However if you do not specify it, the previous metadata for the timer will remain unchanged.

Raises:

CollectorClosedError – If this collector object has already been closed.

class kadabra.client.CollectorClosedError¶: Raised if you try to add metrics to or close a MetricsCollector object that has already been closed.

Agent¶

class kadabra.Agent(configuration=None)¶

Reads metrics from a channel and publishes them (see Publishers). The agent will spin up threads which listen to the configured channel and attempt to publish the metrics using the configured publisher. The agent will also periodically monitor the metrics that have been in progress for a while and attempt to republish them. Because the agent is meant to run indefinitely, side by side with your application, it should be configured and started in a separate, dedicated process.

Internally this object just manages a Receiver and Nanny.

Parameters:	configuration (dict) – Dictionary of configuration to use in place of the defaults.

start()¶: Start the agent. It will receive metrics from the channel, publish them, and attempt to republish metrics that have been pending for a long time (in the case of publishing failures). The agent runs until stopped; thus, you should call this method from a dedicated Python process, as it will block until the process is killed, a keyboard interrupt is detected, or the stop() method is called.

stop(*args, **kwargs)¶: Stop the Agent gracefully, ensuring that any pending publish attempts are finished. This method accepts arbitrary arguments so that it can be called from any context (such as a signal handler).

class kadabra.agent.Receiver(channel, publisher, logger, num_threads)¶

Manages ReceiverThreads which receive metrics from the channel, move them from the queue to in-progress, and attempt to publish them. Publishing failures will result in the metrics remaining in-progress and getting picked up by the Nanny which will attempt to republish them.

Parameters:	channel (Channels) – The channel to read metrics from. See Channels. publisher (Publishers) – The publisher to use for publishing metrics. See Publishers. logger (Logger) – The logger to use. num_threads (integer) – The number of threads to use for publishing metrics.

start()¶: Start the receiver by starting up each ReceiverThread.

stop()¶: Stop the receiver by stopping each ReceiverThread.

class kadabra.agent.ReceiverThread(channel, publisher, logger)¶

Listens to a channel for metrics and publishes them.

Parameters:	channel (Channels) – The channel to read metrics from. See Channels. publisher (Publishers) – The publisher to use for publishing metrics. See Publishers. logger (Logger) – The logger to use.

run()¶: Run this thread until stopped.

stop()¶: Stops this this thread, ensuring that the current run will be the last one.

class kadabra.agent.BatchedReceiver(channel, publisher, logger, publishing_interval, max_batch_size)¶

An alternative to the normal Receiver, the BatchedReceiver will periodically wake up and attempt to publish all metrics at once. Publishing failures will still result in the metrics being retried by the Nanny.

Parameters:

channel (Channels) – The channel to read metrics from. See Channels.
publisher (Publishers) – The publisher to use for publishing metrics. See Publishers.
logger (Logger) – The logger to use.
publishing_interval (integer) – The interval at which metrics should be attempted to be published.
max_batch_size (integer) – The maximum number of metric collections to publish at once. Note that more total metrics will actually be published from this (since each collection contains several metrics grouped by dimension). This parameter just controls the number of metric collections that are retrieved from the channel and published at once.

start()¶: Start the batched receiver.

stop()¶: Stop the batched receiver.

class kadabra.agent.Nanny(channel, publisher, logger, frequency_seconds, threshold_seconds, query_limit, num_threads)¶

Monitors metrics that have been in-progress for a long time and attemps to republish them. This object will periodically query objects in the in-progress queue, and try to republish them if the time between now and when they were serialized is greater than a threshold (indicating the first attempt to publish the metrics failed). It will grab the first X elements from the in-progress queue (where X is a configured value) and add them to a queue, which NannyThreads will read from and attempt to republish. If metrics are successfully published they will be marked as complete.

Parameters:

channel (Channels) – The channel to monitor.
publisher (Publishers) – The publisher to use for republishing metrics.
logger (Logger) – The logger to use.
frequency_seconds (integer) – How often the Nanny should query the in_progress queue.
threshold_seconds (integer) – The threshold seconds to determine if metrics should be attempted to be republished.
query_limit (integer) – The maximum number of elements to query from the in-progress queue for any given Nanny run. This is necessary because the in-progress queue will constantly be changing. Thus nanny needs to take a “snapshot” rather than iterate through the queue.
num_threads (integer) – The number of NannyThreads to use for republishing.

start()¶: Start the nanny by starting up each NannyThread.

stop()¶: Stop the nanny by stopping the Nanny from listening to the channela nd by stopping each NannyThread.

class kadabra.agent.NannyThread(channel, publisher, queue, logger)¶

Listens to a queue for metrics that have been in progress for a long time and attempts to republish them. If the publishing is successful, marks the metrics as complete.

Parameters:	channel (Channels) – The channel to mark the metrics as complete upon successful publishing. publisher (Publishers) – The publisher to be used to publish the metrics object. queue (Queue) – The queue to monitor for metrics to republish. logger (Logger) – The `Logger` to log messages to.

run()¶: Run this thread until stopped.

stop()¶: Stops this this thread, ensuring that the current run will be the last one.

class kadabra.agent.BatchedNanny(channel, publisher, logger, frequency_seconds, threshold_seconds, max_batch_size)¶

Just like the regular Nanny, but deals with batches of metrics, attempting to publish them directly as a single batch. Used with the ~:class:BatchedReceiver.

Parameters:

channel (Channels) – The channel to monitor.
publisher (Publishers) – The publisher to use for republishing metrics.
logger (Logger) – The logger to use.
frequency_seconds (integer) – How often the Nanny should query the in_progress queue.
threshold_seconds (integer) – The threshold seconds to determine if metrics should be attempted to be republished.
max_batch_size (integer) – The maximum size of the batch to receive from the channel and attempt to republish.

start()¶: Start the batched nanny.

stop()¶: Stop the batched nanny.

Metrics¶

class kadabra.Dimension(name, value)¶

Dimensions are used for grouping sets of metrics by shared components. They are key-value string pairs which are meant to be indexed in the metrics storage for ease of querying metrics.

Parameters:	name (string) – The name of the dimension. value (string) – The value of the dimension.

static deserialize(value)¶

Deserializes a dictionary into a Dimension instance.

Parameters:	value (dict) – The dictionary to deserialize into a `Dimension` instance.
Return type:	Dimension
Returns:	A dimension that the dictionary represents.

serialize()¶

Serializes this dimension to a dictionary.

Return type:	dict
Returns:	The dimension as a dictionary.

class kadabra.Counter(name, timestamp, metadata, value)¶

A counter metric, which consists of a name and a floating-point value.

Parameters:	name (string) – The name of the metric. timestamp (datetime) – The timestamp of the metric. metadata (dict) – Metadata associated with this metric, in the form of string-string key-value pairs. This metadata is meant to be stored as non-indexed fields in the metrics storage. value (float) – The floating-point value of this counter.

static deserialize(value, timestamp_format)¶

Deserializes a dictionary into a Counter instance.

Parameters:	value (dict) – The dictionary to deserialize into a `Counter` instance.
Return type:	Counter
Returns:	A counter that the dictionary represents.

serialize(timestamp_format)¶

Serializes this counter to a dictionary.

Parameters:	timestamp_format (string) – The format string for this counter’s timestamp.
Return type:	dict
Returns:	The counter as a dictionary.

class kadabra.Timer(name, timestamp, metadata, value, unit)¶

A timer metric representing an elapsed period of time, identified by a datetime.timedelta and a Unit.

Parameters:	name (string) – The name of the timer. timestamp (datetime) – The timestamp of the timer. metadata (dict) – The metadata associated with the timer. value (timedelta) – The value of the timer. unit (kadabra.Unit) – The unit of the timer value.

static deserialize(value, timestamp_format)¶

Deserializes a dictionary into a Timer instance.

Parameters:	value (dict) – The dictionary to deserialize into a `Timer` instance.
Return type:	Timer
Returns:	A timer that the dictionary represents.

serialize(timestamp_format)¶

Serializes this timer to a dictionary.

Parameters:	timestamp_format (string) – The format string for this timer’s timestamp.
Return type:	dict
Returns:	The timer as a dictionary.

class kadabra.Unit(name, seconds_offset)¶

A unit, representing an offset from seconds. This is used by by kadabra.Timers for unambiguous reporting of the timer’s value.

Parameters:	name (string) – The name of the unit. seconds_offset (integer) – The offset of the unit relative to seconds.

static deserialize(value)¶

Deserializes a dictionary into a Unit instance.

Parameters:	value (dict) – The dictionary to deserialize into a `Unit` instance.
Return type:	Unit
Returns:	A unit that the dictionary represents.

serialize()¶

Serializes this unit to a dictionary.

Return type:	dict
Returns:	The unit as a dictionary.

class kadabra.Units¶

Container for commonly used units.

MILLISECONDS = <kadabra.metrics.Unit object>¶: Unit representing milliseconds.

SECONDS = <kadabra.metrics.Unit object>¶: Unit representing seconds.

class kadabra.Metrics(dimensions, counters, timers, timestamp_format='%Y-%m-%dT%H:%M:%S.%fZ', serialized_at=None)¶

This class encapsulates metrics which can be transported over a channel, and received by the agent. It should only ever be initialized (e.g. instances are meant to be immutable). This guarantees correct behavior with respect to the client (which transports the metrics) and the agent (which receives and publishes the metrics).

Parameters:	dimensions (list) – `Dimension`s for this set of metrics. counters (list) – `Counter`s for this set of metrics. timers (list) – `Timer`s for this set of metrics. timestamp_format (string) – The format string for timestamps. serialized_at (string) – The timestamp string for when the metrics were serialized, if they were previously serialized.

static deserialize(value)¶

Deserializes a dictionary into a Metrics instance.

Parameters:	value (dict) – The dictionary to deserialize into a `Metrics` instance.
Return type:	Metrics
Returns:	A metrics that the dictionary represents.

serialize()¶

Serializes this set of metrics into a dictionary.

Return type:	dict
Returns:	The metrics as a dictionary.

Channels¶

class kadabra.channels.RedisChannel(host, port, db, logger, queue_key, inprogress_key)¶

A channel for transporting metrics using Redis.

Parameters:	host (string) – The host of the Redis server. port (int) – The port of the Redis server. db (int) – The database to use on the Redis server. This should be used exclusively for Kadabra to prevent collisions with keys that might be used by your application. logger (string) – The name of the logger to use.

DEFAULT_ARGS = {'db': 0, 'host': 'localhost', 'inprogress_key': 'kadabra_inprogress', 'logger': 'kadabra.channel', 'queue_key': 'kadabra_queue', 'port': 6379}¶: Default arguments for the Redis channel. These will be used by the client and agent to initialize this channel if custom configuration values are not provided.

complete(metrics)¶

Mark a list of metrics as completed by removing them from the in-progress queue.

Parameters:	metrics (list) – The list of `Metrics` to mark as complete.

in_progress(query_limit)¶

Return a list of the metrics that are in_progress.

Parameters:	query_limit (int) – The maximum number of items to get from the in progress queue.
Return type:	list
Returns:	A list of `Metric`s that are in progress.

receive()¶

Receive metrics from the queue so they can be published. Once received, the metrics will be moved into a temporary “in progress” queue until they have been acknowledged as published (by calling complete()). This method will block until there are metrics available on the queue or after 10 seconds.

Return type:	Metrics
Returns:	The metrics to be published, or None if there were no metrics received after the timeout.

receive_batch(max_batch_size)¶

Receive a list of metrics from the queue so they can be published. Once received, all metrics will be moved into a temporary “in progress” queue until they have been acknowledged as published (by calling complete()). The number of metrics that are received is less than or equal to the max_batch_size, and possibly empty.

Parameters:	max_batch_size (int) – The maximum number of metrics to receive in the batch.
Return type:	list
Returns:	The list of metrics to be published. The size of the list is less than or equal to the `max_batch_size`, and possibly empty if there are no metrics in the queue.

send(metrics)¶

Send metrics to a Redis list, which will act as queue for pending metrics to be received and published.

Parameters:	metrics (Metrics) – The metrics to be sent.

Publishers¶

class kadabra.publishers.DebugPublisher(logger_name)¶

Publish metrics to a logger using the given logger name. Useful for debugging.

Parameters:	logger_name (string) – The name of the logger to use.

DEFAULT_ARGS = {'logger_name': 'kadabra.publisher'}¶: Default arguments for this publisher. These will be used by the agent to initialize this publisher if custom configuration values are not provided.

publish(metrics)¶

Publish the metrics by logging them (in serialized JSON format) to the publisher’s logger at the INFO level.

Parameters:	metrics (list) – The list of ~kadabra.Metrics to publish.

class kadabra.publishers.InfluxDBPublisher(host, port, database, timeout)¶

Publish metrics by persisting them into an InfluxDB database. Series will be created for each metric. Each metric name becomes a measurement and dimensions become the tag set. A single field will be created called ‘value’ which contains the value of the counter or timer. Timers will have an additional field called ‘unit’ which contains the name of the unit. Any metadata will become additional fields, although note that ‘value’ is a reserved name that will be overwritten for both metric types, and ‘unit’ will be overwritten for timers. For more information about InfluxDB see the docs <https://docs.influxdata.com/influxdb>.

Parameters:

host (string) – The hostname of the InfluxDB database.
port (int) – The port of the InfluxDB database.
database (string) – The name of the database to use for publishing metrics with this publisher. Note that this database must exist prior to publishing metrics with this publisher - make sure you set it up beforehand!
timeout (int) – The timeout to wait for when calling the InfluxDB database before failing.

DEFAULT_ARGS = {'host': 'localhost', 'port': 8086, 'timeout': 5, 'database': 'kadabra'}¶: Default arguments for this publisher. These will be used by the agent to initialize this publisher if custom configuration values are not provided.

publish(metrics)¶

Publish the metrics by writing them to InfluxDB.

Parameters:	metrics (Metrics) – The metrics to publish.