How do you perform logging in software that represents a large mathematical model? If the model isn't behaving as expected, is it the code or the data that's at fault?
How do you perform logging in software that represents a large mathematical model? If the model isn't behaving as expected, is it the code or the data that's at fault?
March 2018
Logging is crucial in any large system, but numerical programs add some unique challenges. You can't simply write a large numeric array to a text file: neither a human nor a computer can easily read that.
At Man AHL, we've solved this with a special diagnostic log format, affectionately known as a 'diag'.
Logging Values
Suppose we have some code of the form:
import numpy as np
def complex_model(val):
return val + 1.0
def apply_complex_model(arr):
return np.apply_along_axis(complex_model, 0, arr)
If we're getting unexpected values from apply_complex_model
, does that indicate an issue with complex_model
or the value we passed in for arr
?
It's tempting to print arr
to the log, but that doesn't scale for large values. We also want to plot bad values, so we can eyeball the data.
We modify the code to log the value itself:
import numpy as np
import ahl.diags as diags
def apply_complex_model(arr):
# In practice we provide decorators for common use cases
# like logging inputs and outputs.
with diags.prefix('complex_model'):
diags.log("input", arr)
output = np.apply_along_axis(complex_model, 0, arr)
diags.log("output", output)
return output
This has many advantages:
Interactivity: We can load up the diag in ipython and examine it.
Visibility: We can visualise the actual data that was used when the program ran. Inputs are typically timeseries, which lend themselves to plotting.
Reproducibility: Since we have the interesting inputs, we can re-run our apply_complex_model
function with these inputs. If we're bugfixing, we can run our new implementation against the same inputs.
Storage
Large mathematical models often have large inputs and outputs. Loading the entire diag into memory would be slow and resource intensive.
We store diags as efficiently serialised data in HDF5 files. HDF5 allows us to only load the values from the diag that we're interested in, without reading the whole file. This keeps loading snappy.
Viewing Diags
A diag looks like a nested dict of dicts:
>>> from ahl.diags import import_diag
>>> my_diag = import_diag("~/example_diag.h5")
>>> my_diag['complex_model']['input'].value
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Since we often examine them in ipython, we provide a more convenient API that aids tab-completion:
>>> my_diag.complex_model.input.value
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Shared Diags
Good logs are readily available, and diags are no exception. We store our diags in the 'diags repo': a shared filesystem that's available to all our researchers and developers.
Every time a model runs, its diag is stored in this shared directory. This is incredibly powerful for debugging, and we've built reporting tools on top of the diags repo.
It's not easy to find diag files in this directory directly. With tens of thousands of files, it's hard to find the diag you're interested in.
We monitor the directory with a 'diags indexer' tool. When new diags are available, we update a Mongo database with diag metadata. We then provide a Python object that queries this database.
This database allows us to load diags according to specific constraints:
>>> my_diag = diags.repo.by_user.jdoe.last
This is great for discoverability, we can just press tab to interactively see what data is available:
# Which strategies are running in live?
>>> diags.repo.by_platform.live.by_strategy.<TAB>
# Which markets are we trading in preprod?
>>> diags.repo.by_platform.preprod.by_market.<TAB>
Closing Thoughts
The diag has all the advantage of a log, but it's structured and easy to build upon. More importantly, it contains real Python objects, so it's easy to examine. It's become a ubiquitous part of our tooling.
You are now exiting our website
Please be aware that you are now exiting the Man Group website. Links to our social media pages are provided only as a reference and courtesy to our users. Man Group has no control over such pages, does not recommend or endorse any opinions or non-Man Group related information or content of such sites and makes no warranties as to their content. Man Group assumes no liability for non Man Group related information contained in social media pages. Please note that the social media sites may have different terms of use, privacy and/or security policy from Man Group.