===============================================================================
DiLES - A disclosure learning evaluation system
===============================================================================

DiLES is an evaluation system for learning methods supposed to conceptualize
information disclosure preferences in social interaction within smart
environments.  It is a Python application which uses own learning algorithms as
well as algorithms provided by external libraries like `LibSVM`_, `Orange`_ and
`SciPy`_. Evaluation results are rendered using `matplotlib`_ and presented
within an interactive HTML-based interface.

Software and data related to DiLES is hosted at the `Open Science Repository`__
of the Computer Science Department at Rostock University.

.. __: http://opsci.informatik.uni-rostock.de/index.php/DiLES

-------------------------------------------------------------------------------
Setup
-------------------------------------------------------------------------------

`Buildout`_ is used to set up the development and usage environment (DiLES is
not meant to get installed but used directly within the source tree).

To create a buildout, run::

    $ python bootstrap.py
    $ bin/buildout # this one fails (see buildout.cfg for details)
    $ bin/buildout # this one and subsequent calls should succeed

As a result, there are some ready to use scripts in the ``bin`` directory.
Ready to use means, all dependencies are installed and used properly. These
scripts provide some help when run with the ``--help`` option.

-------------------------------------------------------------------------------
Usage
-------------------------------------------------------------------------------

Defining Scenarios
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Scenarios are described in `YAML`_. They contain of a list of dictionaries of
context information and disclosures, each describing a single disclosure
situation. However, the first dictionary is not handled as a situation but as a
default dictionary providing default values for information items not given in
subsequent dictionaries. The only required items in a situation dictionary are
`subject` (the one who has to decide a disclosure), `persons` (a list of
potential information receiver), and `disclosure` (a list of information items
to disclose). Disclosure information items must be strings which optionally
encodes a hierarchical structure using a dot-separator. An simple example
scenario might look like this::

    # first, the defaults
    - medium: display-wall
    - room: 123
    - tags: []

    # situation 1
    - subject: Bob
    - persons: Sue
    - trigger: start-meeting
    - tags: meeting
    - disclosure:
        - workspace.project-x
        - workspace.project-y
        - contact

    # situation 2
    - subject: Bob
    - persons: Sue, Paul
    - trigger: start-meeting
    - tags: meeting
    - disclosure:
        - workspace.project-x.overview
        - contact.business

This scenario describes two situations in which Bob decides the disclosure of
information for collaboration purposes at the beginning of a meeting. The first
situation is a meeting of Bob and Sue only. Here Bob shares all working
documents related to projects *x* and *y*, as well as contact information.
In the second situation, this time with Paul as an additional participant, Bob
shares only an overview document of project *x* and business related contact
information.  The context items *tags* and *trigger* describe the circumstances
of information disclosures and are just examples of how situations may be
described besides the requires items *subject*, *persons*, and *disclosure*.

In case a scenario includes situations where subjects may disclose information
using one of multiple possible modalities, the chosen modality may be encoded
within the disclosure a an itme's prefix, separated by a colon::

    - subject: Bob
    - persons: Sue
    - medium: wall, mobile
    - disclosure:
        - wall:workspace.project-x.overview

Scenario preprocessors handle such colon-separated modality information in
different ways.

Using the Scripts
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Once a buildout of DiLES has been created, several scripts in the ``bin``
directory are provided. Each scripts provides usage instructions when run with
the ``--help`` option. Following is a short overview about the scripts.


``diles-evaluate-scenario``:
    Evaluates preprocessors and learners with regard to a given scenario.

``diles-join-evaluation-results``:
    Joins multiple result files generated by `diles-evaluate-scenario`. This
    is useful when a scenario has been evaluated in multiple steps, each only
    considering certain subjects, preprocessors, and learners (which may be
    necessary for memory-usage reasons).

``diles-plot-evaluation-results``:
    Renders plots for an evaluation results file as produced by the script
    `diles-evaluate-scenario`.

``diles-rank-evaluation-results``:
    Generates various *global* rankings of evaluation results, e.g. which
    validators performed best in general (i.e. not in context of a specific
    subject, preprocessor or learning method).

``diles-render-plot-summary``:
    Renders HTML-based interactive interface to the plots rendered by the
    script `diles-plot-evaluation-results`.

``diles-analyze-scenario``:
    Analyzes a scenario and renders corresponding plots. These plots are
    recognized by ``diles-render-plot-summary``. These scenario analysis plots
    help in correlating evaluation results with disclosure patterns.

``diles-convert-dihabs-results-to-scenario``:
    Converts the results of a DiHabs survey to a DiLES scenario file.

``tests``:
    Runs all or selected tests (written as `doctests`_).

``python``:
    A Python interpreter with access to the ``diles`` package (for interactive
    testing/debugging).

.. _Buildout: http://www.buildout.org/
.. _LibSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/
.. _Orange: http://orange.biolab.si/
.. _SciPy: http://www.scipy.org/
.. _YAML: http://yaml.org/
.. _matplotlib: http://matplotlib.sourceforge.net/
.. _doctests: http://docs.python.org/library/doctest.html

-------------------------------------------------------------------------------
Packages
-------------------------------------------------------------------------------

The package ``diles`` consists of several sub-packages and modules.
Learning-related functionality is grouped in the sub-package ``diles.learn``.
Functionality related to scenario evaluation is grouped in the sub-package
``diles.scenario``. Modules contained in the main package ``diles`` provide
generally used utilities.

diles.learn
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Learning related functionality: learning algorithms, optimization functions,
performance evaluation functions and metrics.

``diles.learn.eval``:
    Utilities for learner evaluation.

``diles.learn.learners``:
    Base learners for single-label learning problems.

``diles.learn.learners.bayes``:
    Naive Bayes learning methods.

``diles.learn.learners.cbr``:
    Case based reasoning learning methods.

``diles.learn.learners.rules``:
    Rule learning methods.

``diles.learn.learners.svm``:
    Support vector machine learning methods.

``diles.learn.util``:
    Miscellaneous utilities for learning related stuff.

``diles.learn.validator``:
    Disclosure prediction validation methods.

``diles.learn.wrappers``:
    Wrappers enhancing base-learners, e.g. by transforming multi-label learning
    problems into multiple single-label learning problems or by improving
    predictions based on social pattern analysis.

``diles.learn.wrappers.hierarchy``:
    Hierarchical multi-label wrappers (standard methods to transform
    hierarchical multi-label problems into single label-problems).

``diles.learn.wrappers.multi``:
    Multi-label wrappers (standard methods to transform multi-label problems
    into single label-problems).

``diles.learn.wrappers.social``:
    Wrappers exploiting social patterns for improved disclosure prediction.

diles.scenario
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Scenario evaluation related functionality: scenario preprocessing,
conversion, and evaluation as well as evaluation results analysis and
illustration.

``diles.scenario.analyze``:
    Scenario analysis and plotting (implements the script
    ``diles-analyze-scenario``).

``diles.scenario.dihabs``:
    Converter for DiHabs survey results (implements the script
    ``diles-convert-dihabs-results-to-scenario``)

``diles.scenario.html``:
    Summarize scenario evaluation results (implements the script
    ``diles-render-plot-summary``).

``diles.scenario.plot``:
    Plot scenario evaluation results (implements the script
    ``diles-plot-evaluation-results``).

``diles.scenario.preprocessors``:
    Scenario preprocessors and their combinations.

``diles.scenario.rank``:
    Plot global rankings of learning methods and validators across several
    scenario evaluation results (implements the script
    ``diles-rank-evaluation-results``).

``diles.scenario.util``:
    Miscellaneous utilities for scenario evaluation.

diles.stats
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Simple statistical functions (less powerful but faster than their *numpy*
equivalents).

diles.tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test runner (implements the script ``tests`` and provides frequently used
test data).

diles.util
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Miscellaneous utilities.

-------------------------------------------------------------------------------
Extensions
-------------------------------------------------------------------------------

Adding new scenario preprocessors, base learners, wrapping learners, or
validators is easy, as shown in the following sub-sections.

Next to the instructions below, it pays off to inspect the source code directly
which contains comprehensive and detailed documentation. Especially the
`doctests`_ are very helpful in understanding how different units of DiLES work
and interact.

Adding New Base Learners
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To add a new base learner, create a new module in the package
``diles.learn.learners`` which contains a class deriving from
``diles.learn.Learner``. To get automatically recognized, this class' name must
end with `Learner`. The class must define a static attribute named
``paramspace`` which holds a dictionary mapping the learner's constructor
parameters to lists of possible values. This `paramspace` is used for finding
optimal learner configurations. The constructor must accept these parameters as
keywords.

Following is an example, providing a simple guessing learner::

    class GuessLearner(diles.learn.Learner):

        paramspace = {'foo': [1,2,3,4]}

        def __init__(self, foo=1):
            self.foo = 1 # unused, just for the paramspace example

        def train(self, samples, labels, fwl=None, fwt=0):
            label = labels[0]
            confidence = 0.2
            ranking = None
            self.prediction = label, confidence, ranking

        def predict(self, sample):
            # we always predict the same
            return self.prediction

As seen in this example, the prediction method of a learner is expected to
return a 3-tuple: the predicted label, a confidence value between 0 and 1 and
optionally a ranking of all known labels (i.e. a list of rank-value/label
pairs).  This ranking is mainly used for debugging purposes and may also be
`None` as in this example.

For more sophisticated examples inspect the package ``diles.learn.learners``.

Adding New Wrapping Learners
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To add a new wrapping learner, create a new module in the package
``diles.learn.wrapppers`` which contains a class deriving from
``diles.learn.Learner``. To get automatically recognized as a wrapper, this
class' name must end with `Wrapper`. Similar to base learners, the class must
define a static attribute named ``paramspace`` which holds a dictionary mapping
the wrappers's constructor parameter names to lists of possible values.

In contrast to base learners, wrapping learner constructors must accept as
first argument the class of a base learner. Additionally, next to parameters
used by the wrapper itself, the constructor must accept arbitrary parameters
for the base learner. The following example explains these requirements more
illustratively::

    class DummyWrapper(diles.learn.Learner):

        paramspace = {'bar': ['a', 'b']}

        def __init__(self, basecls, bar='a', **baseparams):
            super(DummyWrapper, self).__init__()
            self.bar = bar # unused, just for illustration purposes
            self.model = basecls(**baseparams)

        def train(self, samples, labels, fwl=None, fwt=0):
            self.model.train(samples, labels, fwl=fwl, fwt=fwt)

        def predict(self, sample):
            return self.model.predict(sample)

Of course, this is a rather useless wrapper which directly forwards any call to
an instance of its base learner class. A dummy wrapper for the above mentioned
guessing learner could be instantiated like this::

    dw = DummyWrapper(GuessLearner, bar='b', foo=3)

Adding New Validators
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Validators are methods of the class ``Validator`` in the module
``diles.learn.validator``, following a certain naming pattern. In particular
any method whose name starts with ``_vld_`` is considered as a validator
method.  A validator must return `True` to support a prediction or `False` to
prevent it. The existing validator methods in the referenced module shall serve
as examples for implementing additional validators.

Adding New Preprocessors and Combinations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Preprocessors are defined in the module ``dile.scenario.preprocessors``. A
preprocessor function expects a list of situations from a scenario and must
return a new list of situations. Preprocessors may be chained. Adding a new
preprocessor means to add a new function to the referenced module and list this
function in at least one chain given by the module's ``chains`` attribute.
Again, inspect the referenced module's source for explanatory examples.