=============================================================================== DiLES - A disclosure learning evaluation system =============================================================================== DiLES is an evaluation system for learning methods supposed to conceptualize information disclosure preferences in social interaction within smart environments. It is a Python application which uses own learning algorithms as well as algorithms provided by external libraries like `LibSVM`_, `Orange`_ and `SciPy`_. Evaluation results are rendered using `matplotlib`_ and presented within an interactive HTML-based interface. Software and data related to DiLES is hosted at the `Open Science Repository`__ of the Computer Science Department at Rostock University. .. __: http://opsci.informatik.uni-rostock.de/index.php/DiLES ------------------------------------------------------------------------------- Setup ------------------------------------------------------------------------------- `Buildout`_ is used to set up the development and usage environment (DiLES is not meant to get installed but used directly within the source tree). To create a buildout, run:: $ python bootstrap.py $ bin/buildout # this one fails (see buildout.cfg for details) $ bin/buildout # this one and subsequent calls should succeed As a result, there are some ready to use scripts in the ``bin`` directory. Ready to use means, all dependencies are installed and used properly. These scripts provide some help when run with the ``--help`` option. ------------------------------------------------------------------------------- Usage ------------------------------------------------------------------------------- Defining Scenarios ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Scenarios are described in `YAML`_. They contain of a list of dictionaries of context information and disclosures, each describing a single disclosure situation. However, the first dictionary is not handled as a situation but as a default dictionary providing default values for information items not given in subsequent dictionaries. The only required items in a situation dictionary are `subject` (the one who has to decide a disclosure), `persons` (a list of potential information receiver), and `disclosure` (a list of information items to disclose). Disclosure information items must be strings which optionally encodes a hierarchical structure using a dot-separator. An simple example scenario might look like this:: # first, the defaults - medium: display-wall - room: 123 - tags: [] # situation 1 - subject: Bob - persons: Sue - trigger: start-meeting - tags: meeting - disclosure: - workspace.project-x - workspace.project-y - contact # situation 2 - subject: Bob - persons: Sue, Paul - trigger: start-meeting - tags: meeting - disclosure: - workspace.project-x.overview - contact.business This scenario describes two situations in which Bob decides the disclosure of information for collaboration purposes at the beginning of a meeting. The first situation is a meeting of Bob and Sue only. Here Bob shares all working documents related to projects *x* and *y*, as well as contact information. In the second situation, this time with Paul as an additional participant, Bob shares only an overview document of project *x* and business related contact information. The context items *tags* and *trigger* describe the circumstances of information disclosures and are just examples of how situations may be described besides the requires items *subject*, *persons*, and *disclosure*. In case a scenario includes situations where subjects may disclose information using one of multiple possible modalities, the chosen modality may be encoded within the disclosure a an itme's prefix, separated by a colon:: - subject: Bob - persons: Sue - medium: wall, mobile - disclosure: - wall:workspace.project-x.overview Scenario preprocessors handle such colon-separated modality information in different ways. Using the Scripts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Once a buildout of DiLES has been created, several scripts in the ``bin`` directory are provided. Each scripts provides usage instructions when run with the ``--help`` option. Following is a short overview about the scripts. ``diles-evaluate-scenario``: Evaluates preprocessors and learners with regard to a given scenario. ``diles-join-evaluation-results``: Joins multiple result files generated by `diles-evaluate-scenario`. This is useful when a scenario has been evaluated in multiple steps, each only considering certain subjects, preprocessors, and learners (which may be necessary for memory-usage reasons). ``diles-plot-evaluation-results``: Renders plots for an evaluation results file as produced by the script `diles-evaluate-scenario`. ``diles-rank-evaluation-results``: Generates various *global* rankings of evaluation results, e.g. which validators performed best in general (i.e. not in context of a specific subject, preprocessor or learning method). ``diles-render-plot-summary``: Renders HTML-based interactive interface to the plots rendered by the script `diles-plot-evaluation-results`. ``diles-analyze-scenario``: Analyzes a scenario and renders corresponding plots. These plots are recognized by ``diles-render-plot-summary``. These scenario analysis plots help in correlating evaluation results with disclosure patterns. ``diles-convert-dihabs-results-to-scenario``: Converts the results of a DiHabs survey to a DiLES scenario file. ``tests``: Runs all or selected tests (written as `doctests`_). ``python``: A Python interpreter with access to the ``diles`` package (for interactive testing/debugging). .. _Buildout: http://www.buildout.org/ .. _LibSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/ .. _Orange: http://orange.biolab.si/ .. _SciPy: http://www.scipy.org/ .. _YAML: http://yaml.org/ .. _matplotlib: http://matplotlib.sourceforge.net/ .. _doctests: http://docs.python.org/library/doctest.html ------------------------------------------------------------------------------- Packages ------------------------------------------------------------------------------- The package ``diles`` consists of several sub-packages and modules. Learning-related functionality is grouped in the sub-package ``diles.learn``. Functionality related to scenario evaluation is grouped in the sub-package ``diles.scenario``. Modules contained in the main package ``diles`` provide generally used utilities. diles.learn ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Learning related functionality: learning algorithms, optimization functions, performance evaluation functions and metrics. ``diles.learn.eval``: Utilities for learner evaluation. ``diles.learn.learners``: Base learners for single-label learning problems. ``diles.learn.learners.bayes``: Naive Bayes learning methods. ``diles.learn.learners.cbr``: Case based reasoning learning methods. ``diles.learn.learners.rules``: Rule learning methods. ``diles.learn.learners.svm``: Support vector machine learning methods. ``diles.learn.util``: Miscellaneous utilities for learning related stuff. ``diles.learn.validator``: Disclosure prediction validation methods. ``diles.learn.wrappers``: Wrappers enhancing base-learners, e.g. by transforming multi-label learning problems into multiple single-label learning problems or by improving predictions based on social pattern analysis. ``diles.learn.wrappers.hierarchy``: Hierarchical multi-label wrappers (standard methods to transform hierarchical multi-label problems into single label-problems). ``diles.learn.wrappers.multi``: Multi-label wrappers (standard methods to transform multi-label problems into single label-problems). ``diles.learn.wrappers.social``: Wrappers exploiting social patterns for improved disclosure prediction. diles.scenario ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Scenario evaluation related functionality: scenario preprocessing, conversion, and evaluation as well as evaluation results analysis and illustration. ``diles.scenario.analyze``: Scenario analysis and plotting (implements the script ``diles-analyze-scenario``). ``diles.scenario.dihabs``: Converter for DiHabs survey results (implements the script ``diles-convert-dihabs-results-to-scenario``) ``diles.scenario.html``: Summarize scenario evaluation results (implements the script ``diles-render-plot-summary``). ``diles.scenario.plot``: Plot scenario evaluation results (implements the script ``diles-plot-evaluation-results``). ``diles.scenario.preprocessors``: Scenario preprocessors and their combinations. ``diles.scenario.rank``: Plot global rankings of learning methods and validators across several scenario evaluation results (implements the script ``diles-rank-evaluation-results``). ``diles.scenario.util``: Miscellaneous utilities for scenario evaluation. diles.stats ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Simple statistical functions (less powerful but faster than their *numpy* equivalents). diles.tests ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Test runner (implements the script ``tests`` and provides frequently used test data). diles.util ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Miscellaneous utilities. ------------------------------------------------------------------------------- Extensions ------------------------------------------------------------------------------- Adding new scenario preprocessors, base learners, wrapping learners, or validators is easy, as shown in the following sub-sections. Next to the instructions below, it pays off to inspect the source code directly which contains comprehensive and detailed documentation. Especially the `doctests`_ are very helpful in understanding how different units of DiLES work and interact. Adding New Base Learners ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To add a new base learner, create a new module in the package ``diles.learn.learners`` which contains a class deriving from ``diles.learn.Learner``. To get automatically recognized, this class' name must end with `Learner`. The class must define a static attribute named ``paramspace`` which holds a dictionary mapping the learner's constructor parameters to lists of possible values. This `paramspace` is used for finding optimal learner configurations. The constructor must accept these parameters as keywords. Following is an example, providing a simple guessing learner:: class GuessLearner(diles.learn.Learner): paramspace = {'foo': [1,2,3,4]} def __init__(self, foo=1): self.foo = 1 # unused, just for the paramspace example def train(self, samples, labels, fwl=None, fwt=0): label = labels[0] confidence = 0.2 ranking = None self.prediction = label, confidence, ranking def predict(self, sample): # we always predict the same return self.prediction As seen in this example, the prediction method of a learner is expected to return a 3-tuple: the predicted label, a confidence value between 0 and 1 and optionally a ranking of all known labels (i.e. a list of rank-value/label pairs). This ranking is mainly used for debugging purposes and may also be `None` as in this example. For more sophisticated examples inspect the package ``diles.learn.learners``. Adding New Wrapping Learners ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To add a new wrapping learner, create a new module in the package ``diles.learn.wrapppers`` which contains a class deriving from ``diles.learn.Learner``. To get automatically recognized as a wrapper, this class' name must end with `Wrapper`. Similar to base learners, the class must define a static attribute named ``paramspace`` which holds a dictionary mapping the wrappers's constructor parameter names to lists of possible values. In contrast to base learners, wrapping learner constructors must accept as first argument the class of a base learner. Additionally, next to parameters used by the wrapper itself, the constructor must accept arbitrary parameters for the base learner. The following example explains these requirements more illustratively:: class DummyWrapper(diles.learn.Learner): paramspace = {'bar': ['a', 'b']} def __init__(self, basecls, bar='a', **baseparams): super(DummyWrapper, self).__init__() self.bar = bar # unused, just for illustration purposes self.model = basecls(**baseparams) def train(self, samples, labels, fwl=None, fwt=0): self.model.train(samples, labels, fwl=fwl, fwt=fwt) def predict(self, sample): return self.model.predict(sample) Of course, this is a rather useless wrapper which directly forwards any call to an instance of its base learner class. A dummy wrapper for the above mentioned guessing learner could be instantiated like this:: dw = DummyWrapper(GuessLearner, bar='b', foo=3) Adding New Validators ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Validators are methods of the class ``Validator`` in the module ``diles.learn.validator``, following a certain naming pattern. In particular any method whose name starts with ``_vld_`` is considered as a validator method. A validator must return `True` to support a prediction or `False` to prevent it. The existing validator methods in the referenced module shall serve as examples for implementing additional validators. Adding New Preprocessors and Combinations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Preprocessors are defined in the module ``dile.scenario.preprocessors``. A preprocessor function expects a list of situations from a scenario and must return a new list of situations. Preprocessors may be chained. Adding a new preprocessor means to add a new function to the referenced module and list this function in at least one chain given by the module's ``chains`` attribute. Again, inspect the referenced module's source for explanatory examples.