The Plugin-Oriented Pipeline for Python (POPPy) framework offers functionalities to develop, install and run in a standard way workflows. It is more particularly designed to work with data processing pipelines producing files.
POPPy supports basical features that are usually needed when building and executing pipelines, such as:
POPPy was first written in the framework of the RPW Operations Centre (ROC) project. The ROC is the main entity in charge of the ground segment of the Radio and Plasma Waves instrument (RPW) on-board the Solar Orbiter European space probe. Visit http://rpw.lesia.obspm.fr/ for more details about Solar Orbiter, RPW and the ROC.
The POPPy framework is released under the TBD license.
The POPPy developer teams can be contacted via roc.support@sympa.obspm.fr.
This section details how to install POPPy and use it to develop a pipeline.
POPPy has been tested to work on Linux Debian operating system.
Make sure that the following software set is installed on your system before deploying and using POPPy:
Additionaly a relational database managment system (RDMS) will be required to run a POPPy pipeline.
POPPy must be first installed to develop a pipeline.
It is strongly recommended to use POPPy into a Python’s virtual environment (virtualenv) in order to avoid dependency conflicts. Since the version 3.5, the virtualenv mechanism is natively included in Python.
To create a virtualenv, open a terminal and enter:
$ python3 -m venv /path/to/myprojectvenv
Where /path/to/myprojectvenv is the path to the virtualenv’s directory.
Then, to load enter the command:
$ source /path/to/myprojectvenv/bin/activate
For more details about the Python’s virtual environments, please visit https://docs.python.org/3/tutorial/venv.html.
To install POPPy in the virtualenv, execute the three following commands successively:
$ pip install git+https://gitlab.obspm.fr/POPPY/POPPyCore.git@develop#egg=poppy.core
$ pip install git+https://gitlab.obspm.fr/POPPY/POP.git@develop#egg=poppy.pop
$ pip install git+https://gitlab.obspm.fr/POPPY/PIPER.git@develop#egg=poppy.piper
The first command retrieves from the remote Git server and sets up the POPPy core library. The second and third commands retrieve and set up the POP and PIPER mandatory plugins.
You can generate a pipeline and all the boilerplate code needed to have a basic pipeline that uses the framework.
$ poppy create pipeline poppy_tuto
You will get a directory called mypipeline/ in the current directory containing multiple files :
poppy_tuto
├── config.json
├── descriptor.json
├── lib
├── manage.py
├── requirements.txt
└── settings.py
config.json
: Contains the output path of the pipeline, database
credentials and address. This is the only file that should not be tracked by
your vcs.descriptor.json
: Provides metadata associated to the pipeline, the
project and databases.settings.py
: Contains the list of active plugins, a variable to the root
directory and the identifier of the main database.requirements.txt
: Contains the list of python libraries dependancieslib/
: Contains eventual external libraries (in the case of the RPW
pipeline, this directory contains nasa’s CDF library and the Instrument
Database)manage.py
: The entry point of the pipeline.You can then create a plugin skeleton the same way we created the pipeline :
$ poppy create plugin guide.myplugin
Your plugin name must be of the form namespace.pluginname. It is once again
a way to split the code in a meaningful way.
To help you sort your code, create a directory called plugins/
in root
directory of your pipeline. However your plugins can be wherever you want.
You will see it has once again created a bunch of files prefilled with some usual code.
In order to use the namespace feature, the python code of your plugin must be located in the directory plugin/namespace/plugin/ (see PEP 420 for more information on namespaces)
In the plugin root directory there is :
setup.py
: it is a common python file, it allows you to install your
python module using pipsystem_reqs.ini
: you can put in this file eventual external libraries
needed by the pluginIn the myplugin/guide/myplugin/ you will find multiple python files. It is not mandatory for the POPPy framework to split your code into multiple files but it is simply a good practice, so POPPy assumes you would like to split your code and generates multiple files.
descriptor.json
: as for the pipeline, each plugin needs a descriptor
file. In the case of plugins there are information about the plugin, the tasks
it will perform and their targets (input and output files).commands.py
: in this file you should register the commands you want to
call from the Command Line Interface (CLI).tasks.py
: a file containing the tasks of your plugin. Usually those tasks
are simply decorated python functions.tests.py
: this prefilled file should encourage you to write
unit/functional/end-to-end/whatever tests for your pipeline. The test
procedure is integrated to the POPPy framework and wrapper classes and
functions exists to help you.models
: you will put in this directory all the database models
corresponding to your plugin