Descriptor

Introduction

A pipeline built with the POPPy framework usually needs several plugins to generate different data. For traceability, the pipeline needs to track the versions of the plugins, data format used to generate data products. This is done with the registration of the plugins information through a descriptor file, providing the interface and information necessary to the pipeline.

The following sections describe the descriptor interface and the process for loading plugins using the descriptor file.

Descriptor interface

There is two kind of descriptors:

  • one for the pipeline,
  • one for each plugins.

Pipeline descriptor

The pipeline descriptor provides metadata associated to the pipeline, databases and used plugins. All information is inside pipeline.

Identification

The identity of the pipeline is defined with the following JSON objects:

  • identifier: the identifier of the pipeline.
  • name: a human readable name for the pipeline.
  • description: the purpose of the pipeline.

Release

The release object shall inform about the current S/W release. It shall contain the following attributes:

  • version: current version of the pipeline in the format ‘MAJOR.MINOR.REVISION’, following the ROC conventions [AD2].
  • date: date and hour of the release of the S/W in the format ‘YYYY-MM-DD’, where ‘YYYY’, ‘MM’ and ‘DD’ are respectively the year, month and date of the release.
  • author: name of the person, team or entity responsible of the release.
  • contact: contact of the author (e.g., email)
  • institute: name of the institute that delivers the release.
  • modification: a string containing the list of S/W modifications in the current release.

In addition, the release object can provide the following optional attributes:

  • file: file name to reference a data schema, master CDF, etc.
  • reference: name of a file as reference for the documentation, used as an indication for the ROC team.
  • url: indication for an online resource.

Project

The project field contains information used for the auto-generation of some CDF skeleton files.

  • name: name of the project as to set in the skeleton.
  • source: the source of the data.
  • provider: the provider of the data.
  • discipline: category of the data.
  • PI: informations on the Principal Investigator.
    • PI.name for the name of the PI.
    • affiliation for where the PI is affected.
  • instrument_type: the instrument used in the project.
  • mission_group: the group of the mission.

Databases

The databases object contains a list of object, representation of the databases used by the pipeline and their metadata for future references. Each database database follows this structure.

  • identifier: the identifier of the pipeline.
  • name: a human readable name for the pipeline.
  • description: the purpose of the pipeline.
  • release: a release object as defined for the release of the pipeline. The structure must be the same.

Calibration softwares

calibration_softwares contains the list of paths to the external calibration softwares whose descriptor must be loaded into the ROC database.

Plugin descriptor

The descriptor is a file in the JSON format located inside the namespace/plugin/descriptor.json.

A descriptor file is validated against a schema located inside the poppy/pop/config/plugin-descriptor-schema.json.

Identification

Each S/W will be identified in the pipeline by the attributes provided in the identification JSON object:

  • project: name of the project. It shall be “ROC-SGSE” for S/W used in the ROC-SGSE pipeline, and “RPW” otherwise.
  • name: a cool name for the calibration software, to be human readable.
  • identifier: a unique name used as reference by the ROC-SGSE pipeline to identify the S/W. It shall contain Latin alphabet uppercase letters only. For ease of use your name should be namespace.plugin. In all case , the developer team shall validate the identifier (ID) to avoid duplicated names.
  • description: short description of the software. This description will be saved in the ROC database.

Release

The release object shall inform about the current S/W release. It is also used to describe the output data (see section Outputs). It shall contain the following attributes:

  • version: Current version of the S/W in the format ‘MAJOR.MINOR.REVISION’, following the ROC conventions [AD2].
  • date: Date and hour of the release of the S/W in the format ‘YYYY-MM-DD’, where ‘YYYY’, ‘MM’ and ‘DD’ are respectively the year, month and date of the release.
  • author: Name of the person, team or entity responsible of the release
  • contact: contact of the author (e.g., email)
  • institute: Name of the institute that delivers the release
  • modification: a string containing the list of S/W modifications in the current release.

In addition, the release object can provide the following optional attributes:

  • file: file name to reference a data schema, master CDF, as an indication for the ROC team. Only required in the outputs modes object descriptions.
  • reference: name of a file as reference for the documentation, used as an indication for the ROC team.
  • url: indication for an online resource.

Tasks

The tasks object contains the list of tasks that the plugin defines. For each task, its name, its purpose and the list of input/output datasets to be read/saved shall be supplied. It allows the pipeline to control if the expected output data files are correctly saved.

Each function listed in the tasks object shall contain the following attributes and JSON arrays:

  • name: the name of the task. It will be used as internal reference by the pipeline.
  • description: a short description of the purpose of the function.
  • inputs: a JSON object containing the list of the input datasets required by the task.
  • outputs: a JSON object containing the list of the outputs returned by the task.

The purpose and the content of the inputs and outputs are detailed in the two next sections.

Inputs

The pipeline requires information in order to identify the specific input parameters of a given task. This is the aim of the inputs object, which provides the list of the specific input parameters in JSON objects, with the two following mandatory attributes:

  • identifier: The dataset ID associated to the input data file, as referenced in the ROC system.
  • version: The version of the input data file. This allows the ROC pipeline to ensure that S/W will process the right version of the input file.

Outputs

The pipeline requires information in order to verify that the expected output data files have been correctly produced at the end of a given task execution.

Each JSON object listed in code:outputs shall described in details the corresponding dataset, using the following attributes/objects:

  • identifier: The dataset ID associated to the output, as referenced in the ROC system. It shall be unique and comply the naming convention listed in [AD2].
  • name: a more human-readable name for the dataset, not necessarily unique.
  • description: a short description of the dataset.
  • level: the processing level of the dataset. Allowed values are “LZ”, “L0”, “L1”, “L2”, “L2R”, “L2S”, “L3”, “L4”, “AUX”, “LL0”, “LL1”, “HK”.
  • release: information about the release of the dataset. The structure is the same as defined in the section Release. In the case of a dataset using the CDF format, the “file” attribute shall provide the name of the master CDF file used to generate the output data files.

In any case, the POPPy framework will perform an automated validation of the descriptor file at each new release, in order to check that the file content is consistent with the ROC database information. In particular, it will verify that the datasets declared the tasks object are all defined in the descriptor and across S/W.

Example

An example file taken from the tuto.texter module.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
{
    "identification": {
        "project": "TUTO",
        "identifier": "tuto.texter",
        "name": "Text maker",
        "description": "Reads dictionnary and packets, reconstruct the text"
    },
    "release": {
        "version": "0.0.1",
        "date": "2018-06-03",
        "author": "Grégoire Duvauchelle",
        "contact": "gregoire.duvauchelle@protonmail.com",
        "institute": "LESIA",
        "modification": "Starting version",
        "reference": "reference_document.pdf"
    },
    "tasks": [
        {
            "name": "get_data",
            "category": "Software execution",
            "description": "",
            "inputs":{},
            "outputs": {
                "packets": {
                    "identifier": "TUTO-PKT-L0",
                    "name": "Tutorial data packets",
                    "description": "Contains the identifier of the word",
                    "level": "L0",
                    "release": {
                        "author": "Grégoire Duvauchelle",
                        "date": "2018-06-03",
                        "version": "01",
                        "contact": "gregoire.duvauchelle@protonmail.com",
                        "institute": "LESIA",
                        "modification": "Starting"
                    }
                }
            }
        },
        {
            "name": "load_dict",
            "category": "Software execution",
            "description": "Loads the dictionary in the database",
            "inputs": {},
            "outputs": {}
        },
        {
            "name": "decommute",
            "category": "Software execution",
            "description": "Replace the indexes with the words for every packet",
            "inputs": {
                "packets": {
                    "identifier": "TUTO-PKT-L0",
                    "version": "01"
                }
            },
            "outputs": {
                "words": {
                    "identifier": "TUTO-WDS-L1",
                    "name": "Tutorial words file",
                    "description": "Contains the words",
                    "level": "L1",
                    "release": {
                        "author": "Grégoire Duvauchelle",
                        "date": "2018-06-03",
                        "version": "01",
                        "contact": "gregoire.duvauchelle@protonmail.com",
                        "institute": "LESIA",
                        "modification": "Starting"
                    }
                }
            }
        },
        {
            "name": "make_text",
            "category": "Software execution",
            "description": "",
            "inputs": {
                "words": {
                    "identifier": "TUTO-WDS-L1",
                    "version": "01"
                }
            },
            "outputs": {
                "text": {
                    "identifier": "TUTO-TXT-L2",
                    "name": "Tutorial text file",
                    "description": "contains the text",
                    "level": "L2",
                    "release": {
                        "author": "Grégoire Duvauchelle",
                        "date": "2018-06-03",
                        "version": "01",
                        "contact": "gregoire.duvauchelle@protonmail.com",
                        "institute": "LESIA",
                        "modification": "Starting"
                    }
                }
            }
        }
    ]
}