A pipeline built with the POPPy framework usually needs several plugins to generate different data. For traceability, the pipeline needs to track the versions of the plugins, data format used to generate data products. This is done with the registration of the plugins information through a descriptor file, providing the interface and information necessary to the pipeline.
The following sections describe the descriptor interface and the process for loading plugins using the descriptor file.
There is two kind of descriptors:
The pipeline descriptor provides metadata associated to the pipeline, databases
and used plugins. All information is inside pipeline
.
The identity of the pipeline is defined with the following JSON objects:
identifier
: the identifier of the pipeline.name
: a human readable name for the pipeline.description
: the purpose of the pipeline.The release
object shall inform about the current S/W release. It shall
contain the following attributes:
version
: current version of the pipeline in the format
‘MAJOR.MINOR.REVISION’, following the ROC conventions [AD2].date
: date and hour of the release of the S/W in the format
‘YYYY-MM-DD’, where ‘YYYY’, ‘MM’ and ‘DD’ are respectively the year, month
and date of the release.author
: name of the person, team or entity responsible of the
release.contact
: contact of the author (e.g., email)institute
: name of the institute that delivers the release.modification
: a string containing the list of S/W modifications in
the current release.In addition, the release
object can provide the following optional
attributes:
file
: file name to reference a data schema, master CDF, etc.reference
: name of a file as reference for the documentation, used as
an indication for the ROC team.url
: indication for an online resource.The project field contains information used for the auto-generation of some CDF skeleton files.
name
: name of the project as to set in the skeleton.source
: the source of the data.provider
: the provider of the data.discipline
: category of the data.PI
: informations on the Principal Investigator.PI.name
for the name of the PI.affiliation
for where the PI is affected.instrument_type
: the instrument used in the project.mission_group
: the group of the mission.The databases
object contains a list of object, representation of the
databases used by the pipeline and their metadata for future references. Each
database database follows this structure.
identifier
: the identifier of the pipeline.name
: a human readable name for the pipeline.description
: the purpose of the pipeline.release
: a release object as defined for the release
of the
pipeline. The structure must be the same.calibration_softwares
contains the list of paths to the external
calibration softwares whose descriptor must be loaded into the ROC database.
The descriptor is a file in the JSON format located inside the
namespace/plugin/descriptor.json
.
A descriptor file is validated against a schema located inside the
poppy/pop/config/plugin-descriptor-schema.json
.
Each S/W will be identified in the pipeline by the attributes provided in the
identification
JSON object:
project
: name of the project. It shall be “ROC-SGSE” for S/W used in
the ROC-SGSE pipeline, and “RPW” otherwise.name
: a cool name for the calibration software, to be human readable.identifier
: a unique name used as reference by the ROC-SGSE pipeline
to identify the S/W. It shall contain Latin alphabet uppercase letters only.
For ease of use your name should be namespace.plugin
. In all case
, the developer team shall validate the identifier (ID) to avoid duplicated
names.description
: short description of the software. This description will
be saved in the ROC database.The release
object shall inform about the current S/W release. It is
also used to describe the output data (see section Outputs). It shall
contain the following attributes:
version
: Current version of the S/W in the format
‘MAJOR.MINOR.REVISION’, following the ROC conventions [AD2].date
: Date and hour of the release of the S/W in the format
‘YYYY-MM-DD’, where ‘YYYY’, ‘MM’ and ‘DD’ are respectively the year, month
and date of the release.author
: Name of the person, team or entity responsible of the releasecontact
: contact of the author (e.g., email)institute
: Name of the institute that delivers the releasemodification
: a string containing the list of S/W modifications in
the current release.In addition, the release
object can provide the following optional
attributes:
file
: file name to reference a data schema, master CDF, as an
indication for the ROC team. Only required in the outputs modes object
descriptions.reference
: name of a file as reference for the documentation, used as
an indication for the ROC team.url
: indication for an online resource.The tasks
object contains the list of tasks that the plugin defines.
For each task, its name, its purpose and the list of input/output datasets to
be read/saved shall be supplied. It allows the pipeline to control if the
expected output data files are correctly saved.
Each function listed in the tasks
object shall contain the following
attributes and JSON arrays:
name
: the name of the task. It will be used as internal reference by
the pipeline.description
: a short description of the purpose of the function.inputs
: a JSON object containing the list of the input datasets
required by the task.outputs
: a JSON object containing the list of the outputs
returned by the task.The purpose and the content of the inputs
and outputs
are
detailed in the two next sections.
The pipeline requires information in order to identify the specific input
parameters of a given task. This is the aim of the inputs
object, which
provides the list of the specific input parameters in JSON objects, with the
two following mandatory attributes:
identifier
: The dataset ID associated to the input data file, as
referenced in the ROC system.version
: The version of the input data file. This allows the ROC
pipeline to ensure that S/W will process the right version of the input file.The pipeline requires information in order to verify that the expected output data files have been correctly produced at the end of a given task execution.
Each JSON object listed in code:outputs shall described in details the corresponding dataset, using the following attributes/objects:
identifier
: The dataset ID associated to the output, as referenced in
the ROC system. It shall be unique and comply the naming convention listed in
[AD2].name
: a more human-readable name for the dataset, not necessarily
unique.description
: a short description of the dataset.level
: the processing level of the dataset. Allowed values are “LZ”,
“L0”, “L1”, “L2”, “L2R”, “L2S”, “L3”, “L4”, “AUX”, “LL0”, “LL1”, “HK”.release
: information about the release of the dataset. The structure
is the same as defined in the section Release. In the case of a
dataset using the CDF format, the “file” attribute shall provide the name of
the master CDF file used to generate the output data files.In any case, the POPPy framework will perform an automated validation of the
descriptor file at each new release, in order to check that the file content is
consistent with the ROC database information. In particular, it will verify
that the datasets declared the tasks
object are all defined in the
descriptor and across S/W.
An example file taken from the tuto.texter
module.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 | {
"identification": {
"project": "TUTO",
"identifier": "tuto.texter",
"name": "Text maker",
"description": "Reads dictionnary and packets, reconstruct the text"
},
"release": {
"version": "0.0.1",
"date": "2018-06-03",
"author": "Grégoire Duvauchelle",
"contact": "gregoire.duvauchelle@protonmail.com",
"institute": "LESIA",
"modification": "Starting version",
"reference": "reference_document.pdf"
},
"tasks": [
{
"name": "get_data",
"category": "Software execution",
"description": "",
"inputs":{},
"outputs": {
"packets": {
"identifier": "TUTO-PKT-L0",
"name": "Tutorial data packets",
"description": "Contains the identifier of the word",
"level": "L0",
"release": {
"author": "Grégoire Duvauchelle",
"date": "2018-06-03",
"version": "01",
"contact": "gregoire.duvauchelle@protonmail.com",
"institute": "LESIA",
"modification": "Starting"
}
}
}
},
{
"name": "load_dict",
"category": "Software execution",
"description": "Loads the dictionary in the database",
"inputs": {},
"outputs": {}
},
{
"name": "decommute",
"category": "Software execution",
"description": "Replace the indexes with the words for every packet",
"inputs": {
"packets": {
"identifier": "TUTO-PKT-L0",
"version": "01"
}
},
"outputs": {
"words": {
"identifier": "TUTO-WDS-L1",
"name": "Tutorial words file",
"description": "Contains the words",
"level": "L1",
"release": {
"author": "Grégoire Duvauchelle",
"date": "2018-06-03",
"version": "01",
"contact": "gregoire.duvauchelle@protonmail.com",
"institute": "LESIA",
"modification": "Starting"
}
}
}
},
{
"name": "make_text",
"category": "Software execution",
"description": "",
"inputs": {
"words": {
"identifier": "TUTO-WDS-L1",
"version": "01"
}
},
"outputs": {
"text": {
"identifier": "TUTO-TXT-L2",
"name": "Tutorial text file",
"description": "contains the text",
"level": "L2",
"release": {
"author": "Grégoire Duvauchelle",
"date": "2018-06-03",
"version": "01",
"contact": "gregoire.duvauchelle@protonmail.com",
"institute": "LESIA",
"modification": "Starting"
}
}
}
}
]
}
|