Ingestion

Introduction to Ingestion

In Rosetta Core, Ingestion is the process of converting an (electronic) input file into CDM.

The input file one of a number of different standard formats for electronic data storage and transport, such as XML or JSON. The CDM output can be either:

  • another electronic file, using the same range of support formats as the input file and with CDM as underlying data model (a.k.a. a Serialised CDM document), or
  • a CDM object that computer code can be directly executed on (a.k.a. a De-Serialised CDM object).

The Serialised view is useful to display the CDM output into an interface, in a way that a user can browse through and understand. This is the view that is used when demonstrating the Ingestion application in Rosetta Core. The De-Serialised object is not meant to be readable.

This section presents the basic functionalities of the Ingestion App in Rosetta Core.

Overview of Ingestion Interface

After opening the Ingestion App, the Rosetta Core window shows the Ingestion interface at the bottom of the screen (or full vertical screen if the Editor application is closed):

../_images/ingestion_overview.png

The Ingestion section is further split horizontally into four panels:

  • INGESTION: Where the user selects which file to ingest
  • INPUT: To visualise the input file in detail
  • CDM: To visualise the (serialised) CDM output once ingested
  • DIAGNOSTICS: Statistics and other key information summarising the success (or not) of ingesting the file

Each panel can fold / expand by clicking on its title - Here for instance showing INGESTION and INPUT folded and only CDM expanded:

../_images/2020-07-16-13-02-29.png

Note

By default, the DIAGNOSTICS panel is shown folded when opening the Ingestion App.

The function of each panel is described in the following list of functionalities and detailed in the sections below:

  • Uploading an input file
  • Running / re-running ingestion
  • Inspecting the input and output
  • Using Ingestion diagnostics

Uploading a File

To upload an electronic file for ingestion in the interface click the Upload button in the INGESTION panel and select the file from the popup, the file needs to be physically accessible from the user’s computer.

../_images/ingestion_file_upload.png

Note

Ingestion in Rosetta Core currently only supports XML input types but further formats will be added in future.

Running Ingestion

Once uploaded, and provided that all the required code in the user workspace is ready to be run, the file is automatically processed through ingestion. The INGESTION panel shows the file name with a spinning signal indicating it is being run:

../_images/ingestion_run.png

Assuming the input file is valid and can be mapped to the model (see the `Using Synonyms`_ section), the signal turns green and results are displayed in the INPUT, CDM and DIAGNOSTICS panels:

../_images/ingestion_run_success.png

When the file cannot be ingested, the signal turns red:

../_images/ingestion_run_fail.png

When the user makes modifications to the model in their workspace, the ingestion results become stale, as those changes may impact the way the input should be mapped and represented in the model. This is indicated by the orange signal:

../_images/ingestion_run_modification.png

Once the user has completed their changes and updated code in their workspace is ready to be run, they can re-run ingestion by clicking on the -> command (which was previously deactivated):

../_images/ingestion_run_success_rerun.png

Inspecting Results

Coming soon

Ingestion Diagnostics

Coming soon

Implementing Mappings

The process of converting an electronic document into CDM relies on mapping the attributes of that electronic document to CDM attributes. In the CDM, these mappings are implemented via synonyms.

Users are invited to read the Mapping Component section of the Rosetta DSL documentation to get familiar with how this concept is used in the model, prior to reading the following section.

Coming soon

Integration Tests

Coming soon