Review of the mDIS default database schema
Review of the mDIS default database schema
Contribution Jan. 2021
by H. Lorenz, Univ. Uppsala
Draft, ICDP-Internal
For now, this is an ICDP-internal document, for discussion and review. It is somewhat internal. It should not be used for reference by the general public.
Slightly edited and formatted for the Web by Knut Behrends, GFZ.
Edits entail highlighting, setting headings, and some Markdown-specific tweaks.
Review of mDIS DB-Schema
Comments on the data model of the GitLab master branch, 2021-01-25.
Table project_expedition
An expedition can have several moratoria. Data in a moratorium come from boreholes. Thus, I suggest moving the moratorium start/end information to project_hole. The only possible conflict with this solution would be if a borehole with a moratorium is deepened later and the data are subject to a new moratorium.
All location information should be given under project_site, because an expedition can have several locations, including land and offshore sites (fields project_location => site_location, type_of_drilling, country, state, county, city, location_description; location names could also be implemented via a gazetteer).
rock_classification and geological_age: is this JET specific? If not, move to project_site. This information can vary from site to site.
Funding agencies can vary from borehole to borehole; move funding_agency to project_hole.
mDIS data model (expedition.png File)
![]() |
| expedition annotated |
Table project_site
Much of the information under project_site is actually borehole-specific and does not need to be common for all boreholes on a site.
Move the following information to project_hole:drilling_method, drilling_method_details, platform_type, platform_name, platform_description, platform_operator, bit_sizes
mDIS data model (site.png File)
![]() |
| site annotated |
Table project_hole
ukbgs_* fields shouldn’t be part of the master branch.
Position of the borehole:
coordinate_system: use EPSG codes to make the entry unambiguous and the data machine-readable.latitudeandlongitude: with the option of specifying a coordinate system other than lat/lon, the fields should be called northing and easting.What is
core_depth_ccsf,core_depth_csf,drilling_depth_dsf,drilling_depth_drf?
Even with the comments in the database, I don’t understand. Is this JET specific or part of the basic mDIS? I would stick with the driller’s depth and possibly a corrected depth. There is no need for double fields for below surface and below sea bottom, respectively, because this information is given higher in the hierarchy (type_of_drilling(land, sea, lake), which should be onsitelevel).directionandinclinationneed a better explanation – what is it? The direction of directional drilling at the surface? Of the line between the surface and the endpoint of the borehole, calculated from borehole survey data? Something else?methods_in_holeis obsolete; implement the respective tables from the legacy DIS:campaign,run,files(already done for COSC, can be imported into the data model).
mDIS data model (hole.png File)
![]() |
| hole annotated |
Table core_core
IGSN is missingNOT incore_core. - Maybe it is missing inproject_site?core_typeshould be explained as the method of coring, not the method of drilling.drillers_depthandtop_depthare redundant. Usetop_depthand dropdrillers_depth.bottom_depthcan be calculated astop_depth+drilled_length. Dropbottom_depth.core_recovery_pccan be calculated ascore_recovery/drilled_length*100. Dropcore_recovery_pc.What is
continuity?
The explanation for the column doesn’t help me. Is it JET only?
Is it the condition of the drill core (e.g. like good = no significant disturbance/fracturing, partially fractured = numerous fractures occur, the drill core is still coherent; heavily fractured = the coherence of the drill core cannot be preserved)?
If yes, move tocore_section.rqd_*andigsn_ukbgsare project-specific and shouldn’t be on GitLab master branch.MCD=> see separate discussion at the bottom of the page.
mDIS data model (core.png File)
![]() |
| `core_core` annotated |
Table core_section
top_depthandbottom_depthcan be calculated fromcore_core.top_depthandsection_lengthof sections in the same core. Droptop_depthandbottom_depth.ukbgs_*fields shouldn’t be part of the master branch.MCD=> see separate discussion.
mDIS data model (section.png File)
![]() |
| `core_section` annotated |
Table curation_section_split
- Duplicates a lot of information from
core_section. Drop all duplicated fields.
Move mscl, xrf, radiographs, photographs, etc. to a separate table (e.g. curation_measurements).
It can then be linked to split sections or directly to the parent section if the core is not split at all or in case the measurements were made before the split.
ukbgs_*fields shouldn’t be part of the master branch.
mDIS data model (split.png File)
![]() |
| `curation_section_split` annotated |
Table curation_sample
bottom=top + interval, dropbottom.Any
MCDinformation (includingsection_top_mbsf,sample_top_mbsf,sample_bottom_mbsf) is directly derived fromMCDinformation in higher levels and thus can be referenced. Drop allMCDrelated fields.What is
amount? Rename tosample_size? This requires explanation.What is
volume? Why is it required in addition toamount?The column
scientistduplicates information fromsample_request. Dropscientist.approve*is then part ofcuration_sample_request.ukbgs_*fields shouldn’t be part of the master branch.
mDIS data model (sample.png File)
![]() |
| `curation_sample` annotated |
Table sample_request
Should this reside under
curation, likecuration_sample? i.e.curation_sample_request? Else changecuration_sampletosample_sample(it’s both, operation and curation).sample_volumeandsample_amount, what is it? – same as the question incuration_sampleabove.
mDIS data model (request.png File)
![]() |
| `sample_request` annotated |
Table geology_lithology etc.
These are project-specific and shouldn’t be on GitLab master branch.
We could establish examples according to best practices (use of published vocabularies, etc.) in the template manager but not generate it.
mDIS data model (lithology.png File)
![]() |
| `lithology_geology` - annotated |
To-Dos; other Tables
Improvements to
archive_file(and the process of assigning images to database objects).Decide how to implement core boxes as objects that can be handled by the curation part of the mDIS (maybe clear from AWI order to IG?).
Implement controlled vocabularies for all list items.
MCD
`MCD`
What is MCD and how to handle it?
MCD - “metres corrected depth”
If I remember correctly, MCD simply stands for “metres corrected depth”.
Values can be specified at several levels in the hierarchy (core, section, sample) and in different ways (absolute, relative to top_depth).
Thus, all doors are widely open for database inconsistency due to redundant data (changed here but not there) and for scientific mistakes due to different ideas about how to interpret the data (does mcd_offset on section level add to or override mcd_offset on core level?).
mDIS data model (mcd.png File)
![]() |
| `mcd_*` and `curated_*` fields - annotated |
We need to answer the following questions:
What is "MCD" in practice?
Where does the value come from? (Or “What was the original idea with it?”)
Problem: MCD and mcd_offset are redundant
(MCD = top_depth + mcd_offset). Which one do we want to keep?
Is there a reason for the introduction of
MCDat core and section level?
If yes, what is the reason for it? And what is the original intention to handle this in case there is both a core and a section value formcd_offset?Can we make the system of
MCDcorrections simpler and unambiguous?
MCD: My answers and suggestions
In COSC, the drill core depth is corrected to the depth master downhole log (GRmaster). Televiewer imagery was correlated to GRmaster and then, in Corelyzer, the core scans were correlated to the televiewer images. The resulting depth was extracted from Corelyzer and imported to the mDIS as
MCD.
In the Corelyzer file format, the depth imported from mDIS is calledMCD, and the present “working-depth” (as drill core is displayed on the screen, possibly after depth shifts by the user) as “depth,” hinting that the original workflows included some other depth correction procedure forMCDand thus, already provided Corelyzer with anMCDfrom the legacy DIS.It is good practice to provide absolute values only once (
top_depth) and all other values relative to it, i.e. usemcd_offsetand dropMCD.I don’t know, but I could imagine that it was only about unambiguous absolute values in the beginning and the
mcd_offsetwas added later to easily allow users to evaluate the difference betweenMCDand driller’s depth. If all fields are filled and kept up-to-date at all times, this works. Else it is a mess.I would reduce
MCDtomcd_offsetat section level. An offset that affects the entire core run will have the same value for all sections in this core run (no offset, i.e. 0, by default). Offsets of individual sections are added to it. Any depth values on a section (top) can be calculated to driller’s depth (top_depth+top) orMCD(top_depth+mcd_offset+top).
I suggest changing the data model according to my answer to point 4.









