mDIS User Documentation

H1

Setup

Document Rendering

(Detailed document rendering options not shown)

R Packages

1
2
3
4
5
6
7
1
2

Get metadata formats

1
2
3
4
5
6

Metadata_formats are: dif, oai_datacite, iso19139, oai_dc, datacite.

Get Sets/Metadata Catalogs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

Calculate a data frames of year-intervals

1
2
3
4
5
6
1
2
3
4
5
6
7
# year doy_first doy_last
1 1 2009 2009-01-01 2009-12-31
2 2 2010 2010-01-01 2010-12-31
3 3 2011 2011-01-01 2011-12-31
4 12 2020 2020-01-01 2020-12-31
5 13 2021 2021-01-01 2021-12-31
6 14 2022 2022-01-01 2022-12-31

Goal: Run HTTP GET request for each catalog - year combination.

Step 1/3: Prepare a dataframe of http-calls to verb/method verb=ListSets`

1
2
3
4
5
6
7

Step 2/3: Now compose getter function for fetchin XML Data via HTTP.

1
2
3
4
5
6
7
8
9
10

Step 3/3: Use the xml getter function to extend the dataframe of URLs with the count of DOIS assigned per year.

This will perform 322 HTTP requests (for all 14 years * 23 catalogs),

1
2
3
4
5
6
7
8
9

DOIs per year

1
2
3
4
5
6
7
8
9
# name spec year doy_first doy_last req cnt
1 1 ArboDat 2016 DOIDB.ARBODAT 2002 2002-01-01 2002-12-31 https://doidb.wdc-terra.org/oaip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2002-01-01&until=2002-12-31&set=DOIDB.ARBODAT NA
2 2 CRC1211DB CRC 1211 Database DOIDB.CRC1211 2002 2002-01-01 2002-12-31 https://doidb.wdc-terra.org/oaip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2002-01-01&until=2002-12-31&set=DOIDB.CRC1211 NA
3 3 DEKORP - German Continental Seismic Reflection Program DOIDB.DEKORP 2002 2002-01-01 2002-12-31 https://doidb.wdc-terra.org/oaip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2002-01-01&until=2002-12-31&set=DOIDB.DEKORP NA
4 479 SFB806 and CRC806-Database DOIDB.SFB806 2022 2022-01-01 2022-12-31 https://doidb.wdc-terra.org/oaip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2022-01-01&until=2022-12-31&set=DOIDB.SFB806 2
5 480 TERENO DOIDB.TERENO 2022 2022-01-01 2022-12-31 https://doidb.wdc-terra.org/oaip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2022-01-01&until=2022-12-31&set=DOIDB.TERENO 103
6 481 TR32DB CRC/Transregio 32 Database DOIDB.TR32DB 2022 2022-01-01 2022-12-31 https://doidb.wdc-terra.org/oaip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2022-01-01&until=2022-12-31&set=DOIDB.TR32DB 33
7 482 TRR228DB CRC/Transregio 228 Database DOIDB.TRR228 2022 2022-01-01 2022-12-31 https://doidb.wdc-terra.org/oaip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2022-01-01&until=2022-12-31&set=DOIDB.TRR228 3
8 483 WDS World Stress Map DOIDB.WSM 2022 2022-01-01 2022-12-31 https://doidb.wdc-terra.org/oaip/oai?verb=ListRecords&metadataPrefix=oai_dc&from=2022-01-01&until=2022-12-31&set=DOIDB.WSM 5

Done with getting data.


Prepare data for plotting

Add a few columns to make calculations and plotting easier.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

Checking data quality.
The sum of records in Catalog "DOIDB" should be equal to sum of records in all other data centers:

1
2
3
4
5
6
7
8
9
10
11

Currently ,
with , , so that is TRUE.

Plots

2 Large GFZ Catalogs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

1
2

GFZ's smaller DOI-Catalogs

1
2
3
4
5
6
7
8
9
10

1
2

Sorted by total number of records first

Same plot as before, zoomed in, subplots sorted by total number of records, with individual Y-axis scales.

1
2
3
4
5
6
7
8
9
10

1
2

The End.

PS

Check out the analysis of IGSN catalogs at GFZ.