2008-06-17

IDF Open Meeting: Second Half

Jan Brase, German National Library for Science & Technology (TIB): Access to non-textual information



Data > Publication > Knowledge is accessible; the data itself is not published or accessible. Known problem: verifiability, duplication, data reuse. Data accessibility for 10 years has been mandated in Germany --- and ignored.

Solution: strong data centres; global access to datasets & metadata through existing library catalogues; persistent identifiers.

Results: citability of primary data. High visibility. Verification. Data re-use. Bibliometrics. Enforcing good scientific practice.

Use of DOI to that end: TIB is now a non-commercial DOI registration agency for datasets. Datasets gets DOIs, catalogue entries. Can disaggregate datasets (e.g. multiple measurements), accept conditions, choose formats etc through portal access to dataset.

TIB registers data worldwide, and any community funded research in Europe. Half million objects registered (but not stored at TIB --- they are not a data centre).

Scientific info is not only text: data, tables, pictures, slides, movies, source code, ... which should also be accessible through library catalogues, as publicly funded research outputs. The catalogue becomes a portal within a network of trusted content providers, with persistent links.

Institutions often find it hard to get DOIs from a foreign library (TIB currently being the only show in town); so TIB want to set up new worldwide agency, paralleling CrossRef, as consortium registering DOIs for scientific content by libraries. So far signed up ETH Zürich, INIST France.

ICSTI has started project for citing numerical data and integrating it with text, in which TIB is participant.

Jill Cousins, European Digital Library Foundation: Access to National Resources



European Library: consortium of Council of Europe national libraries. Federated search: starts at federated registry, and also goes to library servers (SRU, Z39.50). Helped national libraries have low barrier to entry, annoying as it is to the user. National libraries themselves know that this won't scale, and are moving from Z39.50 to OAI harvesting.

Persistent identifiers were not priority for national libraries: they hadn't digitised much (5 million for the whole continent), and Z39.50 didn't need to interoperate with external systems. This will change: 100 million items digitised in next 5 years, born digital content, move to OAI PMH, OpenURL.

CENL (Council of euro nat libs) recommends there must be resolution services, based on URNs primarily from NBN namespaces. Each nat lib to have its own resolver service to access its own stuff, following European standards. URN service must deal with other id schemes. For long term survival, DOIs can eventually redirect to nat lib copies (copies of last resort).

NBNs are already being used. They do identify Items not Works, though now that libraries digitise themselves, they move away from that. Resolvers need to deal with appropriate copy.

SURFnet are proposing a global resolver; interested "because it's free (at the moment)", and is prepared to work with both NBNs and DOIs. Nat libs are still learning what the point of persistent identifiers are. Are not working with IDF because of perception of costs and little return (ah, but did they negotiate?); and need to resolve the "last resort" issue, which is not dependent on IDF. Also, a lot of "not invented here", wanting to avoid external providers. Libraries already have working NBNs which work internally, so they haven't had the pressure until now to resolve consistently.

European Digital Library (Europeana) underway. No standards for unique identification yet! Still trying to work out how to realise it (e.g. decentralised?)

No comments: