Hi Dan,
I've looked at it earlier in the year. You may be interested to note that
NIWA's OBIS node was also established with Sloan Foundation funding... they get
fingers in lots of pies - but that was also about harvesting from institutional
to regional to global repositories (SW Pacific OBIS node to OBIS to GBIF).
Apart from dat only being in very early stages of development...
Generally in NZ we have focused on a federated approach to data sharing with
mostly institutional repositories, made available (& discoverable) via web
services. This in contrast to the monolithic data warehouse approach followed
by dat.
I believe (& your work on data management maturity benchmaking might provide
tools to prove or disprove this) that most agencies in NZ are still trying to
get their own houses in order, & trying to build a dat system on top of poorly
managed & documented datasets is not likely to be that useful.
If we can all agree on data delivery & discovery standards, & provide
guidelines & tools to help institutions get up to speed to do this with their
holdings, then a dat style data warehouse, harvesting from many repositories
via common standards with common vocabularies become a very achievable goal.
For example - in the hydrometric/climate data domain, various agencies capture
rainfall data. This might be a 24hr measurement, but calender day or
0900-0900??? (eg: if manual gauges are read at 0900). In the institutional
datasets, the reading may be called "rainfall-24", "rainfall_24hr",
"precipitation", etc... So simply harvesting such data into a dat style
repository does not necessarily help users know if their mashups are apples &
apples, or not...
In general, I believe that data needs to be well managed at the institutional
level, according to commonly agreed standards, discovered & accessed via
agreed, standard protocols with common vocabularies, as a precursor to any
monolithic repository. The simplest way to do this is by creating a
standardised, federated system, which then allows the various datasets to be
usefully harvested into a central system, like a dat repository, if this is
regarded as useful.
So I think dat is a possible tool for the future, but to start looking at
national repositories, when our institutional ones still need work may be
trying to run before we can walk - despite the efforts by the dat development
team to address these issues.
Cheers,
Brent Wood