NeCTAR Melbourne Town Hall

NeCTAR Townhall, 2010-11-26

NeCTAR: National eResearch Collaboration Tools and Resources

$47 million of funding, 2010-2014. Build electronic collaboration infrastructure for national research community. Unimelb is the lead agent.

Aims to enhance research collaboration & outcomes, and to support the connected researcher at the desktop/benchtop. Aims to deploy national research infrastructure and services not otherwise available, in order to enable research collaboration and more rapid outcomes.

Board to approve final project plan, submit to DIISR by Mar 31 2011. Townhall meetings over the next two months.

Consultation paper [PDF] circulated , 60+ responses received, responses available.

Response themes:
* avoid duplicating existing offerings
* needs to be researcher-driven
* questions on how to leverage institutional investments
* need coherent outcomes across nectar
* need to focus on service delivery
* need to establish sustainability

Interim Project plan [PDF] avilable:
NeCTAR is funding four strands of activity. Two are discipline-specific, two are generic and overlaid on the discipline-specific strands.
* Research Tools (discipline-specific, may eventually generalise)
* Virtual labs (resources, not just instruments, available from desktop; emphasis on resources, to prevent them from being applicable to instrument science only).
* Research cloud (general or multi-disciplinary applications and services, plus a framework for using them)
* National server programme (core services, authentication, collaboration, data management services).
NeCTAR will clear up their use of terminology in future communications.

NeCTAR is meant to be serving Research Communities: these are defined as being discipline-based, and range across institutions. e-Research facilitates remote access to shared resources from desktop, in order to enhance collaboration for Research communities (making them Virtual Research communities).

NeCTAR will remain lightweight, to respond to generic and discipline-specific research community needs. Infrastructure is to be built through NeCTAR subprojects. The lead agent UniMelb will subcontract other organisations; some outcomes may be sourced from outside the research community. NeCTAR may start with early adopter groups who already have lots of infrastructure, and NeCTAR may take up existing groupware solutions from these. NeCTAR can only fund infrastructure and not operational services, as it is funded through EIF. Sustainability (as always) is entrusted to the disciplines, NeCTAR will cease at 2014.

Expert panels from across community are to advise the NeCTAR board on allocating subcontracts, as NeCTAR places a premium on transparency. Subcontracts must demonstrate a competitive co-investment model for what NECTAR can't fund: these will take the form of matching funds, likely in-kind, to cover maintenance and support as well as development.
Expert panels will include both researchers, and e-research experts who are familiar with what infrastructure already exists.

There will be a staged model for NeCTAR issuing subcontracts. In 2011 NeCTAR are funding early positive outcomes, in order to give slower-adopting communities more time to develop their proposals. Review of progress and plan for next stage in late 2011.

Research Communities will define the customised solutions they need; these will be delivered through Research Tools & Virtual Labs. Will reserve funds from subcontractors to fund research communities directly, to bring them into Virtual mode.

The considerations for what are the resourcing, scale, timeframe etc of target Virtual Research Communities will inform NECTAR's priorities on what to fund.

NeCTAR is funded to deploy resources for the Cloud nodes, with regard to the Research Cloud, but NeCTAR is not funded to create nodes for the Cloud. NeCTAR will work with existing cloud nodes, e.g. from Research Data Storage Infrastructure (RDSI). Some Research Cloud nodes and RDSI nodes will coexist—but more will be known once the RDSI lead agent has been announced. The consultation responses show a desire for a consistent user experience, which requires a consistent framework for service provision, based on international best practice. (This encompasses virtual machines, data stores access, applications migration, security, licensing, etc.) The framework for the Research Cloud will be developed in parallel with the early projects.

The National Server Program (NSP) will provide core services relevant to all disciplines, e.g. interfaces out of AAF, ANDS, RDSI. The underlying NSP infrastructure will be reliable enough to use as a foundation for more innovative services. The prospect of database hosting has been under much discussion. The National Server Program Allocation Committee is to recommend services for hosting to the NeCTAR board.

Contrast between the National Server Program and the Research Cloud:
* NSP supports core services (EVO, Sharepoint), Research Cloud supports discipline-specific services built on top of the core. (These can include: data analysis, visualisation, collaboration, security, data access, development environments, portals.)
* NSP runs for years, Research Cloud services may only run for minutes.
* NSP provides 24/7 support, Research Cloud provides 9-5 support.
* NSP has strict entry, security, maintenance criteria; Research Cloud less so.

UniMelb is delivering the NSP basic access phase: 50-100 virtual machines, at no charge in 2011, located at UniMelb. This is the first stage of deployment: there will be nodes elsewhere, and Virtual Machine numbers will ramp up.

Many universities are already delivering Virtual Machines, but they can use NeCTAR infrastructure as leverage. Virtual Machine distribution is increasingly used for application release, e.g. with TARDIS.

International exemplars for NeCTAR infrastructure: National Grid Service (UK): Eucalyptus; NASA (US): Open Nebula. NeCTAR will run an expert workshop early next year, inviting international experts and all potential research cloud nodes.

Discussion (from the Twitsphere: #NeCTAR)

* Will the existing ARCS data fabric be maintained? NeCTAR is not able to answer that, since the question is outside NeCTAR's remit. DIISR is in discussions with ARCS on the future of the Data Fabric as well as EVO.


ADLRR2010: My summary

At an unsafe distance, I have posted my summary of what was discussed at the ADLRR2010 summit on the Link Affiliates group blog.


ADLRR2010: Wrap-up

Dan Rehak:
Strongest trend in the meeting's discussions: What is the problem we're trying to solve, who is target community and how do we engage them?
Also: Sustainability, success, return on investment
Good consensus on No More Specs, figure out how to make what we already have work
Still seeing spectrum of solutions and different ecosystems, and don't know yet where to align along that spectrum
We should not focus on what we build, but on what requirements we satisfy
Learning is special, but not because it's the registry/repository architecture, but because we have particular requirements
We are technically mature, but socially immature in our solutions
Throwing it all away and starting from scratch has to be an option; cannot be captive to past approaches
Followup meeting in London next week

Paul Jesukiewicz:
Are soul-searching with the Administration on way forward (under time constraint of next two years: want to leave their mark in Education)
Govts worldwide are reprioritising their repository infrastructure
ADL is putting in recommendations, and govt wants to tackle the Dark Web

ADLRR2010: Breakout Groups: Future Directions

Challenge: Where should ADL spend 10 mill bucks now on repository stuff? (Or 1 mill bucks, instructions to groups varied, and spending some money on a cruise was an option.)

Group 1:
We're back to herding cats
* Do we understand the problem statement clearly yet? Lots of discussion this part 2 days all over the place? Need to work out the grand challenge with the community still
* Need awareness of the problem space; lots of terms used loosely (repository vs registry), ppl don't know what they're trying to solve yet. What's happening in other domains?
* Harmonise, explore what's going on in the space, work with what you've got instead of reinventing solutions
* More infrastructure support: if solutions are out there, what we're missing is the market for it. What good is a highway system without any cars to go on it?

Group 2:
* Understand business drivers and requirements, formally (40%)
* Models for embedding and highly usable systems (25%) (widgets, allowing people to adapt their current systems of choice; don't end up in competition with grassroots systems that are more usable)
* Create mechanisms to establish trust and authority, in terms of functionality and content (15%) (clearinghouse? means of rating? something like Sourceforge?)—this is where the value of a content repository is
* Virtualise content repositories and registries (Google) (10%)—ref what Nick Nicholas was talking about: allow grassroots generated content to be a layer below Google for discovery: middleware API, Web Services, Cloud Computing: essentially data mining
* Study systems (Content Repositories, LCMSs) that already work (10%)

Group 3:
Most time spent on defining what the problem is
* Some thought there is no single problem, depends on what is being asked
* Some thought there is no problem at all, we'll take the money and go on holiday

There are two engineering problems, and one cultural problem:
* Cultural: Incentives for parties creating and maintaining content are diversifying, and are at odds with each other (e.g. more vs less duplicate content)
* Engineering 1: Need to discover repositories: registry-of-repositories (ASPECT). Sharing is the driver: without sharing, no reuse, and content is underutilised. Repositories are not discoverable by Google (Dark Web). Also need evaluation of repositories, what their content is, and how to get access to them. Second, need to make services coming out of R&D sustainable, including identifying business models. Third, need to capitalise on untapped feedback from users, and exchange.
* Engineering 2: Teaching the wrong thing because of lack of connection between teaching, and learning content. Learning content has much context; need to disambiguate this information to improve relations between content and users. Portfolio of assets must be aligned with student needs: need all information in one place. Don't want learning resources to be out of date or inaccurate.
* If you have 1 mill, get 100 users together and observe them, and find out what their requirements are. Don't just survey people, they'll say "Sure", and then not use the new repository because it has the wrong workflows.

Group 4:
* Research. Biggest thing needed is federating the various repository options out there
* Analysis of end user needs: both consumers and producers of content; both need refinement over what is currently available to them
* Systems interfaces to exchange content in multiple systems and multiple ways, including unanticipated uses: open-ended content exchange
* User feedback mechanisms: easier and faster collecting of metadata

Group 5: (My group)
* User Anonymity and security, for
* User Context to
* Drive discovery, which
* Needs data model for user context: typology, taxonomy
Crucial insight is: we're no longer doing discovery, we're going to push out the right learning content to users (suggestion systems), based on all the user behaviour data we and others are gathering—and aggregating it. The Amazon approach on recommending books, finding similar behaviours from other users. Becomes an issue of which social leader or expert to follow in the recommendations. (This is what repositories are *not* already doing: it's the next step forward)
Balanced against security concerns—stuff in firewalls, stuff outside, less reliable stuff in Cloud, etc

Group 6:
* Not everyone needs a repository: what is it for?
* Life cycle maintenance of content: don't focus on just the publishing stage
* Rethink metadata: too much focus on formal metadata, there's a lot of useful informal user feedback/paradata, much can be inferred
* Rethink search; leverage existing search capabilities: The Web itself is an information environment, explore deeper context-based searches (e.g. driven by competencies)
* What will motivate people to work together? (business drivers)
* Standards: how to minimise the set? (not all are appropriate to all communities)
* Exposing content as a service (e.g. sharing catalogues—good alternative to focus on registries, which is premature)
* Focus on domain-specific communities of practice (DoD business model not applicable to academic, constraints on reuse)
* Look at existing research on Web 2.0 + Repository integration

ADLRR2010: Panel: Vendors

Gary Sikes, Giunti Labs

Publishers are restricted from the repository, can't see what they're content is getting there. ADL can have one repository for publishers outside the firewall, and one to publish into within the firewall.
More middleware use in repositories, web services and some API
User-based tagging (folksonomies) and ratings
Corporate education: providing access to digital markets, making content commercially reusable (resell)
Collaborationware and workflow tools, e.g. version comparison, shared workspaces
Workflows including project management roles and reviewing
Content access reporting: who is viewing, what versions are being viewed
Varying interface to repository by role
Challenges: security (publishers outside firewall, users within the firewall). Defining role-based interface. Interoperability. One-Stop Shops being asked for by client. For new implementations: how metadata deals with legacy data.
Standards also important for future-proofing content

John Alonso, OutStart

They provide tools, not knowledge.
Confusion from vendors: what counts as a repository? Google isn't one (referatory/repository confusion)
If we build it, they will not come; they will only come if it is important to them and has value. if too much cost and no return on getting to the content, they will go elsewhere
the clients are not telling him they want their stuff in the repository exposed and searchable
some great successes within the confines of the firewalls --- macdonald's corporate info is exposed to macdonald's local franchises well, motivated by cost efficiency and not mandates
We welcome standards—that people want to use: they lower the cost of entry. Vendors should not be driving the definitions of standards, they just want the business requirements. The buyers don't understand the standards themselves, they just treat them as checkboxes: buyers should articulate the business value of why they are requiring the standard in the first place: there is no business value to implementing the standard, so it never gets verified—or used.
Repositories vs registries: ppl use the terms interchangeably, hence the confusion. Trend is to abstract search, so that back end repositories can be swapped out. But I shouldn't have to write 10 different custom plugins to do so!

Ben Graff, K12 Inc.

Big problem space, many ways of both defining and slicing the problems
It's expensive to do this right, even if you do agree on the problem space: content design, rights management, content formatting & chunking, metadata creation, distribution strategy
The Return On Investment isn't always immediate
Teachers & Profs needs: Applicability (content at right size for right context), Discoverability (find it quickly), Utility (I can make it work in my environment: teachers are pragmatists), Community (peer recommendations, feeding into peers), Satisfaction (best available), Quality (proven, authoritative, innovative)
Students needs: Relevance (interesting & engaging), Applicability (need help finding right thing right now -- though I may not admit it, and I don't know what I don't know: I'm a novice)
Everyone needs: Simplicity (if it's not easy, I'll walk)

Support & respect content wherever it comes from: better exposure of content, greater availability helps society
Improve discovery through author-supplied metadata, ratings, and patterns of efficacy across an ecosystem of use—what we know by analysing usage.
Demonstrate and educate about ROI at multiple levels: government, business, educator, student
Not everyone will need to, want to, or be able to play along for years to come: keep breaking down barriers
Please have *a* standard, not a different standard for each client! content creation and publishing both become bad experiences: each standard becomes its own requirement set

Eric Shepherd, Questionmark

Author once, schedule once, single results set from a student, deliver anywhere, distributed authoring, management of multilingual translations; blended delivery—paper, secure browsers, regular browsers, transformation on the fly: autosense the target platform for delivery. Often embed Questionmark assessment in portals, blogs, wiki, Facebook.
Need to defer to accessibility experts to see if got accessibility right.
Analytics, once data anonymised, to establish quality and reliability of the software
*a* standard is utopian: different standards are targeted at different problems
Driver should not be the standard but the business need; but a vendor cannot survive without standards

ADLRR2010: Panel: Social Media, Alternative Technologies

Susan van Gundy, NSDL

Cooperation with NSF & OSTP: Stem Exchange
NSDL has existed for a decade; digital library functionality, central metadata repository, community of grantees and resource providers, R&D for new tech and strategies
Aim now is to understand education impact of these materials; hence Stem Exchange: what are educators doing with NSDL resources? What knowledge do educators add to the resources, which can be fed back to resource developers?
This is about contextualisation & dissemination. Teachers know what NSDL is by now; now they want to know what other teachers are doing with it
Metadata is limited: labour intensive, expensive; limited use to end users beyond search & browse, though it is still important for content management: metadata is essential but not sufficient
"The evolving power of context": capture context of use of resources
Web service, open API: datastreams from NSDL straight into local repositories; teachers assemble resources online on the local repositories, generating resource profiles; this is paradata being fed back into NSDL (including favouriting of resources)
METANOTE: kudos for correct use of para- in paradata meaning "supporting context, circumstances"; cf. paratext
Generates data feeds of paradata: what others think and do with the resource. Akin to the use of hashtag in capturing usage.
Applies to open reuse subset of NSDL; will integrate into current social networking tools (e.g. RSS)
Now establishing working groups on how this will work
Are looking at folksonomies and pushing that back into NSDL formal metadata
People don't volunteer this data, need incentives: there will be automatic capture of the paradata in aggregate

Jon Phipps, Jes & Co

Interop isn't about using each other's interfaces any more —profusion of standards! Now we need to *understand* each others' interfaces
Linked Data: opportunity to share understanding, semantics of metadata
The 4 principles of Linked Data from Tim Berners-Lee
Jes & Co are making tools to support Master Data: central authoritative open data used throughout a system—in this case, the entire learning community
(Tends to be RDF, but doesn't have to be)
Given that, can start developing relationships between URIs; map understanding across boundaries
This enhances discoverability: ppl agree in advance on the vocabulary, more usefully and more ubiquitously—can aggregate data from disparate sources more effectively (semantic web)
e.g. map US learning objectives to AUS learning objectives for engineering learning resources. not a common et of standards, but a commonly understood set of standards
RDF: there's More Than One Way To Do It: that's chaos, but not necessarily a bad thing


Can't really liveblog myself talking; I'm going through my position paper, and I've recorded myself (19.7 MB MP3, 21 mins).

Sarah Currier, consultant

"Nick said everything I wanted to say" :-)
Others have been high level, big strategic initiatives. This is microcosm education community, addressing compelling need of their own.
14 months of purely Web 2.0 based repository with community, "faux-pository", ad hoc repository
How do edu communities best use both formal repositories and Web 2.0 to share resources? How can repository developers support them using Web 2.0
Is a diigo group a repository? Netvibes is: http://www.netvibes.com/Employability
Community wanted a website powered a repository (whatever that is, they weren't techo); £40k budget. They went Web 2.0: though repositories were being built that were similar to that need, nothing the community could just jump in and use. (And the repositories that were built don't provide RSS!)
"Must not be driven by traditional project reporting outputs": more important to develop a good site than a project report!
Ppl needed both private and public comms spaces, and freely available software.
Paedagogy, social, organisational aspects of communities have not been involved in repository development, and are the major barriers now.
Everyone thinks their repository is somewhere everyone goes to. You're competing with Email, Google, Facebook: no, the repository is not the one-stop shop, push content to where people actually do go
There is a SWORD widget for Netvibes, but it's still rudimentary
Put edu communities at the heart of your requirements gathering and ongoing planning!
You *must* support newsfeeds, including recommendations and commentary, and make sure they work in all platforms
Need easy deposit tools, which can work from desktop and web 2.0 tools
Allow ppl to save this resource to Web 2.0 tools like Facebook; don't make your own Facebook


ADLRR2010: Breakout Groups: What are the problems we've identified so far with repositories?

Understanding users and user needs: reuse is not as simple as hitting a button on iTunes.
Mindshare: how do you get enough resources to compete with google, esp as google are defining user expectations: standards are left wagging the tail of the vendor dogs.
complexity of systems, metadata and policy.
Lack of high quality metadata and tools.

Discussion mostly on organisational and social issues.
Need for ways for authors to connect to repositories, reuse at point of authoring.
Parochial development --- "not developed here", barrier to reuse.
Difficult to get ppl to create metadata.
Network enclaves, access restrictions
Organisational inertia

Security: identity management
Scale: scaling up
Building repositories: is that the right answer? (what is a repository anyway?) What would repository standards cover?
Are repositories solving yesterday's problems? Do we need more? we don't know yet.
Connectivity between repositories -- virtual silos
User-centric view of repositories
Is reuse relevant driver? Is there authority to reuse? Is content authoritative?
Optimising repositories to kinds of content
Manual metadata is too expensive
Getting discovered content: too hard, too costly
Sharing is key driver for repositories

More Incentives needed for using a repository, rather than more standards. Developing app profiles just as dangerous as developing more standards: they are very time consuming, and difficult to maintain.
Trust: single sign on. Security of data: needs trust, common security model.
Need common terminology, still stumbling on repositories vs registries
Quality assurance and validity of control.
Must focus on community and user requirements before looking at technology or content procurement; this has been a wrong focus.

Organisations may have bought a repository but be unaware of the investment; need registry of repositories.
Every agency builds silo, need mandate to unify repositories.
Holy Grail is reusable data, reusable at every level. Many business challenges to that: how to learn from failed efforts? Outside schoolhouses, difficult to get right, and much harder than it seems.
Search needs to be brokered.
What apps are needed for easy registering.
What models will incentivise innovation but not impede progress?
Bottom-up approach approach makes it difficult to get shared vision.
Difficult to set up repository technically. Could use turnkey repositories.
Lack of best practice guides for leveraging repositories, or to get answers to questions from community on how best to do things.

Searching is not finding: may want different granularities, content can be pushed out according to curriculum structures.
Should search be exact or sloppy? Sloppy search is very useful for developing paedagogy.
Process of metadata generation is iterative . user perspective can be trapped to inform subsequent attempts to search.
User generated and computer generated metadata is better than none.
Interoperability is a problem across repositories (application profiles, granularity). interoperability layer of repository is more important than the underlying technology of the repository.


We're missing the users as a constituency in this summit! Hard to draw conclusions without them.
We're also missing the big social networking players like Google & Facebook: they're not interested in engaging despite multiple attempts.
We're missing the publishers. Some had been invited...
Repositories' relation to the web: repositories must not be closed off from the web, growing realisation over past 8 years that the Web is the knowledge environment.

Noone wants more specs here
There is no great new saviour tech, but some new techs are interesting and ready for prime time
"What's special about learning" in this? How do we differentiate, esp if we go down the path of social media?
Have we addressed the business models of how to make this all work?
When do we have enough metadata? Google works because page ranking is complex, *and* critical mass of users. If could gather and share all our analytic data from all our repositories, and share it, could we supplant metadata and start on machine learning? Open question
Building our own competitor to Google & Facebook, our own social tool: is it such a good idea?
Open Source drives innovation, but the highest profile innovation recently has been the closed-source iPhone. Are things moving towards closed source after all? If so, how do repositories play in the Apple-based world?

ADLRR2010: Tech, Interop, Models Panels:

Joel Thierstein, Rice Uni

Connexions: Rice repository platform. 16k modules, 1k collections
Started as elec eng content local to rice, now k-12, community college, lifelong learning, all disciplines
modularised structure: all content can be reused; more freedom at board of studies level, building on a common core
module makes for more efficient updating
"Lenses": social software to do peer review of content
Permanent versioning -- there will be multiple answers given by the source
CC-BY licensing, can buy hard copy as well as getting online pdf or epub.
Can be customised as platform: local branding; k-12 can zone off content for their own purposes
Want to make it available to the world

David Massart, EU Schoolnet

Federation of 30 learning object repositories in EU
Move content to user, not user to content: so very hard to control user access
Driven by metadata, to access content and arrange access to services
Tech interop: most components are in place -- search protocols, harvest and push protocols, metadata profiles; still need repository of repositories to discover repositories, with associated registry description, including autoconfigure of service access. At most need to establish best practice.
The problem no is semantic interop: meaningful queries.
Though theoretically everything is LOM, lots of application profiles, so need repositories of application profiles as well. With that done, can start recording each profile's controlled vocabularies, then crosswalks between the vocabularies, then transformations from one application profile to another.
ASPECT project is trying to do this all now: vocabulary bank, crosswalk, transformation service; trying to work out what would go into an application profile registry.
Dramatic knowledge building: some national repositories were not even up on LOM at the start

Valerie Smothers, MedBiquitous

MedEdPortal: not just learning objects, but learning plans, curricula: structure.
They routinely partner with other repositories. This has had blockers: no standard for packaging content (IMS not applicable to them.)
Peer review, and professional credit for submissions; but this means reviewers need to access files, different downloads every week.
Taxonomies are big in medicine, but don't cover medical education well.
They need fed search into MedEdPortal from other collections; they are reluctant to import other collections or refer out to them, because of how stringent they are.
LOM is profiled. Tracking reuse, and identifying reasons for reuse. Off the shelf products don't support profiles.
Interest in harnessing social networking, and Friend Of A Friend information.

Christophe Blanchi, CNRI

Identifiers are key to digital infrastructure. Ids have to be usable by systems as well as humans, provide client what they need in different contexts.
Identifiers often not interoperable. Syntactic interoperability: has been address with standards; problem is now different communities using different, non-native identifiers. Semantic interoperability: how to tell whether they mean the same thing? Functional interoperability: what can I do with the identifier? You don't always know what you'll get when you act on the identifier. Community interoperability: policy, site of the most silo'ing of identifiers. Persistence interoperability with the future.
Want to provide users with granular access. Recommendation: identifiers should provide user a glimpse of what the resource is. Identifiers resolving to self-defining descriptions. Identifiers must be polymorphic. Identifiers must be mapped to their intrinsic behaviours (typing, cf. MIME).

ADLRR2010: Repository Initiatives

Dan Rehak

Registries and repositories.

Dan and others have been drawing pictures of what systems to do content discovery should be.
So what? People don't understand what these diagrams communicate.
Underlying all this are: models. User workflow models. Business models. Service models. Data models. Technical model. The models interact.
Try to constrain the vocabularies in each of the models.
Needs: provide discovery access delivery management; support user expectations, diverse collections, policies, diverse tech, scalings.
Do we want the single Google of learning? Do we want portals? Do we want to search (and sift through), or more to discover relevant content? Social paradigm: pushes content out.
How to get there? People do things the web 2.0 way. (Iditarod illustration of embrace of web 2.0.)

Panel: Initiatives.

Larry Lannom, CNRI.

ADL did interoperability by coming up with SCORM. Registry to encourage reuse of content within DoD: content stays in place, persistent identification, searchable.
ADL works, although policy took a lot of negotiation. The tech has been taken up in other projects: GENI, M-FASR, commercial product currently embargoed.
Problems: limited adoption. Not clear short-term gain, metadata is hard and expensive, reuse presupposes right granularity of what to reuse.
Tech challenges: quality metadata: tools to map to required schemas, create metadata as close to creating content as possible. Federation across heterogeneous data sets, including vocabulary mapping -- intractable as there are always different ways of thinking about world, so need balance between system interop and semantic incompatibility. Lots of tech, but still no coherence.
Future: Need transparent middleware to ingest content. Need default repository service for those who don't have one. Gaming & virtual worlds registry. Internationalisation. Simple metadata for more general use. Need turnkey registry for easier deployment. Need to revisit CORDRA.
Difference between push and pull is implementation detail, should be transparent to user.

Frans van Assche, Ariadne Foundation.

Globe foundation: largest group of repositories in the world.
Ariadne: federation. Services: harvest, auto metadata generation, ranking. Six expert centres counts as success. Lots of providers in federation.
Problems: exchange between GLOBE partners (there are 15). n2 connection matrix. Language problems. Need a central collection registry, rather than have everyone connect to everyone.
Ariadne is a broker between providers; still need to engage end users.
Tech Challenges: scaling up across all of Globe. Ministries had been disclosing very small amounts of resources, now deluging them.
Need to serve users better, with performant discovery mechanisms, dealing with broken links and duplicates and ranking in a federation particularly. Alt knowledge sources such as Slideshare and iTunes Uni: you can't get away from federates search.
Need social metadata, but will have to wait until basic infrastructure in place.
Ultimately want discovery uniquely tailored to user needs.
Multilingual issues pressing in Europe, need mappings between vocabularies: managing 23 languages is difficult.

Sarah Currier, consultancy.

UK Higher ed repositories, CETIS. Policy, and community analysis around repositories.
First time they reached the broad community, not just the early adopters: reflects what they needed, and their sense of community, got non-techie users from Web 1 to Web 2 mindset on how to use and reuse resources. None of the funding went into tech (which is good).
Their success is the end users; but often the repository content could not be exposed via NetVibes or Widgets, which shocked her. Lots of work by small group of people, so Tragedy of Commons; hard to retain engagement with some users -- though tech this time was not the barrier.
"Fly under the radar": IP, metadata profiles, tech -- got quick outcomes because didn't have to bother with that; the cost is, no influence on repository policy to get them to play along.
Still need to start from users (wide range); what we currently have online in Web 2.0 is very user friendly. They are mostly interoperable and backuppable, so sustainability not as much an issue as it used to be. Lack of interop to Web 2.0 from repositories is still major trouble; until DuraSpace gives Web 2.0 feeds, can't build.
This is not creating own Facebook on top of Fedora: this is about using existing tools on top of Fedora.

Thornton Staples, DuraSpace

Durability is in hand with distribution. DuraCloud is their move into CloudSpace, providing trust there.
Fedora, DSpace, Mulgara triplestore.
Fedora is used around the world, now including govt agencies with open data. Now using fedora in interesting ways, not just as archives, but as a graph of interrelated resources, relating also to external resources.
Fedora no longer grant funded, but open source self-standing project.
Problem: communication of what Fedora is and is intended to do, so ppl just expected their own shiny object out of it. Fedora is complicated product. Fedora in between library timescale and IT timescale; should have put out a user-oriented app much earlier than the base infrastructure, this took much longer to happen (only past couple of years), and blocked adoption.
Tech Challenge: scaling. How many objects in repository affects access, discovery, etc. Size of objects also affects this. Data Conservancy is pushing limits of Fedora: are adding new kinds of data streams to deal with such data more effectively.

Jim Martino, Johns Hopkins Data Conservancy

NSF funded. Data curation as means to address challenges in science.
Came about from astronomers wanting to offload data curation onto library. Has broadened in coverage and use.
Driven by science complex needs, disparate data sets. Will do analysis on how data used, including when not to preserve data.
Data is getting more sizeable.


ADLRR2010: US Govt Perspectives

Paul Jesukiewicz, ADL

Lot of tech, but not a lot of uptake. There are lots of approaches out there to take stock of. Administration: we still don't have good ways of finding content, across govt portals. Need systems that can work for everyone and for varied needs, which is difficult.
Previous administration, not a lot of inter-agency collaboration; that is now happening again.
White House wants to know where things are up to; lots of money for content development & assessment. "Why not Amazon/iTunes/Google experience?"
Technically more possible than policy side. Push to transparent government, so open. Must support both closed and open content.
Will have to have system of systems, each system dealing with different kind of requirements.

Karen Cator, Dept of Ed

National Edu Tech Plan. Move to digital content.
* Learning: largest area, creating engaging and ubiquitous content.
* Assessment: embedded, multiple kinds including simulations; needs context such as "what's next", discoverable, should be ultimately pushable to student.
* Teaching: how to make teachers more effective, making sure they're connected to data and experts.
* Infrastructure: broadband everywhere, mobile access.
* Productivity: cost efficiencies day to day. Personalised learning is very participatory.

States are collaborating on standards; this is a microcosm of what is possible.

Bonus section: R&D. What more needs to be invented? Textbooks addressing full range of standards, not just the easy to test ones. Content interoperability and aggregation.

Student Data interoperability others are working on, including data anonymisation; but content interop is expedient priority for them now.

Open Source: the world is using it so we have to.

Teacher portals are all ad hoc; priority to get content interop there. New business models can arise given interoperable content, but this needs open models.

Content will have to come from everywhere—globally.

Frank Olken, NSF

Works on: Knowledge integration, semantic web, data mining.
National Science Digital Library: longterm program. Now built on Fedora, over RDF, Mulgara triplestore.
RDF enables faceted search, because multiple hierarchies are possible over same resource.
Big vocabs (esp in medical field) are happening through description logics, OWL. NSF not currently using it. RDF has been maturing quickly; the description logic engines and the rule systems are less mature, but the most important part of all of them is the conceptual map.
Most work on semantic web is in Europe through EU support; some US work is being commercialised, but not much US support for logic based approaches.

Can user contribute to taxonomy (= folksonomy)? They are doing research on turning folksonomies into rigorous taxonomies: open research over past two years, but no smashing success so far. NSDL metadata registry project.
Mappings between taxonomies: needs order-preservation to keep hierarchies internally consistent, active research.

ADLRR2010: Notes from ADL Learning Content Registries and Repositories Summit

The following posts are notes from the ADL Learning Content Registries and Repositories Summit, Alexandria VA, 2010-04-13–2010–04-14. (ADLRR2010)

[EDIT: ADLRR2010 series of posts]