2008-06-19

TILE Workshop

TILE Reference Group Meeting

Phil Nicholls in attendance (who has already named me as an "acolyte of Kerry"); he has produced SUMs for the Library 2.0 requirements of data mining for User Context data (strip-mining logs, I guess), and identifying content as relevant to a given user context (correlating courses to reading lists and library loans). In other words: how to extract user context from users like me, to recommend content to me; and how to datamine that user context into existence in the first place. Reading lists, loans records, enrolments, repository logs, user feedback.

Larger context: the Read Write Web: recommendation engines --- a key tech in 2008 on the web. The space for TILE is already populated by non-library providers. The domain is responding to the requirement: libraries talk to Amazon, have borrowing suggestions at Huddersfeld. MESUR project has farmed ginormous amounts of data on loans and citations. Higher ed libraries have huge amounts of data that can be capitalised on for resource discovery, and which is comparatively well defined.

Going e-framework for to get synergies with the other domains e-framework is in (e-learning, e-research).

Tension of e-framework specificity and leveraging/reuse vs. "constant beta", flexible software development, which is contra rigid specs. Need to experiment for a length of time before fixing things down in e-framework. The approach seen as more questionable at a local institutional level than in a national context.

Need shared vocab, not just shared software, to move forward in the library field --- enable dialogue between participants in the national context; e-framework can help build up the vocabulary again.

Sidestepping researcher identity as feeding into this: too hard for now (not familiar with the domain), quite diverse in interface and sparsely populated. The student data is rich and uniform, so working with that as a priority.

Pain points: why isn't your uni library catalogue already like amazon? how do you get the bits of the uni to talk to each other to deliver this? why do people want an amazon experience on their library catalogue, when?

Students already compare notes informally about their reading, which *might* motivate this kind of recommendation structure. But libraries are worried about data privacy; and US are even more touchy. Data will be anonymised; but Student Admin will ask questions once data is aggregated by individual subject, let alone grades awarded.

Peak use of loan recommendations in the existing prototype (Huddersfield) is a month after start of term, when students start exploring beyond their prescribed reading lists.

Reading lists are useful inputs, but not necessarily useful outputs: they are fixed by academics for the once.

e-portfolio a more important parameter for driving this tech than transitory external social networks like Facebook.

Contexts are multiple: can be institutional as well as individual ("what are our students reading?"), and people have multiple identities (Facebook vs. enrolment record): context needs to be tied down, to work out what to harvest. Students are also enrolled in more than one institution! If recommendations should be driven by learner-centered approach, then learner should have control of how their recommendations are used.

--- But if we just throw information into the open, without prescribing context, then contexts will form themselves around what data is available: users will drive it. (Web 2.0 thinking: no prescribed service definition, but data-centric driving.)

Systems need to be able to capitalise on this data to improve e.g. discovery (clustering).

***

Architectures of participation: the efforts of the many can improve the experience of the individual. Need to articulate benefits to users to motivate them to crowdsource.

Deduplication is key to users of catalogues. But surely that shouldn't mean JISC implements its own search engine?

OPACs not very good for discovery (no stemming, spellcheck); don't do task support (e.g. suggest new search terms).

Impediments: control, cultural imperatives; user base --- include lifelong learners?; trust, data quality (tag noise); data longevity; task/workflow support (may not support full workflows, which are not well understood, but can support defined tasks); cost; granularity of annotation target.

Unis are already silo'ing in their learning environments (Blackboard), and that's where they put their reading lists: how do you get information outside the silo?

JISC build a search engine? No, JISC get providers to open up their data, so the existing open source etc. efforts can inform their own search engines with the providers' contextual information.

No comments: