A page to record development and release issues from the Collections Strand. This is a section of the Phase2 Collections strand synthesis
Unless otherwise stated, all references are to project final reports
You can use the links here to jump to the appropriate point on the page:
|
How are different means of making OERs discoverable within disciplines effective?
The aim of the collections strand was to enhance the discovery and reuse of OER materials, by building collections of materials around particular thematic areas.All projects were required, as a condition of funding, to provide a static collection of resources at first year undergraduate, and a dynamic collection, automatically collected from a number of sources, at undergraduate years 2, 3 and postgraduate level.
Appropriate dynamic system
All collections projects required a system that would support their dynamic collections. The majority used Wordpress because of its open ethos (C-SAP), strong developer community (C-SAP, Triton), usability (C-SAP) and large existing user base (Delores) C-SAP: Methods Triton: Politics Inspires Delores
- “WordPress is being considered as a front-end because of its open source origins, ease of use and wide popularity. The structure of WordPress as a CMS and the ease of adding OER ‘widgets’ is of particular relevance together with its reliance on RSS feeds that facilitate use with dynamic collections.” (C-SAP interim report)
- "Wordpress blog offered good combination of web2.0 features and accurate information retrieval " (C-SAP final report)
- Wordpress was chosen for plug-in architecture which allows relatively easy extension of its functionality (Delores final report)
- The Oerbital project, who were hosting their dynamic collection in mediawiki, explored using a Wordpress blog in addition, attracted by its syndication opportunities, but found little use for it
- Mediawiki was used by Oerbital project, (Oerbital)
- OF (GEES) has developed its own FERC system with map interface
Using a hook to draw users in
Pull (search) and push discovery
- Projects generally found that search (ie. pull techniques) were very time consuming for little reward.
- Search for OERs is very difficult to automate because of lack of standardisation and machine readability of licence information and explicit badging as OER (even MIT opencourseware).” ...for full realization of the potential of OERs in general, adoption by providers of a more standard way of representing information about a resource would be beneficial. The data encoded should be machine-readable and at least include information that explicitly identifies the material as being an OER, the legitimate use of the resource based on licensing information and, crucially, various dimensions of the resource relating to subject, syllabus, curriculum and such like. Examples of such data can be seen encoded in the classifications used in Delores Selections and Delores Extensions.” (Delores final report)
- "API and metadata issues mean dynamic collection does not return the “google-like” relevance that users expect " (Triton final report)
- Search tools need to be appropriate
- Google-like search tools and results: google is a more effective tool than OER repositories for finding teaching materials (C-SAP user survey report). Frustration with OER repositories because of lack of relevant results and a host of usability issues “In the survey, Google and Google Scholar were cited by 81% and 75% of respondents respectively. This was followed by YouTube (40%) and Wikipedia (32%). Intute was the only specifically academic site mentioned by more than 20% of respondents.”(C-SAP final report)
- Search engine optimisation to improve visibility of blog: “The blog has had a number of features added to improve the visibility in Google including improved URLs that reflect the blog posts, clear use of categories and exposure of as much metadata as possible.” (Triton final report)
- For push discovery standards are also needed, eg. standard rss descriptions of OER. The Delores Project devised an xml schema for rss from its sux0r Bayesian filter into Waypoint software. Setting up sux0r for Delores Extensions
Ensuring a critical mass of resources
However good the search facilities, a critical mass of potential resources is needed, otherwise search results are sparse and users revert to google.
- OF (GEES): Non-OERs, indicated by traffic light system: “There are not many “open” resources currently. It would be remiss to ignore very good resources that are publicly available but not under a cc licence + we’d only have a small number of pins on the map. By broadening the scope we can engage colleagues who are not aware of OER and CC and also provide a collection that might be interesting enough / have enough critical mass for our community to want to look after and keep contributing to” (OF (GEES) final report) traffic light system: sample FERC results showing traffic lights in use
- C-SAP : reviewers encouraged to review “grey OERs” as well as OERs. Template for reviews has a place for discussing the licencing information
- "In order for OERs to be ‘discovered’ it seems that far more attention needs to be given to making full use of these ‘grey OERs’ both for the value they have and as a means of directing people to explicitly OER materials.” (C-SAP interim report)
- Delores: work with and support authors in repositories without clear CC licences, to provide these (eg. SEED Curriculum; CDEN-Ryerson University )
Feed in and Feed out
Projects have developed automatic feeds in to their sites, to keep them up to date and fresh, drawing users in by acting as an information portal rather than just a repository.
- OER ‘widgets’ for Wordpress (Triton, C-SAP) Triton picture finder
- Podcasts feeder for Wordpress, developed by Triton project, which automatically selects most relevant podcast to blog currently being read
- Newsfeed for Wordpress, developed by Triton project: “Initial feedback from site users indicated a desire for up to date news from various newspapers. The newsfeed block of code brings in several RSS Feeds and modifies them to present them on the site. This helps to keep the site content fresh” News sites chosen in response to student feedback (Triton final report)
Projects have developed feeds out from their sites, raising awareness and placing OERs where stakeholders are
- automatic broadcasting of new blog posts through twitter and facebook (Triton)
- developing [mainly user controlled] widgets for rss feeds out from resource descriptions on static site fordissemination to other interested sites (Delores)
Consistent and appropriate metadata
- Provision of consistent metadata – use of a social sciences taxonomy from the NCRM to classify resources (C-SAP "Tags, Taxonomies and the Collections Site")
- Metadata that meets user needs, as well as those of repository designers and curators: “Our expert workshop and survey both identified a need amongst methods teachers for clearly defined criteria so that they know who has created a resource, when, which level it is appropriate for, and a description of what the resource is. A central theme of the Collections Project has been how to adopt meta-data that reflects user-needs as opposed to the technical perspective of the repository designers....For now we have to continue attempting to balance the demands of a consistent searchable database with the meta-data users are willing and able to contribute.” C-SAP " the perils of describing digital resources"
- Issues of providing consistent metadata, especially how to describe video resources; existing guidelines are designed for conventional video rather than the type found on YouTube. Much metadata on sites such as YouTube apply to the uploader, rather than originator C-SAP "the perils of describing digital resources" ;C-SAP "Notes on Visibility of Licensing in video uploading sites"
What issues arise in linking social technology-based marketing/community portals to resources from a number of institutions
Attribution and quality status
- Making original attribution, and quality-checked (or not checked) status of collected resources clearly visible,
- C-SAP user testing showed that the Politics Inspires/Triton site was liked by users, “However, there was an (inaccurate) assumption that because of the Oxford/Cambridge affiliation that the OER resources generated through the Xpert widget had been reviewed by the institutions” (C-SAP final report)
- "OER sites that stressed the ease of which it was possible to submit materials caused a general lack of trust amongst those unfamiliar with the concept of submitting open materials. In particular the slogan ‘Creating content in Connexions is as easy as 1, 2, 3’, caused both amusement but also mistrust of the site. One participant commented, ‘you could be anybody’. In discussion it was suggested that information about submitting resources should not be on the homepage." (C-SAP expert group in final report)
- Politics Inspires page showing link to author profile,
Different OER release models among partners/institutions/repositories
Oerbital: different OER release models among partners: Oerbital collection Oerbital community discussion of repositories
Development of a common style
Evolution of common style among partners from different institutions (for blog entries describing a resource collection) (Oerbital final report) Examples of Delores common style in use
Negotiating access rights
Negotiating access rights to the blog hosted by one institution but with partners from another. The solution was for an access policy generated by an oversight group comprising members from both institutions setting clear standards and guidelines, with access rights managed by Wordpress (Triton final report appendix 3) Triton Statement of purpose Triton Comments policy Triton Editorial guidelines Triton Legal Notice Triton Privacy Policy Triton Accessibility Policy Triton Takedown Policy
What issues arise in collecting and making OERs available dynamically?
The majority of issues projects encountered in collecting and making OERs available dynamically centred around the lack of consistent metadata in existing resources, provision of consistent metadata about resources on collections sites, the variety of repository APIs, and lack of clear or any licensing information attached to resources.
Metadata creation and exposure for enabling discovery
Collection – locating suitable resources
- Automatic collection
- issues noted by Delores project in final report:
- Limited basis for and implementation of standards for OER provision
- No basis for standardization of means of automatic exchange of OER material.
- Inappropriateness of standard library approach to the provision of OERs for the purposes of easy discovery
- difficult because of limited amount of machine-readable information provided "The [user] experience suggested that using general purpose search tools (e.g. Google) in site-specific searches restricted to URLs containing OERs could be a more fruitful approach. ... More generally, a repository of spiderable URLs at which OERs may be found would be a very valuable resource.,," " ... most successful approach... using bespoke Google search scripts to encode a set of search terms. These terms are case specific; for example, in the Delores Project discovery was attempted using such terms as: gear, machining and other keywords more or less specialized to engineering and engineering design. ... Using this approach had the added benefit of providing all the returns in a single defined format, making subsequent processing straightforward." Once resource found need to inspect (manually) to discover schema for resource description and then write a script to extract resource data automatically from that repository. Scripts written for MITOCW, OER Commons, and JORUM. (Delores final report)
- C-SAP custom search
- Automatic collection would be easier if resources were provided in xml or other format (eg. rdf) intended for machine reading (MIT provides in xml) (Delores final report)
- An easy first step would be to label all OERs explicitly as OERs (even if only in text) (Delores final report)
- Interrogating xml output from open repositories: "XML resource information that contains fieldwork terms is added to the collection. Next the FERC technology establishes whether the fieldwork resource is location specific or not. Location is looked for in two ways. First the description is checked to see if it contains decimal degree coordinates in square brackets . If it does the pin appears on the map at this point. If no coordinates are detected then a second method employs geoparsing" (OF (GEES) final report, OF (GEES) blog on geoparsing )
- "Multiple sources are searched using PHP to query APIs or access and then cache RSS Feeds. Sources are grouped into tabs and then presented accordingly to the page viewer." (Triton final report)
- Broken links returned from aggregation services (Triton final report)
- Deduplication needed,(commented upon in Delores final report an evident in use of Xpert search widget by Triton, though not commented on explicitly in Triton final report)
- Difficulty of finding appropriate (manual) search terms to locate relevant OERs (OF (GEES) final report)
- Usability issues
-
Lack of relevant results (and appearance of irrelevant ones in search returns) (C-SAP final report)
-
Interface difficulties – users wanted filtering options rather than forced choice at outset – filters such as
-
Tag clouds popular (C-SAP final report)
- user frustration if had to download from repositories rather than getting resources directly from a web link, and user suggestion of going direct to author rather than via repository: user comment, “Need to be able to scan resources quickly and not have to download” “Suggest contacting authors of JO resources in order to make direct links not using JO” (OF (GEES) final report)
Collection - API issues
- Variety/lack of APIs of different aggregation services: There are many aggregation services we can use to bring in content to Politics in Spires. Often these aggregation services are without API (Folksemantic, OER Commons, MERLOT) and as such we need to fake requests to generate RSS content which we can then search. Services with APIs (Xpert and Jorum) present a problem by having different approaches to querying. Xpert can be searched for a keyword, whereas JORUM has a mandated list of keywords to be searched against. Xpert sadly was the slowest API, which did lead to issues with timing problems. This all demonstrates that to query five repositories you would need at minimum three pieces of code. It is likely aggregators in future will increase this language diversity, and so complicate algorithms. At present it is useful to have a service which can alter the metadata returned (such as with the flickr API), but effectively proprietary API formats do not assist (Triton final report)
- Filtering options for extracting dynamically for Jorum not yet available – may be solved by the new Jorum API but not able to test within the project timescale because of ongoing API development (OF (GEES) final report)
Collection - metadata issues
- Variety of metadata from different aggregators:
- "the metadata returned was also different. Xpert is unique in the sense that it returns the actual URL of the item, and not a holding page. As aggregators are likely to have aggregated the same content, this introduces a problem of aggregation duplication, and transfers the problem back to the consumers. Most feeds also failed to return licensing or provenance information.. Dealing with a large block of consistent metadata (such as the Oxford Podcast RSS Feeds) led to much simpler and faster algorithms as consistent data could be expected and so code could be simplified. An excellent, fast API from Mendeley is the only call on the journals tab of the Dynamic Collection and it is noticeable this requires by far the least code" (Triton final report)
- harvesting from open repositories requires some standardisation of description , especially for location-specific resources (OF (GEES) final report)
- Poor description of resource in metadata:
- “It was originally planned that the “dynamic collection” would operate by using Sux0r to discriminate broadly between more and less relevant resources using RSS feeds; the more relevant resources would then be classified using rule-based classification techniques, again based on the RSS feeds. It has become clear that the feeds may not be sufficiently descriptive of the resources to be useful in this regard, and that it may be more important to use the full-text content of the resources (this may also be necessary for more general search purposes).“ (Delores final report)
- limitations of metadata currently assigned to resources (C-SAP final report)
Collections - licencing issues
Deriviing classification schema
Devising suitable classification schema was core to all projects, and particularly discussed by Delores and Triton:
“The project's dynamic collections rely heavily on a fixed set of subject categories chosen by the department and subject librarian to duplicate the module structures in the 3-year degree course. A typical category is "Diplomacy" or "Political Theory". Although categories are a fixed set and cannot be modified by a non-admin user, the blogger has a free choice to add labels such as names and places to their post by adding a "Tag". (Triton final report)
Creation of an appropriate interface
-
Creating an attractive and stylish interface (developing Wordpress themes) (Triton final report)
-
Specific Wordpress theme (Carrington) chosen and developed to display clearly the author, pub date, CC-licence URI, other rights, source and URI of the resource, contained in the Wordpress custom fields (Delores “Wordpress development needed” blog post)
-
WordPress theme chosen for best balance of static and dynamic resources (Triton interim report)
-
Map-based interface for fieldwork resources, based on the google maps API, for searching resources OF (GEES) map interface
Available expertise
Wordpress used partly because technical expertise available from Triton project (C-SAP final report)
What selection and quality processes are appropriate for dynamic and dynamically collected OERs?
Selection and quality processes depend on questions of
- What is relevant – establishing the scope of the collection
- Processes for selecting what is relevant, especially automatically
- What is meant by “quality”
- Processes to assure quality – which broke down into assuring academic/pedagogic, technical, and legal quality
Establishing the scope of the collection
- Some projects mention overarching selection criteria:
- Delores: fundamental selection requirements:
- Subject appropriateness
- A specified quality
- Alignment with target audience needs
- Explicitness of conditions of use which define an OER
Item 4 could be relaxed provided that the legitimate use provided for would be of benefit to the target user group. (Delores final report)
- “Relevance, adaptability, and level of interest/engagement are particularly important when choosing an online fieldwork resource." ((OF (GEES) expert group in final report)
- “Important characteristics of a ‘good quality’ fieldwork resource include accessibility, up-to-date, and ease of use” (OF (GEES) community consultation survey in final report)
- Projects used their expert groups and communities to help establish the scope of the collection and quality criteria:
-
reviewers have addressed issues related to the granularity and adaptability of research methods as well as issues involved in using generic vs. subject specific methods teaching resources. this has helped the project team to refine the scope of the research methods domain and the scope of potential collection (C-SAP interim report) C-SAP mindmap of scope of collection
-
Oerbital expert community discussions of repositories and quality
- "The community consultation should help us out in this respect, by clarifying what resources GEES practitioners are most likely to search for, and why " (OF (GEES) blog post "What is a fieldwork OER" )
- “respondents also indicated that the original disciplinary context of a resource was unimportant if it could be repurposed to meet their own needs” (OF (GEES) community consultation survey in final report)
-
there was some discussion in the OF(GEES) project over whether the age of a resource was a quality issue, with the expert group reported as not, in general, concerned, unless for some reason the actual content becomes out of date (eg. because natural disaster changes the environment), whereas being up to date was an important characteristic in the community consultation survey (OF (GEES) final report; blog post, “Can an OER go out of date?”
- In defining scope, projects made various assumptions and decisions:
- the static collection was, "... based upon the assumption that the core knowledge requirements of engineering design students in the early years of study will be substantially unchanging over an unspecified but extended period of time" (Delores final report)
- The OF (GEES) project decided to accept all offered contributions subject to quality control, because if offered then assumed that they will be, in some way, useful for fieldwork (OF (GEES) final report)
-
The Delores project argued that requiring a CC licence is too restrictive if learners are included in the users of OER: "Whilst students might in many situations benefit from the unrestricted use of material – allowing them to borrow and re-use in the manner of the teacher – as learners, often all they need is ‘read-only’ access to the material from which they can then draw new knowledge. Even the most restrictive licence would not prevent their entirely legitimate use of the resource as study material and so, it follows, that, for the purposes of an individual learner OER provision, non-restrictive licences should not be a prerequisite. Material which carries no licence at all and which has clear, and restrictive, copyright may for the student at least be of equal value to that carrying the least restrictive of licences." (Delores final report)
Selection processes
- For their static collections, projects generally relied on the judgement of their expert groups for selection and quality control (eg. Oerbital, Triton
- Automatic collection presented challenges:
- At present, entirely reliable automatic judgement-making [of appropriateness, quality, etc] is probably out of reach. However, work in progress may make this more attainable in the short term. Such work includes, for example, the Learning Registry initiative which is seeking to facilitate the collection and provision of the information that would be necessary. Likewise, work on the assessment of information value (including measures of quality) will support this sort of judgement making (see, Darlington, et al., 2010). (Delores final report) Darlington, M. J., Culley, S. J., Zhao, Y., Austin, S. A. and Tang, L.C.M., 2009. Defining a framework for the evaluation of information. International Journal of Information Quality, 2 (2), pp. 115-132.
- The Delores dynamic collection is based on "core material identified and selected manually for [static collection]... augmented by using ‘conventional’ web-based discovery techniques including RSS feeds, crawling, spidering and scraping". Then filtered (Delores final report)
- Delores: filtering for dynamic collection either by user vote of relevance to the search, and/or by Sux0r [which] provides aggregation of RSS feeds which can be passed through a filtering system using naive Bayesian categorization. This function allows appropriate training sets to be used to train the sux0r implementation to distinguish between classes of input. Once trained, the sux0r filter retains or discards resources based on the system ‘knowledge’ http://deloresoer.wordpress.com/2011/07/22/training-sux0r-to-recognise-design-engineering/ http://deloresoer.wordpress.com/2011/04/17/setting-up-sux0r-for-delores-extensions/
- Delores dynamic collection: filtered materials then fed to the Waypoint classification software Details of the Sux0r interface with Waypoint
Quality assurance
- Quality assurance needed to encompass pedagogy, technical and licencing issues
- User rating
Assuring quality academic/pedagogic
- Assuring academic and pedagogic quality was particularly important for the Triton project as resources were associated by users with the Oxford/Cambridge brand
- Validation/quality is very important and assumed if the Ox/Cambs logo are present
- Filtering is important – fewer, high quality results better than lots of noise
- International coverage is very important – they are very outward looking
- Contributor profiles are essential – helps the user asses value
Hence:
-
Guidance documents were produced (editorial guidelines, guidance for the Oversight Team)
-
Documents making clear the responsibilities of contributors plus actions that would be taken if content was inappropriate were made public on the site (legal notice, comments policy, takedown policy)
-
The site Administrator and Oversight Team acted as gatekeepers to ensure the reputation of the department and university was upheld, particularly ensuring that sensitive topics were treated with care
-
Recommend/like/ratings functions were added to the Dynamic Collections and Learning Pathways to encourage the community to assist with curation of the materials brought into the site (Triton final report)
-
Triton Statement of purpose Triton Comments policy Triton Editorial guidelines Triton Legal Notice Triton Privacy Policy Triton Accessibility Policy Triton Takedown Policy
Assuring quality – technical & legal
- Assurance of technical and legal quality for both dynamic and static collections entailed at least some degree of manual control:
- The Delores project used initial selection of likely source sites of quality OERs, to which automatic collection can be pointed
- Widgets developed by the Triton project were controlled by blogsite owners, and directed at reputable OER collections [implicit rather than explicit in the Triton final report]
- Projects took different approaches to the inclusion of resources that were not explicitly CC licenced:
- “It was felt that excluding fieldwork resources that did not have an open licence but were in the public domain missed out a valuable segment of resources.” The solution adopted was a traffic light system for indicating different levels of permitted usage, to encompass non CC licensed resources (OF (GEES) final report)sample FERC page showing traffic light system; description of traffic light system
- inclusion of some non-CC-licenced material, with licencing information clearly stated in resource descriptions (Delores final report; Oerbital final report)
- "if similar resources come to light that are quasi-OERs i.e. OERs by nature but have not been declared as such, probably because the provider was not aware of the licence process, then we will seek to have these added to the collection if we can obtain OER release with zero to minimal effort” (Oerbital final report)
- exclude aggregation services that return non-CC licensed material and/or broken links (Triton final report)
- Projects provided advice to contributors on licencing resources (eg. OF (GEES)) and/or required contributors to register in order to ensure compliance with CC licencing (eg. Triton)
How are different ways of organising, and guiding users to and through resources effective?
Overview
Project sites provide a portal to static and dynamic OER collections and sometimes other information.
- The majority of the projects are using blogs (generally Wordpress) as the basis of the portal but take different approaches to linking the resources to the blog posts
- A wiki (mediawiki) is being used by the Oerbital project to provide a catalogue of the (static) collection and associated discussion with feeds in/out. Wiki catalogue entries describe individual resources and suggest pedagogic uses, http://heabiowiki.leeds.ac.uk/oerbital/index.php/Main_Page . They are accessed from the main page by category links beside the names of the expert panel.
- OF (GEES) provides multiple ways into its collection, via map-based or text search, word cloud, or browsing categories. "[user survey] findings provide empirical support for the development of a map-based interface for discovering fieldwork resources, and for the need to include multiple search/criteria.” “[word-cloud] provides the user with suggestions for possible search terms, together with an indication of whether the search is likely to return few or many resources” user comment, “Map search is great” (OF (GEES) final report) Catalogue entries describe individual resources with clear licencing information http://www.openfieldwork.org.uk/api/ ; http://www.openfieldwork.org.uk/api/map.php
Classification
Projects used a combination of classification into categories for browsing and filtering results, with author tagging.
Projects derived their categories by a variety of means:
Usability
Guidance through the resources was an area in which projects conducted extensive user testing
- Searches need to be “google-like”
- search/discovery needs to be “google-like” (or google) “(1) the search and browse capabilities of OER repositories and indexes are not particularly effective in helping users to find resources (2) the OER descriptions in RSS feeds are not particularly helpful to repositories or to users (e.g. very often the descriptions are of a course overall rather than the particular resources (lecture notes, examples, etc.) made available through the course). (Delores final report)
- issues of granularity of classification:
- in the static collection, how do we select and provide descriptions at the fine level of granularity that Chris wants while also keeping he valuable information of the original course context of the resource; will the quality of the syndicated metadata be good enough for the Bayesian filtering to work; can we supplement this by using information from the course/resource webpage; what use can we make of customised Google searches? (Delores blog post "finding OERs")
- how to enter multi-site resources on the map interface in the OF (GEES) project http://openfieldwork.org.uk/?p=246
- Other usability issues
- OF (GEES): user testing resulted in Home page with general information, rather than top level map interface
- OF (GEES): users expressed a preference for pins on the map interface to convey information about the type of resource rather than the openness of the resource
- Triton learner focus group:
- The students preferred a simpler design with less cluttered menus.
- Contributor profiles are essential as they help the user assess the value of their content. They were very interested in anything that is connected with academics they’ve heard of.
- Students love audio and video lectures but need them to be presented in a very obvious way.
- They were less interested in widgets/ categories so we need to adopt simpler approaches for presenting OER. Filtering was important; fewer, high quality results are better than lots of noise.
- Articles and posts require more links to introductory materials and to other materials by the contributor.
- Students seem quite sussed on finding and rating quality material but don’t know about alerts and feeds
http://blogs.oucs.ox.ac.uk/openspires/2011/03/09/what-do-students-want/
- Triton: author information was prominently attached to blog (student user testing showed author was one of most commonly used search methods)