OpenRefine (ex-Google Refine) is a powerful tool for working with big data, cleaning it, transforming it from one format into another, extending it with web services, and exploring large data sets with ease.
JournalTOC API is a RESTful web service that can provide access to the full dataset collected by the JournalTOCs Project since 2009. This dataset contains the metadata for over 22,000 journals and for more than two millions of articles published during that span of time.
Ted has used the RDF Refine extension for OpenRefine to link local data stored in VIVO as RDF with other sources on the Web. OpenRefine allowed him to query a reconciliation service to match local strings to entities from another source and the RDF Extension enabled him to export those entities as RDF.
Basically Ted wanted to interlink the metadata describing the work of university researchers with the venues in which their research is published. Because JournalTOCs is a good source of metadata about academic journals and articles, he used a demo reconciliation service developed by Michael Stephens as a model, and put together a basic reconciliation service for the JournalTOC data that queries the JournalTOC API and translates the response to the format that OpenRefine is expecting. This service can be run locally and OpenRefine will query it just fine. Ted has open sourced his code and it is available on Github and it looks like a good option for librarians and researchers working with similar data sets.
Developers can use the JournalTOCs API to embed JournalTOCs’ metadata and search functionality within their own web services. Anyone with access to RSS Readers can also benefit from the JournalTOCs API. Most of JournalTOCs API calls are free and only require a simple registration process. The API responses are returned in RSS 1.0 format, which then you can parse and use in your own web application, RSS reader or institutional web page. Further information on JournalTOCS API can be found here.
More Information on OpenRefine and JournalTOCs:
From time to time, we receive questions about RefMan. For that reason, although Adept have no plans to release new versions of RefMan, we have prepared this small guide to help users needing to export search results from JournalTOCs to RefMan.
Let’s assume that your search query is:
Optical Coherence Tomography intravascular coronary imaging
- Sign in from http://www.journaltocs.ac.uk/index.php?action=signIn
- If you are a Premium user, select the Articles Tab and enter your search query as shown below:
- If you are a Free user, enter your search query and tick the for Articles by Keywords option, as shown below:
- Hit Go to execute your search
- The results listing the articles found for your search will be displayed as shown below:
- When you click on the title of an article, the system will display its full citation and you will be able
to tick the checkbox near to its title to save it in your Articles to Export page, as shown in the following example:
- Repeat the previous step for all the articles you want to export.
- Go to your Articles to Export page (http://www.journaltocs.ac.uk/savedArticles.php) and click the Export to EndNote link to produce a compatible RIS file (EndNote®).
- Follow the instructions at http://www.refman.com/support/faqs/import/faq3.asp to import the RIS file into your RefMan database.
Overall Nature, Science and New England Journal of Medicine are the most followed journals at JournalTOCs (Top Journals). However, among the Open Access (OA) journals (which are carefully selected by JournalTOCs), D-Lib Magazine, the Journal of Information Literacy and, the Journal of Library & Information Science are the most followed OA journals. Below we list the top 100 most followed OA journals:
Every day, we create approx 5 thousand new records; most of those records are metadata of journal articles published in the previous 24 hours. The following image represents the metadata that JournalTOCs has collected so far.
The table illustrated at the left hand side is a sample of the data source for this big metadata. It represents the number of new articles per day found in the journal TOC RSS feeds in March 2013.
Roughly 70% of that metadata was gathered in the last two years alone since JournalTOCs was launched as a public service in May 2011. As today, this metadata represents data of 1,795 publishers, 10,200 Premium users from licensed institutions, 22,050 journals, over 100,000 tracked research interests collected from followed journals that are frequently visited by any user (free and Premium registrations) and near 8 million articles that were published in the last 5 years. This big metadata is more than a matter of size. It can be an opportunity to find insights in new and emerging types of research, to support or create library management systems, and to help to answer questions regarding research publications. JournalTOCs offers ways to harvest this opportunity. It uses web services and standard harvesting protocols to open the door to the possibilities given by this big metadata, including:
+ Harvesting the metadata of all the journals indexed by JournalTOCs, which includes title, ISSN numbers, access rights, subject classification, publisher, number of follower, last issue published date, the URL of the journal RSS feeds and the journal homepage
+ Harvesting the complete database of the metadata of 8 million articles, including all the content collected from their RSS feeds
+ Querying the metadata of specific journals by ISSN or keywords in the journal title
+ Searching for articles in the current issues or the backfile issues
Recently two of our licensed institutional users have been awarded with a project grant and a prestigious award respectively, both of them involving the use of JournalTOCs Premium.
1. Award to develop an automated e-TOCs current awareness service at the NYMC
The Health Sciences Library of the New York Medical College (NYMC), in partnership with the Health Sciences Library System of the University of Pittsburgh, has been awarded a grant to develop an automated Electronic Table of Contents Current Awareness Service using RSS Feeds. The project has been funded with Federal funds from the National Library of Medicine, National Institutes of Health and the Department of Health and Human Services of the United States, under Contract No. HHS-N-276-2011-00003-C.
Partial results of the project have been presented by Marie Ascher, the Associate Director of NYMC Library, in the 11th International Congress on Medical Librarianship (ICML), Boston, USA. ICML is the premier event in Health Sciences Information sponsored this year by JAMA, Elsevier, EBSCO and Wolters Kluwer, among other publishers of medical literature. Marie presented the poster “Development of an Automated Electronic Table of Contents Current Awareness Service Using RSS Feeds and the Library Blog” on Tuesday 7th May during the ICML Poster Session 4.
The objective of the NYMC project is to develop a fully automated e-TOCs current awareness service to replace the physical daily journals shelf. As at many other libraries, researchers used to visit the library regularly to browse the daily journal receipts. However, since print journal collection has shrunk drastically in favour of electronic journals, NYMC recognized the need for a new way to view the latest journal content and embraced the metaphor of the Virtual New Journals Shelf to develop a fully automated e-TOCs system that would push content from JournalTOCs to a “New Journal TOCs” webpage or a posting on the library’s blog.
We congratulate the Health Sciences Library and their creative use of JournalTOCs Premium.
2. IFLA Award to the best library marketing project (5th place) to the VSSC
A Commendable Work award was given to the Indian Vikram Sarabhai Space Centre (VSSC) for the project “Inspiring Library Patrons“. VSSC bagged 5th position of the prestigious IFLA International Marketing Award for 2013. The winners will be announced officially at the IFLA press conference at Singapore in August 2013. Eileen Breen, Senior Publisher at Emerald, which was the sponsor of the award in this year, commented: This year’s winners of the IFLA International Marketing Awards illustrate perfectly Emerald’s endeavours to support global initiatives that benefit society. Once again the IFLA International Marketing Awards prove inspirational to the whole information community and we congratulate these worthy winners.
VSSC Library was awarded for conducting an “open book quiz” programme to make their research staff aware of their services and use the products subscribed by VSSC. About 900 users participated and 688 completed the quiz. The programme was a success, rated as the best program of 2012 in VSSC and all the users appreciated the work and it was well supported by VSSC management. The last question of the quiz was to list 3 favourite journals from a list of journals with customised links to JournalTOCs. N. Narayanan Kutty, the VSSC Periodicals Head, said “If they had asked the users directly to provide their favourite titles in the normal way, only very few would have sent their responses.”
We congratulate the VSSC Library for its effectiveness in making users aware of the library services.