Apache Nifi automating data flows between software systems

Patrick Hochstenbach (Ghent University Library)

Maximum number of participants: 15

Audience: Programmers, data managers, digital architects, technologists

Apache NiFi (nifi.apache.org) is a software project designed to automate the flow of data between software systems. In this software data can be used to send, receive, transform, store data in an automated and configurable way. Data can be ingested with various protocols (FTP, SFTP, HTTP, JMS, UDP …) transformed and stored in many types of database systems while keeping a data provenance and track dataflows from beginning to end.

Given all these potentials we want to investigate what Apache NiFi can do for us, and which type of library applications could benefit from data flow programming.

This workshop will not be a lecture on Apache NiFi programming. We will give a short introduction into the tools, and then open the floor to try out the software and open a discussion how Nifi (data flow programming) can be integrated and used in our daily library automation workflows.

Workshop outcomes: The outcome of this workshop should give us all a better insight in data flow programming, its benefits and risks and how libraries could use these types of tools.


The International Image Interoperability Framework

Leander Seige (Leipzig University Library), Mustafa Dogan (Göttingen State and University Library), Johannes Baiter (Bavarian State Library, Munich)

Maximum number of participants: 40

Audience: IT personnel, system librarians and digital strategists who are interested in implementing IIIF or have done so already

Workshop outcomes: All participants should learn from each other about the challenges of implementing IIIF and gather information on how to design, build or improve IIIF services in terms of reliability, performance and interoperability.

In recent years, the International Image Interoperability Framework (IIIF, https://iiif.io ) has become a widely used standard for making digitised material from libraries, museums and archives digitally accessible. IIIF is characterized by a high degree of interoperability and user-friendliness.

The implementation of IIIF-compatible services confronts interested institutions with various questions and tasks that can be solved in various ways. The workshop will provide a forum for an open discussion on issues related to the implementation of IIIF. Participants are invited to report on their own experiences, recommendations and solutions; to contribute questions and point out difficulties. Participants who do not yet have any experience with IIIF may outline their current situation and learn from other participants about the different solutions and obstacles when implementing IIIF. Participants should find out which steps their institutions may have to take to implement it and how much effort is expected.

The workshop begins with a short introduction to the topic (15 minutes) held by the workshop leaders. A list of the questions to be discussed will be compiled by the organisers and participants in advance of the workshop. Presumably the main subject of the workshop will be the two most important IIIF APIs: the Presentation API and the Image API. The purpose of the Presentation API is to build a JSON LD service that describes objects in manifests and groups them into collections. Typical questions arise from the modeling and resolving of URIs, the parallel operation of different API versions, the embedding or linking of metadata as well as corresponding software like repositories, tools and other infrastructural components. In order to provide citable services, identifiers and attributes must be stable and may not change randomly. The Image API requires the operation of an image server, which is usually equipped with preprocessed image files. The selection of different file formats and tools as well as general performance considerations could be discussed. Other topics may include the implementation of institutional viewers and workspaces, the creation of user-generated content, and other IIIF APIs such as search, authentication and discovery.


Making Sense of Artificial Intelligence

Harrison Dekker (University of Rhode Island)

Maximum number of participants: 30

Audience: Librarians and library technologists

This workshop will introduce library professionals to basic concepts of artificial intelligence and allow them to gain experience with running machine learning and other AI applications in the Cloud. These exercises will involve the use of Jupyter Notebooks and will not require any prior programming experience or specific software installation.

In addition to the hands-on component, participants will also form teams to brainstorm on the implications (opportunities/threats/etc) of AI usage in libraries, knowledge creation, and/or information seeking. Workshop outcomes:

  • Hands-on experience with AI programming
  • Better understanding of the broad categories of AI technologies
  • Better understanding of the implications of AI on knowledge creation and information seeking
  • Better understanding of possible of application of AI technology in libraries


The OpenAIRE Content Provider Dashboard: Open Science services for Content Providers

Leonard Mack (Jisc); Pedro Principe (University of Minho); Paolo Manghi (CNR-ISTI); André Vieira (University of Minho)

Maximum number of participants: 20

The transition to Open Science is a major challenge for all organisations involved in publishing, preserving, and reusing research outputs. Open Science relies not only on the reliable documentation and exchange of highly varied research artefacts, but on data about their linkages. Only then can Open Science make research radically more transparent, easier to evaluate and reproduce.

To meet this challenge, the OpenAIRE-Connect project develops a new generation of scientific communication services on top the of the existing OpenAIRE infrastructure. These Out-of-the-box services support the exchange of research products, publications, datasets, software, packages of research artefacts, and links between them across the whole ecosystem of research communities and content providers. To enrich collections, the OpenAIRE knowledge graph (with more than 25m objects) and the new OpenAIRE-Connect Content Provider Dashboard offer a unique service for literature and data repositories as well as publishers.

This hands-on workshop will introduce repository managers and other content providers to the Content Provider Dashboard. Participants will explore its rich functionality in a live demo, including:

  • monitoring of publications in OpenAIRE-compatible collections and repositories
  • viewing and retrieving of results for related research objects from the OpenAIRE graph
  • subscribing to notifications sourced from the OpenAIRE graph to enrich collections

In the second half, the workshop will turn into a highly collaborative, agile sprint. Participants will imagine extensions to the new tools to address future challenges. In particular, we aim to explore pain points content providers anticipate in their transition - and come up with rapidly developed prototypes to tackle these.

Workshop outcomes: After the workshop, participants will be well-informed about the OpenAIRE Content Provider Dashboard, with a clear understanding of its functions and how it can help them to transition their collections towards Open Science.
Furthermore, participants will have thought critically about what Open Science means for them and the role that repository managers can play in the Open Science services ecosystem, including strategic and operational challenges. This relates particularly to the question how processes can/should be automated and which data (types) would be required for this.
Lastly, participants will also have developed initial ideas to address some of these challenges in a highly interactive, collaborative workshop format.


The library in the patrons’ workflow

Johan Tilstra, Peter van Boheemen

About the presenters: Johan is CEO and founder of Lean Library, now a SAGE Publishing company, that offers a browser extension for patrons of academic and research libraries. Before founding Lean Library, he worked as a programme manager and innovator at Utrecht University Library, where he became fixated with improving the end user experience of library services.

Peter is working as a IT consultant/Software developer within the "Multi Disciplinary Team Library' of Wageningen University and Research. A team of library staff and IT staff located at the library to support installed and develop new library services.

Audience: System librarians, UX librarians

We recognize that the library has no real control where on the web the user is discovering information. We admit that the user is not searching the catalog to find what he or she wants. We know the user mostly finds publications in Google, google scholar, closed or open bibliographies, etc.

Given this situation how can we help the user in knowing which services the library can offer once he or she has discovered a publication or piece of information anywhere on the web.

We would like to discuss:

  • How link resolvers can be extended with extra service.
  • How link resolvers could appear on 'any web page', not only in bibliographes where we have configured them.
  • What these browser plugins that appear 'on the market' are doing and if they can help us with this problem.
  • These browser extensions can potentially gather lost of user information. In which way can we ensure privacy when a user installs these kind of extensions ?
  • Any other great idea you can come up with