An openly usable web index for search is the goal of OpenWebSearch.EU
An online search engine designed specifically for science and research, for economic indicators and environmental statistics or for art and culture: anyone who has ever failed to find answers to specific questions on Google, Bing, Baidu or Yandex and found none in any of the social media channels either, would love to be able to use alternatives. But they do not exist. At least not yet: The European research project OpenWebSearch.EU is developing an open web index and a ecentralized IT infrastructure that will run on the systems of at least five independent organisations over the next three years. For OpenWebSearch.EU, 14 research organisations from seven European countries are cooperating with each other, among them the Leibniz Supercomputing Centre (LRZ). The project has received an 8.5 million Euro funding from the Europe Horizon programme (project number: 101070014) and is set to begin its work after a kick-off meeting of all parties involved.
Alternatives to search engine need an open index
The open web index and decentralised infrastructure form the basis for innovative search engines and thus also for having alternatives when it comes to searching for information. New search engine business models are also a contribution to increasing Europe's sovereignty with regard to search engine technologies: "An open index is the prerequisite for the emergence of a new ecosystem for search and discovery applications," Michael Granitzer states. He is a computer science professor at the University of Passau who coordinates the various work projects developing OpenWebSearch.EU "Free, open and unbiased access to information – we have lost these basic principles in web search and urgently need to restore them. As a user, I would like to choose my search engine the way I choose my newspaper, based on my personal preferences."
The first ideas for the ambitious IT project emerged in a grassroots movement of researchers and technology specialists. In 2016, they founded the Open Search Foundation (OSF) – which today is also a project partner – and formed working groups under its umbrella that have since developed transparent algorithms and prepared the necessary networks and systems for OpenWebSearch.EU with open-access software and technology. Since the beginning, the LRZ has been a partner of the OSF, and some staff members have been involved in its working groups. Like the other data centres, it provides backend systems and fast, accessible servers as well as storage systems for the European project. On these, the open web index can be calculated and the necessary workflows and interfaces can be developed and tested. "An important goal of the OpenWebSearch consortium is to develop a decentralised, cross-organisational infrastructure that will be operated by different data centres," explains Stephan Hachinger, PhD physicist and head of the Research Data Management team at the LRZ. "Together, we want to quantify the costs associated with their operation through extensive load tests." Founders and entrepreneurs or for data centre operating companies need such data to develop new business models or services with the open web index. "The open web search infrastructure benefits us all," says Granitzer. "It will finally give us a real choice in selecting search engines."
Stimulating new business with tenders
The data management specialists at the LRZ can draw on their valuable experience gained during the EU project LEXIS. The project involved setting up a pan-European platform based on existing data centre cloud systems, networking various supercomputers and enabling the evaluation of research data and big data from different locations. As with LEXIS, the LRZ will work with its partners to develop interfaces as well as a managed data repository based on EUDAT systems, which will provide tools for describing as well as exchanging information. "We will probably use REST API, message queues and other modern IT techniques," Hachinger says with respect to the plans of his team. "This will allow Big Data tasks to be conveniently orchestrated and automated."
14 data centres, universities and organisations, a total of around 75 researchers, are working together on OpenWebSearch.EU, which presents quite a challenge in terms of coordination, communication and management. The LRZ's Research Coordination & Support (RCS) team is also involved: It supports the project leaders in organising working groups, as well as in handling the planned tenders. 15 percent of the funding is reserved for development and contributions from external IT service providers or agencies.