By Majda Seghir (Conservatoire National des Arts et Métiers, CNAM)
Evidence-based researches engaged in improving social progress in Europe require high-quality, relevant and timely data. The European Statistical System (ESS) has the major role of providing such reliable information by elaborating procedures to collect data from Member States in a harmonized basis, preparing the data and implementing data access processes. While aggregate statistics are readily available through the Eurostat web portal, access to microdata, which consists of records containing information on individual respondents of business entities, is restricted and requires filling out an application. KIOSK is a new development that grants access to micro-data via a secure access. This promises progress in evidence-based research.
What is the current practice?
The European Commission via Eurostat may grant access to microdata only for scientific purposes and requires the full respect of the following guidelines[1]:
-
- recognition of the research entity which requests the data access,
- submit an appropriate research proposal,
- describe the requested confidential datasets
- specify the safeguards in place to ensure the security of the confidential dat
- the research proposal gets the approval of the relevant national statistical authority, which provides the microdata.
The microdata are usually provided, depending on the access type requested, by either sending the principal researcher files containing partially anonymised data (i.e. Scientific-Use Files) or/and authorising the access to non-anonymised data (i.e. Secure-Use Files), which are only de-identified in Eurostat’s safe centre in Luxembourg. The full potential of microdata was, however, hampered by the anonymization rules to ensure confidentiality in scientific-use files. Secure-use files had the advantage of being only protected against direct identification but had the drawback of being accessible only in the Eurostat’s safe-centre in Luxembourg. The on-site solution to benefit the access to non-anonymized microdata was obviously a strong obstacle for most researchers.
The long process to KIOSK
The awareness that knowledge progress depends on easier access to confidential microdata has pushed the European Commission to change the legal frameworks in 2013 (223-2013 CE Regulation) and explore the technical and administrative solutions to set up a decentralised access under the heading of the ESSnet project DARA [3] (Decentralised and Remote Access to confidential data in the ESS). The feasibility study conducted in 2013 has requested almost a decade to implement a solution that fully guarantees the security of confidential microdata access in each Member State. The lockdown period due to the pandemic and the concerns regarding decarbonisation has demonstrated, if needed, the urgency to move. Roxane Silberman[3], who has played an active role in the discussions to change the legal frameworks and put in place remote access, provided the details on the IT solution implemented by Eurostat. This is KIOSK. KIOSK allows access to confidential microdata, via a secure connection, to the micro-data stored in the safe centre of Eurostat. In fact, there will be no need for researchers to travel to Luxembourg to use the secured microdata, which will be available remotely via a system where researchers cannot download the data but can work on the data within a secure research environment. In addition, access would be granted to researchers directly from their research institute and not from their local National Statistical Institute as initially envisaged. The research institutes, once accredited, will have to follow the guidelines for microdata access and sign an additional commitment amending the Eurostat standard confidentiality undertaking. Then researchers will be identified using two gates (EU login and KIOSK), each with separate passwords and authentication devices. This solution is further combined with Eurostat output checking of the research outputs for confidentiality before releasing it outside secure access facilities.
A bumpy road ahead
The remote access solution to European microdata is by far the most optimal and cost-effective solution to allow the research community to take full advantage of EU microdata. Notwithstanding the progress made toward an EU decentralised access to microdata, the remote access is currently operational only for three files, namely: the Community Innovation Survey (CIS), the European Structure of Earning Survey (SES) and the Micro-Moments database (MMD). The issue regarding the access to the remaining files is not technical but rather an understaffing problem in Eurostat and, more precisely, in each dataset, competent staff have to check the respect of confidentiality in all output results. An automatic research output-checking tool is currently only a proof of concept, and Eurostat is planning to extend the training on output-checking to the research community. Finally, yet importantly, the required approval of each Member State to use its microdata is constraining in that it may increase the delay in accessing the microdata. The researchers risk being denied access to its data if the Member State is not convinced by the submitted research proposal. However, we can hope that the concerns still expressed by some National Statistical Institutes will be lifted by the development of the use of these Secure Use Files and the impact of the research findings on society.
[1] https://ec.europa.eu/eurostat/web/microdata
[2] DARA project end report: https://ec.europa.eu/eurostat/cros/system/files/final_report_ESSnet_DARA_20140321_publishable.pdf
[3] Roxane Silberman is a scientific advisor at the CASD (Centre d’Accès Sécurisée aux Données), a French consortium established to organize and implement secure access services for confidential data for non-profit research services.