Skip to main content

Administrative data

There is a branch of research that analyses and links administrative datasets from different parts of government. The term ‘administrative data’ refers to information collected when people use public services, such as in education, the health service, or the courts or benefits system. This data is collated by government to allow for service delivery. Administrative data is a public resource that aids the provision of public services but can also be used to generate understanding that can help with government decision making and policy development. Administrative datasets are valuable for research purposes because of the potential to generate population level insights when analysed and linked with other datasets from different parts of government for example GP, hospital or education records.

ADR Wales

Welsh Government is part of the Administrative Data Research (ADR) Wales partnership that aims to deliver research that has a clear public benefit. ADR Wales is a collaboration of academics at Swansea and Cardiff universities, statisticians, social researchers and data scientists at Welsh Government and staff at the SAIL Databank based at Swansea University. ADR Wales works in partnership to access, analyse and generate insights from data-based research to provide evidence to support Welsh Government. ADR Wales is part of the UK wide Administrative Data Research UK (ADR UK) investment by the Economic and Social Research Council that is part of UK Research and Innovation.

SAIL Databank

The SAIL Databank holds anonymised data about the population of Wales. It is a secure research platform, and a Digital Economy Act 2017 accredited trusted research environment where de-identified administrative datasets are held and can be made available to accredited researchers to work on approved data linking projects for the public benefit. SAIL is internationally recognised for the robust secure storage and use of anonymised person-based data for research to improve health, well-being and services.

Data linking

Data linking is a process of joining sources of information from different datasets. The process can join information relating to the same person, family, place or event. To allow this research to take place, Welsh Government works with its trusted third party, Digital Health and Care Wales (DHCW) and SAIL in a process known as the ‘split file process’. The split file process is explained in the diagram below.

Figure 1: data linking (the split file process)

Image

Description of figure 1: this diagram shows how datasets are separated into two files for processing to allow for data linkage and ensure that data is de-identified before it enters the SAIL Databank. In particular, the diagram shows how identifying information (such as name and address) in a data collection is separated from the administrative response data (that contains information such as service usage, or health outcomes). The two sets of information are sent to different recipients so that no organisation receives all the information together. DHCW receives the identifiers file and SAIL receives the response data. DHCW removes personal information and adds a reference number and a linking field before the files are recombined making them ready for linkage to other datasets for approved analysis projects. Further information on the trusted third party and the anonymization and linkage process can be found on the SAIL website.

Welsh Government data access panel

All requests to access Welsh Government data are reviewed by a Welsh Government Data Access Panel. The Panel has been established to provide governance for access to individual level and linked datasets for research or statistics. All Welsh Government data that meets this definition is included in the scope of the Panel (with only one exception of highly sensitive and specialised data). The Panel’s purpose is to assess questions such as the public benefit of proposed projects and identify whether projects are feasible and analytically robust. It also reviews ethics documentation and assesses whether further ethics approvals are required.

More information

Data and information that is used for administrative data research is collected during the routine provision and delivery of services across the range of Welsh Government policy areas. Welsh Government staff clean, standardise and verify the data and consider what data is useful and appropriate for further research in SAIL. Suitable datasets are prepared in the correct file format so they can be transferred to DHCW and SAIL in accordance with the split file process described above

Administrative data sent to the SAIL Databank is used for research purposes for the public benefit only, to produce research and journal reports. The reports are about the population of Wales in general, not about individual people.

ADR Wales publications can be found on the ADR Wales website.

Further information about the SAIL Databank can be found on the SAIL Databank website.

Further information about the Data Access Panel can be obtained by emailing: ADRWales@gov.wales

What personal data we hold and where we get this information

Personal data is defined under the UK General Data Protection Regulation (UK GDPR) as ‘any information relating to an identifiable person who can be directly or indirectly identified by reference to an identifier’.

Personal data that is used for administrative data research is collected during the routine provision and delivery of services across the range of Welsh Government policy areas. As this data is already collected for service delivery, administrative data research does not require the collection of additional personal data from you. Once it has been determined that administrative datasets are suitable for deposit into the SAIL Databank, the ADR Wales team in Welsh Government prepare it for the split file process.

This process means that identifiable data (‘File 1’ data) is separated from the administrative data fields and includes:

  • NHS number
  • surname  
  • forename
  • address and postcode
  • date of birth
  • sex

Not all the data listed above is collected for every service across all the policy areas, but as many of these fields that are collected will be transferred to DHCW to allow data linking to take place.

How your data is processed

The Welsh Government has a data processing agreement with DHCW which sets out that the Welsh Government is the data controller and DHCW is the data processor. File 1 data is sent to DHCW (our trusted third party) as part of the split file process for matching and anonymisation as outlined in the diagram (figure 1) above. DHCW uses an automated matching algorithm and then removes identifiers and personal data and replaces them with an anonymised linkage field and reference number.

File 2 data relate to the administrative content or service aspect to the data (for example in relation to education, or health). File 2 data is separated from the personal data (in File 1) and sent directly to SAIL. This split file process means that DHCW will only ever see a list of people but never any of the administrative data about those people, and SAIL only ever sees de-identified datasets with no personal data included.

Once processed, the anonymised demographic elements of the datasets that contain a reference number and anonymised linking field are sent to the SAIL Databank, ready to be recombined with the File 2 content component of the dataset making them ready for linkage to other datasets for use.

Using this process, your data are then linked anonymously to other data sources in the SAIL Databank for non‑commercial research purposes for the public benefit only. Once researchers have completed their analysis, all outputs produced using data held in SAIL are reviewed and checked to ensure safe, legal use of anonymised data for research. This review has two main goals, to prevent the release of potentially disclosive results (that could reveal information about an individual), and to ensure that the data is used in accordance with all policies and any specific approvals in place.

The lawful basis for using your data

The lawful basis for processing information in this data collection exercise is our public task; that is, exercising our official authority to undertake the core role and functions of the Welsh GovernmentSome of the data across all administrative data collected are called ‘special category data’ (information about ethnicity, religion, sexual orientation and health). The lawful basis for processing this information is that it is for statistical or research purposes. However, this data is split from its identifying elements during the split file process and so is no longer associated with an individual.

Research studies using administrative data and data linking are important because they enable us to make decisions based on sound evidence. They allow the Welsh Government to gather population level insights and evidence that can inform government priorities. Administrative data research has been used to:

  • identify early indicators to help safeguard victims of domestic abuse
  • help evaluate school-based interventions to prevent youth homelessness
  • inform policy and planning around active travel to school

The security of your personal data

Personal data processed and prepared for DHCW and SAIL by Welsh Government is held on secure servers. It is prepared in restricted areas and folders with access limited to defined members in the Welsh Government ADR Wales team. The information processed for SAIL and DHCW will only be used for research purposes and to draw conclusions about the general population, not at the level of individual people.

The research results are reported in an anonymised form. Reports will not contain any information that makes individuals identifiable.

The SAIL Databank is at the forefront of technology providing microdata securely to researchers. This is shown in its accreditation to prepare and provide data to researchers under the Digital Economy Act 2017 (UK Statistics Authority). There are other platforms providing this service throughout the UK but none with the wealth of Welsh data already available in the SAIL Databank. Both DHCW and SAIL hold ISO 27001 certification which is in an internationally recognised certification covering managing information security risks and data effectively.

De-identified administrative data is made available to researchers via SAIL. However, officially recognised researchers, such as academics and NHS researchers, and other public sector bodies in Wales may also request access to unlinked data. These requests are scrutinised by the Welsh Government data access panel. If approved, they are governed by data access agreements issued by the Welsh Government. These formal agreements set out strict requirements for the processing and safekeeping of personal data.The researcher is given access to the data via a secure platform under the terms of the agreement and will not be given access to any personal data.

The Welsh Government has procedures to deal with any suspected data security breaches. If a suspected breach occurs, the Welsh Government will notify you and any applicable regulator where we are legally required to do so.

How long we keep your personal data for

The ADR Wales team in Welsh Government delete data once the SAIL team confirm the split process is complete and it has been transferred successfully and quality checked. The only exception to this is where data copies need to be periodically retained to compare and improve the SAIL transfer process. The original datasets held by the Welsh Government collection team remain on the Welsh Government secure servers for service delivery, statistical and monitoring purposes.

DHCW have access to some of the File 1 personal data, in the split file process including name, postal address, sex and date of birth (as noted above) solely to allow data linking with the SAIL Databank. Personal data is not retained once the anonymisation and linking process is finished and is automatically deleted by DHCW after processing.

The data in File 2 (non-identifiable event component) is retained in the SAIL Databank so that researchers can apply to access it as part of the de-identified datasets. Researchers will only be able to access the dataset for a limited period agreed during the project proposal process.

Individual rights

Under UK GDPR, you have the following rights in relation to the personal information you provide as part of administrative data research, specifically you have the right:

  • to access a copy of your own data
  • for us to rectify inaccuracies in that data
  • to object to or restrict processing (in certain circumstances)
  • for your data to be ‘erased’ (in certain circumstances)
  • to lodge a complaint with the Information Commissioner’s Office (ICO) who is our independent regulator for data protection

The contact details for the Information Commissioner’s Office are:

Information Commissioner’s Office,
Wycliffe House,
Water Lane,
Wilmslow,
Cheshire,
SK9 5AF

Phone: 0303 123 1113
Website: www.ico.org.uk

Further information

If you have any further questions about how the data provided as part of administrative data research will be used by the Welsh Government or wish to exercise your rights using the UK General Data Protection Regulation, please contact:

The ADR Wales team
E-mail address: ADRWales@gov.wales

The Welsh Government’s Data Protection Officer can be contacted at:

Data Protection Officer,
Welsh Government,
Cathays Park,
Cardiff,
CF10 3NQ

Email: DataProtectionOfficer@gov.wales