Data explained
Data explained
Sourcing data
What data do I need?
A list of the data collections currently available are provided on each of the data linkage unit websites. A list of the core datasets (those that are routinely linked) can be accessed here. Researchers will often be encouraged to talk to either the data custodians of the data collections from which they are requesting data or the client services officer to discuss the type of information held in the data collection, quality and whether it is likely that the proposed research question can be answered by the data collections requested.
Who has the data?
Within the PHRN the data linkage units are not data repositories and do not receive content data. The content data required for a research project is held by data custodians. Data custodians are the organisation or agency which is responsible for the collection, use and disclosure of information in that data collection. The data custodian is responsible for contributing to the guidelines and approval processes on the use of the data, including involvement with ethics committees and input to the protocols surrounding data use.
In some cases the data linkage unit will act on the data custodians’ behalf and request that researchers contact the data linkage unit rather than the data custodians. The contact person for each of the core data collections have been provided here. For those datasets not listed, please contact the client services officer from the data linkage unit to determine the most appropriate person to contact.
Data flow
Who will I get the data from?
The researcher (contact investigator) will receive the data from each of the data custodians from which they requested data, from the data linkage unit, or a combination of both. Some data linkage units assist with the preparation of data prior to release to researchers. The tasks associated with this service include, pre-merge checking of data extracts, addition of derived variables to data extracts and merging of data extracts, post-merge checking prior to making data available to researchers and provision of data to researchers.
How will I get my data?
There are currently several data transfer methods that researchers can use to send and receive files from data custodians and data linkage units. Data extracts can be transferred to the researcher via a secure transfer service, e.g. made accessible through SURE. Data should not be sent by e-mail.
Data format
What will my data look like when I get it?
As a researcher you will receive only the Project Person Numbers (PPN), Project Event Number (PPE) and their associated content variables, as listed in your approved application.
The amount of data researchers receive and how it's structured depends on the number of data files and fields requested, the temporal scope, and the size of the requested cohort.
Depending on the data linkage unit involved, the data may be provided to the researcher already merged. In most cases the researcher will receive the data as multiple files and be required to merge the data themselves. A separate file is usually provided for each data collection in each year. For example, a researcher applying for data from the birth registry, perinatal data collection and admitted patient data collection, for the date range 2000-2009, would typically receive 30 files in total.
The data will be delivered in a variety of different formats, depending on the data linkage unit and data collection involved. Some data linkage units may deliver the data in a standardized format that can be easily read into any statistical analysis software, e.g. tab delimited text files. In addition to the data files, researchers will also be given metadata for each corresponding data collection, including a data dictionary. The data dictionary provides coding information to assist researchers in interpreting the data.