Data Management Planning
Metadata, most simply defined as “data about data,” is structured information that describes content and represents relationships. Metadata makes it easier for you and others to identify and reuse data correctly at a later date. Basic metadata generally includes:
- Who created the data
- What the data file contains
- When the data were generated
- Why the data were generated
- How the data were generated
Formal metadata standards, unlike basic documentation, allow searching and aggregation of research data from many researchers. In other words, it makes your data easier to find and group with similar data. Which metadata standard is right for your data depends on the type, scale, and discipline of your research project. The UK's Digital Curation Centre has a list of metadata standards by discipline.
Some examples of common metadata standards are:
- Content Standard for Digital Geospatial Metadata: Geographic metadata standard.
- Darwin Core: Biodiversity metadata standard.
- Data Documentation Initiative: Metadata standard for surveys and other observational methods in social, behavioral, economic, and health sciences.
- Dublin Core: Broad and generic metadata standard, usable for describing a wide range of resources.
- Ecological Metadata Language: Earth, environmental, and ecological science metadata standard.
If your field or data type have no formal standard or if you just need a simpler system to keep track of your own data, consider the general guidelines below.
Source: Adapted from UCLA Libraries
Basics of Documenting Your Data
In all likelihood, you are already capturing much of the basic metadata about your research. Your lab notebooks, research files, and codebooks hold much of this information, such as:
- Researcher name
- Details of the experiment/analysis being run, including the purpose and methods used
- Sources of other data used in the experiment/analysis
It is important to begin to document your data at the very beginning of your research project, even before data collection. This will make data documentation easier and reduce the likelihood that you will forget aspects of your data later in the research project. Make sure you link all metadata to the data files themselves. If there are no specific metadata standards for your data, make sure to include the following terms.
|Title||Name of the dataset or research project that produced it|
|Creator||Names and addresses of the organization or people who created the data|
|Identifier||Number used to identify the data, even if it is just an internal project reference number|
|Subject||Keywords or phrases describing the subject or content of the data|
|Funders||Organizations or agencies who funded the research|
|Rights||Any known intellectual property rights held for the data|
|Access information||Where and how your data can be accessed by other researchers|
|Language||Language(s) of the intellectual content of the resource, when applicable|
|Dates||Key dates associated with the data, including: project start and end date; release date; time period covered by the data; and other dates associated with the data lifespan, e.g., maintenance cycle, update schedule|
|Location||Where the data relates to a physical location, record information about its spatial coverage|
|Methodology||How the data was generated, including equipment or software used, experimental protocol, other things one might include in a lab notebook|
|Data processing||Along the way, record any information on how the data has been altered or processed|
|Sources||Citations to material for data derived from other sources, including details of where the source data is held and how it was accessed|
|List of file names||List of all data files associated with the project, with their names and file extensions (e.g. 'NWPalaceTR.WRL', 'stone.mov')|
|File Formats||Format(s) of the data, e.g. FITS, SPSS, HTML, JPEG, and any software required to read the data|
|File structure||Organization of the data file(s) and the layout of the variables, when applicable|
|Variable list||List of variables in the data files, when applicable|
|Code lists||Explanation of codes or abbreviations used in either the file names or the variables in the data files (e.g. '999 indicates a missing value in the data')|
|Versions||Date/time stamp for each file, and use a separate ID for each version (see file organization)|
|Checksums||To test if your file has changed over time (see data backup)|
Source: Adapted from UCLA Libraries
- Last Updated: Jun 25, 2018 9:17 AM
- URL: http://libraryguides.fullerton.edu/DMP
- Print Page