A formal plan describing how research data are managed throughout the lifecycle of a research project. Plans cover topics such as data collection, metadata and documentation, data sharing, and preservation. Portage DMP Assistant is a tool for preparing data management plans (DMPs), with the option to initiate project plans, collaborate with multiple researchers, and get connected with local guidance and support with data management at your academic institution.
For additional help with data management planning, contact a local representative.
Creating a data management plan saves time in the long run by integrating processes within and after the life of a project. A good plan minimizes the need to reorganize, reformat, or attempt to remember details about data when it comes time to disseminate and share it with others.
Internationally, many funding agencies have required data management planning for some time now (e.g. NSF, NIH, in the United States). In Canada, major funders are also developing research data management policies. The Tri-agencies (CIHR, NSERC, and SSHRC), for instance, have adopted a Statement of Principles on Digital Data Management that promotes excellence in digital data management practices and data stewardship for agency-funded research. CIHR already requires deposit of certain data types, and it is expected that all three agencies will establish clear data management requirements in the near future.
Start early. Developing clear comprehensive documentation to support data reuse and descriptive metadata to enhance usability are essential to the long term utility of your data. Ideally, documentation should be produced from the outset of a project and enhanced throughout the data lifecycle. Similarly, good data results from good planning – considering ahead-of-time how data will be collected, coded, and stored will save time in the long-run.
Data professionals can provide valuable guidance and support at the initial stages of the research project. This can significantly reduce the time and money needed to ensure long-term access. Contact a local representative .
Portage respects the privacy of researchers and their intellectual property throughout the entire data management process and the broader research lifecycle. This extends to use of the Portage online data management planning tool, the DMP Assistant.
Information added to the DMP Assistant is only accessible to the researcher and those with whom the researcher chooses to share access. Administrators of the DMP Assistant do not have access to researchers’ plans.
In some instances, it is clear that data management plans are ultimately intended to be shared – as part of grant applications to funding agencies, for example. In other instances, there may be no external motivation to share a plan. It is hoped that many researchers will be willing to share their completed data management plans for a variety of reasons including helping to broaden Portage’s understanding of how the tool is being used and to improve it for future researchers.
Data management plans are formal plans that are meant to be shared openly to inform others about planning for a particular project. If for some reason a DMP cannot be shared, you may still want to export a DMP to save locally or attach to a report or application of some kind to record the planning of your data management.
DMP Assistant, the tool for preparing data management plans (DMPs), allows for plans to be shared with others, including research collaborators to whom you have given permission to view or edit a DMP. This makes it easy for multiple researchers to collaborate on a DMP and invite reviewers to contribute. You can also export your DMP as a PDF from the DMP Assistant tool, to save locally or attach to an application or reference.
In future iterations of the DMP Assistant, a DMP repository will be set up and integrated to support easy publishing of DMPs to a central repository for others to find and view DMPs from across Canada. This will improve understanding of DMPs and specifically help others to learn more about the data management process surrounding your research.
Data specialists (often librarians) understand the management of data across project-level activities and the stewardship of research data after a project finishes and into the future. A data specialist can be particularly helpful when preparing a data management plan. They can provide advice on tools that integrate metadata with data production; recommend efficient workflow practices in producing data; identify services that can help with data management; suggest strategies for choosing a data format, a data security protocol, an appropriate persistent identifier, a data repository, data license template, and many more data management issues.
Data are rarely self-explanatory. As such, all research data should be accompanied by documentation, or metadata (information that describes the data according to community best practices). Metadata standards vary across disciplines. All generally describe who created the data and when; how the data were created; the quality, accuracy, and precision of the data; as well as other characteristics necessary to facilitate data discovery, understanding, and reuse.
Any restrictions on use of the data must be explained in the metadata, along with information about obtaining approval for access to the data, where possible.
Consult with your local data specialist to identify an appropriate metadata standard for your research.
Use open (non-proprietary) file formats whenever possible. Proprietary file formats that require specialized software or hardware are not recommended but may be necessary for certain types of data collection or analysis. Using open file formats or industry-standard formats, especially those widely used by a given community, is preferred whenever possible. Many software packages support exporting data in non-proprietary formats.
Recommended file formats include text (‘.txt’) or character-separated (‘.csv’) files — these are software- and hardware-independent, and as such are considered preservation friendly. Keep in mind that converting proprietary files into more open between formats may result in the loss of some information (e.g. converting from an uncompressed TIFF file to a compressed JPG file loses some information; macros and formatting in Excel files are not retained when spreadsheets are converted to .csv format). As such, changes to file formats should be documented and original files saved with preservation-friendly versions.
In summary, consider file formats that are:
(Louise Corti 2014)
Name files and folders clearly to ensure your data can be uniquely identified and made accessible for future uses.
Organizing and labeling data files and folders systematically will:
When developing file-naming conventions, consider the following three criteria:
(Derived in part from: http://datalib.edina.ac.uk/mantra/organisingdata/) (EDINA and Data Library – University of Edinburgh 2016)
 This helps avoid ambiguity, supports sorting, and follows the international standard: ISO 8601: YYYY-MM-DD; http://www.iso.org/iso/home/standards/iso8601.htm
Planning how research data will be stored and backed up throughout and beyond a research project is a critical component of data security and integrity. Appropriate storage and backup not only helps protect research data from catastrophic losses (due to hardware or software failures, viruses, hackers, natural disasters, human error, etc.), but also facilitates appropriate access by current and future researchers.
The risk of losing data due to human error, natural disasters, or other mishaps can be mitigated by following the 3-2-1 backup rule:
Reputable repositories already exist in some disciplines. Data from these disciplines should be deposited accordingly. Individual institutions may also have data repositories, and these may be appropriate in some instances. An excellent online registry of data repositories, re3data.org, provides the identity and location of both disciplinary and institutional repositories. Subject Librarians at your university may be able to provide for further assistance. Contact information for university libraries that have dedicated data management services or librarians is available here.
Reputable data repositories offer several benefits for those seeking to deposit data; these include:
Derived from: Louise Corti, V. V. d. E., Libby Bishop, Matthew Woollard (2014). Managing and Sharing Research Data: A Guide to Good Practice, Sage.
The key here is citation and unique IDs. Research data should be cited for the same reasons that journal articles, books, and other scholarly works are cited: to acknowledge the original author or producer and to help other researchers find the resource. This entails making data both findable and citable.
The use of unique and persistent identifiers is particularly helpful in this regard. For example, accounts on ORCiD, ImpactStory, ResearchGate, or Google Scholar can be adapted for use with research data. Furthermore, unique Digital Object Identifiers (or DOIs) can be assigned to data files, just as with scholarly publications.
While data sharing contributes to the visibility and impact of research; it has to be balanced with the legitimate desire of researchers to maximizse their research outputs. Equally important is the need to protect the privacy of respondents and to handle sensitive data properly.
Consider where, how, and with whom sensitive data will be made available and how it will be preserved and accessed in the future. Is it possible to create a public version of sensitive data, while retaining the usefulness of the file? These decisions should align with Research Ethics Board requirements. The methods used to share data will be dependent on a number of factors such as the type, size, complexity, and degree of sensitivity of data.