How to estimate the costs of data management?
Description
The processes of data management will incur costs. The expenses may consist of people’s time, tools and services needed for managing the research data during the whole life cycle of the project. You should estimate these costs and address the resources needed in your data management plan.
Considerations
- Budgeting and costing for your data management is often dependent upon local and temporal circumstances, institutional resources, services, and policies. You have to take into account even items such as investments (site services, infrastructure), operations (network, electricity, maintenance) and personnel costs.
- Most research funders will cover justifiable RDM costs. In research funding applications remember, that there are typically two types of eligible costs. Direct costs referring to eg. staff time and equipment and indirect costs including things like administrative and financial management.
- Note that there may be costs of data even beyond the end of the project.
- Planning and implementing data management well can save you from arising costs later on during your research project.
- The bigger your project (or e.g. infrastructure) is or the more partners are involved, it is useful to consider the measures needed to implement and operationalise data management. For example, would you need a dedicated data manager?, should roles and responsibilities about various data management activities be allocated?, do you need training, other resources, or extra time? All of these aspects are important to be taken into consideration when addressing data management costs.
Solutions
- To get an overview of possible costs in your research project, you can go through different research life cycle phases and activities specific for your project.
- Some organisations have created tools, for their users, to help formulate and budget data management costs; such as Data Stewardship Wizard Storage Costs Evaluator, the UK Data Service Data Management costing Tool developed by the UK Data Service, and the TU Delft data management costing tool. These tools can help to budget for personnel costs and/or additional costs that are needed to preserve and share research data beyond a research project.
Costs for data stewards
- Personnel costs for data stewards is an eligible cost in many projects although with limitations on the number of full time employee (FTE). Check if this cost is eligible in your grant.
- Consider which data management tasks will be carried out by a data steward, and how many person-months will be needed for that.
Costs for data collection
- Collecting and reusing data: Collecting data sometimes involves equipment or services that may incur costs. On the other hand, if you reuse data from a data repository it is worth checking whether there are some costs involved using the service.
- Granularity of data: When collecting data it can be tempting to collect more data than is required to answer the research question. Hence, it is important to consider what data is needed as the more data you have, the more expensive it gets to store, clean, and transfer the data.
- Organizing and formatting data: Keeping data organized from the start of the project will help to manage the data later. Keep an up-to-date data catalogue/registry of data provenance and plan beforehand a clear file structure, names and templates so the involved costs and time spent on organizing the data are lower.
Costs for data processing and data documentation
- Anonymisation & pseudonymisation: When working with sensitive or confidential data, consider the possibility of anonymising or pseudonymising the data for controlled or public release. Costs may rise either from time spent on anonymise data or if the service needs to be bought from an expert.
- Digitisation: Some project data can be in a paper-based or analogue format that needs to be digitised. Consider if special services, equipment or software are needed and if results need to be manually checked. These may incur direct costs or time consumed.
- Data documentation: To make data understandable, it needs to be documented, meaning creating information, which enables the interpretations of the data correctly and independently. Describing the data context, methodology, creation process, what the variables and abbreviations mean, as well as how the data was processed and quality controlled takes time.
- Data cleaning: Cleaning data files or verifying data take time and accuracy. At the end of the project cleaning data files for sharing purposes take much more time than keeping the data well organised during the project while collecting and processing it.
Costs for data storing, access control and security measures
- Data storage and back-up: Regardless of data types your research project has, the data needs to be stored in a secure place with adequate access control. Consider your needs for active data storage during the lifetime of the project, as well as archiving needs beyond the end of the study. Find out storing costs from your local service provider.
- Access and security: To protect data from unauthorised use, consider what kind of access control and security measures does your data need. Especially, if you work with sensitive or confidential data, there might be solutions that incur costs, for example if special protected servers or services or software for encrypting data are needed.
Costs for making data available (publishing/sharing) and preserving
- Data sharing: Publishing and sharing data is highly recommended and there are many services available for that. Depending on what service suits your needs, there can be costs that the service provider charges to keep the data available such as a sum per GB or per year. Take also into consideration that cleaning the data to a format that it can be shared as well as creating discovery metadata will take time. The most straightforward and often most cost-effective way to make data available is to deposit it in a public data repository service.
- Data reuse: When making your research data available to others, you have to consider if other parties hold copyright in the data and if you need to seek copyright clearance before sharing data. There may incur costs from juridicial advice and time required to seek copyright clearance. Read more about licensing research data.
- File formats: Data analysis sometimes involves data conversion, that changes the structure in data organisation. Converting data to a different (open or standard) format for sharing or preserving data can incur costs, and therefore, it is important to ask: is there additional software or hardware required for data conversion? How much additional space is required to store new files? Does converted data need to be stored elsewhere, and do the source files need to be transferred too?
More information
Skip tool tableTools and resources on this page
Tool or resource | Description | Related pages | Registry |
---|---|---|---|
Data Stewardship Wizard Storage Costs Evaluator | This service provides simple estimation of storage costs based on desired properties and local/actual configuration. | ||
TU Delft data management costing tool | TU Delft costing tool helps to budget for data management personnel costs in proposals. | ||
UK Data Service Data Management costing Tool | UK Data Service activity-based costing tool. |