Save and share data

The regular backup of research data plays a central role in research data management. It is the responsibility of the researchers, who are supported by the IT Service Center with the following services.

Need more storage for your research project?

The ITS provides you with central storage services. As a rule, access is provided simply by mounting the storage as a network drive in the file structure of your own computer. In addition to the so-called personal "home directory", which is automatically created with the university account, there is the possibility to request a "group resource" to allow a group of uni-internal users, which you can define, to access the data. Both storage areas are automatically included in the nightly data backup.

SharePoint can help you collaborate on your research project, even with project members external to the university. It offers groupware, document and project management functionalities, but the data remains on servers of the University of Kassel.

(more information can also be found here: SharePoint ITS)

Hessenbox can also help them collaborate on your research project. Sharing, versioning and synchronizing between different end devices is guaranteed in terms of security, confidentiality and access protection.

The IT Service Center operates a Linux cluster for scientific applications with high CPU and memory requirements. The Linux cluster is a network of networked computers running a Linux operating system with an access computer and computers for actual job processing on which the application programs run.

For special tasks, you can rent virtual servers at the IT Service Center (hosting) or place and operate your own server hardware (housing). Please contact the IT Service Desk for more information.

For regular backups of your data on workstations and servers within the university's data network, you can use the TSM (Tivoli Storage Manager) backup system operated by ITS.

FAQ

In the work process, not only a large number of data sets are often created, but also respective versions due to various modification stages . With a view to efficient work, coordinated collaborative work processes, long-term traceability and, if necessary, internal or external reusability, it is advisable to define specific conventions for naming and versioning data records. If necessary, it may also make sense to define additional folder structures according to the degree of processing. The conventions should in turn be documented.

Naming conventions may look very different depending on the specifics of the research areas and data. They should reflect what type of data files (original data / raw data, cleaned files, analysis files) or what file form (working file, results file, etc.) are involved. This differentiation can also be done via versioning conventions. Uniformity, unambiguity and meaningfulness are important .

Examples for meaningful file naming are for example:

  • [sediment]_[sample]_[instrument]_[YYYYMMDD].dat
  • [experiment]_[reagent]_[instrument]_[YYYYMMDD].csv
  • [experiment]_[experimental design]_[subject]_[YYYYMMDD].sav
  • [observation]_[location]_[YYYYMMDD].mp4
  • [interviewee]_[interviewer] ]_[YYYYMMDD].mp3

To ensure compatibility between different operating systems, special characters (except underscores and hyphens) and umlauts should be avoided. File names should not exceed 21 characters.

Read-onlyversions should be created at various stages of modification (e.g., original data, cleaned data, analysis-ready data). Further edits should only be made to copies of these master files.

A well-known concept of versioning , based on the Data Documentation Initiative ( DDI) standard, is:

Starting from version "v1-0-0", the following changes are made:

1. the first digit, if multiple cases, variables, waves or sample have been added or deleted

2. the second digit, if data are corrected so that the analysis is affected

3. the third digit, when simple revisions are made without relevance to meaning.

Conventions should always be adapted to the subject or project specific needs. If, for example, versions are not in a linear relationship to each other, relationships can be defined via special metadata schemas (such as the DataCite Metadata Schema) ("IsDerivedBy", "IsSourceOf")

Versioning can also be supported by appropriate software (e.g. Git).


These solutions only relate to the (temporary) backup of their work files and are limited in scope and function. They can be combined with other storage options (such as external storage media). This is not sufficient for storing data in accordance with good scientific practice. Cf.: ↗Where do I archive my data in the long term?.


Unless otherwise noted, all texts on this site and its subpages are licensed under a Creative Commons Attribution 4.0 International License.