It is important to organize your data so your data can be easily used and understood by others. Key practices to organizing your data include having a consistent folder and file structure, using recommended file formats, and using a file naming convention (FNR). Using tools such as LabArchives (an Electronic Lab Notebook) can help you keep your data organized and managed. Your folder structure should be documented and described in your readme.txt file.
LabArchives is the world’s leading ELN (Electronic Laboratory Notebook) with 700,000 scientists and more than 80,000 students using the LabArchives Platform this year. The University of Rochester is currently in the process of acquiring institution access to LabArchives.
Please go to the River Campus Libraries Guide on LabArchives for more information.
Taguette is a free and open-source tool for qualitative research. You can import your research materials, highlight and tag quotes, and export the results. User can:
It is imperative that you think carefully about the file formats you use to manage, share, and preserve your data, as technology is always changing, and software can become obsolete.
According to the DMPTool, formats likely to be accessible in the future are:
Examples of preferred format choices include:
Another good resource to use to learn more about file formats is UK Data Service Guidance on Recommended Formats.
├── bin <- Your compiled model code can be stored here (not tracked by git)
├── config <- Configuration files, e.g., for doxygen or for your model if needed
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
├── docs <- Documentation, e.g., doxygen or scientific papers (not tracked by git)
├── notebooks <- Ipython or R notebooks
├── reports <- For a manuscript source, e.g., LaTeX, Markdown, etc., or any project reports
│ └── figures <- Figures for the manuscript or reports
└── src <- Source code for this project
├── data <- scripts and programs to process data
├── external <- Any external source code, e.g., pull other git projects, or external libraries
├── models <- Source code for your own model
├── tools <- Any helper scripts go here
└── visualization <- Scripts for visualisation of your results, e.g., matplotlib, ggplot2 related.
A file naming convention (FNC) is a framework for naming your files in a way that describes what they are and their relationship to other files. It is important to create the FNC at the very beginning of the project. Make sure everyone involved in the research project is aware of the FNC, and that all members consistently used it. You want to record the FNC in your readmt.txt file and in the data documentation section of your research data management and sharing plan.
General rules to follow include:
Information to consider including in your FNC:
Include the formula for the FNC in your readme.txt file, including the meanings of any acronyms that need to be used in the FNC.
|Date||The date the interview was taken in YYYYMMDD format.|
|Interviewee||Pseudonym of the interviewee.|
Which document type is this:
Notes - Raw notes taken by the interviewer during the interview process.
Transcript - Transcript created from the audio file of the interview.
The location where the sample was taken.
ERI - Lake Erie
ONT - Lake Ontario
|Date||The date the sample was taken in YYYYMMDD format.|
|Version Number||The version number of the table. Record as vXX.|