tutorials:file_management
Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| tutorials:file_management [2023/08/20 23:36] – created chkuo | tutorials:file_management [2025/09/09 20:05] (current) – chkuo | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== File Management ====== | ====== File Management ====== | ||
| + | * Guidelines that Chih-Horng Kuo (chk@gate.sinica.edu.tw) developed for our group members. Suggestions are welcome. | ||
| + | * Related information: | ||
| + | * [[tutorials: | ||
| + | * [[tutorials: | ||
| ===== Backup ===== | ===== Backup ===== | ||
| + | * <color # | ||
| + | * File loss happens. Not __IF__, but __WHEN__. | ||
| * Establish an automatic backup plan; manual backup plans are not practical | * Establish an automatic backup plan; manual backup plans are not practical | ||
| - | | + | |
| - | * Linux: rsync scripts | + | * Linux: |
| + | ===== Working with collaborators ===== | ||
| + | * Discuss and setup a workflow | ||
| + | * Avoid emailing files as attachments; | ||
| + | * Setting up a file server vs. using a commercial service | ||
| + | * File server: | ||
| + | * Pros: Full control, large storage, no vendor lock-in | ||
| + | * Cons: Requires hardware, set up, and maintenance | ||
| + | * File sharing service | ||
| + | * Pros: Easy, low cost | ||
| + | * Cons: Limited space, less control over structure, potential data leakage. | ||
| - | * Naming and version control | + | ===== File directory structure ===== |
| - | * File names | + | * Make it easy to identify the parts that need to be included in backup |
| - | * Do: short and informative. Include information such as project id, keyword (e.g. manuscript, report, keyword of figure/table), version or date | + | * Example for personal desktops/laptops |
| - | * Avoid: long names that are difficult to read (e.g., | + | * A few top level folders with clear naming for easy backup/ |
| - | * Avoid: short names that are not informative (e.g., manuscript.docx, | + | * Example for shared servers |
| - | * Avoid: space or special characters | + | * Depends on the research group, for example: data/, lab_doc/, project/, conference/ |
| - | * Avoid: " | + | * Clear rules, communication, and enforcement |
| - | * Versioning | + | * A proper directory hierarchy helps to keep files well-organized, even when short file names are used |
| - | * Version number should be the last part of the file name; two digits should be sufficient | + | |
| - | * Optional: Add initials if multiple people are involved in the project | + | |
| - | * When in doubt, save as a new version | + | |
| - | * Recommended: | + | |
| - | * Benefits | + | |
| - | * Write as much as possible without worrying if those parts will be kept in later versions | + | |
| - | * Throw away as much as needed to make a good story; you can always go back to previous versions to retrieve the deleted materials | + | |
| - | * Project progress report | + | |
| - | * In the beginning of the document, include: | + | |
| - | * Name (__who__ prepared this report?) | + | |
| - | * Date (__when__ was it prepared? | + | |
| - | * Project id (__what__ is it about?) | + | |
| - | * Directory of relevant files (on lab servers for people to get raw data files if needed) | + | |
| - | * For long reports, prepare a short summary section | + | |
| - | * Branching | + | |
| - | * When multiple people | + | |
| - | * It is important | + | |
| - | * Example | + | |
| - | * For project " | + | |
| - | * ABC then saved the file as " | + | |
| - | * DEF creates " | + | |
| - | * GHI creates " | + | |
| - | * ABC being the project leader, should be responsible to set the deadline, collect the files, | + | |
| - | * Figure/ | + | |
| - | * Before finalizing the order, use names without fig/table number and with a keyword (e.g., " | + | |
| - | * After finalizing the order, put all early versions into a separate folder (e.g., " | + | |
| - | * If the order changed, put the previous versions into another separate folder (e.g., " | + | |
| - | | + | ===== Naming and version control ===== |
| - | * For Word files: better | + | |
| - | * For Google Docs: all changes are automatically tracked. Manually name and download | + | * Do: short and informative. Include information such as project id, keyword (e.g. manuscript, report, keyword of figure/ |
| + | * Avoid: | ||
| + | * Long names that are difficult | ||
| + | * Short names that are not informative (e.g., manuscript.docx, | ||
| + | * Space or special characters (can cause problem across systems) | ||
| + | * Names such as "xxx_final.docx", " | ||
| + | * Versioning | ||
| + | * Version number should be the last part of the file name; two digits should be sufficient | ||
| + | * Optional: Add initials if multiple people are involved in the project | ||
| + | * When in doubt, save as a new version | ||
| + | * Recommended: | ||
| + | * Benefits | ||
| + | * Write as much as possible | ||
| + | * Throw away as much as needed | ||
| + | * Project progress report | ||
| + | * In the beginning of the document, include: | ||
| + | * Name (__who__ prepared this report?) | ||
| + | * Date (__when__ was it prepared? | ||
| + | * Project id (__what__ is it about?) | ||
| + | * Directory of relevant files (on lab servers for people to get raw data files if needed) | ||
| + | * For long reports, prepare a short summary section | ||
| + | * Branching and merging | ||
| + | * When multiple people are involved, branches may be created for each person to work on a different part | ||
| + | * It is important that everyone agree with the leader on when and how to merge the branches | ||
| + | * Example | ||
| + | * For project | ||
| + | * ABC then saved the file as " | ||
| + | * DEF creates " | ||
| + | * GHI creates " | ||
| + | * ABC being the project leader, should be responsible to set the deadline, collect the files, then merge and create " | ||
| + | * Figure/ | ||
| + | * Before finalizing the order, use names without fig/table number and with a keyword (e.g., " | ||
| + | * After finalizing the order, put all early versions into a separate folder (e.g., " | ||
| + | * If the order changed, put the previous | ||
| + | * Very important to save the files in editable formats (e.g., .ai, not just .jpg) | ||
| - | | + | ===== Tracking changes ===== |
| - | * Discuss | + | |
| - | * Avoid emailing | + | * For Google Docs: All changes are automatically tracked. Manually download the major versions |
| - | * Pros and cons of setting up a file server vs. using a commercial service. | + | |
| + | ===== Metadata ===== | ||
| + | | ||
| + | * Examples | ||
| + | * For the raw data file of a gel image (20250509_1428.jpg), | ||
| + | * For a set of raw data files (e.g., Sanger sequencing results), or a file folder, provide | ||
| - | * File directory structure | ||
| - | * Make it easy to identify the parts that need to be included in backup | ||
| - | * Example for personal desktops/ | ||
| - | * Example for shared servers | ||
tutorials/file_management.1692545810.txt.gz · Last modified: by chkuo