====== File Management ====== * Guidelines for file management that Chih-Horng Kuo (chk@gate.sinica.edu.tw) developed for our group members. Suggestions are welcome. * Related information: * [[tutorials:scientific_presentation|Scientific Presentations]] * [[tutorials:scientific_writing|Scientific Writing]] ===== Backup ===== * Critical! * You will experience file loss. Not __IF__, but __WHEN__. * Establish an automatic backup plan; manual backup plans are not practical * Mac: Time Machine * Linux: rsync scripts ===== Working with collaborators ===== * Discuss and setup a workflow * Avoid emailing files as attachments; use a file server or file sharing service (e.g., Google Drive) instead. * Pros and cons of setting up a file server vs. using a commercial service. ===== File directory structure ===== * Make it easy to identify the parts that need to be included in backup * Example for personal desktops/laptops * Example for shared servers ===== Naming and version control ===== * File names * Do: short and informative. Include information such as project id, keyword (e.g. manuscript, report, keyword of figure/table), version or date * Avoid: long names that are difficult to read (e.g., full title of the manuscript) * Avoid: short names that are not informative (e.g., manuscript.docx, figure1.ai) * Avoid: space or special characters * Avoid: "xxx_final.docx", "xxx_final_revised.docx", "xxx_final_revised_typofixed.docx", etc * Versioning * Version number should be the last part of the file name; two digits should be sufficient * Optional: Add initials if multiple people are involved in the project * When in doubt, save as a new version * Recommended: In early versions of manuscript drafts, add a short section in the beginning of the document to explain the major changes made * Benefits * Write as much as possible without worrying if those parts will be kept in later versions * Throw away as much as needed to make a good story; you can always go back to previous versions to retrieve the deleted materials * Project progress report * In the beginning of the document, include: * Name (__who__ prepared this report?) * Date (__when__ was it prepared?) * Project id (__what__ is it about?) * Directory of relevant files (on lab servers for people to get raw data files if needed) * For long reports, prepare a short summary section * Branching and merging * When multiple people are involved, branches may be created for each person to work on a different part * It is important that everyone agree with the leader on when and how to merge the branches * Example * For project "agro38", start the main manuscript file as "agro38_ms_v01.docx" * ABC then saved the file as "agro38_ms_v02_ABC.docx" to work on Introduction * DEF creates "agro38_ms_v02_DEF.docx" to work on Materials and Methods * GHI creates "agro38_ms_v02_GHI.docx" to work on Figure Legend. * ABC being the project leader, should be responsible to set the deadline, collect the files, then merge and create "agro38_ms_v03.docx" as the starting point for the next iteration. * Figure/table files * Before finalizing the order, use names without fig/table number and with a keyword (e.g., "fig_phylogeny_v02.ai", "table_accession_v05.xlsx") * After finalizing the order, put all early versions into a separate folder (e.g., "figure_stage1"), then add figure/table numbers to the file names (e.g., "fig1_phylogeny_v15.ai") * If the order changed, put the previous versions into another separate folder (e.g., "figure_stage2"), then update the figure/table numbers (e.g., "fig3_phylogeny_v16.ai") ===== Tracking changes ===== * For Word files: better to use the build-in function "Track Changes"; possible to use the "Compare Documents" functions later. * For Google Docs: all changes are automatically tracked. Manually name and download the major versions.