Backup script for imd files
I'm not sure what methods you are all using to backup the data from the Helios / CyTOF, but here is the method I have settled on. I hope it's useful or others can suggest better / faster methods.
I'm using 7zip and (optionally) some additional codecs. The codecs aren't strictly necessary, but give me about a 10% speed increase when using BROTLI vs. LZMA2 with the standard 7zip install.
EDIT - Thanks to Samuel for helping with tidying the code and removing the need to add 7Zip to the Environment variables!
EDIT 2 - Here's a Github link
Here's the batch file:
- Code:
@ECHO OFF
REM set the path of 7z
SET ZIPPER=C:\Program Files\7-Zip\7z.exe
REM Check the 7z is correctly installed
If Not Exist "%ZIPPER%" (
Echo Error: Zipper is not found as "%ZIPPER%"!
Goto END
)
REM Tune the zipping options (BROTLI Requires additional codecs)
REM 7z options have been tested and selected for optimal speed
Set ZIPOPTIONS=-mm=BROTLI -mx=2 -mmt24
REM -mm=BROTLI - use the BROTLI codec. If not using additional codecs, use LZMA2, which is slightly slower (~10%), but slightly better compression (~4%).
REM -mx=2 - compression quality (higher = better compression, but slower). Use -mx=0 if using LZMA2.
REM -mmt24 - use 24 threads; set as the number of virtual cores on your machine
REM Root folder to backup goes here:
SET WORKDIR=E:\User_Data\
REM Check if workdir exists
IF NOT EXIST "%WORKDIR%" (
Echo Error: working directory does not exist; check "%WORKDIR%"!
Goto END
)
REM change to working directory
CD /D "%WORKDIR%"
REM Loop across files recursively
REM Zip all IMD files in the WORKDIR directory (and subdirs)
REM but only if they haven't already been zipped
REM %%f = files
REM %%~pnf = path and filename (excluding extension)
FOR /R %%f in (*.imd) DO (
Echo Processing %%f
IF NOT EXIST "%%~pnf.7z" (
"%ZIPPER%" a %ZIPOPTIONS% "%%~pf%%~nf.7z" "%%f"
)
)
Echo Compression finished.
:END
Here's what it does:
1) Change to the directory I want to compress.
2) Find all the IMD files in that directory (and subdirectories).
3) See if they have already been compressed.
4) If they haven't, it will create new, compressed files.
5) Each IMD is compressed into its own identically-named 7z file (lossless compression).
After much experimentation, this was as fast as I could get it. e.g. 55 GB IMD file compressed in ~40 seconds to about 900 MB.
-mm=BROTLI - this will use the BROTLI codec
-mx=2 - this is the compression quality (higher = better compression, but slower)
-mmt24 - this will use 24 threads. I experimented with values from 1 to 256 and found that, on our Helios machine (24 virtual cores), this was the best value for speed.
I then back up the 7z files to a data server (using more code in the batch file to execute an rsync backup over SSH).
Hope this helps someone and I'd welcome improvements or suggestions of what methods you use!