Jump to content


CNers have asked about a donation box for Cloudy Nights over the years, so here you go. Donation is not required by any means, so please enjoy your stay.


How do you manage your data (archive, etc.)?

  • Please log in to reply
5 replies to this topic

#1 ryanha


    Viking 1

  • *****
  • topic starter
  • Posts: 920
  • Joined: 05 Aug 2020

Posted 20 September 2021 - 10:08 AM

For those of you that image a lot, how do you manage all your data?


I am fortunate that I have clear skies close to 200 days a year and I have a semi-permanent setup so I am able to image a lot. I have 1TB on my NUC on the scope and I generally manually copy files in the morning (drag/drop with file explorer).  Recently I have started imaging unbinned with my ASI294mm and the files are pretty big, and with broadband imaging I do 60s subs, so this generates a good bit of data (35G/night).


In terms of backup, there is too much data to back it all up to cloud storage (especially when you consider all the pre-processed interim files) but I want to back up more than just my final image so I am starting to think about how to organize my file system to make backup easier.  



Here is what I am thinking:


Backup Strategy


Tier 1: Master stacks and processed final (backup to cloud)
So master stacks and processed final (and steps) should be backed up to the cloud.


Tier 2: Source data: Subs and flats (backup to removable HD and keep a copy on my PC)
Source data should be saved, but maybe just to local backup (archievied)


Tier 3: Calibration and preprocessed subs (registered, etc.) (backup to removable HD or discard?)
I guess I’ll hold on to these too until maybe one day if it is problematic?
Want to keep them separate though.



Calibration Data Organization
Want to have flat masters and dark masters organized somewhat.
Want to ensure that it is clear which image train/dates/gain, etc.
Separate out flat/dark subs from masters



Folder Structure (draft?)
So here is what I am thinking:


  • LIGHTS (this is just the raw LIGHT subs)
    • <TARGETx>
  • DARKS (this is just the raw DARK subs)
    • ASI294mm
    • Bin1
      • Gain 121
      • Gain 300
    • Bin2
  • FLATS (this is just the raw FLAT subs)
    • WO_unknown
    • WO925
    • WO666
  • PREPROCESSED (this is calibrated/regstered, etc.)
    • <TARGETx>
  • PROCESSED (this is the master stacks and post-processing)
    • <TARGETx>
  • FLATMASTERS (this is the stacked master flats, by image train, dated)
    • WO_unknown
    • WO925
    • WO666
  • DARKMASTERS (stacked master darks by camera/bin/gain)
    • ASI294mm
      • Bin1
        • Gain 121
        • Gain 300
      • Bin2



Interested in anyone's thoughts or input into this!




#2 JTank70


    Vostok 1

  • *****
  • Posts: 146
  • Joined: 20 Apr 2015
  • Loc: Maine

Posted 20 September 2021 - 10:15 AM

I was headed here this morning to research the same question.  Thanks. 

  • ryanha likes this

#3 rgsalinger



  • *****
  • Moderators
  • Posts: 9,487
  • Joined: 19 Feb 2007
  • Loc: Carlsbad Ca

Posted 20 September 2021 - 11:17 AM

I don't see much point in keeping the data that comes from intermediate steps. I only permanently store the data that I actually got, the resulting master images and the final finished product. It takes no more than 30 minutes for me (using WBPP in Pixinsight) a new set of masters from the original data given the way that I image.


I use google drive and google desktop drive. That way the data I'm taking is stored within minutes on my local computer - the one that I use for processing. So, by the time I get up and have coffee, my nights data is available for processing. Of course that's only going to work for long exposure (mine are 3-5 minutes) DSO imaging. 


I have 4 OTA's and 4 cameras, so my pattern may differ quite a bit from what other folks use. What I've settled on, for the highest level, is to classify the images by camera and then within camera by temperature. Then I have sub-directories for lights, targets, and dark/bias frames. 


I have sub directories within the lights by target, within the flats and dark/bias by dates. I use naming conventions within the subdirectories for any further differentiation like date and/or gain.


I decided a while back that I'd rather take longer exposures for NB imaging as opposed to using higher gain. I did that to simplify my need for multiple dark libraries, and because the imaging software the I use doesn't properly record the gain and offset and doesn't allow me to run fully automated while changing gain/offset for different targets.


I just use a 4TB USB drive to back up my really old images. Sooner or later, I'll buy another one and make a backup copy. Newegg has one for under 100 dollars. I don't really go back and reprocess things from more than a couple of years ago so I'm fine with that. Again, you may want to pay up and get a big cloud drive that gives you automatic backup. 



#4 vehnae


    Vostok 1

  • -----
  • Posts: 185
  • Joined: 17 May 2013
  • Loc: Finland

Posted 20 September 2021 - 11:19 AM

I copy the raw data from my imaging computer directly to my NAS that has mirrored disks and cloud backup. The directory hierarchy I use is /<season>/<target>/<date>/.


Each date directory has both the light frames and complete calibration masters (darks and flats) required to calibrate them. The master darks and flats are produced on the imaging computer and automatically distributed as needed (according to exposure time, gain, filters, polar angle) to accompany every light frame shot that night. Yes, I have quite many copies of the same master bias/dark/flat frames lying around but it's easier than to try to match correct calibration frames later on. When I start processing it's easy to just throw a whole directory to WBPP script and let it run one night at a time.


Additionally, for each target I have a registration reference image copied as 'framing.fit'. All the intermediate files (calibrated, registered, etc) are only on the processing computer and will be destroyed after I'm done with the image. At the end of the processing the stacked images and other .xisf/.psd/etc files (basically the 'master' directory) are copied to the NAS.


For mosaics, I consider each panel to be a separate target (e.g. "M42-1", "M42-2") and then establish a separate "M42" directory for storing processing artifacts.


So in the end the directory looks something like this:


  • 2020-2021/
    • M42/
      • 2020-10-14/
        • master-dark-25x300s-gain0-10C.fit
        • flat-master-L-PA001.fit
        • M42_2020-10-14_Light_PA1.4_L_300sec_1x1_0001.. etc
      • master/
        • L.xisf /* etc */
        • M42-final.psd
      • framing.fit


  ++ Jari

Edited by vehnae, 20 September 2021 - 11:22 AM.

  • lambermo likes this

#5 whwang



  • *****
  • Posts: 3,725
  • Joined: 20 Mar 2013

Posted 20 September 2021 - 11:54 AM

I keep all raw files, calibrated stacks, and final processed results permanently.  I keep intermediate processes for a year or two.


I store everything in a NAS, an external backup drive for the NAS, and cloud storage.  On the computer that I do processing, I keep only files that I need, and back them up in an external drive constantly.


I follow professional observatory's file naming convention: don't name the file, just give it a serial number.  The FITS headers and nightly observing logs are used to keep track which file is which.  File names are just long serial numbers.  Using file names to keep track things is messy, in my opinion.

#6 Rasfahan


    Viking 1

  • -----
  • Posts: 920
  • Joined: 12 May 2020
  • Loc: Hessen, Germany

Posted 20 September 2021 - 01:28 PM

I use Voyager to automate taking dusk and dawn flats, so I have fresh flats for all imaging nights. This works better with the open optics of the RCs. The imaging NUC has 2TB of storage. I transfer files from the NUC to an SSD-based NAS in the morning (this is not automatic yet. I should probably automate it). I keep everything that is fairly recent (6 months) on the SSD-based NAS for easy access. This backups to a disk-based NAS and to cloud storage. After that time raw image data and raw calibration frames are phased over to another, disk-based NAS (which also does backups to the first disk-based NAS and cloud storage). Masters and finished images stay on the SSD-NAS. Intermediate calibration results are discarded. 


My files are named. I don't have 200 clear nights a year so there's little danger things get too complicated with many targets during one night. I also find having the target name and some metadata in the filename makes it much easier to write quick scripts to reorganise things or fix problems (especially since WBPP can now read metadata from the filename - nifty if you want to automatically match nightly flats and corresponding lights for a multi-session calibration or if you reprocess old files and want to make sure the correct, old darks are used). 

CNers have asked about a donation box for Cloudy Nights over the years, so here you go. Donation is not required by any means, so please enjoy your stay.

Recent Topics

Cloudy Nights LLC
Cloudy Nights Sponsor: Astronomics