Realtime image registration/stack (17 frames/sec)
Posted 10 June 2014 - 05:20 PM
Currently the CCD camera does 15 fps maximum, the unoptimised software is keeping up by taking 0.057 seconds (17FPS) to perform FFT based image registration and stacking. I see that it will be possible to achieve 30-60 fps as the current image is LRGB rather than mono - each channel being processed separately.
Longer-exposures open up DSOs, shorter FPS exposures open up Solar and Planetary - including the addition of on the fly drizzle.
I have just got this working - it uses the GPU to perform parallel computation rather than the CPU, hence the speed. I hope the weekend will be just as clear outside, however here is a youtube on the no-lens test. You can see the realtime processing and on a couple of frames the registration notes change in light caused by hand moment (gaps at the top and right of the frame):
I hope that this doesn't decent into VvsEA, however I hope to show over the coming months how realtime EA can be used on solar and planetary viewing (with in-field views I promise!). If that's acceptable!
Posted 15 June 2014 - 06:03 AM
Adding a real camera is now giving me 10 FPS (15FPS max with this CCD) - the slow point here is that I'm not multi-threaded to download the next image whilst processing the last.
Here's the 10FPS pointing out the back of the garden (just a few seconds):
Here's the 30FPS of the pseudo test camera (a file loaded from HD and given as the image - no waiting on USB download!):
1. it looks like a 1920 film - the vingetting is the Hann window applied to prevent FFT registration against the image boundaries. The full image can then be rotated/stacked and so you'll not see this in the full version.
2. It's red - because it's using one channel (red) of the RGBA of the GPU, adding a small amount of code to the GPU stacking means this could be any colour you like - including white Something to add later.
3. The flickering - there's a 16 frame reference frame that means it's getting cleared. Hence the stack builds in brightness before being cleared.
I still have some optimisations - adding a thread for I/O would be the more efficient way and would recover a considerable amount of FPS.. I think that will be next.
Posted 15 June 2014 - 12:01 PM
Posted 05 July 2014 - 01:20 PM
You can see the realtime alignment occurring - as I press against the scope the border of the image changes.
I still need to reoptimise - this will return speed and accuracy.
It's quite relaxing with the water in the background
Posted 05 July 2014 - 01:54 PM
So starting with that, it seems to me that a couple of things could be done, first, depending on the camera capabilities (FPS and download speed), we would want to take the quickest shots possible and then discard those that are not good enough to be used. Next we take those that we like and have them lined up prior to stacking. Finally they would be added to the current stack, discarding the oldest frames. In order to do this, there are several methods, I'm not really sure what would be best.
One thing to note is the number of frames discarded will have a lot to do with the seeing conditions and of course the settings.
Posted 05 July 2014 - 02:15 PM
1. if the rotation or translation delta are outside of a threshold. This indicates if the image is not being aligned correctly.
2. Sharpness of the image - the sharpness of slopes would be a good to grade - then use a histogram of steepness, if the majority drops too low then you know the image isn't particularly sharp and can be dropped.
Another alignment idea is to use an optic flow - detecting the parts of the image that have moved, if the parts of the image have moved beyond the threshold then that *part* of the image can be masked out (alpha=0.0).
The result is you keep the good data but throw away the bad. You don't have the complexity so you read maintain higher frame rate.
The extension of the optic flow is a bit more involved - making a distortion vector space and then rebuilding the images according to the vector space. This is like registax - although you'd not have the chance to pick the features manually but rather set the options for filtering.
One thing that can be done with GPUs - apple's "Mavericks" allows multiple GPUs as OpenCL does now.. this means you can perform some tasks on one and some on the other; with only small data (i.e. variables) being transferred between them.
I want to add pre-processing, darks and flats, this can be processed in one go without any issues (not that FFT has a problem with noise).
Going further is the analysis of a set of frames and the positions of stars within - as they move over time, the PSF can be used to reconstruct the image details. However this is likely to be seconds per frame..
edit: I forgot to say - I'll start rearchitecting the threading model of the app shortly. As you've indicated there's space for multiple threads to perform tasks whilst the GPU is busy aligning - this includes using the other smaller Intel GPU (or CPU even) to perform grading or other tasks.
The one GPU limitation that's a pain is that they;re not build for multi tasking (although newer nVidia are starting to) - so you can't overlap a new upload of image to the GPU whilst it's processing data.
Posted 13 July 2014 - 12:30 PM
As discussed I've implemented an on-the-fly grading system. You can set both the parameters for grading and the minimum grade level. In short it returns a really good result.
Posted 13 July 2014 - 12:34 PM
I've also started work on a PSF deconvolution. I should also start work on the sub-pixel alignment.
Posted 13 July 2014 - 12:44 PM
Posted 13 July 2014 - 01:15 PM
This is the image before:
Posted 13 July 2014 - 01:17 PM
This is with my little 4" APO - hence it will never end up as detailed as a SCT
Posted 13 July 2014 - 01:42 PM
Posted 15 July 2014 - 03:05 PM
This thread seems to be the continuation of something else. I am not sure, what is the software you are working on? Does your software have a webpage or any other good way to get oneself familiarize through?
Another question. Does your software absolutely need GPU processing? If yes, what GPU's is it working on?
Posted 16 July 2014 - 04:02 PM
The pipeline doing this is currently sat inside the driver test application I originally wrote for testing my ATIK OSX drivers - continues on as I have ATIK cameras (I don't work for them - the full-time job is in mobile-enterprise industry).
My background is software (BSc Software Engineering with parallel and distributed systems). I've been playing with GPU since 2005, since then I've done stuff with image processing - specifically realtime. This intersects my interests of astro, image processing, and not spending time slaving over a the mass of subs..
As I don't have much time to play around with AP.. I wanted something I can just plug and go.. hence writing this application as an extension to pull together the fast techniques that aren't focused on "pretty" but, rather, focused on _now_. Usually I have about 1-2 hours in a session.. and I can count the sessions on one hand since january..
I have a few things I want to build into the experimental pipeline - then I'll rehome it.. either by open sourcing it or making it a bit more portable. The code is portable, although slightly Apple orientated as it's a continuation of the test app.. adapted to experiment.
It's a considerable effort to write the UI for an app.. and I get an hour or so each day on the commute.. attempting to port the code base would probably be too much - hence considering the open source option as there's lots of new OSX, linux and windows capture apps that have done the hard work of Qt based development and got the cameras working over multiple platforms.
I have an interest in getting this working for solar, planetary and DSO - regardless of AP for DSO.. simply because I can see more with the kit I have in the short period of time I have.
When will this be available.. well I may release this in the current "example app" just so people believe it exists seriously though - I will probably continue developing the pipeline as there's plenty more interesting things to look at .. rather than reinventing the GUI wheel..
I'm currently debugging the PSF deconv. So you can either create a Bessel-based PSF integrated from 350nm to 850nm (currently this is using the pixel size and focal rations too) or even use a portion of the image as a PSF (so if you take a bright star - that's a PSF!). I also want to add upscale/sub-pixel - i.e. drizzle based stacking using a 1:10 accuracy.
Here's my buggy output from the current deconv .. It's *almost there* .. umm experimentally
Edit: fround one bug..the image below was on the titan 7.6um pixels not 5.4um pixels so the new image looks a little clearer..
Posted 17 July 2014 - 04:33 AM
Posted 17 July 2014 - 11:46 AM
Thank you for explaining this interesting project.
About three years ago I did some work in CUDA and the speedup gained through parallel processing was mind blowing. But those were big boards with 1024 GPU cores each, which was about the max at that time. I am sure that technology has advanced since then and now multi-GPU's are common in laptops.
Good luck with your project and keep us posted of your progress.
Posted 17 July 2014 - 12:03 PM
With the later GPUs this code will go faster.
I've now got the PSF based deconv working. Here's an example - two cropped PNGs incoming as the effect is very subtle:
Posted 17 July 2014 - 12:05 PM
Now using the same data - just with the sharpening switched on with a 650nm bressel PSF tuned to the scope and the camera - look at the planet bands:
Posted 17 July 2014 - 02:07 PM
Posted 17 July 2014 - 02:42 PM
Any subject would be good to be honest - although the GPU is feeling it with the 17.1MB images.
I've located a bug - I had a patch over it initially but the deconv has meant mapping out the impact and correcting it.. in the end it will be faster internally.
Noisy images welcome no need to flatten or use darks (I don't normally!).
Posted 20 July 2014 - 03:06 PM
* PSF deconv now working nicely
* linear stretch and basic histogram added
* sub-pixel alignment without second upscaled alignment
* drizzle upscale stacking (4x4 example attached, but this will go up to 10+x)
As Chris noted in another thread - I'm using the fast "bi-linear interpolation" that is fast but doesn't give the same quality results as more sophisticated techniques. The benefit to me is that the bi-linear is supported in hardware of the GPU making it even faster. I can add a more sophisticated technique and allow switching between the two.
I'll leave you all alone for a bit The attached is a cropped screenshot of the 4x4 to fit the 200Kb limit - click of full resolution:
Posted 22 July 2014 - 04:09 AM
Here's an 8x8 upscale drizzle using the current pipeline (PSF for the 3350mm f/32 on the little 4" APO that has 670mm native fl!) - this is a screenshot that's been cropped to fit the 200Kb limit to keep all the detail (helps if you stand back from the monitor on this one!):
Posted 26 July 2014 - 06:13 AM