FITSFlow : Automatic Image Flow Estimation for FITS Files
İSky Coyote, 2 Feb 2007


FITSFlow is a Macintosh OS X program that estimates the flow between 2 FITS images. This flow is calculated at a regularly spaced grid of points, and uses an iterated method to produce a vector field of flow direction and magnitude at each point in the grid. The general procedure for estimating the flow is similar to those discussed in Evans (1) and in Wu (2) listed in the references below. FITSFlow can estimate the flow for a number of different 'flow element' geometries, which consist of user-defined rectangular subregions in each of the two images. Vectors can be estimated to the nearest pixel, or to subpixel resolutions using an iterative image interpolation scheme. FITSFlow can be run manually, interactively, from the GUI, or automatically and repeatedly by the use of batch command files. FITSFlow can save the vector field in both text and image files after each run. FITSFlow requires a G4, G5, or Intel Macintosh computer running OS X 10.3.9 (PPC only), 10.4.8, or greater. FITSFlow is free for all non-commercial users.

To download FITSFlow, click the following links:

PPC version, or download from http://www.skycoyote.com/sky/FITSFlow/FITSFlowPPC.31Mar07.tar.gz.
Intel version, or download from http://www.skycoyote.com/sky/FITSFlow/FITSFlow86.31Mar07.tar.gz.
Example images, or download from http://www.skycoyote.com/sky/FITSFlow/FITSFlowImages.tar.gz.

You can see examples of using FITSFlow on synthetic test and distorted images here.


NEW: FITSMap/FITSFlow update 28 Mar 07: Mercator projection mapping and flow compensation.

(Click for larger image)


Reading FITS Images

FITSFlow reads two FITS files of the same size. These images show the state of a 2d system (e.g. clouds on Venus) at two different times. The vector field produced represents the absolute flow in pixels from image 1 to image 2 at a grid of evenly spaced points. Images submitted to FITSFlow should be >= 0 at every pixel, and should be preprocessed and aligned. Only a single monochrome 2d image in the first HDU of each FITS file is supported. All values are converted to double precision on input. Images can be masked (= 0) in regions where the flow should not be estimated.

To read in 2 images, launch FITSFlow and type the image paths into the fields 'File 1' and 'File 2'. Click the mouse in the first path and press 'enter' to load the first file, then click the mouse in the second path and press 'enter' to load the second file. Alternately, you can click on the file buttons to use a dialog to browse for and load the image files. The first image will be displayed in the plot window. You can reload/overload images at any time.

The two images used in this example are of Venus taken by Eliot Young and Mark Bullock on 7/13/04 with the NASA Infrared Telescope Facility on Mauna Kea, Hawaii. Each image is a composite of several raw images which have been preprocessed, aligned, stacked, averaged, sharpened, enlarged, and masked by me using custom software. They are about 1h13m apart. The steps required to create each of these composite images are shown here.

You can use the 'Image Scale' buttons and fields to adjust the brightness of the image displayed in the plot window. To change the maximum and minimum brightness settings, type a new value into the field and press 'enter'. These settings only affect the image if the scale mode is set to 'manual'. Changes to the image scale have an immediate effect.

FITSFlow Settings

The settings which contribute to the creation and display of a vector flow field are:

Grid: The X and Y grid spacing in pixels.

Domain: The size, in pixels, of the rectangular regions in the first
    image that are used to look for matches in the second image.  Each
    domain is centered on a grid point.
       
Range: The size, in pixels, of the rectangular regions in the second
    image that are searched for occurrences of the domains/templates
    from the first image.
      
Range offset: The distance in pixels from each grid point to the
    center of the domain.

NOTE: After changing the grid, domain, or range settings, press the
    'Refresh' button to redraw the flow element geometry in the small
    figure at right.  Domains and ranges need not be concentric.  Here
    is an example of a range offset from its domain and grid point:

    By using rectangular domains and ranges, and a range offset, you 
    can control where FITSFlow searches for matches between images.

Plot flow only: Plot only the estimated flow vector at each grid
    point.

Plot everything: In addition, plot the domain and range rectangles, and
    all candidate flow vectors (local maxima of cross correlation
    surface).

Labels: Show/hide text information in the plot.
     
Vector length: Multiply the length of each flow vector by this value
    in the plot.  This value can be a real number >, =, or < 1.
               
Vector offset: Add these values to every vector before plotting. 
    These can be used to remove an average vector value from the plot.
               
Vector scale and vector max: Sets the maximum value of vector
    magnitude in the color bar.

Clip to max: Clips all vectors longer than the manual maximum
    magnitude.
              
NOTE: After changing any of the plot/vector settings, press the
    'Refresh' button to redraw the plot.

Neighborhood: The radius, in pixels, around each grid point which is
    used to select additional flow elements to iterate the flow by the
    procedure described below.  In order to use the iterated method,
    you must ensure that the neighborhood radius is at least as large,
    or larger, than the grid spacing.
              
Sigma: The difference, in pixels, between a candidate flow and another
    flow in its surrounding neighborhood, for which the vector
    correlation coefficient weight falls to 1/e.  This setting
    determines how fast the iteration changes the flow field.  A
    larger value of sigma will change the flow a smaller amount each
    iteration.
       
Resolution: The resolution, in pixels, of the resulting vectors.  Note
    that performing a flow estimation to a resolution of less than
    one pixel will require considerably more time and memory than
    integer resolution, depending on your other settings.  In general,
    it is not suggested that you use a resolution of less than 1/8
    pixel.
      
Postprocess: After flow calculation, perform statistics on the
    neighborhood of each grid point and replace flows which are
    outside of the given threshold (in standard deviations) with the
    mean for the neighborhood.  Note that this can change the result
    to a vector which is not one of the actual flow candidates.
            
Thresh: Threshold in standard deviations for vector substitution.

FITSFlow Controls

The control butons are:

Reset: Stop computations, discard all calculations, and redraw the
    image.

Run: Reset the calculations and perform an initial estimation of the
    flow using the method described below.  After the initial
    estimation, perform additional iterations on the flow if the
    'iterations' field is > 0.  Perform post processing on the flow if
    the 'postprocess' button is checked.
    
    NOTE: The 'run' button always performs a reset before starting 
    calculations.
    
Step: After a flow has been estimated, perform additional iterations
    and display the results, if the 'iterations' field is > 0. 
    Perform post processing on the flow if the 'postprocess' button is
    checked.
     
Stop: Stop computations, but do not discard calculations or redraw the
    plot.  This button is intended primarily as a way to exit the
    thread performing the calculations, if they appear to be taking
    too long.
     
Refresh: Redraw the plot using the new values of the plot settings
    above.

NOTE: There is currently a thread synchronization problem which may not 
    cause the plot window to be redrawn at the end of a run.  It is 
    probably best to manually refresh the plot after each run if there 
    is any doubt about the display.

The 'neighborhood', 'sigma', 'resolution', 'postprocess', and 'thresh' settings only take effect when the 'Run' or 'Step' button is clicked. The image file and scale settings take effect immediately. All other settings take effect when the 'Refresh' button is clicked.

You can simultaneously create as many FITSFlow 'documents' as you like. Each document has an independent set of parameters and plot windows, flow calculations and vectors. Choose the 'File->New' menu item to create a new document.

Calculating a Flow

To perform an initial, basic, estimate of the flow between the images loaded above:

1. Set the grid size to 16, domain to 32, and range to 96 pixels.
2. Set the neighborhood to 32.
3. Uncheck the 'postprocess' button.
4. Set the 'Run iterations' to 0.
5. Resize the plot window to show the entire image.
6. Click the 'Run' button.

To perform an initial flow estimate, FITSFlow uses the domain region at each grid point in the first image to look for matches within the range region around each grid point in the second image. Matches are performed using the parametric correlation coefficient:

where f is the domain subimage at grid point (x, y), and g is each range candidate subimage at offset (u, v). All points {u, v} in the range which are local non-negative maxima in the correlation surface produce candidate flow vectors at offsets {u, v}. To see some of the domains, ranges, and candidate vectors for these two images:

1. Set the grid size to 100.
2. Set the 'Plot' buttons to 'Everything'.
2. Set the 'Vector length' to 2.
3. Click the 'Run' button.

The two gray squares at each grid point show the domains and ranges of the flow elements. All vectors within each range are candidates for the flow at that point, in that they are peaks in the correlation surface. The single vector chosen as the flow estimate at each point is the candidate with the highest correlation value. To go back to the single-flow calculation and plot, reverse the changes you just made:

1. Set the grid size back to 16.
2. Set the 'Plot' button to 'Flow only'.
2. Set the 'Vector length' to 1.
3. Click the 'Run' button.

Iterating a Flow

Note that many of the flow vectors appear to be in the wrong directions, and that there is generally more variation to the overall flow than would be expected by viewing these two images in rapid succession in a movie. This is possibly because the initial flow estimation picked the wrong candidate vectors at some grid points, simply because they had higher raw correlation values than other candidates. This can be corrected for by iterating the flow and recomputing the vector candiate probabilities based on the candidates in the surrounding neighborhoods. To do this, click on the 'step' button:

You can continue iterating the flow by clicking the 'step' button additional times. Here are flows for the 2nd and 5th iterations:

Note that with additional iterations, the flow becomes smoother and there is less random variation. After only a few iterations, the flow will stabilize and change very little with further iterations. Also, in this case, the flow magnitude has decreased in both its maximum and average magnitudes, with most vectors in the range of about 6 pixels, which is confirmed by a visual inspection of the two images.

Also note that the iteration calculation does not change any of the flow vectors, it is only selecting different candidate vectors from the correlation surface peaks which are more compatible with the candidate vectors in other flows around a given grid point (the flow 'neighborhood'). All candidates are local maxima in the correlation surface, but will vary in their absolute correlation values. Because of noise and distortion in the two images, the raw correlation values may not accurately represent the actual flow between the images, whereas selecting different peaks in the correlation surface produces a more realistic flow estimate. In the plot window, the 'nchanges' label shows the number of flow vectors which were changed to a different candidate by the current iteration. Generally, this number will decrease toward and reach zero with additional iterations, although asymptotically there may be a few vectors which are changed at every iteration (e.g. slight limit cycles), especially if more than one candidate vector has the same correlation value. In this example, the number of changes dropped from 606 at the first iteration to 59 at the fifth iteration.

Iteration Algorithm

Here is the actual iteration algorithm:

For each flow element at (x, y)
-------------------------------

0. Compute parametric correlation surface, peaks, and candidate vectors as described above.
   Initial vector values are just raw PCC values from 0 to 1 (negative values are ignored).
   
1. Initiallize candidate probabilities to

     pr(i) = pcc(i) / sum(pcc(i)), for all vectors i
   
2. Update candidate probabilities by

     pr(i) = pr(i) * vcc(i) / sum(pr(i) * vcc(i)), for all vectors i
   
   where the 'vector correlation coefficient' is
   
     vcc(i) = sum(sum(pr(k) * exp(-|v(i) - v(k)|^2 / sigma^2) * (grid / dist(j))))
     
              for all vectors i,
              for all neighbors j (do not use self),
              for all vectors k in j,
              sigma is a difference parameter (in pixels),
              grid is gridsize (hypotenuse in pixels),
              dist(j) is distance from (x, y) to neighbor j (in pixels)
              
3. Sort {pr(i)} and select vector with highest value as flow.

4. Repeat 1-3 for each additional iteration.

The key to this process is vcc(i), the 'vector correlation coefficient', which measures the similarity between two candidate vectors, one in the current flow element, and one in a neighboring flow element. This coefficient is based on two weights: the Cartesian difference between the two vectors, and the distance between the two flow elements. If v(i) = v(k), then the exponential term is = 1, otherwise it falls to zero as the difference between the vectors increases. Similarly, for flow elements which are one grid unit apart, grid / dist(j) = 1, and this falls to zero as the flows are further apart. These two weights are multiplied by each vector's probability, and the results are summed and normalized. Note that other candidates in the current flow are not used in this calculation, since these are competitors to the current candidate.

The two weight terms have the following functional form (f1 and f3 below):

f2 is a possible alternate form, but it falls off faster with difference between the two vectors. In both cases, the exponential term falls to 1/e when the magnitude of the difference between the two vectors is equal to sigma. Using a smaller value of sigma penalizes vector differences more. The neighbor distance term simply weights each neighbor inversely as its distance from the flow, so that all neighbors at every constant distance contribute equally to the calculation.

Varying Neighborhood and Sigma

The following image shows the effect of varying the neighborhood size from 10 to 50 for a gridsize of 10. This is a single iteration of an initial flow. The larger neighborhood has the effect of smoothing the flow somewhat like a spatial lowpass filter, and results in more vector candidates being changed at each iteration. The larger neighborhood iteration, however, took somewhat longer to calculate.

The next image shows the effect of varying sigma from 25 to 75 for a single iteration of the same initial flow. The smaller value of sigma causes the flow to change (dampen) faster, similar to applying a lowpass filter in time, with the result that more candidate vectors were changed. Both values of sigma took the same time to compute. In general however, it is expected that larger values of sigma will produce better convergence of the flow (at the expense of overall time), with less chance of cycling, although this claim is yet to be proven.

Sub-pixel Flow Estimation

Although all initial and iterated flow candidate vectors have integer dx and dy components, you can choose to estimate the final flow field to fractional pixel values by setting the 'Resolution' menu to a value less than 1 pixel. In this case, after both the initial estimate and all iterations have been performed, the final flow candidates will be enhanced to sub-pixel values by using an iterative image interpolation scheme. You can see how this process works here. Using sub-pixel estimation does not affect the candidate vectors, only the final flow choice. Sub-pixel estimation requires significantly more memory and time to compute, so that in general it is not recommended to use less than 1/8 pixel accuracy for most purposes. Sub-pixel estimation does not greatly affect the visual display, but it will affect the output of the vector field in a text file.

Postprocessing a Flow

You can postprocess the results of a computation (initial estimate or iteration) by checking the 'Postprocess' button and entering a threshold value in units of standard deviations. During postprocessing, the mean vector and standard deviation distance from the mean are calculated for each grid point neighborhood. (grid / dist) weights are applied to both the mean and standard deviation. Then, those flow vectors which are outside of the threshold for the neighborhood are replaced by the mean for the neighborhood. Again, this is like applying a spacial lowpass filter to the flow field. In the following figure, postprocessing with a threshold of 2.5 (left) and 0.5 (right) standard deviations was applied to the initial flow estimation (no iterations):

In each plot, the 'pchanges' label shows how many flow vectors were replaced by their neighborhood means during postprocessing. This count is independent of the number of changes resulting from an iteration (the 'nchanges' label). Postprocessing, if selected, is applied at the end of the run cycle (initial calculation plus any immediate iterations) and at the end of the step cycle (1 or more additional iterations). Note that postprocessing does not change any of the candidate vectors, it only replaces the displayed flow vector with the mean. Therefore it will not affect subsequent iteration calculations. However, postprocessing may produce flow vectors which are not any of the candidate vectors.

Saving a Flow

You can save the vector flow as a text file by choosing the 'File->Save Vectors...' menu item. You will be prompted for a name and folder, and the extension '.txt' will be appended to the file name.

Vector files are saved in the following format:

Number of vectors (lines following)
X-coordinate  Y-coordinate  DX-difference  DY-difference  Value
X-coordinate  Y-coordinate  DX-difference  DY-difference  Value
X-coordinate  Y-coordinate  DX-difference  DY-difference  Value
etc...

All coordinates are in pixels. DX and DY will be in fractional values if sub-pixel resolution has been performed. The value column is the raw parametric correlation coefficient for initial flows, or the normalized vector probability for iterated flows. If you save the vectors while the plot is showing 'everything', all candidate vectors for each (x, y) grid coordinate will be saved in addition to the chosen flow vector, sorted in descending value. Note that there will not in general be the same number of candidate vectors for each grid point, nor will there always be any candidates, depending on the specific shape and values of the correlation surface for that grid point, and whether the domain for that grid point contains any non-zero pixels.

Saving a Plot

You can save the vector plot as an image file (compressed TIFF) by choosing the 'File->Save TIFF...' menu item. You will be prompted for a name and folder, and the extension '.tiff' will be appended to the file name. Note that to save the entire plot, you must first resize the plot window to see the whole plot, otherwise only the visible area plus the scrollbars and window title will be saved, for some reason. Saving multiple TIFF images allows you to create a movie of the flows using QuickTime Pro. Here is an example of a 2 frame movie made from the Venus images shown above, plus one additional composite image from later in the observation the same day: movie.

Setting FITSFlow Defaults

You can create a file called 'FITSFlow.defaults.txt' in the same folder as FITSFlow.app and use this file to set the default parameters for a new FITSFlow document. This file must be saved in Unix format (e.g. with newlines ('\n') at the end of each line of text). If this file exists, it will be read and used to initialize the settings in every new document. The syntax for this file is as follows. Spelling is not case-sensitive, and spaces are optional:

#              comment line (must be first char on line)
//             comment line (must be first chars on line)

file1          = file-path
file2          = file-path
gridX          = integer-value
gridY          = integer-value
domainX        = integer-value
domainY        = integer-value
rangeX         = integer-value
rangeY         = integer-value
rangeOffsetX   = integer-value
rangeOffsetY   = integer-value
imageScale     = {0: automatic, 1: manual}
imageMax       = real-value
imageMin       = real-value
plotMode       = {0: flow only, 1: everything}
plotLabels     = {0: no, 1: yes}
vectorLength   = real-value
vectorOffsetX  = real-value
vectorOfffsetY = real-value
vectorScale    = {0: automatic, 1: manual}
vectorMax      = real-value
clipMode       = {0: no, 1: yes}
neighborhood   = real-value
sigma          = real-value
resolution     = {0: 1 px, 1: 1/2 px, 2: 1/4 px, 3: 1/8 px, 4: 1/16 px, 5: 1/32 px, 6: 1/64 px}
postProcess    = {0: no, 1: yes}
thresh         = real-value
runIters       = integer-value
stepIters      = integer-value

Batch File Commands

In addition to reading default settings from FITSFlow.defaults.txt, you can read settings and commands from a batch file by choosing 'File->Run batch file...'. Batch files are text file using the same syntax as above. In addition, they also support the following commands:

run
step
savevectors    file-path
savetiff       file-path
quit

You can place any number of commands in a batch file. Typically you would make a batch file something like the following:

// file, display, and run settings
file1          = file-path
etc...

run
savevectors    file-path
savetiff       file-path

// more file, display, and run settings
file1          = file-path
etc...

run
savevectors    file-path
savetiff       file-path

etc...

quit

Batch commands can also be placed in FITSFlow.defaults.txt, so that the program will automatically run them from startup.

Tips for Use

1. Input images must be good quality, and free from artifacts and distortion. All defects will affect the flow calculation. Ideally, both images should have the same range of values, with 0 value indicating a mask region to be ignored.

2. Start with a large grid size to speed calculations and give an idea of the shape of the flow. If a smaller grid size makes the vectors too crowded to see, decrease their size with the 'length' field.

3. The domain size and shape should represent the actual features in the images. For the Venus images used above, domain sizes of 16-32 pixels are reasonable. Larger domains take longer to calculate. There should be some obvious variation in image brightness within each domain. Smaller domains will find more false matches, larger domains (with varying signal) will find fewer matches, but will not localize the flow as precisely.

4. Range sizes, shapes, and offsets should also represent actual features and expected flows. Ranges should be larger than domains by an amount at least equal to the expected flow magnitude. Flow candidates will only be found in the interior of ranges, not at the edges.

WARNING: If range sizes are too small, or in the wrong locations, to find a match to displaced domains, the resulting flow will be wrong and likely random. Peaks (vectors) are not found at range edges. Larger ranges take much longer to compute. Larger ranges will also find more false matches, especially if small domains are used.

5. For the iterated method to do anything, the neighborhood size must be at least equal to the grid size (or grid points will have 0 neighbors). Using a neighborhood radius of 1-3 grids (minimum dimension) is probably prudent, although this should ideally be based on some physical constraint concerning how neighboring flows actually affect one another. Larger neighborhoods take considerably longer to compute.

6. Sigma should be based partly on the expected magnitudes of nominal flows, and partly on the desired speed of flow convergence. Sigma is the distance in flow difference (magnitude in pixels) for which the vector correlation coefficient between two vectors drops to 1/e (maximum of 1). The maximum expected difference between neighboring flows is probably a good lower bound, with larger values changing the flows less per iteration. You can gauge how fast the iterations are converging by watching the 'nchanges' label in the plot.

7. You probably don't want to plot all vectors ('everything') for small grid sizes.

8. Labels in the plot can be turned on/off by checking/unchecking the button and then clicking 'refresh'.

9. Image scale can be set at any time. You can darken an image (by raising the maximum value in manual mode) to see the vectors better.

10. Stop and Reset may take a few seconds to have an effect if the calculations are currently running (especially sub-pixel calculations). If stop or reset takes more than a few seconds, you will probably need to kill the program using the command-option-escape keys.

11. Run always does a reset before starting new calculations. Reset always does a stop.

12. When in doubt about the plot, click 'refresh' to redraw it.

13. You can run FITSFlow from the Unix command line with something like the following:

   open ~/FITS/FITSFlow/FITSFlow86.app

This will also cause FITSFlow.defaults.txt to be read and any batch settings and commands to be processed.

References

1. GLACIER SURFACE MOTION COMPUTATION FROM DIGITAL IMAGE SEQUENCES, A.N. Evans, Department of Electronic and Electrical Engineering, University of Bath, Claverton Down, Bath. BA2 7AY, United Kingdom. To appear in IEEE Transaction on Geoscience and Remote Sensing

2. A Correlation-Relaxation-Labeling Framework for Computing Optical Flow - Template Matching from a New Perspective, Qing X. Wu, Member, IEEE, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 8, September 1995.



Please send bug reports, comments, corrections, wish-lists, etc..., to skycoyote@comcast.net.
İSky Coyote 2007