The style of this post is going to be a bit different than the usual PA posts, but I though this journey was worth documenting somewhere on the off chance it’s helpful.
So here we go: how to pick random ROIs within an original area.
It all started one fateful night, when I came across this post on the ImageJ Forum. The original poster wanted to pick two sampling ROIs from within an image with a selection ROI already drawn on.
In my original reply, this was the first (slightly harebrained) idea that came to mind. Assuming a shape with low levels of concavity, you can find the centre of mass of the original selection, then draw a line to a point on the circumference of the selection ROI.
You can then randomly pick a point along that line between the centroid and the selection ROI circumference and draw your ROI. Boom! Random sampling in an amorphous shape.
In order to make sure subsequent ROIs are non-intersecting, I suggested calculating the predicted area of the ROIs (assuming we would make them circular), then comparing this to the measured area of the logical OR of the two ROIs. Pretty smart eh?
Turns out: not really.
I actually spent an evening writing a script to perform this analysis. You can find it here.
One of the interesting problems I came across (which we’ll revisit later) is how ImageJ measures selections and overlays versus pixels. Here’s a quick example. If you want to play along, run this code in the script editor:
newImage("Untitled", "8-bit black", 256, 256, 1); run("Specify...", "width=100 height=100 x=128 y=128 oval centered"); run("Measure");
This simply draws a circle selection in the middle of a new image. We can predict what the Area will be with A=π·r² and it comes out to 7853.98. That’s all well and good except if you followed along above, you’ll know that the result we got was slightly different:
Oh. To find out what’s happening, you can run one further command:
run("Fill", "slice");. This should fill your circle. If you now use the magic wand to select the object and hit measure again…you guessed it: 7860.
What this means is that when you have a vector selection (like a circle), what is actually being measured are the underlying pixels (which makes sense really).
Unfortunately, this means a slight detour when trying to calculate the predicted ROI area versus the logically combined area. You can see how I went about it here (TL;DR: make a single ROI, measure it, then multiply this by however many ROIs you have).
Back to the Question
This is all well and good but it always nagged me that the centroid to circumference sampling technique would bias the area in the middle of the selection ROI. I’m not trained in maths at all (assuming you don’t count that GCSE) so I have no idea how to show theoretically that this is not a uniform sampling. So I resort to my favourite approach: brute-force empiricism.
I modified the code a bit and created 10,000 single ROIs in the same selection ROI (example1.tif in the project repo). It looked something like this:
Now the important part. If you plot the distribution of coordinates in X and Y you should (I think for a roughly symmetrical shape) get a uniform distribution (like rolling a single die). Hmm.
OK so clearly the distribution highly favours the middle of the object. Is this something to do with the amorphous shape of the object? Let’s try again with a circle:
And the distribution?
Well at least it was a nice ride.
Doing it right
Perhaps the more obvious way to go about this is simply to randomly select coordinates in X and Y then do two checks:
- Does the whole ROI, when created at a random centroid fall within the original selection?
- Does the ROI overlap with any previous ROIs created?
I think the reason why the first idea above seemed more appealing is that it superficially seems more likely to provide ROIs within the original shape (only in exceptionally convoluted shapes would you ever have to make the first check), however computation is cheap so I spent another evening coding this second approach.
Here are 10,000 ROIs and already you can see the distribution looks to have much more even coverage.
And the distribution confirms this too.
If you want to play you can find the code here.
Mo Value, Mo Problems
Even though it’s not particularly complex code, I’m quite pleased with the way it’s been implemented, especially as there are some things that tripped me up along the way.
For example, there are edge cases where the code would get stuck in an infinite loop. Two such examples are if the region ROI is bigger than the original selection ROI. Of course we would never exit the loop that prevents ROIs being outside the original because it would always extend beyond the original bounds. The same goes for posing an unsolvable packing problem. If you are asking for more ROIs than can fit inside the original shape without overlapping, you’ll continually create ROIs that will be discarded.
Both of these problems are solved with a ‘tries’ counter (one for inside vs outside and one for creating non-overlapping ROIs). The variable (
numTries_inside in this example) is initiated with a value of zero. Every time we start a loop, we check the variable and if it’s above a certain number, we exit the macro with a message. importantly you need two other things:
- Every time you fail the ‘valid ROI’ test, you need to increment the counter (
- Every time the test succeeds (IE you create a valid ROI) the counter is reset (
Another really obvious point, but we only select the random X and Y coordinates within the bounding box of the original shape. No point in spending time testing positions outside of that.
Finally, always be mindful when dealing with calibrated vs uncalibrated images. My initial testing was all done on uncalibrated images then when I tried it out on a calibrated image, it all broke because anything that’s measured (including the bounding box early on in the code) is returned in calibrated units. I took the lazy way out in this regard and uncalibrated the image at the start, did all the processing in pixels then re-calibrated at the end.
So there it is. The moral of the story is that sometimes you have to go away and think about a problem to make you realise that the first solution you come up with is not necessary the best one (or even a particularly good one).
The example images and code are all in the repo, so go and have a play! With any luck the workflow and application is helpful, or at least interesting to a few people.
…also, if you value regular sleeping patterns: don’t read the ImageJ forum just before bed.