Skip to content

tckgen: Default streamline count fix#809

Closed
Lestropie wants to merge 1 commit intomasterfrom
tckgen_default_number_fix
Closed

tckgen: Default streamline count fix#809
Lestropie wants to merge 1 commit intomasterfrom
tckgen_default_number_fix

Conversation

@Lestropie
Copy link
Copy Markdown
Member

Only revert to default 1000 streamlines if neither -number nor -maxnum provided.

As discussed here.

@jdtournier
Copy link
Copy Markdown
Member

So as I argued in my response to this thread, I'm not convinced I agree with the rationale behind this change...

@Lestropie
Copy link
Copy Markdown
Member Author

But it seems many users do expect a fixed density of seeds, irrespective of how many seeds may fail. There are situations where such an approach might be appropriate (e.g. tractography-based parcellation of the thalamus), but these are catered for explicitly with the -seed_per_voxel options.

This requires knowledge of the number of voxels in the seed mask, so that one can set the appropriate number of seeds per voxel to get the desired total number of seeds. Easy I know, but may be indirect depending on experimental design.

I'm not convinced I can think of any other use cases where a fixed number of attempts makes sense: it implies that the number of resulting streamlines is being interpreted in some way, which is not considered a good idea.

That's a fair inferential leap. For instance they might simply be trying to match experimental design across software platforms.

... while I can appreciate that users may expect that setting -maxnum would override -number, I can't think of many use cases where that would be the recommended approach

Even if it isn't the recommended approach, surely it's better than this behaviour?

tckgen ... -act ACT_5tt.nii -backtrack -crop_at_gmwmi CSD.mif track_1.tck -maxnum 500000
tckgen: [100%]   162618 generated,     1000 selected

That to me is a hidden default value infiltrating through despite a user request having been made. It's not 'overriding -number', it's disabling the hidden default for -number if -maxnum is provided. Admittedly the current solution isn't quite there yet (e.g. pretty sure setting -number 0 -maxnum 0 will revert to default values with this solution, whereas with master code that's a flag to track indefinitely), but it makes more sense to me than leaving it as it is and trusting users to read and understand the documentation.

@thijsdhollander
Copy link
Copy Markdown
Contributor

it's disabling the hidden default for -number if -maxnum is provided

Hmm, it's subtle, but I'm not sure about this: if you say a "hidden default" is disabled, you imply there is an (even more hidden/implied) default that becomes the default if the default is disabled. I'd say, the current default isn't hidden: it's very explicit in the manual / command help. Yes, users do need to read it to know it, but why else do we have manuals?

About this overall issue: I have the feeling a lot could be resolved by keeping the behaviour as is (on master), but maybe re-thinking the names of these options. Admittedly, if I would come in as a naive new user, I may be deceived by the name -maxnum being the number of tracks that I will end up with. It seems the emphasis needs to be more on the "generated" versus "selected" tracks. This could be achieved either in the name of the option itself (but that's an annoying interface/behaviour change), or maybe just in a better description of these options?

@thijsdhollander
Copy link
Copy Markdown
Contributor

Ok, just checked the actual command help: it seems the actual default (1000) is not mentioned. That one should definitely be added to the help text of the -number option.

@jdtournier
Copy link
Copy Markdown
Member

OK, I can see there are arguments for and against. Personally, I tend to also think this is more a matter of terminology than anything else. I stand by my earlier comment that in the overwhelming majority of cases, the current setup is appropriate - yes, there may be cases where users want to have a fixed number of seeds without using the explicit -seed_per_voxel options, but they should really be a minority. The particular example mentioned:

For instance they might simply be trying to match experimental design across software platforms.

is a case in point: this would only affect a very small minority of users, and if they genuinely want to match across software platforms, they will definitely need to read the docs very carefully. Besides, I suspect MRtrix3 is the only platform that uses random uniform seeding by default, I expect most other packages out there will seed on a uniform grid, potentially with multiple seeds per location. This can only be matched using a non-default seeder anyway, for which the option doesn't apply.

I'm not saying that there can't be any legitimate use cases for this, just that I struggle to think of any that genuinely can't or shouldn't be addressed differently.

Onto what we can do about it: I've been thinking along similar lines to @thijsdhollander: the terminology is confusing and not explicit enough. We definitely need to document what the default is for -number (it used to be documented, I wonder what happened to that...?). But more to the point, we need to convey the sense of generated vs. selected streamlines in a way that is sufficiently succinct and obvious that people can usefully reason about it. One option might be to use 'seeds' for the first of these terms? And we could be more explicit about the selection. We could also add an option to achieve a set number of seeds explicitly (even though this can already be achieved using a combination of the previous two...). This would give us these options:

  • -select number: set the desired number of streamlines to be selected by tckgen, after all selection criteria have been applied (i.e. inclusion/exclusion ROIs, min/max length, etc). tkcgen will keep seeding streamlines until this number of streamlines have been selected, or the maximum allowed number of seeds has been exceeded (see -maxseeds option). By default, this is set to 1000 streamlines.
  • -maxseeds number: set the maximum number of seeds that tckgen will attempt to track from. This is used to prevent the program from running indefinitely when no streamlines can be found that match the selection criteria. By default, this is set to 100× the number of selected streamlines. Set to zero to disable.
  • -numseeds number: set the number of seeds that tckgen will attempt to track from. This overrides both -select and -maxseeds. Use this option if you genuinely need a constant number of seeds rather than selected streamlines. However, note that in most cases, the -seed_per_voxel options are likely to be more appropriate.

I think this should cover it. I'm not convinced about the name for the -select option, but I can't think of anything else that's as specific and succinct. Assuming we go ahead with this, we'll need to go through the doc page to make sure the terminology is consistent. Obviously this amounts to a pretty big interface change, and will totally break everyone's scripts... Happy to go along with that as long as it's part of a larger update...

Thoughts?

@thijsdhollander
Copy link
Copy Markdown
Contributor

It took me a little while to let those proposed names sink in, but I think I like where that logic is going. If we go this way: would it make sense to actually put -maxseeds and -numseeds in the "Tractography seeding options" category? Right now they are in the "Streamlines tractography options" category, together with a bunch of options that are actually much more about certain properties/constraints relating to the final tracks/tractogram. This may actually be part of the reason why people get confused. After all, as the proposed description for -maxseeds reads, it is in principle only intended to be a safeguard against running indefinitely, and is not really promoted or encouraged to be a core mechanism to define a property of the final tractogram (to make it a bit "safer" in this respect, we could even increase its default to 1000x selected streamlines; decreases chances that most users even have to bother with this option). If we would move these 2 options to the seeding category, the remaining -select should probably also become the first entry in its (current) category, so -step, -angle are closer to -maxlength, etc... the latter all being properties that define the properties of individual tracks.

@jdtournier
Copy link
Copy Markdown
Member

Happy with @thijsdhollander's suggestions for reorganising the options into the right sections, makes total sense. Going back to the original idea, I think changing the terminology from 'attempts' to 'seeds' will help reduce confusion massively in the long run. Still not sold on the -select option though, somehow its name is not clear - although its use on the command line would be fine: -select 1M is I think very expressive. But in isolation, it's hard to guess what -select might be about. Thoughts...?

@Lestropie
Copy link
Copy Markdown
Member Author

Another gotcha: There's a difference between the number of 'seeds' and the number of 'attempts', since sometimes the seed generator will nominate a point (and possibly direction) but the tracking algorithm init() function will fail. May not matter in the case of random seeding within a mask (since you just generate another candidate seed point), but when you request a fixed number of seeds per voxel there's a difference between the two.

Though how these seeding mechanisms work may or may not change in the future. shifty eyes

@jdtournier
Copy link
Copy Markdown
Member

Another gotcha: There's a difference between the number of 'seeds' and the number of 'attempts', since sometimes the seed generator will nominate a point (and possibly direction) but the tracking algorithm init() function will fail. May not matter in the case of random seeding within a mask (since you just generate another candidate seed point), but when you request a fixed number of seeds per voxel there's a difference between the two.

That's funny, I thought about this but came to the opposite conclusion: when you have a fixed number of seeds per voxel, there is a one-to-one relationship between seeds and attempts. However, I can see that things aren't all that simple: seeding may fail because there genuinely is nothing to track there (in which case this still classes as an attempt), but also because the random direction we picked didn't have sufficent FOD amplitude along it (in which case we should really be trying a bit harder to find a viable direction at that seed location before we can genuinely call it an attempt). So maybe we need to think this through a bit more carefully, but probably more in terms of how things are implemented than the terms used?

Though how these seeding mechanisms work may or may not change in the future. shifty eyes

You little teaser... Now WTF are you about to do?

@Lestropie
Copy link
Copy Markdown
Member Author

That's funny, I thought about this but came to the opposite conclusion: when you have a fixed number of seeds per voxel, there is a one-to-one relationship between seeds and attempts.

Logically yes, they're the same. But programmatically, if Method::init() fails, it won't register an 'attempt' with the Tractography::Writer.

Also another aside point: If the command-line options are to change, should the corresponding Tractography::Properties entries (and posibly also count / total_count) correspondingly change (bearing in mind they're currently not consistent anyway)?

... but also because the random direction we picked didn't have sufficent FOD amplitude along it (in which case we should really be trying a bit harder to find a viable direction at that seed location before we can genuinely call it an attempt).

Began playing with this yesterday. The current code for iFODx just selects the first direction it samples that's above threshold. I'm trying to get it using calibration & rejection sampling much like the tracking itself, so that seeding in a direction of large FOD amplitude is more probable. What I have works, except for the case of iFOD2 with large FOD power because the distribution gets sharpened so much.

The code also now logs the number of cases where init() succeeded but next() failed to propagate in either direction, and running with -info will report it.

You little teaser... Now WTF are you about to do?

Oh nothing too evil. Would just like to have seed mechanisms that set the number of successful streamlines per voxel rather than the number of attempts per voxel as is currently provided. Haven't actually started on it though. Currently I'm achieving the same by just generating single-voxel seeds and looping over tckgen runs, but doing it all in the one execution might help cut down on initialisation time / early cache misses.

@thijsdhollander
Copy link
Copy Markdown
Contributor

thijsdhollander commented Feb 7, 2017

Since this one (not the current commit, but where the discussion led to) seems to converge towards quite an important change (renaming) of interface to one of the core options of tckgen, it may be good to get it over with as part of tag_0.3.16. But I would probably wait with any further bigger changes (on any front) to tag_0.3.16 until repo_fhs_restructure is merged in... for the sake of not overcomplicating that particular merge... 🙂

@jdtournier jdtournier mentioned this pull request Feb 7, 2017
@jdtournier
Copy link
Copy Markdown
Member

Where do we stand with the changes we'd discussed here? I agree with @thijsdhollander that if we want to make such drastic changes, tag_0.3.16 is the best opportunity for this.

@thijsdhollander
Copy link
Copy Markdown
Contributor

thijsdhollander commented Feb 27, 2017

I've done the initial changes to get this going: see #921. The implementation of the -num_seeds functionality is still a TODO, but the interface should be fully functional and self-consistent...

@thijsdhollander
Copy link
Copy Markdown
Contributor

This has now been superseded by #921 .

@thijsdhollander thijsdhollander deleted the tckgen_default_number_fix branch March 9, 2017 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants