Skip to content

speeding up get_design_matrix4triplet? #649

@scottstanie

Description

@scottstanie

Description of the desired feature

I was using the phase-closure matrix built in get_design_matrix4triplet, and it seemed to take longer than I expected. For reference, I had ~5000 interferograms from 100-120 SAR images, and it took around a minute to create the matrix (which was sized around (200000, 5000)

Is your feature request related to a problem? Please describe

MintPy/mintpy/objects/stack.py

Lines 1244 to 1246 in 6745249

triangle_idx.append([date12_list.index(ifgram1),
date12_list.index(ifgram2),
date12_list.index(ifgram3)])

This part of the matrix is doing 3 linear time lookups in the inner loop, where you need to find the position of each interferogram for the 3 in the closure loop.

The main idea of the speedup is to do something like this:

    # Create an inverse map from tuple(date1, date1) -> index in ifg list
    ifg_to_idx = {ifg: idx for idx, ifg in enumerate(date12_tuples)}

So that you can look up the position in constant time. This leads to around a 10x speedup for networks of 4000-5000 interferograms.

Describe the solution you'd like

In [1]: from mintpy.objects import ifgramStack

In [2]: stack_obj = ifgramStack()
In [3]: start_date = datetime.date(2020,1,1)
In [4]: dates = [start_date + datetime.timedelta(days=i) for i in range(100)]
In [5]: dates[:3]
Out[5]:
[datetime.date(2020, 1, 1),
 datetime.date(2020, 1, 2),
 datetime.date(2020, 1, 3)]

In [6]: dates_str = [d.strftime("%Y%m%d") for d in dates]
In [7]: import itertools
In [10]: date12_tuples = list(itertools.combinations(dates_str, 2))
In [11]: date12_list = ['_'.join(tup) for tup in date12_tuples]
In [12]: date12_list[:4]
Out[12]:
['20200101_20200102',
 '20200101_20200103',
 '20200101_20200104',
 '20200101_20200105']

In [13]: len(date12_list)
Out[13]: 4950

In [14]: %time C1 = stack_obj.get_design_matrix4triplet(date12_list)
CPU times: user 40.5 s, sys: 668 ms, total: 41.2 s
Wall time: 41.3 s

# Tested my implementation in this script
In [15]: import ts_utils
In [16]: %time C2 = ts_utils.build_closure_matrix(date12_list)
CPU times: user 970 ms, sys: 1.71 s, total: 2.68 s
Wall time: 2.69 s


In [17]: np.allclose(C1, C2)
Out[17]: True

I've got the example implementation on a branch of my fork: scottstanie@d5a654e

Additional context

I'm not sure what the common use cases are of people currently using MintPy; if people have only a few hundred interferograms, this won't be a noticeable speedup. If people are doing things with 100+ SAR dates and 5000-10000 interferograms, this will cut off a few minutes.

Are you willing to help implement and maintain this feature? Yes/No

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions