Support for adjacency matrices in autoregressive models by adrianjav · Pull Request #54 · probabilists/zuko

adrianjav · 2024-07-19T15:31:09Z

This is an implementation of #53, in which I add support to provide an adjacency matrix to the autoregressive models, instead of using the ordering between variables.

I have added some asserts to make sure I cover all corner cases (e.g., providing both an ordering and an adjacency matrix), and added tests to make sure that the zero-entries of the adjacency matrix are indeed zero when computing the Jacobian matrix.

I might have missed something, and I am happy to fix it if that's the case.

PS: I read the policy about commit messages a bit too late, we can merge all commits into a single one and write an appropiate commit message.

francois-rozet · 2024-07-22T19:49:29Z

Hello @adrianjav, thank you for this PR. I quickly went through the modifications which looked good to me. I'll do a proper review this week. I just have a question:

Enabling to construct a MaskedAutoregressiveTransform directly from an adjacency matrix makes perfect sense. However, I am not sure if a series of masked autoregressive transformations (i.e. a MAF) with the same adjacency matrix makes sense. Could you explain in what case having several transformations with the same adjacency would be used?

adrianjav · 2024-07-23T08:47:53Z

Hi @francois-rozet, thanks for taking a look already to the PR!

Could you explain in what case having several transformations with the same adjacency would be used?

While we chose to have a single flow layer, we consider these types of architectures in our article (Table 1), and the idea is that you have a finer access to the way the information flows through the network.

I implemented this way so that it was direct to use the adjacency matrix on all the derivatives flows (i.e., MAF, NSF, ...) without having to add any more code, but I agree that it can be seen as rather arbitrary. I would propose two alternatives:

Add a check so that if the adjacency matrix is used, then the number of layers needs to be 1.
Allow to specify either an adjacency matrix or a list of them (one per layer), so that the user has even finer control on the way the flows are structured.

What do you think?

francois-rozet · 2024-07-24T13:39:58Z

Hello @adrianjav,

The current philosophy of Zuko regarding pre-built flows (MAF, NSF, NAF, ...) is to make them very simple and avoid overloading them with many features. We don't want to confuse beginners. If the user needs more control, they are expected to build their own custom architecture using lower-level components such as transformations and distributions.

In this case, we would expect the user to instantiate the list of MaskedAutoregressiveTransform themself given the adjacency matrices and then combine the transformations as a Flow. Note that this would be necessary anyway to build a generative (instead of abductive) flow. For example, for a single-layer abductive flow

flow = zuko.flows.Flow(
    transform=MaskedAutoregressiveTransform(
        features=5,
        context=8,
        adjacency=adjacency,  # 5x5 matrix
    ),
    base=base,
)

Therefore, I think that this PR should only modify the MaskedAutoregressiveTransform class. In addition,

There are constraints on the adjacency $A$ for the transformation to be invertible. I think checking that the adjacency satisfies these constraints would be great. For example, $A$ should be lower-triangular (with zeros on the diagonal) with respect to the variable ordering.
The number of "passes" for the inverse transformation can be inferred from the adjacency $A$. If there is a computationally efficient way to do it, it would also be a nice addition.

adrianjav · 2024-07-26T09:07:36Z

Hi,

Yeah, I fully understand and respect that philosophy. I will revert the changes I made to MAF and keep them local to MaskedAutoregressiveTransform. As for the other two points you suggested:

The matrix $A + I$ should be invertible, not $A$, right? (if we assume a zeroed diagonal).
If I understand correctly, the number of passess is just the diameter of the graph defined by $A$, right? I can add code to compute that, but it only saves on unecessary computations, isn't it?

I will add the necessary checks for the adjacency matrix.

PS: By the way, what is the policy with respect to adding dependencies? I am thinking that using https://networkx.org/ for the checks would make the code much simpler and reliable.

This reverts commit 04018ee686e1a5001c21a603b787af5b54153678.

adrianjav · 2024-07-26T11:39:37Z

I have added the necessary changes and reverted those involving MAF.

As a summary, now it checks whether the matrix has cycles, and infers the order (for the string representation) as well as the number of passes (diameter of the graph). There is no need to check invertibility, since we assume ones in the diagonal and it has no cycles.

For the algorithm, I have simplified the code from the networkx package to deal with our specific case.

Let me know if I missed something.

francois-rozet

Hello @adrianjav, thank you for implementing the adjacency checks. This is surprisingly simple (and efficient)! I read the PR carefully and left a few comments.

tests/test_flows.py

francois-rozet · 2024-07-26T12:26:37Z

zuko/flows/autoregressive.py

        order: LongTensor = None,
        univariate: Callable[..., Transform] = MonotonicAffineTransform,
        shapes: Sequence[Size] = ((), ()),
+        adjacency: BoolTensor = None,


Suggested change

order: LongTensor = None,

univariate: Callable[..., Transform] = MonotonicAffineTransform,

shapes: Sequence[Size] = ((), ()),

adjacency: BoolTensor = None,

order: LongTensor = None,

adjacency: BoolTensor = None,

univariate: Callable[..., Transform] = MonotonicAffineTransform,

shapes: Sequence[Size] = ((), ()),

Move adjacency just below order. (also in the docstring)

francois-rozet · 2024-07-26T12:41:04Z

zuko/flows/autoregressive.py

+        assert (order is None) or (
+            adjacency is None
+        ), "Parameters `order` and `adjacency_matrix` are mutually exclusive."
+        assert (passes == features) or (
+            adjacency is None
+        ), "When using `adjacency_matrix`, `passes` has to be the number of features."


Does not match the docstring.

Could you expand on this? I am not sure I understand what you mean

In the docstring, it is said that passes and order are ignored when adjacency is provided. This is not the case here.

Oh yes, I see now, thanks!

francois-rozet · 2024-07-26T12:49:53Z

zuko/flows/autoregressive.py

        # Hyper network
        self.hyper = MaskedMLP(adjacency, **kwargs)

+    def _check_adjacency(self, adjacency: BoolTensor) -> None:


Could you add a small description of what this function does? (e.g. checks that the graph is acyclic and returns the diameter + link to networkx).

francois-rozet · 2024-07-26T12:52:01Z

zuko/flows/autoregressive.py

+        self.passes = len(all_generations)  # Graph diameter
+        self.order = torch.tensor(reduce(list.__add__, all_generations))


This function should return the diameter instead of modifying it in-place.

Also, currently, the order argument is a topological order, but the attribute self.order is not. It represents the "computation" order, or in this code the "generation" of each feature. So I think we should not assign the topological order to self.order, even though it is an interesting quantity.

I think simply setting self.order = None when adjacency is provided is good.

The adjacency matrix is describing the DAG that the data-generating process follows, so any of its topological orders are indeed the generation order (i.e., if you group those variables that are generated simultaneously), isn't it? In the case of providing the ordering to the constructor, there is no grouping to be made (the adj. matrix is a full lower triangular matrix), so there is only one topological order. Or did I get something wrong?

I think you misunderstood my comment. Currently, the attribute self.order is not a topological order so you should not assign a topological order to it.

For example, when passes=2 and features=5, self.order is [0,0,0,1,1], meaning that the 3 first features can be computed together (they are in the same generation).

Yes, I did misunderstand it, my bad.
If that's the case, we could also compute the order in a pythonic way by just checking each variable in which generation level is:

self.order = torch.tensor( [i for j in range(adjacency.size(0)) for i, x in enumerate(all_generations) if j in x] )

But of course it is your call, I have no strong opinions on this matter.

I would prefer setting self.order to None. It is not used in the class anyway.

francois-rozet · 2024-07-26T13:15:17Z

zuko/flows/autoregressive.py

+            adjacency = out_order[:, None] > in_order
        else:
-            order = torch.as_tensor(order)
+            adjacency.mul_(  # Remove the diagonal (if it is non-zero)


The class should raise an error is the diagonal is non-zero, not silently "fix" it.

I opted for doing it silently since "in reality" the diagonal is non-zero (because of the transformer), and it needs to be made zero for the conditioner. I figured it out that fixing it would be less confusing, but I can also just raise an error.

I see what you mean, but then we should impose that the matrix has ones on the diagonal, and raise a error if it has not.

The reason why I prefer raising an error is that if a user makes a mistake (e.g. they permute the rows but they forget to permute the columns), they might miss it if we "fix" it silently.

That is actually a pretty good reason to checking it. I'll raise an error then 👍🏼

francois-rozet · 2024-07-26T13:15:55Z

zuko/flows/autoregressive.py

+                        del indegree_map[child]
+            all_generations.append(this_generation)
+
+        assert not indegree_map, "The adjacency matrix contains cycles"


Suggested change

assert not indegree_map, "The adjacency matrix contains cycles"

assert all(d == 0 for d in indegree), "The graph contains cycles."

francois-rozet · 2024-07-26T13:56:41Z

zuko/flows/autoregressive.py

    def extra_repr(self) -> str:
        base = self.univariate(*map(torch.randn, self.shapes))
-        order = self.order.tolist()
+        order = self.order


Suggested change

order = self.order

francois-rozet · 2024-07-26T13:58:00Z

zuko/flows/autoregressive.py

+        adjacency: Adjacency matrix that all layers of the flow should follow.
+            If different from :py:`None`, then `randperm` should be :py:`False`.


Remove adjacency in MAF.

francois-rozet · 2024-07-26T13:58:53Z

zuko/flows/autoregressive.py

        return '\n'.join([
            f'(base): {base}',
            f'(order): {order}',
+            f'(adjacency): {adjacency}',


I think we should either show order or adjacency, not both.

adrianjav · 2024-07-26T16:49:27Z

Thanks a lot for the review. I think all changes are more or less clear and I will start implementing them. I actually would appreciate if, after the next changes, you could put the final touches (e.g., let the __repr__ function to your liking) as it may be quicker that way.

PS: I am going on vacation tomorrow for a week, so I will be a bit less active.

francois-rozet · 2024-07-26T17:25:21Z

Sure, I'll finish the PR! Tell me when I can take over.

…Transform

…trix

adrianjav · 2024-07-27T20:42:13Z

I just made most of the changes suggested in the review. As I said, I'd be grateful if you can take over and implement the string representation at your liking.

Another comment, with regards to the diagonal of adjacency. It should be zero for the work to code, but I think the reason you gave is a super valid one, so I have opted to force the user to fill it with ones, and then I fill with zeroes. Feel free to change it if you think it is better doing it another way.

francois-rozet · 2024-07-29T13:56:38Z

Hello @adrianjav, I have made a few modifications to __init__, extra_repr and the tests. For the tests, I replaced the fixed adjacency matrices with random ones, which can help to detect more issues. For the representation, I chose to display the number of passes when the adjacency is provided.

For the __init__, I chose to revert to zeros on the diagonal as this is the convention in your paper and in Wehenkel et al. (2021). However, we still test that the diagonal is full of zeros, to prevent user mistakes.

I also really liked the way you added columns to the adjacency for the context and duplicated rows for the output so I made this part common for both order and adjacency.

If everything is good for you, I am happy to merge the PR!

francois-rozet · 2024-07-30T11:50:25Z

In the end I have reverted to ones on the diagonal. As you said, I think it is easier to understand that adjacency describes the transformation graph rather than the conditioning graph.

I will now merge this PR. Thank you for this contribution!

adrianjav · 2024-07-30T12:32:49Z

Glad to contribute! ☺️

adrianjav added 9 commits July 19, 2024 15:53

Added option to pass the adjacency matrix to autoregressive models.

7dbf706

Fix "Import block is un-sorted or un-formatted"

f67757f

Fixed "Too many newlines at end of file"

a832cf7

Fixed ruff parse errors

b8510c6

Fix shape of adjacency matrix

0352e38

Fixed tests for the adjacency matrix

7650c5c

Fixed docs

8958a38

Fixed docs (again)

d1c6bd9

Fix adjacency matrix taking context into account

f8e5290

adrianjav added 2 commits July 26, 2024 11:15

Revert "Auxiliary commit to revert individual files from 7dbf706"

bb70b86

This reverts commit 04018ee686e1a5001c21a603b787af5b54153678.

Add checks for the adjacency matrix and infer order and passes

ca13bd5

francois-rozet reviewed Jul 26, 2024

View reviewed changes

adrianjav added 10 commits July 27, 2024 21:36

Move adjacency below order in the constructor of MaskedAutoregressive…

426cd5c

…Transform

Remove adjacency from MAF

84557dc

Simplify and document _check_adjacency

54187e0

Test the cases with wrong adjacency matrices

2d09c7d

Make the interaction of adjacency, order, and passes more consistent

0f37bd0

Test invertibility of MaskedAutoregressiveTransform with adjacency ma…

d255177

…trix

Fix formatting errors

c5ea910

Fix problem when adjacency was being reused

4dfded6

Fix test error (adjacency reused with different input sizes)

ece9cf4

Fix format errors

7d73678

francois-rozet added 2 commits July 29, 2024 15:36

Improve __init__, __repr__ and tests

afbe5b2

Revert MAF changes

4e95936

Revert to ones on the diagonal

cf17397

francois-rozet merged commit 146bb4b into probabilists:master Jul 30, 2024

francois-rozet linked an issue Aug 3, 2024 that may be closed by this pull request

Add support for custom adjacency matrices in autoregressive models #53

Closed

aalmodovares mentioned this pull request Apr 12, 2025

Add support for custom conditional adjacency #64

Closed

		self.passes = len(all_generations) # Graph diameter
		self.order = torch.tensor(reduce(list.__add__, all_generations))

	assert not indegree_map, "The adjacency matrix contains cycles"
	assert all(d == 0 for d in indegree), "The graph contains cycles."

		adjacency: Adjacency matrix that all layers of the flow should follow.
		If different from :py:`None`, then `randperm` should be :py:`False`.

Conversation

adrianjav commented Jul 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

francois-rozet commented Jul 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrianjav commented Jul 23, 2024

Uh oh!

francois-rozet commented Jul 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrianjav commented Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrianjav commented Jul 26, 2024

Uh oh!

francois-rozet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

francois-rozet Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

francois-rozet Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

francois-rozet Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

francois-rozet Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

francois-rozet Jul 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrianjav commented Jul 26, 2024

Uh oh!

francois-rozet commented Jul 26, 2024

Uh oh!

adrianjav commented Jul 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

francois-rozet commented Jul 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

francois-rozet commented Jul 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

adrianjav commented Jul 19, 2024 •

edited

Loading

francois-rozet commented Jul 22, 2024 •

edited

Loading

francois-rozet commented Jul 24, 2024 •

edited

Loading

adrianjav commented Jul 26, 2024 •

edited

Loading

francois-rozet Jul 26, 2024 •

edited

Loading

francois-rozet Jul 26, 2024 •

edited

Loading

francois-rozet Jul 26, 2024 •

edited

Loading

francois-rozet Jul 26, 2024 •

edited

Loading

francois-rozet Jul 26, 2024 •

edited

Loading

adrianjav commented Jul 27, 2024 •

edited

Loading

francois-rozet commented Jul 29, 2024 •

edited

Loading

francois-rozet commented Jul 30, 2024 •

edited

Loading