Ability to blacklist ip ranges for federation traffic#5043
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #5043 +/- ##
===========================================
+ Coverage 61.75% 61.76% +0.01%
===========================================
Files 336 336
Lines 34639 34658 +19
Branches 5692 5694 +2
===========================================
+ Hits 21390 21407 +17
- Misses 11721 11723 +2
Partials 1528 1528 |
richvdh
left a comment
There was a problem hiding this comment.
generally looks sane. A few suggestions
synapse/config/server.py
Outdated
| # - syd.example.com | ||
|
|
||
| # Prevent federation requests from being sent to the following | ||
| # blacklist IP address CIDR ranges. |
There was a problem hiding this comment.
please could you explain what the default is (even if it's that nothing is blacklisted)?
synapse/config/server.py
Outdated
| # Prevent federation requests from being sent to the following | ||
| # blacklist IP address CIDR ranges. | ||
| # | ||
| #federation_ip_range_blacklist: |
There was a problem hiding this comment.
do you think we should uncomment this so that we blacklist the RFC1918 addresses by default for new deployments?
There was a problem hiding this comment.
Seeing as it would be a security benefit for the vast majority of use cases, I would argue yes.
There was a problem hiding this comment.
so would I. Please can you make it so?
There was a problem hiding this comment.
Done so, but requires having sytest override it so that tests don't fail.
synapse/http/client.py
Outdated
| "Dropped %s from DNS resolution to %s" % (ip_address, hostname) | ||
| ) | ||
| raise SynapseError(403, "IP address blocked by IP blacklist entry") | ||
| # Only raise a 403 if this request originated from a |
There was a problem hiding this comment.
I'm struggling to remember now. I remember I worked with @erikjohnston on this.
Something will passing tests perhaps?
There was a problem hiding this comment.
T'was actually due to special-casing for unit tests. We've now removed this, as the existing implementation didn't make sense anyways.
synapse/config/server.py
Outdated
| self.federation_domain_whitelist[domain] = True | ||
|
|
||
| self.federation_ip_range_blacklist = config.get( | ||
| "federation_ip_range_blacklist", None, |
There was a problem hiding this comment.
If we make the default [] rather than None we can save ourselves a bunch of special-casing?
There was a problem hiding this comment.
I suppose so since this is a blacklist and an empty one is the same as not having one at all.
tests/http/test_fedclient.py
Outdated
| self.assertNoResult(test_d) | ||
|
|
||
| # Make sure treq is trying to connect | ||
| # Make sure the req is trying to connect |
tests/http/test_fedclient.py
Outdated
| def test_client_ip_range_blacklist(self): | ||
| """Ensure that Synapse does not try to connect to blacklisted IPs""" | ||
| # Set up the ip_range blacklist | ||
| from netaddr import IPSet |
* develop: (34 commits) Add a default .m.rule.tombstone push rule (#4867) Fix infinite loop in presence handler changelog more logging improvements remove extraneous exception logging Clarify logging when PDU signature checking fails Changelog Add --no-pep-517 to README instructions set PIP_USE_PEP517 = False for tests Fix handling of SYNAPSE_NO_TLS in docker image (#5005) Config option for verifying federation certificates (MSC 1711) (#4967) Remove log error for .well-known/matrix/client (#4972) Prevent "producer not unregistered" message (#5009) add gpg key fingerprint Don't crash on lack of expiry templates Update debian install docs for new key and repo (#5074) Add management endpoints for account validity Send out emails with links to extend an account's validity period Make sure we're not registering the same 3pid twice Newsfile ...
tests/http/test_fedclient.py
Outdated
| d = do_request() | ||
| self.pump() | ||
|
|
||
| # Nothing has happened yet |
There was a problem hiding this comment.
please remember to test that these deferreds eventually get resolved one way or the other, otherwise we'll end up with requests blocking forever.
There was a problem hiding this comment.
Still not 100% sure on how to do this. yield'ing results in the function timing out (though maybe that's what we want to test?).
There was a problem hiding this comment.
yield'ing results in the function timing out
yes. The thing to note here is that the test function itself is running synchronously: it starts at the beginning, and doesn't return until it is finished. Note that is different to a Deferred-returning function, which only does part of its work synchronously, and leaves the rest to be done later (at which point, the Deferred will resolve).
Now, the way that we test the asynchronous code from a synchronous function is something like:
- run a function that returns a deferred
- satisfy the condition that the function is waiting for (eg: pretend that some time has passed; connect a TCP socket; spoof some results from the database).
- Once we satisfy the awaited condition, the second half of the function should complete, and hence resolve its deferred.
- Check that the deferred has now resolved with the expected result.
Often, all that is needed is to "pretend some time has passed". We do this by using a fake Reactor where we can manipulate its idea of the current time. "Pretending time has passed" is then called "advancing the reactor", and is what self.pump does.
[It may be helpful to note that this means that there are actually two reactors in play, since the test runner (trial) is a Twisted application which runs on a regular reactor: so we end up with the main reactor which is running trial and the test reactor which is running the code under test. But in practice, as long as your test functions are synchronous, you can forget about the trial reactor.]
So, long story short: how do we test that the deferreds eventually get resolved? well, somewhere in the test we ought to be able to call self.successResultOf(d) or self.failureResultOf(d, cls) [iirc]. If that assertion fails, it means that the deferred isn't completing, and that is a bad thing!
richvdh
left a comment
There was a problem hiding this comment.
this is taking shape!
One thing I would say is that I think this PR is starting to sprawl a bit, and I would suggest splitting out the changes to the existing blacklisting code so that we can land them independently of the federation blacklist.
synapse/http/client.py
Outdated
| ): | ||
| logger.info( | ||
| "Blocking access to %s because of blacklist" % (ip_address,) | ||
| "Blocking access to %s because of blacklist. Returning 0 results" % |
There was a problem hiding this comment.
Is it true to say we are returning 0 results here? It certainly doesn't match my way of thinking about it.
synapse/http/client.py
Outdated
| (ip_address,) | ||
| ) | ||
| e = SynapseError(403, "IP address blocked by IP blacklist entry") | ||
| e = SynapseError(404, "No results found") |
There was a problem hiding this comment.
I'm guessing this is changing to keep it in sync with https://github.com/matrix-org/synapse/pull/5043/files#diff-3c2efbd259756c1c1dc5dbdf467b5570R341, for which see my comments there, however:
On the one hand I see the value of keeping the behaviour consistent between literal IPs, and hostnames. On the other, since we have the ability to give a better response like a 403 here, I'm inclined to do so.
| logger.warn("Error downloading %s: %r", url, e) | ||
|
|
||
| if isinstance(e, DNSLookupError): | ||
| # DNS lookup returned no results |
There was a problem hiding this comment.
there are other reasons that we could get a DNSLookupError
| # FIXME: pass through 404s and other error messages nicely | ||
| logger.warn("Error downloading %s: %r", url, e) | ||
|
|
||
| if isinstance(e, DNSLookupError): |
There was a problem hiding this comment.
Think this would be better handled with a separate except DNSLookupError as e clause.
| # Note: This will also be the case if the found IP address | ||
| # is blacklisted | ||
| raise SynapseError( | ||
| 404, "No results found", Codes.UNKNOWN |
There was a problem hiding this comment.
It's not obvious to me that this is the correct thing to return. "No results found" is a pretty useless error, and why a 404?
https://matrix.org/docs/spec/client_server/unstable.html#get-matrix-media-r0-preview-url is annoying unhelpful about how to handle this situation, which would be a nice thing to rectify, but I think a 502 is the normal way for a proxy-like thing (which we are) to say "I couldn't connect to the the thing I'm supposed to be proxying".
| tls_client_options_factory, | ||
| ) | ||
|
|
||
| # Prevent direct connections to blacklisted IP addresses |
There was a problem hiding this comment.
I don't find "direct connections" very clear here. How about:
"Use a BlacklistingAgentWrapper to prevent circumventing the IP blacklist via IP literals in server names"
tests/http/test_fedclient.py
Outdated
| d = do_request() | ||
| self.pump() | ||
|
|
||
| # Nothing has happened yet |
There was a problem hiding this comment.
yield'ing results in the function timing out
yes. The thing to note here is that the test function itself is running synchronously: it starts at the beginning, and doesn't return until it is finished. Note that is different to a Deferred-returning function, which only does part of its work synchronously, and leaves the rest to be done later (at which point, the Deferred will resolve).
Now, the way that we test the asynchronous code from a synchronous function is something like:
- run a function that returns a deferred
- satisfy the condition that the function is waiting for (eg: pretend that some time has passed; connect a TCP socket; spoof some results from the database).
- Once we satisfy the awaited condition, the second half of the function should complete, and hence resolve its deferred.
- Check that the deferred has now resolved with the expected result.
Often, all that is needed is to "pretend some time has passed". We do this by using a fake Reactor where we can manipulate its idea of the current time. "Pretending time has passed" is then called "advancing the reactor", and is what self.pump does.
[It may be helpful to note that this means that there are actually two reactors in play, since the test runner (trial) is a Twisted application which runs on a regular reactor: so we end up with the main reactor which is running trial and the test reactor which is running the code under test. But in practice, as long as your test functions are synchronous, you can forget about the trial reactor.]
So, long story short: how do we test that the deferreds eventually get resolved? well, somewhere in the test we ought to be able to call self.successResultOf(d) or self.failureResultOf(d, cls) [iirc]. If that assertion fails, it means that the deferred isn't completing, and that is a bad thing!
| for domain in federation_domain_whitelist: | ||
| self.federation_domain_whitelist[domain] = True | ||
|
|
||
| self.federation_ip_range_blacklist = config.get( |
There was a problem hiding this comment.
we ought to follow the example of #5134 and include 0.0.0.0 and ::, whether they were explicitly listed or not.
synapse/http/client.py
Outdated
| def resolveHostName(self, recv, hostname, portNumber=0): | ||
|
|
||
| r = recv() | ||
| d = defer.Deferred() |
There was a problem hiding this comment.
can you merge #5155 so that we can get this stuff out of the diff?
tests/http/test_fedclient.py
Outdated
| # Nothing happened yet | ||
| self.assertNoResult(d) | ||
|
|
||
| self.pump(120) |
There was a problem hiding this comment.
hrm, having to wait 120 seconds for this doesn't sound right. Why does it take so long?
There was a problem hiding this comment.
Oh, it doesn't. I've just selected that value arbitrarily. Have decreased.
* develop: (45 commits) URL preview blacklisting fixes (#5155) Revert 085ae34 Add a DUMMY stage to captcha-only registration flow Make Prometheus snippet less confusing on the metrics collection doc (#4288) Set syslog identifiers in systemd units (#5023) Run Black on the tests again (#5170) Add AllowEncodedSlashes to apache (#5068) remove instructions for jessie installation (#5164) Run `black` on per_destination_queue Limit the number of EDUs in transactions to 100 as expected by receiver (#5138) Fix bogus imports in tests (#5154) add options to require an access_token to GET /profile and /publicRooms on CS API (#5083) Do checks on aliases for incoming m.room.aliases events (#5128) Remove the requirement to authenticate for /admin/server_version. (#5122) Fix spelling in server notices admin API docs (#5142) Fix sample config 0.99.3.2 include disco in deb build target list changelog Debian: we now need libpq-dev. ...
There was a problem hiding this comment.
this could really have done with being outside the try/catch, but nm
Closes #3953
This PR adds a new config option that functions similar to
url_preview_ip_range_blacklist, but provides a blacklist for federation traffic. It does this by making use of ourIPBlacklistResolverandBlacklistingAgentWrapperclasses.Sytest PR: matrix-org/sytest#607