Skip to content

[Validator] Add a HaveIBeenPwned password validator#27738

Merged
fabpot merged 1 commit intosymfony:masterfrom
dunglas:haveibeenpwned
Apr 1, 2019
Merged

[Validator] Add a HaveIBeenPwned password validator#27738
fabpot merged 1 commit intosymfony:masterfrom
dunglas:haveibeenpwned

Conversation

@dunglas
Copy link
Copy Markdown
Member

@dunglas dunglas commented Jun 27, 2018

Q A
Branch? master
Bug fix? no
New feature? yes
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets n/a
License MIT
Doc PR todo

This PR adds a new Pwned validation constraint to prevent users to choose passwords that have been leaked in public data breaches.
The validator uses the https://haveibeenpwned.com/ API. The implementation is similar to the one used by Firefox Monitor. It allows to not expose the password hash using a k-anonymity model. The specific implementation for HaveIBeenPwned has been described in depth by Cloudflare.

Usage:

// Rejects the password if is present in any number of times in any data breach
class User
{
    /** @Pwned */
    public $plainPassword;
}

// Rejects the password if is present more than 5 times in data breaches
class User
{
    /** @Pwned(maxCount=5) */
    public $plainPassword;
}

// Customize the error message
class User
{
    /** @Pwned(message='Please select another password, this one has already been hacked.') */
    public $plainPassword;
}

@stof
Copy link
Copy Markdown
Member

stof commented Jun 27, 2018

This constraint already exist as a third-party one here: https://github.com/rollerworks/PasswordStrengthValidator/blob/v1.1.3/src/Validator/Constraints/P0wnedPassword.php

I'm wondering whether we need to have it in core.

@dunglas
Copy link
Copy Markdown
Member Author

dunglas commented Jun 27, 2018

I wasn't aware of this bundle! Thanks @stof for pointing it out.
IMO it's a good reason to move this kind of very important features in core :)

@stof
Copy link
Copy Markdown
Member

stof commented Jun 27, 2018

the threshold is not yet implemented there (it is always the equivalent of your maxCount=1), but there is a issue about it already: rollerworks/PasswordStrengthValidator#22

@stloyd
Copy link
Copy Markdown
Contributor

stloyd commented Jun 27, 2018

I was thinking about similar thing to add to Symfony, but after re-thinking the idea to have it core I decided to abandon it & probably propose it as documentation tutorial or something like that. Mostly cause it depends on external implementation.

WDYT?

@linaori
Copy link
Copy Markdown
Contributor

linaori commented Jun 27, 2018

Great idea! I'm pretty sure that despite depending on a vendor, this would be a nice addition to the core.

@dunglas
Copy link
Copy Markdown
Member Author

dunglas commented Jun 27, 2018

I was thinking about similar thing to add to Symfony, but after re-thinking the idea to have it core I decided to abandon it & probably propose it as documentation tutorial or something like that. Mostly cause it depends on external implementation.

I had the same feelings, but if it's good enough to be included in Firefox, with its huge user base. I guess it's good for us too. It's also used by 1password.

@nicolas-grekas nicolas-grekas added this to the next milestone Jun 27, 2018
Copy link
Copy Markdown
Contributor

@Majkl578 Majkl578 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this violates the license / ToS of the service:

In other words, you're welcome to use the public API to build other services, but you must identify Have I Been Pwned as the source of the data . Clear and visible attribution with a link to haveibeenpwned.com should be present anywhere data from the service is used including when searching breaches or pastes and when representing breach descriptions. It doesn't have to be overt, but the interface in which Have I Been Pwned data is represented should clearly attribute the source per the Creative Commons Attribution 4.0 International License.

By design, Symfony, as a server-side framework (and Validator, as server-side component), can't fulfill any of those requirements.

Also hardcoding HTTP requests to some URL sounds like a really bad idea (should the service be compromised, all users are doomed and you are in serious security trouble).

@ChangePlaces
Copy link
Copy Markdown

ChangePlaces commented Jun 27, 2018

Please don't turn symfony into laravel. The owners of this site could easily 'omit' certain passwords and will know what sites have such passwords.

@stof
Copy link
Copy Markdown
Member

stof commented Jun 27, 2018

Also hardcoding HTTP requests to some URL sounds like a really bad idea (should the service be compromised, all users are doomed and you are in serious security trouble).

I don't understand what you mean here. If you don't call the URL, you cannot use the service at all.

@stof
Copy link
Copy Markdown
Member

stof commented Jun 27, 2018

@Majkl578 the paragraph you pasted is titled License — breach & paste APIs. This validator is not using any of these 2 apis, but the "Pwned Passwords" one

@dunglas
Copy link
Copy Markdown
Member Author

dunglas commented Jun 27, 2018

Please don't turn symfony into laravel.

This kind of comments aren't constructive and create a bad atmosphere. That being said, and even if I personally think that we've a lot to learn from Laravel, I don't get what the relation between this PR and Laravel is. AFAIK, they don't provide an integration with HaveIBeenPwned (yet).
If "turning Symfony into Laravel" helps making the web a safer place, I'm 100% for.

The owners of this site could easily 'omit' certain passwords and will know what sites have such passwords.

The password hash isn't sent to the API (only the first 5 chars are). So the API cannot know the password used. You can refer to the Cloudflare paper about k-anonymity I linked in the PR description.

Anyway, this site is run by a well known security expert from Microsoft, and trusted by internet backbones such as Firefox and CloudFlare. This feature is 100% optin, if you don't trust this service, don't use it.

@Majkl578 In addition to what @stof said, I sent a mail to Troy Hunt to be sure we're in the path from a legal PoV. I think that it's the website that will use this validator that need to comply with ToS, not Symfony. We'll make that bold in the documentation.

@stof
Copy link
Copy Markdown
Member

stof commented Jun 27, 2018

Licensing is not an issue: https://twitter.com/troyhunt/status/1012066309251592192

@goetas
Copy link
Copy Markdown
Contributor

goetas commented Jun 27, 2018

Not really happy to see third party api landing into the symfony core (that's becoming bigger and bigger recently)

@teohhanhui
Copy link
Copy Markdown
Contributor

teohhanhui commented Jun 28, 2018

maxCount should be minCount, i.e. it's considered as pwned if it's present for a minimum number of times.

@teohhanhui
Copy link
Copy Markdown
Contributor

teohhanhui commented Jun 28, 2018

IMO we could add an optional dependency on a third-party HTTP client. Since there's no PSR, the closest thing is https://github.com/php-http/httplug

Edit: There's a draft PSR-18: https://github.com/php-fig/fig-standards/blob/master/proposed/http-client/http-client.md

protected function createValidator()
{
$httpClient = function (string $url) {
if ('https://api.pwnedpasswords.com/range/3EF27' === $url) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String instead of constant? Just asking. :)
private const API_ERROR_VALIDATOR_STRING or sth like that, maybe?

@dunglas
Copy link
Copy Markdown
Member Author

dunglas commented Jun 28, 2018

@teohhanhui I think it's safer to wait for PSR-18 and use something lightweight like the current signature for now. It will be easy to switch when PSR-18 will be stable.

@dunglas
Copy link
Copy Markdown
Member Author

dunglas commented Jun 28, 2018

@teohhanhui I planned to switch to threshold as suggested by @stof. wdyt?

@ChangePlaces
Copy link
Copy Markdown

Also wanted to point out, the twitter stream of the guy responsible for this service has many messages of him talking about how much it's costing to host the service. Never a good sign.

nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 7, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - fopen()+curl-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using NativeHttpClient
 - logger integration
 - a TraceableHttpClient and integration with the profiler
 - FrameworkBundle integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

Other ideas for the future:
 - HTTP/2 push as a temporary cache in complete()
 - cookie jar as a new "cookie_jar" option
 - a MIME multipart stream builder
 - HTTP cache as a decorating client
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 8, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - fopen()+curl-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using NativeHttpClient
 - logger integration
 - a TraceableHttpClient and integration with the profiler
 - FrameworkBundle integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

Other ideas for the future:
 - HTTP/2 push as a temporary cache in complete()
 - cookie jar as a new "cookie_jar" option
 - a MIME multipart stream builder
 - HTTP cache as a decorating client
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 8, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - fopen()+curl-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using NativeHttpClient
 - logger integration
 - a TraceableHttpClient and integration with the profiler
 - FrameworkBundle integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

Other ideas for the future:
 - HTTP/2 push as a temporary cache in complete()
 - cookie jar as a new "cookie_jar" option
 - a MIME multipart stream builder
 - HTTP cache as a decorating client
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 8, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()`+`curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - a MIME multipart stream builder
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 8, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()`+`curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - a MIME multipart stream builder
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 8, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()`+`curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - a MIME multipart stream builder
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - HTTP/HSTS cache
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 8, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()`+`curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - a MIME multipart stream builder
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - HTTP/HSTS cache
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 8, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()`+`curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - a MIME multipart stream builder
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - HTTP/HSTS cache
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 10, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()`+`curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - implement bridges with existing HTTP clients in contrib packages
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - helpers to send JSONs/forms/uploads via special type in "body" option
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - use raw sockets instead of the HTTP stream wrapper
 - HTTP/HSTS cache
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 10, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()` + `curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - helpers to send JSONs/forms/uploads via special type in "body" option
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - use raw sockets instead of the HTTP stream wrapper
 - HTTP/HSTS cache
 - implement bridges with existing HTTP clients
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 10, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()` + `curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - helpers to send JSONs/forms/uploads via special type in "body" option
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - use raw sockets instead of the HTTP stream wrapper
 - HTTP/HSTS cache
 - implement bridges with existing HTTP clients
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 10, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7,
which is orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()` + `curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - helpers to send JSONs/forms/uploads via special type in "body" option
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - use raw sockets instead of the HTTP stream wrapper
 - HTTP/HSTS cache
 - implement bridges with existing HTTP clients
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 10, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7, which is orthogonal
to the way Symfony is designed and thus we don't want to push in core.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests
 - [ ] split in several PRs?

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()` + `curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - helpers to send JSONs/forms/uploads via special type in "body" option
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - use raw sockets instead of the HTTP stream wrapper
 - HTTP/HSTS cache
 - implement bridges with existing HTTP clients
 - etc.
nicolas-grekas added a commit to nicolas-grekas/symfony that referenced this pull request Jan 10, 2019
| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | -
| License       | MIT
| Doc PR        | -

Sooner that later, we'll need to send HTTP calls to APIs,
e.g. Symfony Mailer or symfony#27738 already need it.

Common existing HTTP clients for PHP rely on PSR-7, which is complex and
orthogonal to the way Symfony is designed.

More reasons we need this in core are the package principles:
if we want to be able to keep our BC+deprecation promises, we
have to build on more stable and more abstract dependencies than
Symfony itself.

So here we are. This PR introduces a new `Http` namespace in contracts,
and a new `HttpClient` component.

Its surface is by design very simple to use, while still flexible
enough to cover more advanced use cases thanks to streaming+laziness.

Two full implementations are provided:
 - `NativeHttpClient` is based on PHP's HTTP stream wrapper. It's the
   most portable one but relies on a blocking `fopen()`.
 - `CurlHttpClient` relies on the curl extension. It supports full
   concurrency and HTTP/2.

Here are some examples that work with both clients.

For common cases, all the methods on reponses are synchronous:
```php
$client = new NativeHttpClient();

$response = $client->get('https://google.com');

$statusCode = $response->getStatusCode();
$headers = $response->getHeaders();
$content = $response->getContent();
```

When several responses need to be fetched concurrently,
clients provide a `complete()` method, e.g.:

```php
$client = new CurlHttpClient();
$pool = [];

for ($i = 0; $i < 379; ++$i) {
    $pool[] = $client->get("https://http2.akamai.com/demo/tile-$i.png");
}

foreach ($client->complete($pool) as $r) {
    $r->getHeaders();
}
```

The `complete()` method accepts a second `$timeout` argument that defaults
to `ini_get('default_socket_timeout')`, typically 60s. The `foreach` will
skip responses that are *inactive* for longer than the timeout. These
inactive responses are available by calling `$responses->getReturn()`
after iterating is done.

Providing `0` as timeout allows iterating over already completed responses
in a non-blocking way.

When a next response depends on the result of a previous one, it's also
possible to schedule new responses for completion. Doing so is incompatible
with `foreach` and requires using the `Generator` interface directly,
its `send()` method especially:

```php
$client = new CurlHttpClient();

// [...] populate $pool of responses for some first requests as above

$responses = $client->complete($pool);

while ($responses->valid()) {
    $completedResponse = $responses->current();
    $completedKey = $responses->key(); // the key of the response in $pool

    // [...] compute $moreResponses based e.g. on $completedResponse/Key

    if ($moreResponses) {
        $responses->send($moreResponses);
    } else {
        $responses->next();
    }
}
```

By default, clients follow redirections. When the redirection limit is
reached, or when a `4xx` or a `5xx` happens, an exception is thrown.

An array of options allows adjusting the behavior when sending requests.
They are documented in `HttpClientTrait`.

TODO:
 - [ ] validate the design
 - [ ] add tests

Implemented:
 - flexible contracts for HTTP clients
 - `fopen()` + `curl`-based clients
 - gzip compression enabled when possible
 - fetch multiple responses concurrently
 - progress function able to cancel the request
 - flexible timeout management
 - public key pinning

Help wanted (can be done after merge):
 - handle standard proxy-related env vars when using `NativeHttpClient`
 - logger integration
 - `TraceableHttpClient` and integration with the profiler
 - `FrameworkBundle` integration: autowireable alias + semantic configuration for default options
 - help clarify the contracts where needed
 - mock client

More ideas:
 - HTTP/2 push as a temporary cache in `complete()`
 - cookie jar with a new `"cookie_jar"` option
 - helpers to send JSONs/forms/uploads via special type in "body" option
 - make `NativeResponse::getAttribute()` return attributes inspired from `curl_getinfo()`
 - use raw sockets instead of the HTTP stream wrapper
 - HTTP/HSTS cache
 - implement bridges with existing HTTP clients
 - etc.
return;
}

$httpClient = $this->httpClient;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unecessary

$hashPrefix = substr($hash, 0, 5);
$url = sprintf(self::RANGE_API, $hashPrefix);

$result = $httpClient->request('GET', $url)->getContent();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the endpoint is unavailable (500/503/... for instance), what should we do?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we throw. But this has been discussed at some point, I'll add an option to ignore this constraint if the API is down (disabled by default).

public function testThresholdNotReached()
{
$constraint = new NotPwned(['threshold' => 10]);
$this->validator->validate(self::PASSWORD_LEAKED, $constraint);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
$this->validator->validate(self::PASSWORD_LEAKED, $constraint);
$this->validator->validate(self::PASSWORD_LEAKED, new NotPwned(['threshold' => 10]));

@dunglas
Copy link
Copy Markdown
Member Author

dunglas commented Mar 21, 2019

  • Added a new option to not throw in case of error
  • Fixed issues raised in comments

Should be ready to be merged

return;
}

throw $e;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think that throwing here by default makes sense. What should be configurable is whether a HTTP failure will make the validator create a constraint violation or just skip.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think that throwing here by default makes sense.

The system must be as secure as possible by default. If there is an outage for the service, I prefer to retry latter to create the account of my company user than letting using something like "mum", or one that already has leaked. Now this behavior can be change using a simple attribute.

What should be configurable is whether a HTTP failure will make the validator create a constraint violation or just skip.

It's exactly what the new attribute does, or am I missing something?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing an exception will not create a constraint violation, but will lead in a server error. From the user's point of view that's the worst that could happen as they won't get any feedback of what went wrong and if there is anything they could do.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if de don’t throw, how the monitoring system will detect the ongoing issue? It should be very exceptional and should probably trigger an alert.

Alternatively I can change the attribute to accept three value: throw (default), skip or fail. wdyt?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should log by default and add a scream option (defaulting to false) to allow to opt-in.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an exception it basically means that 3rd party is down, which means that there is an unrecoverable error. If you (as a system) decided that whatever password entered should NOT have been "pwned" then at this point we should throw an exception here, not log something. Now, if you "prefer" it to not have been "pwned" just set skipOnError to true and you're covered.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the same side as @sroze

@fabpot
Copy link
Copy Markdown
Member

fabpot commented Mar 31, 2019

This one should be in 4.3 :) Let's talk on Slack about the best way to finish it.

Copy link
Copy Markdown
Contributor

@sroze sroze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍

return;
}

throw $e;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an exception it basically means that 3rd party is down, which means that there is an unrecoverable error. If you (as a system) decided that whatever password entered should NOT have been "pwned" then at this point we should throw an exception here, not log something. Now, if you "prefer" it to not have been "pwned" just set skipOnError to true and you're covered.

@fabpot
Copy link
Copy Markdown
Member

fabpot commented Apr 1, 2019

Thank you @dunglas.


protected static $errorNames = [self::PWNED_ERROR => 'PWNED_ERROR'];

public $message = 'This password has been leaked in a data breach, it must not be used. Please use another password.';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be added in validators.en.xlf + any language you know :)

not sure we'll do another round of good first issues for the remaining locales :}

use Symfony\Component\Validator\Constraint;

/**
* Checks if a password has been leaked in a data breach.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps clarify the password should NOT be leaked :/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.