Iashim [internetarchive shim for http://box/archive with NGINX]#2120
Iashim [internetarchive shim for http://box/archive with NGINX]#2120georgejhunt wants to merge 4 commits intoiiab:masterfrom
Conversation
| src: /etc/apache2/sites-available/internetarchive.conf | ||
| path: /etc/apache2/sites-enabled/internetarchive.conf | ||
| state: link | ||
| - name: Install nginx config for nternetarchive for short URL http://box/archive (if debuntu and internetarchive_enabled) |
There was a problem hiding this comment.
| - name: Install nginx config for nternetarchive for short URL http://box/archive (if debuntu and internetarchive_enabled) | |
| - name: Install nginx config for internetarchive for short URL http://box/archive (if debuntu and internetarchive_enabled) |
| - name: Restart to enable/disable http://box/archive (not just http://box:{{ internetarchive_port }}) | ||
| systemd: | ||
| name: "{{ apache_service }}" # httpd or apache2 | ||
| name: nginx # httpd or apache2 |
There was a problem hiding this comment.
| name: nginx # httpd or apache2 | |
| name: nginx |
|
@mitra42 can we ask you to please have your Node web server listen on IPv4 ? Let us know if you have any questions! Then we can merge this or similar. |
|
Hi @holta - no idea what this is about. I've certainly not intentionally constrained anything to listen on IPv6, the servers in all other configurations are running standalone - i.e. not thru Apache or Nginx and are certainly listening on IPv4 (no way i would remember their IPv6 address to key it in !). I'm actually surprised that a proxypass works since dweb-mirror can redirect URLs itself,. But it must be as AFAIK IA is working on IIAB (I only have one RPI4 and its setup with a different configuration at the moment), but anyway it should at least get you the first page - problems , if there were any, would be after that point. But ... as I said, definitely I'm not doing anything about using IPv6. |
|
@georgejhunt can you clarify a bit more what you need from @mitra42? Thanks! |
|
@georgejhunt you should be able to confirm what I'm saying by just pointing browser at http://[IPv4-address-of-yourbox]:4244 from any other machine. |
|
Yes, I can confirm that from a browser 172.18.96.1:4244 works.
But getting nginx to proxy that, such that http://172.18.96.1/archive works
... that is something I'm still working on.
In the following listing, I'd like to see node listening of 127.0.0.1:4244
netstat returns:
oot@box:/opt/iiab/internetarchive/node_modules/@internetarchive/dweb-mirror#
netstat -natp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
9416/nginx: master
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN
5929/dnsmasq
tcp 0 0 172.18.96.1:53 0.0.0.0:* LISTEN
5929/dnsmasq
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
842/sshd
tcp 0 0 0.0.0.0:3000 0.0.0.0:* LISTEN
30671/kiwix-serve
tcp 0 0 127.0.0.1:8090 0.0.0.0:* LISTEN
2869/apache2
tcp 0 0 0.0.0.0:9090 0.0.0.0:* LISTEN
1540/uwsgi
tcp 0 0 0.0.0.0:8008 0.0.0.0:* LISTEN
29929/python2.7
tcp 0 0 127.0.0.1:3306 0.0.0.0:* LISTEN
26809/mysqld
tcp 0 196 10.10.123.180:22 10.10.123.191:55345 ESTABLISHED
7478/sshd: root@pts
tcp6 0 0 :::4244 :::* LISTEN
11845/node
tcp6 0 0 :::4949 :::* LISTEN
973/perl
tcp6 0 0 :::22 :::* LISTEN
842/sshd
root@box
:/opt/iiab/internetarchive/node_modules/@internetarchive/dweb-mirror#
…-- and we have essenially turned ipv6 whenever we can. I think that the
kernel must be equating ipv6 and ipv4 for a port that it has open.
On Thu, Jan 9, 2020 at 1:04 PM Mitra Ardron ***@***.***> wrote:
@georgejhunt <https://github.com/georgejhunt> you should be able to
confirm what I'm saying by just pointing browser at http://:4244 from any
other machine.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2120?email_source=notifications&email_token=AAOTQHHJ4K5C3OKYXF33ETLQ46GNNA5CNFSM4KEOXNS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIRYYYI#issuecomment-572755041>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOTQHDFLLVITCY2EBBMUMDQ46GNNANCNFSM4KEOXNSQ>
.
|
|
There's some chat on the topic here nodejs/node#18041 regarding change in node's default behavior, which suggests a change to the line in In dweb mirror, problem is that I'm running a different setup and don't have a way to know if any change i made actually fixed your problem. Editing that line and then doing if you find a combination that works for your setup, I can look at testing in other setups, and will put it in the config if that's needed. |
|
your suggestion succeeded at making port 4244 visible to the "netstat"
function, but did not resolve the lack of function at port 4244.
I played with the browser on the test machine using the url that
works: "http://localhost:4244", and saw that there is an intermediate
redirect to:
http://127.0.0.1:4244/archive/archive.html?mirror=127.0.0.1%3A4244&transport=HTTP&identifier=local;
So I tried the following nginx conf:
location /archive {
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header Host $http_host;
proxy_pass http://127.0.0.1:4244/archive/archive.html?mirror=127.0.0.1%3A4244&transport=HTTP&identifier=local;
}
Which in turn returned a link labeled "Skip to main Content" -- which
seems most likely to be a response issued from the dweb code in node.
So maybe you know of a url that would be correct for the proxy_pass url?.
…On Fri, Jan 10, 2020 at 1:51 PM Mitra Ardron ***@***.***> wrote:
There's some chat on the topic here nodejs/node#18041 regarding change in node's default behavior, which suggests a change to the line in mirrorHttp
const server = app.listen(config.apps.http.port);
In dweb mirror, problem is that I'm running a different setup and don't have a way to know if any change i made actually fixed your problem.
Editing that line and then doing service internetarchive restart should work fine.
I'd try
const server = app.listen(config.apps.http.port, "127.0.0.1");
or
const server = app.listen(config.apps.http.port, "0.0.0.0");
if you find a combination that works for your setup, I can look at testing in other setups, and will put it in the config if that's needed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
|
I don't actually know what URL would work for proxypass on new version, in fact I'm surprised it worked on old version ! The server uses redirects, so its most comfortable when its addressed directly e.g. having nginx redirect to https://:4244. I remember we struggled with this when we first got internet archive working on IIAB but that was a while back (I thought the problem was that under Apache with rewrite module turned off (previous IIAB confi) and unlike under Nginx, you coudnt figure out what the IP address of your host was. Anyway, that proxypass line you've give it won't do what I think you are trying to achieve because I believe in nginx you do a "rewrite" of the URL, and then the proxypass, for example I had this block for a different service. But as I said, I do not think proxypass will work at all, once the browser has the code its going to try and access URLs that won't work (like '/details/foo') My best guess is that you want a block like Once it gets there, you should see a redirect to ... If that doesnt work then I may have to replicate your setup to figure this out, can you give me (abbreviated) instructions to build the same setup that is failing for you (I have a RPI4 I'll use for this). |
|
@mitra42 fyi IIAB 7.1's release was deferred until February (likely the 2nd half of February) so that NGINX and similar infra are much more solid and clear for all. That means we have many weeks ahead of us to get things unstuck here. But still give me a shout if you and @georgejhunt need a short call to accelerate this work by early/mid-February at the latest? |
|
On the call this morning, I learned that apache was not able to proxy
/archive either.
If that is the case, I'm thinking this is a "won't fix".
…On Wed, Jan 15, 2020 at 9:03 AM A Holt ***@***.***> wrote:
@mitra42 <https://github.com/mitra42> fyi IIAB 7.1
<https://github.com/iiab/iiab/milestone/6>'s release was deferred until
February (likely the 2nd half of February) so that NGINX and similar infra
are much more solid and clear for all.
That means we have many weeks ahead of us to get things unstuck here.
But still give me a shout if you and @georgejhunt
<https://github.com/georgejhunt> need a short call to accelerate this
work by early/mid-February at the latest?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2120?email_source=notifications&email_token=AAOTQHAP45QYIKLEBZRPVCTQ546VTA5CNFSM4KEOXNS2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJBBLXI#issuecomment-574756317>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAOTQHBDSBUNZQDADL7WN6LQ546VTANCNFSM4KEOXNSQ>
.
|
|
Ok - I'm thinking it should be a redirect, from /archive/xxx to http://$host:4244/xxx |
|
Hi @georgejhunt - I'd propose it as a PR but I've got one of these boxes up, and am more than a little confused by having both apache and nginx on it ! I can't figure out how they are related - what handles what and how they hand off between them, in particular I can only find an apache rule set for archive and it doesnt look like this is being used. |
|
When I pasted your location rewrite clause, nginx complained about a
missing ")". I took out the "(" but still got an error.
Can you suggest where the ")" belongs?
…On Wed, Jan 29, 2020 at 9:35 PM Mitra Ardron ***@***.***> wrote:
Hi @georgejhunt -
Anyway .... if we are moving to nginx, why not add something like ....
location /archive/ {
rewrite ^(/archive/(.*)$ http://$host:4244/$1 permanent;
}
I'd propose it as a PR but I've got one of these boxes up, and am more than a little confused by having both apache and nginx on it ! I can't figure out how they are related - what handles what and how they hand off between them, in particular I can only find an apache rule set for archive and it doesnt look like this is being used.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
|
I would have taken out the first '('. doesn't $1 refer to the 2nd parens? |
|
Arggh - sorry, I cut and pasted another rule and miss-edited. Should work.... Note the other half of the question, was that I couldn't figure out where this rule went, so I couldn't test it, given that both Apache and Nginx appear to be running in current version (presumably because the switch over is half done) |
Yeah please read https://github.com/iiab/iiab/blob/master/roles/nginx/README.md so that you understand why both Apache and NGINX are on. Hit us back @mitra42 if you have any questions, Thanks! Obviously this isn't the final word (as the doc above explains, some apps/services cannot easily be made to run under NGINX yet...at this point and possibly for the coming year or so!) |
|
Thanks for the pointer - answered most of my questions. Point 2 doesn't parse - FastCGI is not available on nginx, or on Apache ? I see it in the config for mediawiki, and I'm not sure what "validates Ngix" means, does it mean validates that app for NGINX, or is it a typo meaning "Invalidates for NGINX" List (iv) is really services that run their own http server (neither Apache nor NGINX) i.e. don't need handling in nginx, just forwarding (redirect for internetarchive, not sure which for kalite) |
@georgejhunt I think you wrote that sentence, so hopefully you can explain more?
Exactly. The point is that many (not all of course) but many educators and users out there are very fond of mnemonics like http://box/archive and http://box/kalite if we can make these possible. For very rapid access to that day's lesson — without students/all getting overly distracted surfing around into other topics/apps off of IIAB's main page e.g. http://box |
|
Understood - and should be easier with nginx because you have the $host variable, that makes redirects to the same host MUCH easier, as in the line above so http://box/archive/details/foo would redirect to http://box:4244/details/foo which would work, and access its own internal URLs correctly. (which it wont do after a proxypass) You can do this on Apache, but IIAB was running with rewrite module disabled, which disables access to that variable. |
Missed translating apache to nginx for internetarchive.
This still fails because internetarchive uses node, and is currently configured to listen on ipv6, but we've disabled ipv6 on IIAB.
The configuration that needs changing may be in dweb, but I cannot find it.