I'm playing with a Suse 10.2 installation on my new hardware. It's all basically working locally but routing/DNS is broken.
I can traceroute to anywhere, I can ping anywhere but I can't actually communicate with anywhere - i.e. I can't connect Firefox to any web sites and I can't ssh to anywhere.
So what's set up wrong? /etc/resolv.conf has the right things in it (same as my other system anyway) and the default gateway is the same as my other system. I'm stumped.
On Thu, Oct 05, 2006 at 10:46:29PM +0100, cl@isbd.net wrote:
I'm playing with a Suse 10.2 installation on my new hardware. It's all basically working locally but routing/DNS is broken.
I can traceroute to anywhere, I can ping anywhere but I can't actually communicate with anywhere - i.e. I can't connect Firefox to any web sites and I can't ssh to anywhere.
So what's set up wrong? /etc/resolv.conf has the right things in it (same as my other system anyway) and the default gateway is the same as my other system. I'm stumped.
You say routing/DNS is broken, but if you can traceroute/ping then routing sounds fine.
Are you tracerouting/pinging an IP or a hostname?
If an IP, can you ssh to an IP address?
Do you have any firewalling rules?
J.
On Thu, Oct 05, 2006 at 10:58:50PM +0100, Jonathan McDowell wrote:
On Thu, Oct 05, 2006 at 10:46:29PM +0100, cl@isbd.net wrote:
I'm playing with a Suse 10.2 installation on my new hardware. It's all basically working locally but routing/DNS is broken.
I can traceroute to anywhere, I can ping anywhere but I can't actually communicate with anywhere - i.e. I can't connect Firefox to any web sites and I can't ssh to anywhere.
So what's set up wrong? /etc/resolv.conf has the right things in it (same as my other system anyway) and the default gateway is the same as my other system. I'm stumped.
You say routing/DNS is broken, but if you can traceroute/ping then routing sounds fine.
Well yes, I sort of realised that, but *something* is awry with routing as I have no useful internet access at all.
Are you tracerouting/pinging an IP or a hostname?
A hostname. E.g. I can 'traceroute shell.x-1.net' and I can 'ping shell.x-1.net' but I can't 'ssh shell.x-1.net'. The ssh just hangs for ever.
If an IP, can you ssh to an IP address?
No, I tried that anyway, ssh to an IP address just fails too.
Do you have any firewalling rules?
Possibly, but this is what I hate about modern distributions, it's all buried in the depths of GUI configuration utilities somewhere. I did try 'firewall off' in yast2 but that seemed to have no effect.
As you say it's not really a DNS problem, it's some sort of routing issue I think.
Do ping and traceroute use UDP? If so it would seem that UDP is working OK but that TCP isn't.
On Fri, Oct 06, 2006 at 08:46:40AM +0100, cl@isbd.net wrote:
Do ping and traceroute use UDP? If so it would seem that UDP is working OK but that TCP isn't.
Ping is ICMP, traceroute is UDP, grab tcptraceroute... if that doesn't work then you've got a TCP problem, this is *not* routing, though, as ICMP and UDP can traverse, this limits it to a firewall, and I'd assume that it's your gateway device.
Thanks,
On Fri, Oct 06, 2006 at 09:10:33AM +0100, Brett Parker wrote:
On Fri, Oct 06, 2006 at 08:46:40AM +0100, cl@isbd.net wrote:
Do ping and traceroute use UDP? If so it would seem that UDP is working OK but that TCP isn't.
Ping is ICMP, traceroute is UDP, grab tcptraceroute... if that doesn't work then you've got a TCP problem, this is *not* routing, though, as ICMP and UDP can traverse, this limits it to a firewall, and I'd assume that it's your gateway device.
The gateway is a router, already in use by two or three other systems on the network with no problems. That was my first thought in fact so I checked that the router firewall setup was the same for the new Linux box as for the other systems - it is.
I have checked repeatedly that the system with a problem has got the router's IP address as its default gateway too. (Anyway as non-TCP works surely the gateway must be set right)
Everything is on the same 192.168.1.0 subnet, the router (i.e. gateway) is 192.168.1.254. There are happily working Windows and Linux systems at 192.168.1.11, 192.168.1.1 and 192.168.1.4. The system that is not working is at 192.168.1.64. I have had other Linux distributions working OK in the same box at the same IP address so it's almost certainly something adrift in the installation itself I suspect.
(N.B. I have DHCP turned off for everything and fixed IP for all my systems)
On Fri, Oct 06, 2006 at 09:33:06AM +0100, cl@isbd.net wrote:
On Fri, Oct 06, 2006 at 09:10:33AM +0100, Brett Parker wrote:
On Fri, Oct 06, 2006 at 08:46:40AM +0100, cl@isbd.net wrote:
Do ping and traceroute use UDP? If so it would seem that UDP is working OK but that TCP isn't.
Ping is ICMP, traceroute is UDP, grab tcptraceroute... if that doesn't work then you've got a TCP problem, this is *not* routing, though, as ICMP and UDP can traverse, this limits it to a firewall, and I'd assume that it's your gateway device.
The gateway is a router, already in use by two or three other systems on the network with no problems. That was my first thought in fact so I checked that the router firewall setup was the same for the new Linux box as for the other systems - it is.
OK - alarm bells just went off in my head... just a simple, really easy question here... is this a 2.6.17 kernel? If so, can you try the following: echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
If your networking then works, I know what the problem is... if not, then there's something else at play here ;)
Cheers,
On Fri, Oct 06, 2006 at 09:40:09AM +0100, Brett Parker wrote:
On Fri, Oct 06, 2006 at 09:33:06AM +0100, cl@isbd.net wrote:
On Fri, Oct 06, 2006 at 09:10:33AM +0100, Brett Parker wrote:
On Fri, Oct 06, 2006 at 08:46:40AM +0100, cl@isbd.net wrote:
Do ping and traceroute use UDP? If so it would seem that UDP is working OK but that TCP isn't.
Ping is ICMP, traceroute is UDP, grab tcptraceroute... if that doesn't work then you've got a TCP problem, this is *not* routing, though, as ICMP and UDP can traverse, this limits it to a firewall, and I'd assume that it's your gateway device.
The gateway is a router, already in use by two or three other systems on the network with no problems. That was my first thought in fact so I checked that the router firewall setup was the same for the new Linux box as for the other systems - it is.
OK - alarm bells just went off in my head... just a simple, really easy question here... is this a 2.6.17 kernel? If so, can you try the following: echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
It's a 2.6.18 kernel (rc5 I think), but the above *does* fix the problem - what arcane art is that?! :-)
Brilliant, thanks.
If your networking then works, I know what the problem is... if not, then there's something else at play here ;)
I'm expecting this sort of problem really, I'm pushing the edge of drivers available in Linux which is why I'm using a 2.6.18 kernel, the SuSE 10.2 alpha (it'll be beta very soon I think) was one of the few that has the drivers I want in it.
On Fri, Oct 06, 2006 at 09:47:50AM +0100, cl@isbd.net wrote:
On Fri, Oct 06, 2006 at 09:40:09AM +0100, Brett Parker wrote:
OK - alarm bells just went off in my head... just a simple, really easy question here... is this a 2.6.17 kernel? If so, can you try the following: echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
It's a 2.6.18 kernel (rc5 I think), but the above *does* fix the problem - what arcane art is that?! :-)
Brilliant, thanks.
OK - the basic problem is that in the 2.6.17 kernel the window scaling was actually "fixed" to conform to the specification, and now uses a percentage of available ram for the window sizes... the unfortunate part is that half the routers in the world are completely broken wrt the spec, which is a pain...
So, the "long term" fix for now is to add to /etc/sysctl.conf the following:
net.ipv4.tcp_rmem=4096 87380 174760 net.ipv4.tcp_wmem=4096 16384 131072
Which are the old defaults for the window sizes.
To test that you can do: sysctl -w net.ipv4.tcp_rmem="4096 87380 174760" sysctl -w net.ipv4.tcp_wmem="4096 16384 131072" sysctl -w net.ipv4.tcp_window_scaling=1
If you want to check that these are set, cat the corresponding files in /proc/sys, so for example: cat /proc/sys/net/ipv4/tcp_window_scaling
And check that's back to 1, and the other 2 are set as I've put in here... and long live the tcp connection :)
I'm expecting this sort of problem really, I'm pushing the edge of drivers available in Linux which is why I'm using a 2.6.18 kernel, the SuSE 10.2 alpha (it'll be beta very soon I think) was one of the few that has the drivers I want in it.
I'm fairly sure that 2.6.18 has been released, Noodles probably knows better than I... *looks at kernel.org* - yup, it's been released, 2006-09-20 was the release date :)
Cheers,
On Fri, Oct 06, 2006 at 09:57:53AM +0100, Brett Parker wrote:
On Fri, Oct 06, 2006 at 09:47:50AM +0100, cl@isbd.net wrote:
On Fri, Oct 06, 2006 at 09:40:09AM +0100, Brett Parker wrote:
OK - alarm bells just went off in my head... just a simple, really easy question here... is this a 2.6.17 kernel? If so, can you try the following: echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
It's a 2.6.18 kernel (rc5 I think), but the above *does* fix the problem - what arcane art is that?! :-)
Brilliant, thanks.
OK - the basic problem is that in the 2.6.17 kernel the window scaling was actually "fixed" to conform to the specification, and now uses a percentage of available ram for the window sizes... the unfortunate part is that half the routers in the world are completely broken wrt the spec, which is a pain...
So, the "long term" fix for now is to add to /etc/sysctl.conf the following:
net.ipv4.tcp_rmem=4096 87380 174760 net.ipv4.tcp_wmem=4096 16384 131072
Which are the old defaults for the window sizes.
To test that you can do: sysctl -w net.ipv4.tcp_rmem="4096 87380 174760" sysctl -w net.ipv4.tcp_wmem="4096 16384 131072" sysctl -w net.ipv4.tcp_window_scaling=1
If you want to check that these are set, cat the corresponding files in /proc/sys, so for example: cat /proc/sys/net/ipv4/tcp_window_scaling
And check that's back to 1, and the other 2 are set as I've put in here... and long live the tcp connection :)
Excellent, thank you, I've done all that now and I'm just doing a reboot remotely - I'm not totally convinced it'll come back up for me to check now but that doesn't really matter.
I'm expecting this sort of problem really, I'm pushing the edge of drivers available in Linux which is why I'm using a 2.6.18 kernel, the SuSE 10.2 alpha (it'll be beta very soon I think) was one of the few that has the drivers I want in it.
I'm fairly sure that 2.6.18 has been released, Noodles probably knows better than I... *looks at kernel.org* - yup, it's been released, 2006-09-20 was the release date :)
Yes, I know that, but there's no distribution with the released version of the 2.6.18 kernel in it yet - at least not a 'major' distribution. I need to get something up and running in order to enable me to build a released 2.6.18 kernel.
While I've been waffling away here the system *has* rebooted successfully and now I can make TCP connections to the outside world!
:-) Thanks very much indeed.
On Fri, Oct 06, 2006 at 11:19:27AM +0100, cl@isbd.net wrote:
On Fri, Oct 06, 2006 at 09:57:53AM +0100, Brett Parker wrote:
I'm fairly sure that 2.6.18 has been released, Noodles probably knows better than I... *looks at kernel.org* - yup, it's been released, 2006-09-20 was the release date :)
Yes, I know that, but there's no distribution with the released version of the 2.6.18 kernel in it yet - at least not a 'major' distribution. I need to get something up and running in order to enable me to build a released 2.6.18 kernel.
Debian Unstable has it... That's a rather major distribution... OK - so it's not a released distribution... ;) I believe that Ubuntu Egdy ships at the end of the month with the 2.6.18 kernel too...
While I've been waffling away here the system *has* rebooted successfully and now I can make TCP connections to the outside world!
:-) Thanks very much indeed.
Not a problem, as I said, was bitten by this on someones gentoo system a while ago, and on a couple of our servers a while after that... so it came to mind :)
Cheers,
On Fri, Oct 06, 2006 at 11:36:48AM +0100, Brett Parker wrote:
On Fri, Oct 06, 2006 at 11:19:27AM +0100, cl@isbd.net wrote:
On Fri, Oct 06, 2006 at 09:57:53AM +0100, Brett Parker wrote:
I'm fairly sure that 2.6.18 has been released, Noodles probably knows better than I... *looks at kernel.org* - yup, it's been released, 2006-09-20 was the release date :)
Yes, I know that, but there's no distribution with the released version of the 2.6.18 kernel in it yet - at least not a 'major' distribution. I need to get something up and running in order to enable me to build a released 2.6.18 kernel.
Debian Unstable has it... That's a rather major distribution... OK - so it's not a released distribution... ;) I believe that Ubuntu Egdy ships at the end of the month with the 2.6.18 kernel too...
Yes, OK, Ubuntu Edgy is one of my two 'front runners' at the moment. I want a distribution that's officially supported by Vmware. The Ubuntu Edgy beta does pretty well at installing itself actually as they have already put the JMicron PATA patch into it and thus it can see my CD drive which is more than any other distribution I've tried so far.
The advantage of Suse is that it's rather easier to customise the way I want it (or so it seems so far), I can get to run an FVWM2 based desktop with very little effort.
I could also, of course, go for Slackware 11 which would make it very easy to transport all my customisations across from my old Slackware 10.x system. Slackware 11 comes with 2.6.18 as an optional kernel. It's just a bit messier to install Vmware, that's all.
While I've been waffling away here the system *has* rebooted successfully and now I can make TCP connections to the outside world!
:-) Thanks very much indeed.
Not a problem, as I said, was bitten by this on someones gentoo system a while ago, and on a couple of our servers a while after that... so it came to mind :)
It's one of those things which is trivial to fix but *very* difficult to diagnose. It did point pretty conclusively at the router when I found I could ssh from the new Linux box to the old one on the same subnet.
This whole process has been quite an education in (not?) buying bleeding edge hardware. It's something I have always avoided in the past, mostly on the basis that "just a little behind bleeding edge" is usually better value for money.
However this time everything and everyone said that the Intel Core 2 Duo processor was the one to go for as it's a significant jump ahead of the best AMD 64x2 processors at the moment. The price differential wasn't that significant compared with the AMD processors so I went for it.
I don't regret the decision really, I have an existing system (well, two systems) that continues to work OK so I can wast time on the new system without any real pressure to get it working quickly. The initial problem that pushed me to a new system (my win2k desktop was failing to boot more and more frequently) turned out to be a flakey power supply and replacing that has made it 100% reliable again.
I have (hopefully) a workaround for the JMicron IDE problem on the way, I've ordered a SATA CD/DVD drive. I should then be able to install a stable version of either SuSe or Ubuntu without hassle.
I have already worked around the Realtek 8168 NIC problem by putting a cheap NIC in one of the PCI slots.
Well, I'm really moving the topic to Penguins (as we should on a Linux list) ...
On 06-Oct-06 cl@isbd.net wrote:
This whole process has been quite an education in (not?) buying bleeding edge hardware. It's something I have always avoided in the past, mostly on the basis that "just a little behind bleeding edge" is usually better value for money.
I remember (a long time ago) reading about how penguins get up in the morning, and waddle off to the edge of their ice-floe. There they all line up, looking down into the sea (into which they all wish to plunge), and wait, doubtfully, shuffling their feet.
Eventually, one of them will shuffle a bit too far, and onto the slippery slope. It slides inexorably down, and splashes into the sea. The others watch, wide-eyed.
After a while, if the one that fell does not get eaten by a seal, the others drop forward onto their bellies and toboggan joyfully down to the sea for breakfast.
However this time everything and everyone said that the Intel Core 2 Duo processor was the one to go for as it's a significant jump ahead of the best AMD 64x2 processors at the moment. The price differential wasn't that significant compared with the AMD processors so I went for it.
Watching this wide-eyed! Congratulations on evading that leopard seal!
Cheers, Ted.
-------------------------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@nessie.mcc.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 06-Oct-06 Time: 13:18:50 ------------------------------ XFMail ------------------------------
On 06-Oct-06 cl@isbd.net wrote:
On Thu, Oct 05, 2006 at 10:58:50PM +0100, Jonathan McDowell wrote:
Are you tracerouting/pinging an IP or a hostname?
A hostname. E.g. I can 'traceroute shell.x-1.net' and I can 'ping shell.x-1.net' but I can't 'ssh shell.x-1.net'. The ssh just hangs for ever.
If an IP, can you ssh to an IP address?
No, I tried that anyway, ssh to an IP address just fails too.
It looks as though it's not a routing/DNS problem at all.
I'm wondering if it may have to do with ssh -- maybe the host you're trying to ssh to doesn't like being accosted with ssh (may even not have it available) and so simpy refuses to respond, and therefore hangs. Or maybe your connection is coming in from an IP address that it doesn't want to respond to.
Have you tried simple telnet?
Chris, I'll drop you a private mail with an IP address that will certainly respond to telnet with a login prompt, provided you can connect to it at all.
If you get that far, then it's not routing.
If it hangs, then you're not getting to it.
Cheers, Ted.
-------------------------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@nessie.mcc.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 06-Oct-06 Time: 09:20:33 ------------------------------ XFMail ------------------------------
On Fri, Oct 06, 2006 at 09:20:36AM +0100, Ted Harding wrote:
On 06-Oct-06 cl@isbd.net wrote:
On Thu, Oct 05, 2006 at 10:58:50PM +0100, Jonathan McDowell wrote:
Are you tracerouting/pinging an IP or a hostname?
A hostname. E.g. I can 'traceroute shell.x-1.net' and I can 'ping shell.x-1.net' but I can't 'ssh shell.x-1.net'. The ssh just hangs for ever.
If an IP, can you ssh to an IP address?
No, I tried that anyway, ssh to an IP address just fails too.
It looks as though it's not a routing/DNS problem at all.
I'm wondering if it may have to do with ssh -- maybe the host you're trying to ssh to doesn't like being accosted with ssh (may even not have it available) and so simpy refuses to respond, and therefore hangs. Or maybe your connection is coming in from an IP address that it doesn't want to respond to.
It's not just ssh not working and I can ssh to the same place from another Linux box on the same network/subnet as the one that doesn't work.
HTTP doesn't work either, just sits with the thingy going round and round but no connection.
Have you tried simple telnet?
Chris, I'll drop you a private mail with an IP address that will certainly respond to telnet with a login prompt, provided you can connect to it at all.
I have tried a telnet to the same place and it appears to connect but doesn't produce a prompt. I also tried turning the verbosity of ssh up somewhat and it too appears to connect but nothing gets returned.
It's as if the system is blocking *all* incoming packets, even those that are coming back down the same TCP stream.
I can ssh *in* to the system in question and (just tried it) I can ssh *from* the system into the other Linux box on the same subnet. What I can't do is ssh to the outside world - wierd!!
Wow - I just tried running yast2 on the system in question having logged into it from here at work via two ssh connections. It's popped up the GUI YaST control centre on my Sun X desktop here at work! Most things *must* be working pretty well then!