Zero RX and TX Packets
My new server had been running for about 10 days. Running Slackware 14.1 32-bit. No connectivity problems. Excellent iperf
results. No problems streaming videos to my media player.
Then I was checking something — I now forget what — when I ran ifconfig
at the server. All zeroes for RX and TX packets. Likewise for /proc/net/dev
. Likewise for vnstat
.
The vnstat history indicated the defect existed the moment I went live with the server. I never had much reason to notice because of the great connectivity.
I use the ifconfig
transfer stats in a script to decide when to power down at night. I have been using this script for several years since I started using my office desktop for pseudo server tasks. The RX and TX stats indicate current connection transfers, such as client systems streaming music and videos, or downloading package updates through cron jobs. (I also use netstat -an
to test client connections but that is unrelated to this topic.)
The server ASRock N68C-GS4 FX motherboard uses an Atheros QCA8171 Gigabit chip set, which uses the alx
kernel module driver. Lesson number one, be more diligent when buying motherboards. Actually I had been diligent when I researched the board. I read some customer reports that the Atheros chip worked in Linux. I did not pause to think the statement needing qualification as to which kernel versions. Or that quirks existed.
Some web surfing indicated a lack of traffic counter support with that module.
There were reports of wake-on-lan not working with some newer kernel versions. I was having no problems using wake-on-lan with the motherboard. All I needed was the network stats.
Slackware 14.1 was released with the 3.10.17 kernel. The latest 3.10 version is 3.10.94.
More surfing indicated that the alx
driver network stats had been fixed, at least in the 3.14 kernel. Perhaps the related patches had been backported to the 3.10 series.
A related kernel bug report contained a link to the original maintainer’s reason for not supporting traffic counters.
This is classic WTF material. Year of the Linux Desktop? Not likely with this kind of attitude.
I have not compiled a kernel in several years. I still had my script wrapper from a few years ago when I regularly compiled kernels on Slackware.
I interrupted my day to fix Yet Another Linux Related Bug.
I copied the Slackware 3.10.17 kernel config file to my kernel build directory. A few nominal adjustments to my old script and with a mild shake of wonderment with my head, was on my way to compiling a new kernel.
The days of 15 minute kernel compiles are long gone. Compiling the 3.10.94 kernel took an hour on my 2.3 GHz dual core system.
While compiling I looked further into the problem. After learning the stats counter problem was fixed in the 3.14 series kernel, I downloaded a copy of the 3.14.58 sources and compared them to the 3.10.94 sources. The net stats counter support had not been backported into the 3.10.94 kernel. I was wasting my time compiling.
I next thought I could find a patch to merge with the 3.10.17 sources. I did not find anything specific but I ran into something called the kernel backports. Seems this is what I needed.
I browsed the alx
sources and found references to stat counting. Good.
The general approach is nice:
- Download the latest version of backports.
untar
the sources.cd
to that directory.- Run
make menuconfig
.
As far as I can tell the default config is packaged such that nothing or very little is enabled. The trick is to enable only the needed modules and save the config file.
- Run
make
. - Run
make install
.
When I finished running make
there were two kernel modules in the source tree:
compat/compat.ko
drivers/net/ethernet/atheros/alx/alx.ko
Running make install
installed the new modules to /lib/modules/`uname -r`/updates
and runs depmod
.
Next I held my breath:
rmmod alx && modprobe alx && /etc/rc.d/rc.inet1 restart
The one-liner was quick enough that my office system did not miss a beat and continued playing tunes from the server. I ran two scripts, one to test my ISP connection route and then another to sync files from the server. Finally I had ifconfig eth0
network stats.
Yay. A short-lived yay. A cynical and sour yay.
I lost wake-on-lan support, which was working beautifully before the updated driver. Reboots and shut downs did not fix the bug.
Running ethtool
verified the loss. There was no wake-on-lan support at all.
I found a reverse patch in this bug report. The bug report indicates the wake-on-lan support was ripped from the driver and has yet to be restored. I checked the alx
driver in the backports sources and confirmed the lack of support. The reverse patch was my only hope.
I merged the patch and recompiled the backports sources.
I verified network stats with ifconfig
and wake-on-lan support with ethtool
. A suspend to ram and a magic packet from my office computer verified wake-on-lan was again truly working.
That is how people using Linux waste, er, consume, er, spend an entire afternoon fixing bugs. Someone once defined insanity as repeating the same act and expecting different results. We Linux users are insane. We keep repeating the same act of acting as though this stuff just works.
Does that alx
kernel maintainer receive a pay check for this kind of unprofessional work?
Posted: Usability Tagged: General
Category:Next: Missing Command Line Tweaks
Previous: Nvidia Woes