Monday, September 23, 2013

Debugging Problems with the Linux sunxi rtl8188eu Wireless Driver

New: Updates are at the bottom!

I have a problem with rtl8188eu wireless driver.  I develop the OS for the MICILE tablet system and sometimes the wireless modules fail to initialize on boot or reboot.  Needless to say, this makes for unhappy customers.  This posting is a preliminary analysis of the problem with a possible path to a solution

What do I currently know of the problem?
  • Seems to happen when a lot of wifi networks are present
  • When it happens, the entire USB bus that the module is on fails to appear when lsusb is called
  • The rtl8188eu kernel module can not be unloaded, so only one chance to debug per reboot

WIFI Failed to Initialize!

So how can I solve such a problem?

Some items that need to be done before a solution can be found are:

1.  Replicate the problem reliably
2.  Need to sniff the USB power lines and the data packets
3.  Fix the rtl8188eu driver so it can be unloaded

To solve #1, I plan on finding every old wireless router I can and hook them all up in the same room to replicate an apartment with wifi on all sides.  I think 9 routers should do it, as that should form a grid of nine networks representing an apartment surrounded by wifi routers on all sides.  In addition, some kind of automated reboot loop will have to be written to power cycle the tablet until the problem occurs.  The problem only occurs 5 to to percent of the time, so some automated way must be found to make the problem happen.

Crowded Apartment WIFI Configuration

To solve #2, the tablet will be opened up, wires will be soldered to the onboard usb wifi module and a logic analyzer will monitor the usb bus and an oscilloscope will monitor the voltage rail.  Since the usb voltage is software controlled, it could be a bug in the way the power is brought up from a standby state.

USB Debug Points

To solve #3, I don't really care about the functionality of the driver because once the module comes up it works just fine.  I just need enough functionality in the driver to bring up the usb bus and the module and then fix the driver enough to be able to unload cleanly.  To that end, everything in between those two steps will be deleted from the module.

Wish me luck that the answer comes easily!

Day 2 Update!

Wires have been soldered to the USB pins.

USB Debug Wires
And interesting information has been found!  It was as I suspected!  When rebooting the tablet, the USB bus that the WIFI adapter is attached to does not power cycle correctly.

I instrumented the kernel with printk statements that clearly show the power to the USB bus should be turned off for several seconds during the boot sequence, but that is clearly not happening as shown by the logic analyzer hooked up to the USB power line.  On a cold boot however, it takes several seconds for the USB power line to come up, so that gives hope that it is in fact controlled by the AXP power controller and can be toggled.  My thinking is that if the power is turned off the USB correctly during a reboot sequence then the USB WIFI module will reset and come back properly every time.  Without the power cycle, the USB WIFI module can be left in a bad state between reboots and then fail to enumerate itself on the USB bus on a reboot.

From looking at the reference design schematics, it looks like the USB WIFI power should be hooked up to the LDO3 of the AXP209 power controller.  Let's hope the factory didn't deviate from the reference schematic.

USB WIFI Power Schematic

After some careful probing with a multimeter I can confirm that the power rail of the USB WIFI module is indeed connected to pin 41 (LDO3) of the AXP209 power controller chip.

I think we are almost there!  Now we just need to figure out how to turn on and off the LDO3 output of the AXP209 from the kernel.  Most likely at this point it will turn out to be a wrong setting in the script.bin configuration file for the tablet and need no code changes to the kernel or kernel drivers.  The datasheet for the AXP209 is in chinese, so this should be fun trying to figure out the proper I2C commands to turn the LDO3 output on and off.  The worst part is I just desoldered the very fine wires to the 0603 resistors used for spying on the I2C bus.  I'm not going to push my luck and try to reattach them so I will see if we can live without seeing what is happening on the bus.

More to come later!