Is there a known problem with USB ports?

I’m not sure where else to turn at this point. I have been testing gl-inet devices for the past few days, mostly the ar150m and the ar300m to see if they could be used with a temperature sensor that I need for customers.

No matter what I do however, I keep getting CRC errors with the communications.
The software is a custom one that someone is building for me.
We are using the following methods.

opkg update
opkg install kmod-usb-serial kmod-usb-serial-ftdi kmod-usb-uhci
modprobe ftdi_sio
echo 0260 00a4 > /sys/bus/usb-serial/drivers/ftdi_sio/new_id

./tmpsensor --serial-port=/dev/ttyUSB0

On all of the gl-inet devices I’ve tested, they show CRC errors. I compared this to some Linux devices I have (mini PC’s) and they do not show these errors.

Wondering how I could find the reason for these problems.

Not sure if this can help.

D:response all packet 393, bad packet 22
D:request sleep 1000000 microseconds
D:request request latest data…

D:request next_pkt_length before read 0
D:request read nread 58
D:request pkt_length 58, nread 58
E:response packet crc check error
packet:
52 42 36 00 01 21 50 B0 0E 0A 12 09 02 00 58 9F
0C 00 65 15 97 00 71 05 3E 1B 28 07 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 0A 25
D:response all packet 394, bad packet 23

Can anyone share how I could monitor the communications to get some idea of where these errors are coming from.

Are these USB CRC errors or a higher-level protocol?

Are your mini-PCs x86 or a non-x86 architecture?

Is your code endian-clean? Like most non-x86 devices, the MIPS CPUs in Atheros SoCs are big-endian.

The only time I’ve had a problem with a USB device on a GL.iNet product, I eventually traced down root cause to an issue with the device: candleLight_fw/usbd_gs_can.c at master · candle-usb/candleLight_fw · GitHub - they haven’t implemented support for bigendian hosts yet

1 Like

As Entropy512 pointed out, there are a huge number of possibilities here since you’re talking about a custom piece of code, talking to a custom hardware device.

Comparing to code running on a MiniPC is not really relavent, other than proving that the device/software combination works in that specific scenario. I have seen times where problems arise with serial communications on MIPS hardware simply because the interface was written poorly and the timing the device expected was not being adequately delivered due to lower CPU power/processing speed of the MIPS board. You also have the whole slew of “cross compiled using an entirely different tool chain” to consider. Your MiniPC may be complining against GlibC vs Musl or whatever option is being used on your GLInet firmware. In theory they provide the same functionality, but there are all kinds of tweaks and differences between them that perhaps you are encountering. Or your coder is writing based on GNU assumptions or bugs that aren’t present or don’t operate the same way.

So many possibilities here. If I were in your shoes, I would start with:

  1. Since it looks like your custom software is communicating over an FTDI USB->Serial interface to the hardware, try ditching your software and using a standard terminal app. Minicom comes to mind. Install it, run commands/do whatever you think your software should be doing, and see if the errors occur there.

If they do, then it could be the device or a hardware issue. If not, then obviously something about the software or the tool chain/build process needs to be examined and tested in more detail. As Entropy mentioned, it could be Endian-ness, it could be underlying library versions (if you rely on any), or just misconceived code/code not designed to be run on an embedded MIPS device.

  1. Get your “software person” a device and buildroot setup to work with and have them troubleshoot it. If they are developing the solution for you, they should know best how to do the debugging to see exactly whats going on.
1 Like

As a bit more detail, I had some recent experience with “weird” issues with custom devices on USB ports. Eventually, I narrowed it down to two issues with two of my use cases:

  1. See above regarding one of the USB-CAN adapters I have here cloning the interface of another USB-CAN adapter incompletely. The “original” adapter handle endian-cleanliness issues on the adapter side by detecting host endianness. The “clone” has handling that detection and doing something with it as “TODO”. End result - adapter fails very badly on big-endian Linux hosts.
  2. Another USB-CAN adapter I have does play nice with big-endian hosts, but it turns out that Wireshark SSH remote capture is also NOT endian-clean and the captured data is garbage if a big-endian capture server is feeding a little-endian Wireshark instance. Saving the .pcapng file on the GL.iNet device and SSHing it after capture was completed works fine.
1 Like

Just to add, OpenWRT has some pretty extreme optimization flags in GCC to make the code as small as possible.
Some years ago a dev i was talking to that was working on arduino was doing similar optimizations to the compile process for his programs, and even found some GCC bugs doing those optimizations, apart from 1000 crashes in his code for no “apparent” reason.

@jolouis is totally right, if you make a program for say ubuntu, you can’t expect it to work exactly the same on openwrt due to the optimizations of gcc, glibc and all the surrounding libraries. Test the USB using some tools that are confirmed to work that ship or are present in openwrt, then compare to your program.

Wow, thank you for so many leads and ideas.

Of course you’re right, there are so many possible reasons but after starring so long as it, sometimes it’s good to post what you have because the feedback leads you in the right direction.

In this case, it ended up being the way that the code was communicating and now we have cleaned up all of the CRC errors so things are perfect.

My concern was that perhaps there is/was some known problem with the gl-inet devices when it comes to using usb-to-serial methods. Sometimes there are buggy versions, things like that so wanted to eliminate what ever I could.

I don’t know enough to explain what the dev did but when I suggested that perhaps it was the way we were communicating, he seemed to find a solution.

Thanks again.