Switch reboot issue gl-ap1300

Hi,

using gl-ap1300 with openwrt 22.03.2. When connecting the WAN ethernet cable and issuing the ‘reboot’ command, the boot hangs on the uboot init for a while, then boots up with a lot of warnings and finally comes up without working ethernet.
Note that this somehow seems related to the ethernet hardware to which the ap-1300 is connected: on some networks the boot is ok, on other network the issue happens.

any suggestions?

  • is there a fix (maybe a newer bootloader?)
  • is it possible to trigger a power cycle from the command line (instead of a reboot)?
  • can I somehow reset the switch or eth phys completely after the reboot. I tried
swconfig dev switch0 set linkdown 1
swconfig dev switch0 set apply

as well as

cd /sys/devices/platform/soc/c000000.ess-switch/driver/
echo c000000.ess-switch > unbind
echo c000000.ess-switch > bind

but this re-triggers the same warnings. Neither command brings back the ethernet phys

  • can I somehow shut down the switch or eth phys completely before reboot?
  • any debug info that can help?

thnx

Tim

Serial console log (more complete logs in attachment seriallogs.zip (11.3 KB))

reboot
root@Ottie:/# [   55.969047] ath10k_ahb a800000.wifi: peer-unmap-event: unknown peer id 1
<...>
[  
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.BF.3.1.1-00120
S - IMAGE_VARIANT_STRING=DAABANAZA
S - OEM_IMAGE_VERSION_STRING=CRM
S - Boot Config, 0x00000021
S - Reset status Config, 0x00000010
S - Core 0 Frequency, 0 MHz
B -       261 - PBL, Start
B -      1339 - bootable_media_detect_entry, Start
B -      1678 - bootable_media_detect_success, Start
B -      1692 - elf_loader_entry, Start
B -      5069 - auth_hash_seg_entry, Start
B -      7211 - auth_hash_seg_exit, Start
B -    577089 - elf_segs_hash_verify_entry, Start
B -    694458 - PBL, End
B -    694482 - SBL1, Start
B -    785493 - pm_device_init, Start
D -         7 - pm_device_init, Delta
B -    786939 - boot_flash_init, Start
D -     52820 - boot_flash_init, Delta
B -    843898 - boot_config_data_table_init, Start
D -      3839 - boot_config_data_table_init, Delta - (419 Bytes)
B -    851109 - clock_init, Start
D -      7578 - clock_init, Delta
B -    863161 - CDT version:2,Platform ID:8,Major ID:1,Minor ID:1,Subtype:0
B -    866575 - sbl1_ddr_set_params, Start
B -    871672 - cpr_init, Start
D -         2 - cpr_init, Delta
B -    876055 - Pre_DDR_clock_init, Start
D -         4 - Pre_DDR_clock_init, Delta
D -     13177 - sbl1_ddr_set_params, Delta
B -    889794 - pm_driver_init, Start
D -         2 - pm_driver_init, Delta
B -    959926 - sbl1_wait_for_ddr_training, Start
D -        27 - sbl1_wait_for_ddr_training, Delta
B -    975546 - Image Load, Start
D -    152350 - QSEE Image Loaded, Delta - (297752 Bytes)
B -   1128325 - Image Load, Start
D -      1446 - SEC Image Loaded, Delta - (2048 Bytes)
B -   1138738 - Image Load, Start
D -    222252 - APPSBL Image Loaded, Delta - (454580 Bytes)
B -   1361388 - QSEE Execution, Start
D -        60 - QSEE Execution, Delta
B -   1367609 - SBL1, End
D -    675235 - SBL1, Delta
S - Flash Throughput, 2008 KB/s  (754799 Bytes,  375847 us)
S - DDR Frequency, 537 MHz


U-Boot 2012.07 [Chaos Calmer 15.05.1,1619990+r49254] (Mar 24 2021 - 17:28:16)

smem ram ptable found: ver: 1 len: 3
DRAM:  256 MiB
machid : 0x8010100
NAND:  spi_nand: spi_nand_flash_probe SF NAND ID 0:c2:12:c2
SF: Detected MX35LFxGE4AB with page size 2 KiB, total 128 MiB
SF: Detected W25Q32 with page size 4 KiB, total 4 MiB
ipq_spi: page_size: 0x100, sector_size: 0x1000, size: 0x400000
132 MiB
MMC:   
In:    serial
Out:   serial
Err:   serial
machid: 8010100
flash_type: 0

Net:   MAC0 addr:94:83:c4:22:da:5e
PHY ID1: 0x4d
PHY ID2: 0xd0b1

<stays stuck here for 1-2 min>

then the kernel boots, but prints a lot of warnings

ipq40xx_ess_sw_init done
eth0
Hit "gl" key to stop booting:  0
<...>
Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 5.10.146 (builder@buildhost) (arm-openwrt-linux-muslgnueabi-gcc (OpenWrt GCC 11.2.0 r19803-9a599fee93) 11.2.0, GNU ld (GNU Binutils) 2.37) #0 SMP Fri Oct 14 22:44:41 2022
[    0.000000] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=10c5387d
[    0.000000] CPU: div instructions available: patching division code
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[    0.000000] OF: fdt: Machine model: GL.iNet GL-AP1300
[    0.000000] Memory policy: Data cache writealloc
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000080000000-0x000000008fffffff]
[    0.000000]   HighMem  empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080000000-0x0000000087dfffff]
[    0.000000]   node   0: [mem 0x0000000087e00000-0x0000000087ffffff]
[    0.000000]   node   0: [mem 0x0000000088000000-0x000000008fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000080000000-0x000000008fffffff]
[    0.000000] percpu: Embedded 15 pages/cpu s30860 r8192 d22388 u61440
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 64960
[    0.000000] Kernel command line: ubi.mtd=rootfs root=mtd:ubi_rootfs rootfstype=squashfs rootwait ubi.mtd=ubi root=/dev/ubiblock0_1
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 247172K/262144K available (6446K kernel code, 608K rwdata, 952K rodata, 1024K init, 246K bss, 14972K reserved, 0K cma-reserved, 0K highmem)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] 	Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[    0.000000] arch_timer: cp15 timer(s) running at 48.00MHz (virt).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xb11fd3bfb, max_idle_ns: 440795203732 ns
[    0.000008] sched_clock: 56 bits at 48MHz, resolution 20ns, wraps every 4398046511096ns
[    0.000025] Switching to timer-based delay loop, resolution 20ns
[    0.000326] Calibrating delay loop (skipped), value calculated using timer frequency.. 96.00 BogoMIPS (lpj=480000)
[    0.000352] pid_max: default: 32768 minimum: 301
[    0.000536] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.000556] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.001588] CPU: Testing write buffer coherency: ok
[    0.001975] qcom_scm: convention: smc legacy
[    0.002938] Setting up static identity map for 0x80300000 - 0x8030003c
[    0.003100] rcu: Hierarchical SRCU implementation.
[    0.003372] dyndbg: Ignore empty _ddebug table in a CONFIG_DYNAMIC_DEBUG_CORE build
[    0.003745] smp: Bringing up secondary CPUs ...
[    0.007312] smp: Brought up 1 node, 4 CPUs
[    0.007338] SMP: Total of 4 processors activated (384.00 BogoMIPS).
[    0.007349] CPU: All CPU(s) started in SVC mode.
[    0.012119] VFP support v0.3: implementor 41 architecture 2 part 30 variant 7 rev 5
[    0.012289] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.012316] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[    0.012566] pinctrl core: initialized pinctrl subsystem
[    0.014377] NET: Registered protocol family 16
[    0.014761] DMA: preallocated 256 KiB pool for atomic coherent allocations
[    0.015861] thermal_sys: Registered thermal governor 'step_wise'
[    0.016301] cpuidle: using governor ladder
[    0.016359] cpuidle: using governor menu
[    0.041765] cryptd: max_cpu_qlen set to 1000
[    0.046083] usbcore: registered new interface driver usbfs
[    0.046157] usbcore: registered new interface driver hub
[    0.046224] usbcore: registered new device driver usb
[    0.046277] pps_core: LinuxPPS API ver. 1 registered
[    0.046290] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.046318] PTP clock support registered
[    0.048262] clocksource: Switched to clocksource arch_sys_counter
[    0.049232] NET: Registered protocol family 2
[    0.049487] IP idents hash table entries: 4096 (order: 3, 32768 bytes, linear)
[    0.050507] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 6144 bytes, linear)
[    0.050573] TCP established hash table entries: 2048 (order: 1, 8192 bytes, linear)
[    0.050614] TCP bind hash table entries: 2048 (order: 2, 16384 bytes, linear)
[    0.050664] TCP: Hash tables configured (established 2048 bind 2048)
[    0.050794] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.050838] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
[    0.051130] NET: Registered protocol family 1
[    0.051180] PCI: CLS 0 bytes, default 64
[    0.053658] workingset: timestamp_bits=14 max_order=16 bucket_order=2
[    0.058158] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.058183] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    0.186047] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 248)
[    0.189559] bam-dma-engine 8e04000.dma: num-channels unspecified in dt
[    0.189582] bam-dma-engine 8e04000.dma: num-ees unspecified in dt
[    0.190282] tcsr 1949000.tcsr: setting wifi_glb_cfg = 41000000
[    0.190382] tcsr 194b000.tcsr: setting usb hs phy mode select = e700e7
[    0.190474] tcsr 1953000.ess_tcsr: setting ess interface select = 0
[    0.190562] tcsr 1957000.tcsr: setting wifi_noc_memtype_m0_m2 = 2222222
[    0.190881] Serial: 8250/16550 driver, 16 ports, IRQ sharing enabled
[    0.192708] msm_serial 78af000.serial: msm_serial: detected port #0
[    0.192757] msm_serial 78af000.serial: uartclk = 1843200
[    0.192817] 78af000.serial: ttyMSM0 at MMIO 0x78af000 (irq = 33, base_baud = 115200) is a MSM
[    0.192848] msm_serial: console setup on port #0
[    0.742097] printk: console [ttyMSM0] enabled
[    0.747242] msm_serial: driver initialized
[    0.756320] loop: module loaded
[    0.757383] spi_qup 78b5000.spi: IN:block:16, fifo:64, OUT:block:16, fifo:64
[    0.759578] spi-nor spi0.0: w25q32 (4096 Kbytes)
[    0.765623] 8 fixed-partitions partitions found on MTD device spi0.0
[    0.770262] OF: Bad cell count for /soc/spi@78b5000/flash@0/partitions
[    0.776532] OF: Bad cell count for /soc/spi@78b5000/flash@0/partitions
[    0.783406] Creating 8 MTD partitions on "spi0.0":
[    0.789387] 0x000000000000-0x000000040000 : "SBL1"
[    0.794741] 0x000000040000-0x000000060000 : "MIBIB"
[    0.799557] 0x000000060000-0x0000000c0000 : "QSEE"
[    0.804163] 0x0000000c0000-0x0000000d0000 : "CDT"
[    0.809071] 0x0000000d0000-0x0000000e0000 : "DDRPARAMS"
[    0.813804] 0x0000000e0000-0x0000000f0000 : "APPSBLENV"
[    0.818852] 0x0000000f0000-0x000000170000 : "APPSBL"
[    0.824032] 0x000000170000-0x000000180000 : "ART"
[    0.833028] spi-nand spi0.1: Macronix SPI NAND was found.
[    0.833368] spi-nand spi0.1: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 64
[    0.839111] 1 fixed-partitions partitions found on MTD device spi0.1
[    0.846908] Creating 1 MTD partitions on "spi0.1":
[    0.853286] 0x000000000000-0x000008000000 : "ubi"
[    1.290322] ESS reset ok!
[    1.364585] ESS reset ok!
[    1.948842] PHY 4 single test PSGMII issue happen!
[    2.033119] PHY4 test see issue!
[    2.107376] ESS reset ok!
[    2.685792] PHY 4 single test PSGMII issue happen!
[    2.770111] PHY4 test see issue!
[    2.844387] ESS reset ok!
[    3.422785] PHY 4 single test PSGMII issue happen!
[    3.507094] PHY4 test see issue!
<...>
<boots to working command line>

but after this, eth1 (wich was connected and working before the reboot) does not come up:

Tue Jan 10 08:58:47 2023 kern.info kernel: [  115.550263] ess_edma c080000.edma eth1: Link is Down

anyone with some idea how to fix this?
Ethernet not working after (soft) reboot…

Does the same problem exist if the partner of the WAN port is replaced by another device? Can your partner’s model be provided?

Please add this patch.

Hi, it does not happen on all peer devices. We found this happening on 2 out of 50 networks we tried on. This seems to happen with 2 different routers (I don’t have the model details).