[Macchiato] EDKII grub boot fails with PCIe init
Frederik Lotter
frederik.lotter at netronome.com
Tue Mar 13 07:54:03 GMT 2018
On Tue, Mar 13, 2018 at 8:48 AM, Marcin Wojtas <mw at semihalf.com> wrote:
> 2018-03-13 6:49 GMT+01:00 Frederik Lotter <frederik.lotter at netronome.com>:
> > On 12 Mar 2018 6:45 PM, "Marcin Wojtas" <mw at semihalf.com> wrote:
> >
> > Yes, it's the latest - I've just pushed a fix for OS variables write,
> > so you can fetch. I have a couple of questions:
> >
> > 0. What version of the board are you using (is the PCIE-reset fix there)?
> >
> >
> > We have v1.3 revision boards.
> >
> >
> > 1. What kind of pcie device are you using?
> >
> >
> > We have Smart NIC cards, but this board has nothing plugged in.
> >
> >
> > 2. Does the boot from u-boot always succeed with the pcie device?
> >
> >
> > Yes. It appears 100% reliable in comparison with EFI boot.
> >
> > So my conclusion is either the EDK Phy setup is different, or the DT
> data is
> > somehow different.
>
> Phy setup should be aligned, soon it will be common fortunately. DT is
> a bit different as it uses 512MB MMIO32 region, but his should not
> have any impact. Overall this is strange, I've booted kernel hundreds
> of times without issues. However I usually had at least e1000 NIC
> plugged in my setup (now also x4 GPU card on the second board). So I
> have a couple of requests, so that I can better understand the issue:
>
> 1. Can you provide me with full boot log (beginning from first lines
> after power-on/reset) from both u-boot and uefi?
>
> 2. Can you plug the smart nic and try efi boot to see if the problem
> persists?
>
> 3. Can you try cross-checking the DT
> - take armada-8040-mcbin.dtb from Platforms/Marvell/Armada and boot
> your linux from u-boot?
> - take the DT from u-boot and substitute above file in uefi, rebuild
> your image and boot?
>
I will do the above, but I have replaced the armada-8040-mcbin.dtb in the
edk2-platforms with the files generated by the kernel build.
DTS: ~/work/edk2-open-platform/Platforms/Marvell/Armada/armada-8040-db.dts
<-- linux/arch/arm64/boot/dts/marvell/.armada-8040-mcbin.dtb.dts.tmp
DTB: ~/work/edk2-open-platform/Platforms/Marvell/Armada/armada-8040-db.dtb
<-- linux/arch/arm64/boot/dts/marvell/.armada-8040-mcbin.dtb
I did md5sum on both sets and they match on both sides.
Let me know if you still want me to try something here.
> Thanks,
> Marcin
>
> >
> >
> > Thanks,
> > Marcin
> >
> >
> > 2018-03-12 17:31 GMT+01:00 Frederik Lotter <
> frederik.lotter at netronome.com>:
> >> 150e1b404d2e3b49574ad57e485827be12270ab9
> >>
> >> I believe this is HEAD, committed on 6th of March?
> >>
> >> On Mon, Mar 12, 2018 at 6:23 PM, Marcin Wojtas <mw at semihalf.com> wrote:
> >>>
> >>> Hi Frederik,
> >>>
> >>> Which commit ID of marvell-armada-wip you're building your binary from?
> >>>
> >>> Thanks,
> >>> Marcin
> >>>
> >>> 2018-03-12 17:22 GMT+01:00 Ard Biesheuvel <ard.biesheuvel at linaro.org>:
> >>> > On 12 March 2018 at 16:15, Frederik Lotter
> >>> > <frederik.lotter at netronome.com> wrote:
> >>> >> Hi,
> >>> >> I am getting CPU stall warnings when booting up using the EFI
> route. I
> >>> >> suspect the PCIe interface, as the stall warning sometimes contain
> the
> >>> >> probe
> >>> >> function. Other times is seems to get further than PCIe init, but
> >>> >> still
> >>> >> stall interrupt handling.
> >>> >> Here are some facts around my observation:
> >>> >>
> >>> >> I have two sdcards for my Machiattobin board. They have identical
> >>> >> kernels
> >>> >> (4.16 rc5) with Ubuntu 16.04 rootfs. The one sdcard uses a uboot, DT
> >>> >> and
> >>> >> kernel boot. The second sdcard has EDKII, grub kernel boot. The
> EDKII
> >>> >> build
> >>> >> includes the device tree DTB (and DTS which I believe is unused)
> from
> >>> >> the
> >>> >> one used on the uboot sdcard.
> >>> >>
> >>> >> EFI stub: Booting Linux Kernel...
> >>> >> EFI stub: Using DTB from configuration table
> >>> >> EFI stub: Exiting boot services and installing virtual address
> map...
> >>> >> [ 0.000000] Booting Linux on physical CPU 0x0000000000
> [0x410fd081]
> >>> >> [ 0.000000] Linux version 4.16.0-rc5-mbcin-netronome-2-dirty
> >>> >> (root at mcb1-cpt) (gcc version 5.4.0 20160609 (Ubuntu/Linaro
> >>> >> 5.4.0-6ubuntu1~16.04.9)) #2 SMP PREEMPT Mon Mar 12 14:40:25 UTC 2018
> >>> >> [ 0.000000] Machine model: Marvell 8040 MACHIATOBin
> >>> >> [ 0.000000] efi: Getting EFI parameters from FDT:
> >>> >> [ 0.000000] efi: EFI v2.70 by EDK II
> >>> >> [ 0.000000] efi: SMBIOS 3.0=0xbfd00000 ACPI 2.0=0xb6760000
> >>> >> MEMATTR=0xb8973418 RNG=0xbffdbf98
> >>> >> [ 0.000000] random: fast init done
> >>> >> [ 0.000000] efi: seeding entropy pool
> >>> >> :
> >>> >>
> >>> >> (I am using the latest EDKII master, the Marvell edk2-open-platform
> >>> >> 17.10
> >>> >> banch, with all the latest mv-ddr/ atf /etc....).
> >>> >>
> >>> >> The DT data appear there in die EFI boot, but the PCIe interface
> >>> >> fails,
> >>> >> and
> >>> >> results (I believe) in the CPU stall warnings:
> >>> >>
> >>> >> [ 717.453025] INFO: rcu_preempt self-detected stall on CPU
> >>> >> :
> >>> >> :
> >>> >> [ 717.589783] armada8k_pcie_probe+0x140/0x240
> >>> >> :
> >>> >>
> >>> >> Other times, the pcie gets further:
> >>> >>
> >>> >> [ 3.312127] PCI: OF: host bridge /cp0/pcie at f2600000 ranges:
> >>> >> [ 3.317740] PCI: OF: IO 0xf9000000..0xf900ffff -> 0xf9000000
> >>> >> [ 3.323692] PCI: OF: MEM 0xc0000000..0xdfffffff -> 0xc0000000
> >>> >> [ 3.328915] random: crng init done
> >>> >> [ 4.326158] armada8k-pcie f2600000.pcie: phy link never came up
> >>> >> [ 4.332109] armada8k-pcie f2600000.pcie: Link not up after
> >>> >> reconfiguration
> >>> >> [ 4.339056] armada8k-pcie f2600000.pcie: PCI host bridge to bus
> >>> >> 0000:00
> >>> >
> >>> >
> >>> > To be brutally honest, the armada8k-pcie driver is a piece of junk,
> >>> > and you're much better off using the generic ECAM driver, which now
> >>> > includes special handling for the missing root port on Synopsys IP.
> >>> >
> >>> > It also allows you to have both MMIO32 and MMIO64 regions, which can
> >>> > be useful with some PCIe cards with large BARs
> >>> >
> >>> > Could you try
> >>> >
> >>> > compatible = "marvell,armada8k-pcie-ecam";
> >>> >
> >>> > in the DT node, please?
> >>> >
> >>> > (Before you do that, please check whether UEFI recognizes your PCI
> >>> > hardware using the 'pci' command in the shell)
> >>> >
> >>> >
> >>> >> [ 4.345705] pci_bus 0000:00: root bus resource [bus 00-ff]
> >>> >> [ 4.351217] pci_bus 0000:00: root bus resource [io
> 0x0000-0xffff]
> >>> >> (bus
> >>> >> address [0xf9000000-0xf900ffff])
> >>> >> [ 4.360741] pci_bus 0000:00: root bus resource [mem
> >>> >> 0xc0000000-0xdfffffff]
> >>> >> [ 4.367659] pci 0000:00:00.0: [11ab:0110] type 01 class 0x060400
> >>> >> [ 4.373708] pci 0000:00:00.0: reg 0x10: [mem
> 0x00000000-0x000fffff
> >>> >> 64bit]
> >>> >> [ 4.380562] pci 0000:00:00.0: supports D1 D2
> >>> >> [ 4.384853] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
> >>> >> [ 4.390697] pci 0000:00:00.0: bridge configuration invalid ([bus
> >>> >> 00-00]),
> >>> >> reconfiguring
> >>> >> [ 4.398771] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated
> >>> >> to
> >>> >> 01
> >>> >> [ 4.405427] pci 0000:00:00.0: BAR 0: assigned [mem
> >>> >> 0xc0000000-0xc00fffff
> >>> >> 64bit]
> >>> >> [ 4.412776] pci 0000:00:00.0: PCI bridge to [bus 01]
> >>> >> [ 4.725111] pcieport 0000:00:00.0: Signaling PME with IRQ 56
> >>> >> [ 4.730842] pcieport 0000:00:00.0: AER enabled with IRQ 56
> >>> >>
> >>> >> but then CPUs are still stalled on some incoming IRQ
> >>> >>
> >>> >> [ 87.352768] dump_backtrace+0x0/0x150
> >>> >> [ 87.356445] show_stack+0x14/0x20
> >>> >> [ 87.359773] sched_show_task+0x14c/0x170
> >>> >> [ 87.363711] dump_cpu_task+0x40/0x50
> >>> >> [ 87.367300] rcu_dump_cpu_stacks+0x94/0xd8
> >>> >> [ 87.371412] rcu_check_callbacks+0x7ac/0x980
> >>> >> [ 87.375700] update_process_times+0x2c/0x58
> >>> >> [ 87.379900] tick_sched_handle.isra.5+0x30/0x50
> >>> >> [ 87.384449] tick_sched_timer+0x40/0x90
> >>> >> [ 87.388301] __hrtimer_run_queues+0x124/0x198
> >>> >> [ 87.392676] hrtimer_interrupt+0xe4/0x240
> >>> >> [ 87.396701] arch_timer_handler_phys+0x30/0x40
> >>> >> [ 87.401163] handle_percpu_devid_irq+0x78/0x130
> >>> >> [ 87.405712] generic_handle_irq+0x24/0x38
> >>> >> [ 87.409738] __handle_domain_irq+0x5c/0xb8
> >>> >> [ 87.413850] gic_handle_irq+0x58/0xb0
> >>> >> [ 87.417526] el1_irq+0xb0/0x128
> >>> >> [ 87.420678] __do_softirq+0xb0/0x228
> >>> >> [ 87.424267] irq_exit+0xbc/0xf0
> >>> >> [ 87.427421] __handle_domain_irq+0x60/0xb8
> >>> >> [ 87.431533] gic_handle_irq+0x58/0xb0
> >>> >> [ 87.435209] el1_irq+0xb0/0x128
> >>> >>
> >>> >> Is anyone aware of any issue like this?
> >>> >>
> >>> >> Regards,
> >>> >> Fred
> >>> >>
> >>> >>
> >>> >> _______________________________________________
> >>> >> Macchiato mailing list
> >>> >> Macchiato at lists.einval.com
> >>> >> https://lists.einval.com/cgi-bin/mailman/listinfo/macchiato
> >>> >
> >>> > _______________________________________________
> >>> > Macchiato mailing list
> >>> > Macchiato at lists.einval.com
> >>> > https://lists.einval.com/cgi-bin/mailman/listinfo/macchiato
> >>
> >>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einval.com/pipermail/macchiato/attachments/20180313/8f5d80e2/attachment.html>
More information about the Macchiato
mailing list