[Macchiato] ODP: EDKII grub boot fails with PCIe init

marcin wojtas mw at semihalf.com
Tue Mar 13 07:16:38 GMT 2018


Frederic,

I forgot abot important information – yesterday I pushed DT alignment to v4.16-rc4 on top of narvell-armada-wip. Please use it and let know how it works in your setup.

Thanks,
Marcin

Od: Marcin Wojtas
Wysłano: wtorek, 13 marca 2018 07:48
Do: Frederik Lotter
DW: macchiato at lists.einval.com; Ard Biesheuvel
Temat: Re: [Macchiato] EDKII grub boot fails with PCIe init

2018-03-13 6:49 GMT+01:00 Frederik Lotter <frederik.lotter at netronome.com>:
> On 12 Mar 2018 6:45 PM, "Marcin Wojtas" <mw at semihalf.com> wrote:
>
> Yes, it's the latest - I've just pushed a fix for OS variables write,
> so you can fetch. I have a couple of questions:
>
> 0. What version of the board are you using (is the PCIE-reset fix there)?
>
>
> We have v1.3 revision boards.
>
>
> 1. What kind of pcie device are you using?
>
>
> We have Smart NIC cards, but this board has nothing plugged in.
>
>
> 2. Does the boot from u-boot always succeed with the pcie device?
>
>
> Yes. It appears 100% reliable in comparison with EFI boot.
>
> So my conclusion is either the EDK Phy setup is different, or the DT data is
> somehow different.

Phy setup should be aligned, soon it will be common fortunately. DT is
a bit different as it uses 512MB MMIO32 region, but his should not
have any impact. Overall this is strange, I've booted kernel hundreds
of times without issues. However I usually had at least e1000 NIC
plugged in my setup (now also x4 GPU card on the second board). So I
have a couple of requests, so that I can better understand the issue:

1. Can you provide me with full boot log (beginning from first lines
after power-on/reset) from both u-boot and uefi?

2. Can you plug the smart nic and try efi boot to see if the problem persists?

3. Can you try cross-checking the DT
- take armada-8040-mcbin.dtb from Platforms/Marvell/Armada and boot
your linux from u-boot?
- take the DT from u-boot and substitute above file in uefi, rebuild
your image and boot?

Thanks,
Marcin

>
>
> Thanks,
> Marcin
>
>
> 2018-03-12 17:31 GMT+01:00 Frederik Lotter <frederik.lotter at netronome.com>:
>> 150e1b404d2e3b49574ad57e485827be12270ab9
>>
>> I believe this is HEAD, committed on 6th of March?
>>
>> On Mon, Mar 12, 2018 at 6:23 PM, Marcin Wojtas <mw at semihalf.com> wrote:
>>>
>>> Hi Frederik,
>>>
>>> Which commit ID of marvell-armada-wip you're building your binary from?
>>>
>>> Thanks,
>>> Marcin
>>>
>>> 2018-03-12 17:22 GMT+01:00 Ard Biesheuvel <ard.biesheuvel at linaro.org>:
>>> > On 12 March 2018 at 16:15, Frederik Lotter
>>> > <frederik.lotter at netronome.com> wrote:
>>> >> Hi,
>>> >> I am getting CPU stall warnings when booting up using the EFI route. I
>>> >> suspect the PCIe interface, as the stall warning sometimes contain the
>>> >> probe
>>> >> function. Other times is seems to get further than PCIe init, but
>>> >> still
>>> >> stall interrupt handling.
>>> >> Here are some facts around my observation:
>>> >>
>>> >> I have two sdcards for my Machiattobin board. They have identical
>>> >> kernels
>>> >> (4.16 rc5) with Ubuntu 16.04 rootfs. The one sdcard uses a uboot, DT
>>> >> and
>>> >> kernel boot. The second sdcard has EDKII, grub kernel boot. The EDKII
>>> >> build
>>> >> includes the device tree DTB (and DTS which I believe is unused) from
>>> >> the
>>> >> one used on the uboot sdcard.
>>> >>
>>> >> EFI stub: Booting Linux Kernel...
>>> >> EFI stub: Using DTB from configuration table
>>> >> EFI stub: Exiting boot services and installing virtual address map...
>>> >> [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd081]
>>> >> [    0.000000] Linux version 4.16.0-rc5-mbcin-netronome-2-dirty
>>> >> (root at mcb1-cpt) (gcc version 5.4.0 20160609 (Ubuntu/Linaro
>>> >> 5.4.0-6ubuntu1~16.04.9)) #2 SMP PREEMPT Mon Mar 12 14:40:25 UTC 2018
>>> >> [    0.000000] Machine model: Marvell 8040 MACHIATOBin
>>> >> [    0.000000] efi: Getting EFI parameters from FDT:
>>> >> [    0.000000] efi: EFI v2.70 by EDK II
>>> >> [    0.000000] efi:  SMBIOS 3.0=0xbfd00000  ACPI 2.0=0xb6760000
>>> >> MEMATTR=0xb8973418  RNG=0xbffdbf98
>>> >> [    0.000000] random: fast init done
>>> >> [    0.000000] efi: seeding entropy pool
>>> >> :
>>> >>
>>> >> (I am using the latest EDKII master, the Marvell edk2-open-platform
>>> >> 17.10
>>> >> banch, with all the latest mv-ddr/ atf /etc....).
>>> >>
>>> >> The DT data appear there in die EFI boot, but the PCIe interface
>>> >> fails,
>>> >> and
>>> >> results (I believe) in the CPU stall warnings:
>>> >>
>>> >> [  717.453025] INFO: rcu_preempt self-detected stall on CPU
>>> >> :
>>> >> :
>>> >> [  717.589783]  armada8k_pcie_probe+0x140/0x240
>>> >> :
>>> >>
>>> >> Other times, the pcie gets further:
>>> >>
>>> >> [    3.312127] PCI: OF: host bridge /cp0/pcie at f2600000 ranges:
>>> >> [    3.317740] PCI: OF:    IO 0xf9000000..0xf900ffff -> 0xf9000000
>>> >> [    3.323692] PCI: OF:   MEM 0xc0000000..0xdfffffff -> 0xc0000000
>>> >> [    3.328915] random: crng init done
>>> >> [    4.326158] armada8k-pcie f2600000.pcie: phy link never came up
>>> >> [    4.332109] armada8k-pcie f2600000.pcie: Link not up after
>>> >> reconfiguration
>>> >> [    4.339056] armada8k-pcie f2600000.pcie: PCI host bridge to bus
>>> >> 0000:00
>>> >
>>> >
>>> > To be brutally honest, the armada8k-pcie driver is a piece of junk,
>>> > and you're much better off using the generic ECAM driver, which now
>>> > includes special handling for the missing root port on Synopsys IP.
>>> >
>>> > It also allows you to have both MMIO32 and MMIO64 regions, which can
>>> > be useful with some PCIe cards with large BARs
>>> >
>>> > Could you try
>>> >
>>> > compatible = "marvell,armada8k-pcie-ecam";
>>> >
>>> > in the DT node, please?
>>> >
>>> > (Before you do that, please check whether UEFI recognizes your PCI
>>> > hardware using the 'pci' command in the shell)
>>> >
>>> >
>>> >> [    4.345705] pci_bus 0000:00: root bus resource [bus 00-ff]
>>> >> [    4.351217] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
>>> >> (bus
>>> >> address [0xf9000000-0xf900ffff])
>>> >> [    4.360741] pci_bus 0000:00: root bus resource [mem
>>> >> 0xc0000000-0xdfffffff]
>>> >> [    4.367659] pci 0000:00:00.0: [11ab:0110] type 01 class 0x060400
>>> >> [    4.373708] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x000fffff
>>> >> 64bit]
>>> >> [    4.380562] pci 0000:00:00.0: supports D1 D2
>>> >> [    4.384853] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
>>> >> [    4.390697] pci 0000:00:00.0: bridge configuration invalid ([bus
>>> >> 00-00]),
>>> >> reconfiguring
>>> >> [    4.398771] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated
>>> >> to
>>> >> 01
>>> >> [    4.405427] pci 0000:00:00.0: BAR 0: assigned [mem
>>> >> 0xc0000000-0xc00fffff
>>> >> 64bit]
>>> >> [    4.412776] pci 0000:00:00.0: PCI bridge to [bus 01]
>>> >> [    4.725111] pcieport 0000:00:00.0: Signaling PME with IRQ 56
>>> >> [    4.730842] pcieport 0000:00:00.0: AER enabled with IRQ 56
>>> >>
>>> >> but then CPUs are still stalled on some incoming IRQ
>>> >>
>>> >> [   87.352768]  dump_backtrace+0x0/0x150
>>> >> [   87.356445]  show_stack+0x14/0x20
>>> >> [   87.359773]  sched_show_task+0x14c/0x170
>>> >> [   87.363711]  dump_cpu_task+0x40/0x50
>>> >> [   87.367300]  rcu_dump_cpu_stacks+0x94/0xd8
>>> >> [   87.371412]  rcu_check_callbacks+0x7ac/0x980
>>> >> [   87.375700]  update_process_times+0x2c/0x58
>>> >> [   87.379900]  tick_sched_handle.isra.5+0x30/0x50
>>> >> [   87.384449]  tick_sched_timer+0x40/0x90
>>> >> [   87.388301]  __hrtimer_run_queues+0x124/0x198
>>> >> [   87.392676]  hrtimer_interrupt+0xe4/0x240
>>> >> [   87.396701]  arch_timer_handler_phys+0x30/0x40
>>> >> [   87.401163]  handle_percpu_devid_irq+0x78/0x130
>>> >> [   87.405712]  generic_handle_irq+0x24/0x38
>>> >> [   87.409738]  __handle_domain_irq+0x5c/0xb8
>>> >> [   87.413850]  gic_handle_irq+0x58/0xb0
>>> >> [   87.417526]  el1_irq+0xb0/0x128
>>> >> [   87.420678]  __do_softirq+0xb0/0x228
>>> >> [   87.424267]  irq_exit+0xbc/0xf0
>>> >> [   87.427421]  __handle_domain_irq+0x60/0xb8
>>> >> [   87.431533]  gic_handle_irq+0x58/0xb0
>>> >> [   87.435209]  el1_irq+0xb0/0x128
>>> >>
>>> >> Is anyone aware of any issue like this?
>>> >>
>>> >> Regards,
>>> >> Fred
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> Macchiato mailing list
>>> >> Macchiato at lists.einval.com
>>> >> https://lists.einval.com/cgi-bin/mailman/listinfo/macchiato
>>> >
>>> > _______________________________________________
>>> > Macchiato mailing list
>>> > Macchiato at lists.einval.com
>>> > https://lists.einval.com/cgi-bin/mailman/listinfo/macchiato
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einval.com/pipermail/macchiato/attachments/20180313/8f859d5b/attachment-0001.html>


More information about the Macchiato mailing list