[Macchiato] EDKII grub boot fails with PCIe init

Marcin Wojtas mw at semihalf.com
Tue Mar 13 14:51:00 GMT 2018


Hi Frederik,

2018-03-13 13:36 GMT+01:00 Frederik Lotter <frederik.lotter at netronome.com>:
> Hi Marcin,
>
> I am attaching text files, not sure how it works with the list - tell me if
> there is another way.
>
> On Tue, Mar 13, 2018 at 11:27 AM, Marcin Wojtas <mw at semihalf.com> wrote:
>>
>> 2018-03-13 8:54 GMT+01:00 Frederik Lotter <frederik.lotter at netronome.com>:
>> > On Tue, Mar 13, 2018 at 8:48 AM, Marcin Wojtas <mw at semihalf.com> wrote:
>> >>
>> >> 2018-03-13 6:49 GMT+01:00 Frederik Lotter
>> >> <frederik.lotter at netronome.com>:
>> >> > On 12 Mar 2018 6:45 PM, "Marcin Wojtas" <mw at semihalf.com> wrote:
>> >> >
>> >> > Yes, it's the latest - I've just pushed a fix for OS variables write,
>> >> > so you can fetch. I have a couple of questions:
>> >> >
>> >> > 0. What version of the board are you using (is the PCIE-reset fix
>> >> > there)?
>> >> >
>> >> >
>> >> > We have v1.3 revision boards.
>> >> >
>> >> >
>> >> > 1. What kind of pcie device are you using?
>> >> >
>> >> >
>> >> > We have Smart NIC cards, but this board has nothing plugged in.
>> >> >
>> >> >
>> >> > 2. Does the boot from u-boot always succeed with the pcie device?
>> >> >
>> >> >
>> >> > Yes. It appears 100% reliable in comparison with EFI boot.
>> >> >
>> >> > So my conclusion is either the EDK Phy setup is different, or the DT
>> >> > data is
>> >> > somehow different.
>> >>
>> >> Phy setup should be aligned, soon it will be common fortunately. DT is
>> >> a bit different as it uses 512MB MMIO32 region, but his should not
>> >> have any impact. Overall this is strange, I've booted kernel hundreds
>> >> of times without issues. However I usually had at least e1000 NIC
>> >> plugged in my setup (now also x4 GPU card on the second board). So I
>> >> have a couple of requests, so that I can better understand the issue:
>> >>
>> >> 1. Can you provide me with full boot log (beginning from first lines
>> >> after power-on/reset) from both u-boot and uefi?
>
>
> Attached:
>
>>
>> >>
>> >> 2. Can you plug the smart nic and try efi boot to see if the problem
>> >> persists?
>
>
> Will try tomorrow - I am working from home and cannot locate anything to
> plug in.
>
>>
>> >>
>> >> 3. Can you try cross-checking the DT
>> >> - take armada-8040-mcbin.dtb from Platforms/Marvell/Armada and boot
>> >> your linux from u-boot?
>> >> - take the DT from u-boot and substitute above file in uefi, rebuild
>> >> your image and boot?
>> >
>> >
>> > I will do the above, but I have replaced the armada-8040-mcbin.dtb in
>> > the
>> > edk2-platforms with the files generated by the kernel build.
>> >
>> > DTS:
>> > ~/work/edk2-open-platform/Platforms/Marvell/Armada/armada-8040-db.dts
>> > <-- linux/arch/arm64/boot/dts/marvell/.armada-8040-mcbin.dtb.dts.tmp
>> > DTB:
>> > ~/work/edk2-open-platform/Platforms/Marvell/Armada/armada-8040-db.dtb
>> > <-- linux/arch/arm64/boot/dts/marvell/.armada-8040-mcbin.dtb
>> >
>> > I did md5sum on both sets and they match on both sides.
>> >
>> > Let me know if you still want me to try something here.
>> >
>>
>> Yes, check it :)
>>
>> Also please boot unmodified marvell-armada-wip with updated DT:
>> Lastest commit is (b316449f24fda)
>
>
> I updated to the latest commit for edk2-platforms.

Are you sure that you sent the log from the newest marvell-armada-wip?
In the problematic log I can see that:
- to me this log shows it's rather not PCIE related hang at all
(driver init passes, link is down, because there is no card in the
slot).
- it's rather old device tree, because of the RTC:

[    2.988908] armada38x-rtc f4284000.rtc: rtc core: registered
f4284000.rtc as rtc0
[    2.996544] rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc1
[...]
[    4.732965] armada38x-rtc f4284000.rtc: setting system clock to
2018-03-13 09:40:00 UTC (1520934000)

New device tree fixes that and it should be:

[    1.956873] rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
[...]
[    3.418276] rtc-efi rtc-efi: setting system clock to 2018-03-13
14:29:43 UTC (1520951383)

Also the device tree fixes access to the variables on McBin done from
OS level, which resulted in RCU stalls...

Please compile b316449f24fda of marvell-armada-wip clean, boot and
provide with the log.

Thanks,
Marcin



More information about the Macchiato mailing list