[Macchiato] EDKII grub boot fails with PCIe init
Frederik Lotter
frederik.lotter at netronome.com
Fri Apr 13 10:59:09 BST 2018
Hi Marcin,
On Wed, Mar 28, 2018 at 12:23 PM, Frederik Lotter <
frederik.lotter at netronome.com> wrote:
>
>
> On Wed, Mar 28, 2018 at 12:11 PM, Marcin Wojtas <mw at semihalf.com> wrote:
>
>>
>>
>> 2018-03-28 12:00 GMT+02:00 Frederik Lotter <frederik.lotter at netronome.com
>> >:
>>
>>> Hi Marcin,
>>>
>>> On Fri, Mar 23, 2018 at 1:38 PM, Frederik Lotter <
>>> frederik.lotter at netronome.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Mar 23, 2018 at 1:02 PM, Marcin Wojtas <mw at semihalf.com> wrote:
>>>>
>>>>> Frederik,
>>>>>
>>>>>
>>>>> 2018-03-23 11:02 GMT+01:00 Marcin Wojtas <mw at semihalf.com>:
>>>>> > Hi Frederik,
>>>>> >
>>>>> > 2018-03-23 8:07 GMT+01:00 Frederik Lotter <
>>>>> frederik.lotter at netronome.com>:
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Fri, Mar 16, 2018 at 5:16 PM, Marcin Wojtas <mw at semihalf.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>> Hi Frederik,
>>>>> >>>
>>>>> >>> 2018-03-16 14:04 GMT+01:00 Frederik Lotter <
>>>>> frederik.lotter at netronome.com>:
>>>>> >>> >
>>>>> >>> > On Wed, Mar 14, 2018 at 10:56 AM, Frederik Lotter <
>>>>> frederik.lotter at netronome.com> wrote:
>>>>> >>> >>
>>>>> >>> >> On Wed, Mar 14, 2018 at 10:13 AM, Frederik Lotter <
>>>>> frederik.lotter at netronome.com> wrote:
>>>>> >>> >>>
>>>>> >>> >>> On Tue, Mar 13, 2018 at 6:23 PM, Marcin Wojtas <
>>>>> mw at semihalf.com> wrote:
>>>>> >>> >>>>
>>>>> >>> >>>> Frederic,
>>>>> >>> >>>>
>>>>> >>> >>>> >>> Please use the size I suggested for the 'reg' property
>>>>> >>> >>>> >>
>>>>> >>> >>>> >>
>>>>> >>> >>>> >> Sorry I missed that:
>>>>> >>> >>>> >>
>>>>> >>> >>>> >> [ 1.463413] PCI: OF: host bridge /cp0/pcie at 0xe0000000
>>>>> ranges:
>>>>> >>> >>>> >> [ 1.463427] PCI: OF: IO 0xeff00000..0xeff0ffff ->
>>>>> 0x00000000
>>>>> >>> >>>> >> [ 1.463435] PCI: OF: MEM 0xc0000000..0xdfffffff ->
>>>>> 0xc0000000
>>>>> >>> >>>> >> [ 1.463442] PCI: OF: MEM 0x800000000..0x8ffffffff ->
>>>>> 0x800000000
>>>>> >>> >>>> >> [ 1.463481] pci-host-generic e0000000.pcie: ECAM at [mem
>>>>> >>> >>>> >> 0xe0000000-0xefefffff] for [bus 00-fe]
>>>>> >>> >>>> >> [ 1.463525] pci-host-generic e0000000.pcie: PCI host
>>>>> bridge to bus
>>>>> >>> >>>> >> 0000:00
>>>>> >>> >>>> >> [ 1.463531] pci_bus 0000:00: root bus resource [bus
>>>>> 00-fe]
>>>>> >>> >>>> >> [ 1.463536] pci_bus 0000:00: root bus resource [io
>>>>> 0x0000-0xffff]
>>>>> >>> >>>> >> [ 1.463541] pci_bus 0000:00: root bus resource [mem
>>>>> >>> >>>> >> 0xc0000000-0xdfffffff]
>>>>> >>> >>>> >> [ 1.463547] pci_bus 0000:00: root bus resource [mem
>>>>> >>> >>>> >> 0x800000000-0x8ffffffff]
>>>>> >>> >>>> >>
>>>>> >>> >>>> >>
>>>>> >>> >>>> >> So I assume this works, and I am super grateful. I will
>>>>> test it tomorrow
>>>>> >>> >>>> >> with our Smart NIC.
>>>>> >>> >>>> >>
>>>>> >>> >>>>
>>>>> >>> >>>> Please also check pure branch (unmodified DT) as requested in
>>>>> my
>>>>> >>> >>>> previous email . According to the bootlog the stall was
>>>>> observed after
>>>>> >>> >>>> pcie driver init and without latest patch I mentioned. So I'd
>>>>> like to
>>>>> >>> >>>> make sure that we're on the same side here.
>>>>> >>> >>>
>>>>> >>> >>>
>>>>> >>> >>> I attached two logs files. The only difference is the one also
>>>>> uses an initrd.
>>>>> >>> >>>
>>>>> >>> >>> I now have a pcie card in - and it looks much better.
>>>>> >>> >>>
>>>>> >>> >>> I am starting to feel like this whole exercise is going to
>>>>> discover something unrelated and stupid I have done, so I apologize in
>>>>> advance. Hopefully the journey will help someone else as well.
>>>>> >>> >>>
>>>>> >>> >>> Any idea what could be failing? I seem to be some issue
>>>>> mounting the root filesystem (or at the same time). In the initrd case it
>>>>> seems to complain about mmcblk0 (the onboard flash which I do not use).
>>>>> >>> >>>
>>>>> >>> >>> I attach my grub file just for in case - i am new to grub
>>>>> (Note I hacked 4 entries towards the end).
>>>>> >>> >>
>>>>> >>> >>
>>>>> >>> >> Here are 3 more logs, I will not send more - but I thought
>>>>> perhaps this could complete the picture.
>>>>> >>> >>
>>>>> >>> >> 1. One without a PCIe card - successful PCIe init
>>>>> >>> >>
>>>>> >>> >> 2. One with a card, but with the PCIe probe causing the stall
>>>>> (this is not often seen)
>>>>> >>> >>
>>>>> >>> >> 3. The other stall, after successful PCIe init
>>>>> >>> >
>>>>> >>> >
>>>>> >>> > Hi Marcin,
>>>>> >>> >
>>>>> >>> > I am sure you are quite busy, and I really appreciate all the
>>>>> help I got from you.
>>>>> >>> >
>>>>> >>> > Please will you have a look at the last two emails (and the
>>>>> attachments) I've sent you, once you have time again. If there is anything
>>>>> I can do that will help you, just let me know. I really need to get this
>>>>> working reliably for us to proceed with the EFI boot route, and since we
>>>>> really need to support generic netboot/ISO installs and images for CentOS
>>>>> and Ubuntu, I think this must be the best way forward.
>>>>> >>> >
>>>>> >>>
>>>>> >>> I took a look at your logs and it all looks a bit strange. Is it
>>>>> pure
>>>>> >>> v4.16-rc5? If yes, can you avoid grub and boot directly from shell?
>>>>> >>
>>>>> >>
>>>>> >> Could you give me a hint how I do this? I am very new to EDK/grub?
>>>>> >
>>>>> > When booting, hit escape. Go to "Boot Manager" and then to "UEFI
>>>>> > Shell". Navigate to partition, which comprises your Linux Image
>>>>> (let's
>>>>> > assume it's FS0):
>>>>> > Shell> fs0:
>>>>> > FS0:\> Image <commandline arguments>
>>>>> >
>>>>> > Example:
>>>>> >
>>>>> > https://pastebin.com/rzk9uzAx
>>>>>
>>>>
>>> This does not work for me because EDK does not have a working alias for
>>> mmcblk1p3. In the past I had the EFI System partition last, but now I have
>>> the Linux Filesystem last (not sure if this is confusing EDK)
>>>
>>> Mapping table
>>> FS0: Alias(s):HD4c:;BLK5:
>>> VenHw(0D51905B-B77E-452A-A2C0-ECA0CC8D514A,000078F2000000000
>>> 0)/SD(0x0)
>>> /HD(2,GPT,82F4BCFA-CC74-4966-A5BB-3206A1D4B694,0x11000,0x64000)
>>> FS1: Alias(s):F0:;BLK7:
>>> VenMsg(06ED4DD0-FF78-11D3-BDC4-00A0C94053D1,0200000000000000)
>>> BLK0: Alias(s):
>>> VenHw(0D51905B-B77E-452A-A2C0-ECA0CC8D514A,00006EF0000000000
>>> 0)/eMMC(0x
>>> 0)/Ctrl(0x0)
>>> BLK1: Alias(s):
>>> VenHw(0D51905B-B77E-452A-A2C0-ECA0CC8D514A,00006EF0000000000
>>> 0)/eMMC(0x
>>> 0)/Ctrl(0x1)
>>> BLK2: Alias(s):
>>> VenHw(0D51905B-B77E-452A-A2C0-ECA0CC8D514A,00006EF0000000000
>>> 0)/eMMC(0x
>>> 0)/Ctrl(0x2)
>>> BLK3: Alias(s):
>>> VenHw(0D51905B-B77E-452A-A2C0-ECA0CC8D514A,000078F2000000000
>>> 0)/SD(0x0)
>>>
>>> BLK4: Alias(s):
>>> VenHw(0D51905B-B77E-452A-A2C0-ECA0CC8D514A,000078F2000000000
>>> 0)/SD(0x0)
>>> /HD(1,GPT,3AA5DACC-F435-43FF-828F-A6374A093D21,0x1000,0x10000)
>>> BLK6: Alias(s):
>>> VenHw(0D51905B-B77E-452A-A2C0-ECA0CC8D514A,000078F2000000000
>>> 0)/SD(0x0)
>>> /HD(3,GPT,7E0D3DE0-AD08-4599-9AA3-164B3D83EA0D,0x75000,0x6EDBDF)
>>>
>>> The Image is on BLK6: which I cannot change to - nothing happens if I
>>> type "BLK6:".
>>>
>>
>> If Image is on EXT partition, EDK won't read it due to lack of support in
>> generic code (FAT is preferred). Any chance you take FAT-formatted USB
>> stick and put binaries there?
>>
>>
> OK I will do this and retest skipping grub.
>
> I you can have a look at the img i've sent I would appreciate it a lot.
> Just to have a second pair of eyes on what is happening.
>
> I will upload another image with /boot mounted on FAT.
>
Running 'Image console=ttyS0,115200 root=/dev/mmcblk1p2 rw rootwait'
directly from EDK UEFI Shell yields in exactly the same failure.
However, there appear to be differences (see diff.pdf)
Can you please help me?
--------------------------
Please find the following files attached, and on google drive:
edk-fail.txt - Boot log when booting directly from EDK
edk-fail.txt.strip - Time stripped version you can use with kdiff
uboot-pass.txt - Boot log when booting directly from EDK
uboot-pass.txt.strip - Time stripped version you can use with kdiff
diff.pdf - PDF of the kdiff3 output
https://drive.google.com/drive/folders/1ra8UvI9vFXKlmgmyTd9njtr3v9ib8Ikg?usp=sharing
--------------------------
The link above also contains an image you can burn on 4M sdcard.
Which contains:
- Latest EDK (branches as used on Macchiatobin website)
- Stable v4.16 Linux Kernel (using generic 'make defconfig')
- Ubuntu 18.04 squashfs based server image
The image has uboot on /dev/mmcblk1p1 (/boot/flash-image-uboot.img)
Just place with /boot/flash-image-efi.img (dd if=/boot/ flash-image-efi.img
of=/dev/mmcblk1p1)
If you want to use grub, note that I have modified entries under Advanced
which skips the initrd.
---------------------------
>
>
>> Best regards,
>> Marcin
>>
>>
>>
>>>
>>>
>>>
>>>>
>>>>> >
>>>>> >
>>>>> >>
>>>>> >>>
>>>>> >>>
>>>>> >>> Can you share your grub? I'd like to test it in my setup.
>>>>>
>>>>
>>> I think the easiest will be if you flash this image on your sdcard.
>>>
>>> https://drive.google.com/open?id=1qqGfgFw-rcc1d09a2F-BoRgmBnmmqrw5
>>>
>>> So I have made the whole image from scratch:
>>>
>>> Userspace: 18.04 Ubuntu-server squashfs
>>> Kernel: mainline 4.16 rc7 (arm64 generic deconfig)
>>>
>>> mmcblk1p1: uboot/EDK image
>>> mmcblk1p2: ESP
>>> mmcblk1p3: Ubuntu userspace
>>>
>>> The way I did it was to get uboot based boot up, and then I do:
>>>
>>> apt-get install grub-efi-arm64
>>> update-grub (creates the grub.cfg)
>>> grub-install /dev/mmcblk1 --removable (without removable I get some
>>> efbootmgr errors related to efivars)
>>>
>>> Either I am the only one trying EFI with custom images, or I am making
>>> an obvious mistake (which is likely). However, I have done it so many times
>>> now with different permutations, that I have exhausted all possible
>>> combinations.
>>>
>>> I would really appreciate it so much if you could just boot the image
>>> and inspect it locally.
>>>
>>> (The flash-image.bin was build again from vanilla sources including the
>>> latest commits on the usual branches).
>>>
>>>
>>> >>
>>>>> >>
>>>>> >> Please find the content of my EFI partition and /boot/grub
>>>>> attached. I will really appreciate if you could give it a go.
>>>>> >>
>>>>> >> I got this grub through apt-get for Ubuntu.
>>>>> >
>>>>> > Thanks, will try it.
>>>>> >
>>>>>
>>>>> Where do you take your initrd images from? Any chance you expose your
>>>>> /boot directory as a tarball, I'd like to run your images as well.
>>>>>
>>>>
>>>> I really appreciate your effort in helping.
>>>>
>>>> Sadly I tried to update and start fresh, so I don't currently have a
>>>> working environment I can share, because I cannot get the vanilla mainline
>>>> kernel to work on the board (I posted another question about this).
>>>>
>>>> I will send you my entire setup ASAP.
>>>>
>>>>
>>>>> Thanks,
>>>>> Marcin
>>>>>
>>>>> >>
>>>>> >> apt-get install grub-efi-arm64
>>>>> >>
>>>>> >> Info below:
>>>>> >>
>>>>> >> root at localhost:/boot# apt-get install grub-efi-arm64
>>>>> >> Reading package lists... Done
>>>>> >> Building dependency tree
>>>>> >> Reading state information... Done
>>>>> >> grub-efi-arm64 is already the newest version
>>>>> (2.02~beta2-36ubuntu3.17).
>>>>> >> 0 upgraded, 0 newly installed, 0 to remove and 48 not upgraded.
>>>>> >> root at localhost:/boot# uname -a
>>>>> >> Linux localhost.localdomain 4.16.0-rc5-mbcin-netronome-2-dirty #2
>>>>> SMP PREEMPT Mon Mar 12 14:40:25 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux
>>>>> >> root at localhost:/boot# cat /etc/os-release
>>>>> >> NAME="Ubuntu"
>>>>> >> VERSION="16.04.3 LTS (Xenial Xerus)"
>>>>> >> ID=ubuntu
>>>>> >> ID_LIKE=debian
>>>>> >> PRETTY_NAME="Ubuntu 16.04.3 LTS"
>>>>> >> VERSION_ID="16.04"
>>>>> >> HOME_URL="http://www.ubuntu.com/"
>>>>> >> SUPPORT_URL="http://help.ubuntu.com/"
>>>>> >> BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
>>>>> >> VERSION_CODENAME=xenial
>>>>> >> UBUNTU_CODENAME=xenial
>>>>> >>
>>>>> >>>
>>>>> >>>
>>>>> >>> Can you please download:
>>>>> >>> https://d-i.debian.org/daily-images/arm64/20180314-02:10/net
>>>>> boot/mini.iso
>>>>> >>> burn on stick and install? It's very easy and I use it in my GPU
>>>>> setup
>>>>> >>
>>>>> >>
>>>>> >> OK I will try this but first I want to redo my environment from
>>>>> scratch so make sure I have not messed up anything.
>>>>> >>
>>>>> >>>
>>>>> >>> on MacchiatoBin.
>>>>> >>>
>>>>> >>> Thanks,
>>>>> >>> Marcin
>>>>> >>>
>>>>> >>> >
>>>>> >>> >>
>>>>> >>> >>>
>>>>> >>> >>>>
>>>>> >>> >>>> >
>>>>> >>> >>>> > This doesn't tell you all that much, to be honest. But at
>>>>> least the
>>>>> >>> >>>> > numbers look sane now, and appear to match the UEFI
>>>>> configuration.
>>>>> >>> >>>> >
>>>>> >>> >>>> >> However, we are building a product that obviously requires
>>>>> long term
>>>>> >>> >>>> >> maintenance, so may I please get your input on a strategy
>>>>> with this?
>>>>> >>> >>>> >>
>>>>> >>> >>>> >> If we decide to stick with this driver, would it be easy
>>>>> for things to
>>>>> >>> >>>> >> become disjointed?
>>>>> >>> >>>> >>
>>>>> >>> >>>> >> The hope with going the EFI route is that we could boot
>>>>> "generic" Ubuntu and
>>>>> >>> >>>> >> CentOS installs, so I guess as long as we keep the DT and
>>>>> the EFKII snapshot
>>>>> >>> >>>> >> in sync on our side, the risk is low.
>>>>> >>> >>>> >>
>>>>> >>> >>>> >
>>>>> >>> >>>> > I'm afraid you are getting caught in the middle of a
>>>>> philosophical
>>>>> >>> >>>> > debate here: many engineers that are involved with the
>>>>> Marvell support
>>>>> >>> >>>> > in Linux feel that a device tree is not something that
>>>>> should be
>>>>> >>> >>>> > supported long term, and needs to be bundled with the OS.
>>>>> Over the
>>>>> >>> >>>> > last couple of kernel releases, the Marvell 8040 support
>>>>> was changed
>>>>> >>> >>>> > in a non-backward compatible manner numerous times.
>>>>> >>> >>>> >
>>>>> >>> >>>>
>>>>> >>> >>>> I think current DT should work with everything >= v4.12. So
>>>>> far
>>>>> >>> >>>> multiple users were able to install debian with recent fixes,
>>>>> I
>>>>> >>> >>>> suggest first making sure, what possibly can happen that your
>>>>> setup
>>>>> >>> >>>> behaves differently. Switching to a8k-ecam-pcie driver is a
>>>>> nice idea,
>>>>> >>> >>>> but I'm not sure the distros using DT have it.
>>>>> >>> >>>>
>>>>> >>> >>>> > This conflicts badly with the idea that the firmware
>>>>> provides the
>>>>> >>> >>>> > hardware description (using DT or ACPI), and that the
>>>>> contract with
>>>>> >>> >>>> > the OS is kept by both sides for longer than a single
>>>>> release.
>>>>> >>> >>>> >
>>>>> >>> >>>> > So I cannot really answer that question, unfortunately. If
>>>>> you don't
>>>>> >>> >>>> > intend to use the onboard network controller, you could go
>>>>> the ACPI
>>>>> >>> >>>> > route, I guess.
>>>>> >>> >>>>
>>>>> >>> >>>> FYI. on-board network ACPI support is being upstreamed to the
>>>>> Centos.
>>>>> >>> >>>>
>>>>> >>> >>>> >
>>>>> >>> >>>> > Another problem is that none of this UEFI/ACPI support is
>>>>> upstream in
>>>>> >>> >>>> > the Tianocore project, and trying random trees left and
>>>>> right doesn't
>>>>> >>> >>>> > really help when assessing whether a platform is suitable
>>>>> as a long
>>>>> >>> >>>> > term investment.
>>>>> >>> >>>> >
>>>>> >>> >>>>
>>>>> >>> >>>> There's only single branch recommended in the MacchiatoBin
>>>>> wiki, I
>>>>> >>> >>>> wouldn't call it 'random'. Entire branch is supposed to land
>>>>> >>> >>>> eventually in the Tianocore and become the only support.
>>>>> Before end of
>>>>> >>> >>>> year ~50 patches got there, still some bits are missing, but
>>>>> I think
>>>>> >>> >>>> we're not that far from desired point. I really want to push
>>>>> it but
>>>>> >>> >>>> still it requires time I'm personally short of, so I'll
>>>>> appreciate
>>>>> >>> >>>> understanding.
>>>>> >>> >>>>
>>>>> >>> >>>> Thanks,
>>>>> >>> >>>> Marcin
>>>>> >>> >>>>
>>>>> >>> >>>> >
>>>>> >>> >>>> >
>>>>> >>> >>>> >> For example, using the same DT with uboot, it fails:
>>>>> >>> >>>> >>
>>>>> >>> >>>> >> [ 0.294942] sysfs: cannot create duplicate filename
>>>>> >>> >>>> >> '/bus/platform/devices/e0000000.pcie'
>>>>> >>> >>>> >> [ 0.294950] CPU: 2 PID: 1 Comm: swapper/0 Not tainted
>>>>> >>> >>>> >> 4.16.0-rc5-mbcin-netronome-2-dirty #2
>>>>> >>> >>>> >> [ 0.294952] Hardware name: Marvell 8040 MACHIATOBin (DT)
>>>>> >>> >>>> >> [ 0.294955] Call trace:
>>>>> >>> >>>> >> [ 0.294967] dump_backtrace+0x0/0x150
>>>>> >>> >>>> >> [ 0.294970] show_stack+0x14/0x20
>>>>> >>> >>>> >> [ 0.294976] dump_stack+0x98/0xbc
>>>>> >>> >>>> >> [ 0.294980] sysfs_warn_dup+0x60/0x78
>>>>> >>> >>>> >> [ 0.294983] sysfs_do_create_link_sd.isra.0+0xd8/0xe0
>>>>> >>> >>>> >> [ 0.294986] sysfs_create_link+0x20/0x40
>>>>> >>> >>>> >> [ 0.294990] bus_add_device+0x88/0x148
>>>>> >>> >>>> >> [ 0.294993] device_add+0x394/0x568
>>>>> >>> >>>> >> [ 0.294997] of_device_add+0x5c/0x70
>>>>> >>> >>>> >> [ 0.295000] of_platform_device_create_pdata+0x80/0xd0
>>>>> >>> >>>> >> [ 0.295003] of_platform_bus_create+0xdc/0x300
>>>>> >>> >>>> >> [ 0.295006] of_platform_bus_create+0x11c/0x300
>>>>> >>> >>>> >> [ 0.295008] of_platform_populate+0x4c/0xb0
>>>>> >>> >>>> >> [ 0.295014] of_platform_default_populate_i
>>>>> nit+0xa4/0xc0
>>>>> >>> >>>> >> [ 0.295017] do_one_initcall+0x38/0x120
>>>>> >>> >>>> >> [ 0.295020] kernel_init_freeable+0x134/0x1d4
>>>>> >>> >>>> >> [ 0.295025] kernel_init+0x10/0x100
>>>>> >>> >>>> >> [ 0.295028] ret_from_fork+0x10/0x18
>>>>> >>> >>>> >>
>>>>> >>> >>>> >> So I think this confirms that the pcie setup is different
>>>>> between EDKII and
>>>>> >>> >>>> >> uboot (unless I am doing something stupid here).
>>>>> >>> >>>> >>
>>>>> >>> >>>> >
>>>>> >>> >>>> > It looks like you have two copies of the pcie node here, no?
>>>>> >>> >>>
>>>>> >>> >>>
>>>>> >>> >>
>>>>> >>> >
>>>>> >>
>>>>> >>
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einval.com/pipermail/macchiato/attachments/20180413/457ec293/attachment-0001.html>
More information about the Macchiato
mailing list