<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Mar 13, 2018 at 3:45 PM, Ard Biesheuvel <span dir="ltr"><<a href="mailto:ard.biesheuvel@linaro.org" target="_blank">ard.biesheuvel@linaro.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 13 March 2018 at 13:44, Frederik Lotter<br>
<div><div class="gmail-h5"><<a href="mailto:frederik.lotter@netronome.com">frederik.lotter@netronome.com</a><wbr>> wrote:<br>
> On Tue, Mar 13, 2018 at 2:55 PM, Ard Biesheuvel <<a href="mailto:ard.biesheuvel@linaro.org">ard.biesheuvel@linaro.org</a>><br>
> wrote:<br>
>><br>
>> On 13 March 2018 at 12:48, Frederik Lotter<br>
>> <<a href="mailto:frederik.lotter@netronome.com">frederik.lotter@netronome.com</a><wbr>> wrote:<br>
>> > On Tue, Mar 13, 2018 at 2:36 PM, Ard Biesheuvel<br>
>> > <<a href="mailto:ard.biesheuvel@linaro.org">ard.biesheuvel@linaro.org</a>><br>
>> > wrote:<br>
>> >><br>
>> >> On 13 March 2018 at 12:26, Frederik Lotter<br>
>> >> <<a href="mailto:frederik.lotter@netronome.com">frederik.lotter@netronome.com</a><wbr>> wrote:<br>
>> >> > Hi Ard,<br>
>> >> ><br>
>> >> > On Mon, Mar 12, 2018 at 6:22 PM, Ard Biesheuvel<br>
>> >> > <<a href="mailto:ard.biesheuvel@linaro.org">ard.biesheuvel@linaro.org</a>><br>
>> >> > wrote:<br>
>> >> >><br>
>> >> >> On 12 March 2018 at 16:15, Frederik Lotter<br>
>> >> >> <<a href="mailto:frederik.lotter@netronome.com">frederik.lotter@netronome.com</a><wbr>> wrote:<br>
>> >> >> > Hi,<br>
>> >> >> > I am getting CPU stall warnings when booting up using the EFI<br>
>> >> >> > route.<br>
>> >> >> > I<br>
>> >> >> > suspect the PCIe interface, as the stall warning sometimes contain<br>
>> >> >> > the<br>
>> >> >> > probe<br>
>> >> >> > function. Other times is seems to get further than PCIe init, but<br>
>> >> >> > still<br>
>> >> >> > stall interrupt handling.<br>
>> >> >> > Here are some facts around my observation:<br>
>> >> >> ><br>
>> >> >> > I have two sdcards for my Machiattobin board. They have identical<br>
>> >> >> > kernels<br>
>> >> >> > (4.16 rc5) with Ubuntu 16.04 rootfs. The one sdcard uses a uboot,<br>
>> >> >> > DT<br>
>> >> >> > and<br>
>> >> >> > kernel boot. The second sdcard has EDKII, grub kernel boot. The<br>
>> >> >> > EDKII<br>
>> >> >> > build<br>
>> >> >> > includes the device tree DTB (and DTS which I believe is unused)<br>
>> >> >> > from<br>
>> >> >> > the<br>
>> >> >> > one used on the uboot sdcard.<br>
>> >> >> ><br>
>> >> >> > EFI stub: Booting Linux Kernel...<br>
>> >> >> > EFI stub: Using DTB from configuration table<br>
>> >> >> > EFI stub: Exiting boot services and installing virtual address<br>
>> >> >> > map...<br>
>> >> >> > [ 0.000000] Booting Linux on physical CPU 0x0000000000<br>
>> >> >> > [0x410fd081]<br>
>> >> >> > [ 0.000000] Linux version 4.16.0-rc5-mbcin-netronome-2-<wbr>dirty<br>
>> >> >> > (root@mcb1-cpt) (gcc version 5.4.0 20160609 (Ubuntu/Linaro<br>
>> >> >> > 5.4.0-6ubuntu1~16.04.9)) #2 SMP PREEMPT Mon Mar 12 14:40:25 UTC<br>
>> >> >> > 2018<br>
>> >> >> > [ 0.000000] Machine model: Marvell 8040 MACHIATOBin<br>
>> >> >> > [ 0.000000] efi: Getting EFI parameters from FDT:<br>
>> >> >> > [ 0.000000] efi: EFI v2.70 by EDK II<br>
>> >> >> > [ 0.000000] efi: SMBIOS 3.0=0xbfd00000 ACPI 2.0=0xb6760000<br>
>> >> >> > MEMATTR=0xb8973418 RNG=0xbffdbf98<br>
>> >> >> > [ 0.000000] random: fast init done<br>
>> >> >> > [ 0.000000] efi: seeding entropy pool<br>
>> >> >> > :<br>
>> >> >> ><br>
>> >> >> > (I am using the latest EDKII master, the Marvell<br>
>> >> >> > edk2-open-platform<br>
>> >> >> > 17.10<br>
>> >> >> > banch, with all the latest mv-ddr/ atf /etc....).<br>
>> >> >> ><br>
>> >> >> > The DT data appear there in die EFI boot, but the PCIe interface<br>
>> >> >> > fails,<br>
>> >> >> > and<br>
>> >> >> > results (I believe) in the CPU stall warnings:<br>
>> >> >> ><br>
>> >> >> > [ 717.453025] INFO: rcu_preempt self-detected stall on CPU<br>
>> >> >> > :<br>
>> >> >> > :<br>
>> >> >> > [ 717.589783] armada8k_pcie_probe+0x140/<wbr>0x240<br>
>> >> >> > :<br>
>> >> >> ><br>
>> >> >> > Other times, the pcie gets further:<br>
>> >> >> ><br>
>> >> >> > [ 3.312127] PCI: OF: host bridge /cp0/pcie@f2600000 ranges:<br>
>> >> >> > [ 3.317740] PCI: OF: IO 0xf9000000..0xf900ffff -> 0xf9000000<br>
>> >> >> > [ 3.323692] PCI: OF: MEM 0xc0000000..0xdfffffff -> 0xc0000000<br>
>> >> >> > [ 3.328915] random: crng init done<br>
>> >> >> > [ 4.326158] armada8k-pcie f2600000.pcie: phy link never came up<br>
>> >> >> > [ 4.332109] armada8k-pcie f2600000.pcie: Link not up after<br>
>> >> >> > reconfiguration<br>
>> >> >> > [ 4.339056] armada8k-pcie f2600000.pcie: PCI host bridge to bus<br>
>> >> >> > 0000:00<br>
>> >> >><br>
>> >> >><br>
>> >> >> To be brutally honest, the armada8k-pcie driver is a piece of junk,<br>
>> >> >> and you're much better off using the generic ECAM driver, which now<br>
>> >> >> includes special handling for the missing root port on Synopsys IP.<br>
>> >> >><br>
>> >> >> It also allows you to have both MMIO32 and MMIO64 regions, which can<br>
>> >> >> be useful with some PCIe cards with large BARs<br>
>> >> >><br>
>> >> >> Could you try<br>
>> >> >><br>
>> >> >> compatible = "marvell,armada8k-pcie-ecam";<br>
>> >> >><br>
>> >> >> in the DT node, please?<br>
>> >> >><br>
>> >> >> (Before you do that, please check whether UEFI recognizes your PCI<br>
>> >> >> hardware using the 'pci' command in the shell)<br>
>> >> ><br>
>> >> ><br>
>> >> > This exercise help a lot. Thank you for the proposal.<br>
>> >> ><br>
>> >> > So now I can consistently boot using uboot and efi.<br>
>> >> ><br>
>> >> > However, the pcie driver init fails. I have provided boot logs and<br>
>> >> > also<br>
>> >> > my<br>
>> >> > DT entry - we need custom BAR ranges, and I am not sure if this<br>
>> >> > driver<br>
>> >> > understand everything.<br>
>> >> ><br>
>> >> > cp0_pcie0: pcie@f2600000 {<br>
>> >> > compatible = "marvell,armada8k-pcie-ecam", "snps,dw-pcie";<br>
>> >> > reg = <0 0xf2600000 0 0x10000>,<br>
>> >> > <0 ((0xf6000000 + (0 * 0x1000000)) + 0xf00000) 0 0x80000>;<br>
>> >> > reg-names = "ctrl", "config";<br>
>> >> > #address-cells = <3>;<br>
>> >> > #size-cells = <2>;<br>
>> >> > #interrupt-cells = <1>;<br>
>> >> > device_type = "pci";<br>
>> >> > dma-coherent;<br>
>> >> > msi-parent = <&gic_v2m0>;<br>
>> >> ><br>
>> >> > bus-range = <0 0xff>;<br>
>> >> > ranges =<br>
>> >> ><br>
>> >> > <0x81000000 0 (0xf9000000 + (0 * 0x10000)) 0 (0xf9000000 + (0 *<br>
>> >> > 0x10000))<br>
>> >> > 0 0x10000<br>
>> >> ><br>
>> >> > 0x82000000 0 (0xf6000000 + (0 * 0x1000000)) 0 (0xf6000000 + (0 *<br>
>> >> > 0x1000000)) 0 0xf00000>;<br>
>> >> > interrupt-map-mask = <0 0 0 0>;<br>
>> >> > interrupt-map = <0 0 0 0 &cp0_icu 0x0 22 4>;<br>
>> >> > interrupts = <0x0 22 4>;<br>
>> >> > num-lanes = <1>;<br>
>> >> > clocks = <&cp0_clk 1 13>;<br>
>> >> > status = "disabled";<br>
>> >> > };<br>
>> >> ><br>
>> >> ><br>
>> >> > Error:<br>
>> >> ><br>
>> >> > [ 1.396968] PCI: OF: host bridge /cp0/pcie@f2600000 ranges:<br>
>> >> > [ 1.396979] PCI: OF: IO 0xf9000000..0xf900ffff -> 0xf9000000<br>
>> >> > [ 1.396984] PCI: OF: MEM 0xc0000000..0xdfffffff -> 0xc0000000<br>
>> >> > [ 1.396998] pci-host-generic f2600000.pcie: ECAM area [mem<br>
>> >> > 0xf2600000-0xf260ffff] can only accommodate [bus 00-ffffffffffffffff]<br>
>> >> > (reduced from [bus 00-ff] desired)<br>
>> >> > [ 1.397002] pci-host-generic f2600000.pcie: ECAM ioremap failed<br>
>> >> > [ 1.397011] pci-host-generic: probe of f2600000.pcie failed with<br>
>> >> > error<br>
>> >> > -12<br>
>> >> ><br>
>> >> ><br>
>> >> > Thanks for the support.<br>
>> >> ><br>
>> >><br>
>> >> Please try the following config<br>
>> >><br>
>> >> cp0_pcie0: pcie@e0000000 {<br>
>> >> compatible = "marvell,armada8k-pcie-ecam", "snps,dw-pcie";<br>
>> >> reg = <0 0xe0000000 0 0xff00000>;<br>
>> >> #address-cells = <3>;<br>
>> >> #size-cells = <2>;<br>
>> >> #interrupt-cells = <1>;<br>
>> >> device_type = "pci";<br>
>> >> dma-coherent;<br>
>> >> msi-parent = <&gic_v2m0>;<br>
>> >><br>
>> >> bus-range = <0 0xfe>;<br>
>> >> ranges = <0x1000000 0x0 0x00000000 0x0 0xeff00000 0x0 0x00010000>,<br>
>> >> <0x2000000 0x0 0xc0000000 0x0 0xc0000000 0x0 0x20000000>,<br>
>> >> <0x3000000 0x8 0x00000000 0x8 0x00000000 0x1 0x00000000>;<br>
>> >><br>
>> >> interrupt-map-mask = <0 0 0 0>;<br>
>> >> interrupt-map = <0 0 0 0 &cp0_icu 0x0 22 4>;<br>
>> >> };<br>
>> ><br>
>> ><br>
>> > I am trying it now.<br>
>> ><br>
>> > Could you just give me some insight on how the peripheral base address<br>
>> > can<br>
>> > be just modified like that ?<br>
>> ><br>
>> > Is there a mapping change somewhere?<br>
>> ><br>
>><br>
>> All those addresses are configurable, and the default armada8k-pcie<br>
>> driver sets up all the translation windows from scratch (in a rather<br>
>> limited way, mind you)<br>
>><br>
>> The armada8k-pcie-ecam driver just reuses the configuration set by the<br>
>> firmware, allowing for a larger bus range and an additional 4 GB<br>
>> window for 64-bit MMIO<br>
><br>
><br>
> The new DTS extract:<br>
><br>
> cp0_pcie0: pcie@0xe0000000 {<br>
> compatible = "marvell,armada8k-pcie-ecam", "snps,dw-pcie";<br>
> reg = <0 0xe0000000 0 0x10000>;<br>
><br>
> #address-cells = <3>;<br>
> #size-cells = <2>;<br>
> #interrupt-cells = <1>;<br>
> device_type = "pci";<br>
> dma-coherent;<br>
> msi-parent = <&gic_v2m0>;<br>
><br>
> bus-range = <0 0xfe>;<br>
> ranges = <0x1000000 0x0 0x00000000 0x0 0xeff00000 0x0 0x00010000>,<br>
> <0x2000000 0x0 0xc0000000 0x0 0xc0000000 0x0 0x20000000>,<br>
> <0x3000000 0x8 0x00000000 0x8 0x00000000 0x1 0x00000000>;<br>
><br>
> interrupt-map-mask = <0 0 0 0>;<br>
> interrupt-map = <0 0 0 0 &cp0_icu 0x0 22 4>;<br>
> };<br>
><br>
> The result:<br>
><br>
> [ 1.463594] PCI: OF: host bridge /cp0/pcie@0xe0000000 ranges:<br>
> [ 1.463608] PCI: OF: IO 0xeff00000..0xeff0ffff -> 0x00000000<br>
> [ 1.463616] PCI: OF: MEM 0xc0000000..0xdfffffff -> 0xc0000000<br>
> [ 1.463622] PCI: OF: MEM 0x800000000..0x8ffffffff -> 0x800000000<br>
> [ 1.463638] pci-host-generic e0000000.pcie: ECAM area [mem<br>
> 0xe0000000-0xe000ffff] can only accommodate [bus 00-ffffffffffffffff]<br>
> (reduced from [bus 00-fe] desired)<br>
> [ 1.463646] pci-host-generic e0000000.pcie: ECAM ioremap failed<br>
> [ 1.463657] pci-host-generic: probe of e0000000.pcie failed with error<br>
> -12<br>
><br>
><br>
<br>
</div></div>Please use the size I suggested for the 'reg' property<br></blockquote><div><br></div><div>Sorry I missed that:</div><div><br></div><div>[ 1.463413] PCI: OF: host bridge /cp0/pcie@0xe0000000 ranges:</div><div>[ 1.463427] PCI: OF: IO 0xeff00000..0xeff0ffff -> 0x00000000</div><div>[ 1.463435] PCI: OF: MEM 0xc0000000..0xdfffffff -> 0xc0000000</div><div>[ 1.463442] PCI: OF: MEM 0x800000000..0x8ffffffff -> 0x800000000</div><div>[ 1.463481] pci-host-generic e0000000.pcie: ECAM at [mem 0xe0000000-0xefefffff] for [bus 00-fe]</div><div>[ 1.463525] pci-host-generic e0000000.pcie: PCI host bridge to bus 0000:00</div><div>[ 1.463531] pci_bus 0000:00: root bus resource [bus 00-fe]</div><div>[ 1.463536] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]</div><div>[ 1.463541] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xdfffffff]</div><div>[ 1.463547] pci_bus 0000:00: root bus resource [mem 0x800000000-0x8ffffffff]</div><div><br></div><div><br></div><div>So I assume this works, and I am super grateful. I will test it tomorrow with our Smart NIC.</div><div><br></div><div>However, we are building a product that obviously requires long term maintenance, so may I please get your input on a strategy with this?</div><div><br></div><div>If we decide to stick with this driver, would it be easy for things to become disjointed?</div><div><br></div><div>The hope with going the EFI route is that we could boot "generic" Ubuntu and CentOS installs, so I guess as long as we keep the DT and the EFKII snapshot in sync on our side, the risk is low.</div><div><br></div><div>For example, using the same DT with uboot, it fails:<br></div><div><br></div><div><div>[ 0.294942] sysfs: cannot create duplicate filename '/bus/platform/devices/e0000000.pcie'</div><div>[ 0.294950] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc5-mbcin-netronome-2-dirty #2</div><div>[ 0.294952] Hardware name: Marvell 8040 MACHIATOBin (DT)</div></div><div><div>[ 0.294955] Call trace:</div><div>[ 0.294967] dump_backtrace+0x0/0x150</div><div>[ 0.294970] show_stack+0x14/0x20</div><div>[ 0.294976] dump_stack+0x98/0xbc</div><div>[ 0.294980] sysfs_warn_dup+0x60/0x78</div><div>[ 0.294983] sysfs_do_create_link_sd.isra.0+0xd8/0xe0</div><div>[ 0.294986] sysfs_create_link+0x20/0x40</div><div>[ 0.294990] bus_add_device+0x88/0x148</div><div>[ 0.294993] device_add+0x394/0x568</div><div>[ 0.294997] of_device_add+0x5c/0x70</div><div>[ 0.295000] of_platform_device_create_pdata+0x80/0xd0</div><div>[ 0.295003] of_platform_bus_create+0xdc/0x300</div><div>[ 0.295006] of_platform_bus_create+0x11c/0x300</div><div>[ 0.295008] of_platform_populate+0x4c/0xb0</div><div>[ 0.295014] of_platform_default_populate_init+0xa4/0xc0</div><div>[ 0.295017] do_one_initcall+0x38/0x120</div><div>[ 0.295020] kernel_init_freeable+0x134/0x1d4</div><div>[ 0.295025] kernel_init+0x10/0x100</div><div>[ 0.295028] ret_from_fork+0x10/0x18</div></div><div><br></div><div>So I think this confirms that the pcie setup is different between EDKII and uboot (unless I am doing something stupid here).</div><div><br></div><div><br></div><div><br></div></div><br></div></div>