[Macchiato] success (first boot)

Marcin Wojtas mw at semihalf.com
Tue Jan 30 11:31:49 GMT 2018


HI Marc,

2018-01-30 12:26 GMT+01:00 Marc Zyngier <marc.zyngier at arm.com>:
>
> On 30/01/18 00:17, Marcin Wojtas wrote:
> > Hi guys,
> >
> > I just recalled, that for the SLES12 SP3 requirements, I agreed to
> > 'downgrade' armada-8040-mcbin.dtb, so that it could use older DT
> > bindings.
>
> Really? Do we now have per-distribution firmware?

No, the old binding used to work with the newest mainline and SLE12
SP3, so it was a temporary compromise.

>
> > Maybe this is the reason of the observed inconsistencies? It's
> > enough to take v4.15 dtb, put it in
> > Platforms/Marvell/Armada/armada-8040-mcbin.dtb, rebuild and reflash the
> > image. I'm curious if it can help anything. When I boot the mainline
> > kernel with mainline dtb (or ACPI) everything seems fine to me, although
> > I admit, that apart from iperf3, I didn't stress the network much
> > (however all apt install succeeded).
>
> Current mainline sort of works, but see the plat you quoted below, which
> has nothing to do with DT (and may have to do with the skb being freed
> too early).

Agree, but what if it's some clock/irq corner case that does not
appear with new binding? Just guessing here, seems a bit unrelated to
me as well.

Best regards,
Marcin

>
> Thanks,
>
>         M.
>
> >
> > Best regards,
> > Marcin
> >
> > 2018-01-25 13:44 GMT+01:00 Marc Zyngier <marc.zyngier at arm.com
> > <mailto:marc.zyngier at arm.com>>:
> >
> >     On 24/01/18 09:43, Marcin Wojtas wrote:
> >     > +Antoine
> >     >
> >     > Hi Marc,
> >     >
> >     > 2018-01-24 10:25 GMT+01:00 Marc Zyngier <marc.zyngier at arm.com
> >     <mailto:marc.zyngier at arm.com>>:
> >     >> On 23/01/18 23:15, Ard Biesheuvel wrote:
> >     >>> On 23 January 2018 at 23:12, Steve McIntyre <steve at einval.com
> >     <mailto:steve at einval.com>> wrote:
> >     >>>> On Tue, Jan 23, 2018 at 11:07:53PM +0000, Matthew Bentham wrote:
> >     >>>>> Hi,
> >     >>>>> Thanks. I've got it showing a Linux terminal using an old
> >     Radeon 5570 now :-)
> >     >>>>>
> >     >>>>> My problem now is that the network device driver is causing a
> >     kernel panic when
> >     >>>>> doing "apt install task-gnome-desktop". I get a lot of warning
> >     messages first
> >     >>>>> along the lines of "detected wrong cpu at end of Tx processing".
> >     >>>>
> >     >>>> Aha! Not just me then! I've been seeing the same 2 things for a
> >     while,
> >     >>>> but Leif and Stuart have not.
> >     >>>>
> >     >>>> I was wondering if it's memory corruption in the driver, but
> >     I've not
> >     >>>> had much chance to dig into it yet.
> >     >>>>
> >     >>>
> >     >>> I think this may be a known issue. Marc?
> >     >>
> >     >> I fixed something along those lines[1]. But this driver is being
> >     >> randomly changed and almost never tested in a production
> >     environment, so
> >     >> it it breaks on almost every single effin' change.
> >     >>
> >     >> 4.14 should have the fix though... If it still breaks, I'll dust
> >     off the
> >     >> board and give it a spin.
> >     >>
> >     >>         M.
> >     >>
> >     >> [1]
> >     >>
> >     https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/marvell/mvpp2.c?id=13c249a94f525fe4c757d28854049780b25605c4
> >     <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/marvell/mvpp2.c?id=13c249a94f525fe4c757d28854049780b25605c4>
> >     >> --
> >     >
> >     > I saw your patch is already a part of v4.14... Two days ago similar
> >     > bug report was sent to the lists and I Antoine said it will be
> >     > investigated.
> >
> >     I've just booted with 4.15-rc9, and I don't see the TX issue any more,
> >     which is what I expected with the above fix.
> >
> >     What I am seeing is this:
> >
> >     [   11.849032] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
> >     [   12.720977] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
> >     [   12.809757] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready
> >     [   13.762656] mvpp2 f2000000.ethernet eth0: Link is Up - 1Gbps/Full
> >     - flow control rx/tx
> >     [   13.770787] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> >     [   14.639318] BUG: Bad page state in process swapper/0  pfn:e6804
> >     [   14.645296] page:ffff7fd8249a0100 count:0 mapcount:0 mapping:
> >           (null) index:0xfffff609268049c0
> >     [   14.654659] flags: 0xfffe00000000100(slab)
> >     [   14.658781] raw: 0fffe00000000100 0000000000000000
> >     fffff609268049c0 0000000000400000
> >     [   14.666563] raw: ffff7fd8232e5420 fffff60927bf0610
> >     fffff6092b546000 0000000000000000
> >     [   14.674348] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> >     [   14.680818] bad because of flags: 0x100(slab)
> >     [   14.685191] Modules linked in:
> >     [   14.688271] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> >     4.15.0-rc9-00124-g9f55a2093a18 #8070
> >     [   14.696568] Hardware name: Marvell 8040 MACHIATOBin (DT)
> >     [   14.701900] Call trace:
> >     [   14.704362]  dump_backtrace+0x0/0x1c0
> >     [   14.708039]  show_stack+0x24/0x30
> >     [   14.711367]  dump_stack+0x8c/0xb0
> >     [   14.714695]  bad_page+0xf0/0x158
> >     [   14.717935]  free_pages_check_bad+0x80/0xa0
> >     [   14.722134]  __free_pages_ok+0x24c/0x340
> >     [   14.726071]  page_frag_free+0x78/0x88
> >     [   14.729749]  skb_free_head+0x38/0x48
> >     [   14.733337]  skb_release_data+0x140/0x158
> >     [   14.737362]  skb_release_all+0x30/0x40
> >     [   14.741126]  consume_skb+0x38/0x110
> >     [   14.744630]  arp_process.constprop.4+0x158/0x778
> >     [   14.749264]  arp_rcv+0xf0/0x118
> >     [   14.752418]  __netif_receive_skb_core+0x5ac/0x8a8
> >     [   14.757140]  __netif_receive_skb+0x28/0x78
> >     [   14.761252]  netif_receive_skb_internal+0x30/0x120
> >     [   14.766060]  napi_gro_receive+0x15c/0x188
> >     [   14.770085]  mvpp2_poll+0x37c/0x678
> >     [   14.773586]  net_rx_action+0x108/0x390
> >     [   14.777349]  __do_softirq+0x120/0x370
> >     [   14.781025]  irq_exit+0x90/0xb0
> >     [   14.784179]  __handle_domain_irq+0x90/0xf8
> >     [   14.788291]  gic_handle_irq+0x60/0xb0
> >     [   14.791966]  el1_irq+0xb0/0x128
> >     [   14.795119]  arch_cpu_idle+0x28/0x1c8
> >     [   14.798796]  do_idle+0x138/0x1f0
> >     [   14.802036]  cpu_startup_entry+0x28/0x30
> >     [   14.805973]  rest_init+0xd4/0xe0
> >     [   14.809216]  start_kernel+0x3a4/0x3b8
> >     [   14.812911] Disabling lock debugging due to kernel taint
> >     [   15.957032] mvpp2 f4000000.ethernet eth2: Link is Up - 1Gbps/Full
> >     - flow control rx/tx
> >     [   15.965006] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
> >     [   46.036434] device eth0 entered promiscuous mode
> >
> >     which happens if you attach a macvtap to the network interface
> >     (in this case eth0). As this is the only system where this happens,
> >     I assume the Ethernet driver is doing something pretty bad when
> >     it comes to managing the skb life cycle. None of the non-Marvell
> >     systems I have access to explode this way...
> >
> >     Thanks,
> >
> >             M.
> >     --
> >     Jazz is not dead. It just smells funny...
> >
> >
>
>
> --
> Jazz is not dead. It just smells funny...



More information about the Macchiato mailing list