[Macchiato] success (first boot)

Marcin Wojtas mw at semihalf.com
Tue Jan 30 00:17:47 GMT 2018


Hi guys,

I just recalled, that for the SLES12 SP3 requirements, I agreed to
'downgrade' armada-8040-mcbin.dtb, so that it could use older DT bindings.
Maybe this is the reason of the observed inconsistencies? It's enough to
take v4.15 dtb, put it in Platforms/Marvell/Armada/armada-8040-mcbin.dtb,
rebuild and reflash the image. I'm curious if it can help anything. When I
boot the mainline kernel with mainline dtb (or ACPI) everything seems fine
to me, although I admit, that apart from iperf3, I didn't stress the
network much (however all apt install succeeded).

Best regards,
Marcin

2018-01-25 13:44 GMT+01:00 Marc Zyngier <marc.zyngier at arm.com>:

> On 24/01/18 09:43, Marcin Wojtas wrote:
> > +Antoine
> >
> > Hi Marc,
> >
> > 2018-01-24 10:25 GMT+01:00 Marc Zyngier <marc.zyngier at arm.com>:
> >> On 23/01/18 23:15, Ard Biesheuvel wrote:
> >>> On 23 January 2018 at 23:12, Steve McIntyre <steve at einval.com> wrote:
> >>>> On Tue, Jan 23, 2018 at 11:07:53PM +0000, Matthew Bentham wrote:
> >>>>> Hi,
> >>>>> Thanks. I've got it showing a Linux terminal using an old Radeon
> 5570 now :-)
> >>>>>
> >>>>> My problem now is that the network device driver is causing a kernel
> panic when
> >>>>> doing "apt install task-gnome-desktop". I get a lot of warning
> messages first
> >>>>> along the lines of "detected wrong cpu at end of Tx processing".
> >>>>
> >>>> Aha! Not just me then! I've been seeing the same 2 things for a while,
> >>>> but Leif and Stuart have not.
> >>>>
> >>>> I was wondering if it's memory corruption in the driver, but I've not
> >>>> had much chance to dig into it yet.
> >>>>
> >>>
> >>> I think this may be a known issue. Marc?
> >>
> >> I fixed something along those lines[1]. But this driver is being
> >> randomly changed and almost never tested in a production environment, so
> >> it it breaks on almost every single effin' change.
> >>
> >> 4.14 should have the fix though... If it still breaks, I'll dust off the
> >> board and give it a spin.
> >>
> >>         M.
> >>
> >> [1]
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
> linux.git/commit/drivers/net/ethernet/marvell/mvpp2.c?id=
> 13c249a94f525fe4c757d28854049780b25605c4
> >> --
> >
> > I saw your patch is already a part of v4.14... Two days ago similar
> > bug report was sent to the lists and I Antoine said it will be
> > investigated.
>
> I've just booted with 4.15-rc9, and I don't see the TX issue any more,
> which is what I expected with the above fix.
>
> What I am seeing is this:
>
> [   11.849032] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
> [   12.720977] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
> [   12.809757] IPv6: ADDRCONF(NETDEV_UP): eth2: link is not ready
> [   13.762656] mvpp2 f2000000.ethernet eth0: Link is Up - 1Gbps/Full -
> flow control rx/tx
> [   13.770787] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [   14.639318] BUG: Bad page state in process swapper/0  pfn:e6804
> [   14.645296] page:ffff7fd8249a0100 count:0 mapcount:0 mapping:
> (null) index:0xfffff609268049c0
> [   14.654659] flags: 0xfffe00000000100(slab)
> [   14.658781] raw: 0fffe00000000100 0000000000000000 fffff609268049c0
> 0000000000400000
> [   14.666563] raw: ffff7fd8232e5420 fffff60927bf0610 fffff6092b546000
> 0000000000000000
> [   14.674348] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> [   14.680818] bad because of flags: 0x100(slab)
> [   14.685191] Modules linked in:
> [   14.688271] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 4.15.0-rc9-00124-g9f55a2093a18 #8070
> [   14.696568] Hardware name: Marvell 8040 MACHIATOBin (DT)
> [   14.701900] Call trace:
> [   14.704362]  dump_backtrace+0x0/0x1c0
> [   14.708039]  show_stack+0x24/0x30
> [   14.711367]  dump_stack+0x8c/0xb0
> [   14.714695]  bad_page+0xf0/0x158
> [   14.717935]  free_pages_check_bad+0x80/0xa0
> [   14.722134]  __free_pages_ok+0x24c/0x340
> [   14.726071]  page_frag_free+0x78/0x88
> [   14.729749]  skb_free_head+0x38/0x48
> [   14.733337]  skb_release_data+0x140/0x158
> [   14.737362]  skb_release_all+0x30/0x40
> [   14.741126]  consume_skb+0x38/0x110
> [   14.744630]  arp_process.constprop.4+0x158/0x778
> [   14.749264]  arp_rcv+0xf0/0x118
> [   14.752418]  __netif_receive_skb_core+0x5ac/0x8a8
> [   14.757140]  __netif_receive_skb+0x28/0x78
> [   14.761252]  netif_receive_skb_internal+0x30/0x120
> [   14.766060]  napi_gro_receive+0x15c/0x188
> [   14.770085]  mvpp2_poll+0x37c/0x678
> [   14.773586]  net_rx_action+0x108/0x390
> [   14.777349]  __do_softirq+0x120/0x370
> [   14.781025]  irq_exit+0x90/0xb0
> [   14.784179]  __handle_domain_irq+0x90/0xf8
> [   14.788291]  gic_handle_irq+0x60/0xb0
> [   14.791966]  el1_irq+0xb0/0x128
> [   14.795119]  arch_cpu_idle+0x28/0x1c8
> [   14.798796]  do_idle+0x138/0x1f0
> [   14.802036]  cpu_startup_entry+0x28/0x30
> [   14.805973]  rest_init+0xd4/0xe0
> [   14.809216]  start_kernel+0x3a4/0x3b8
> [   14.812911] Disabling lock debugging due to kernel taint
> [   15.957032] mvpp2 f4000000.ethernet eth2: Link is Up - 1Gbps/Full -
> flow control rx/tx
> [   15.965006] IPv6: ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
> [   46.036434] device eth0 entered promiscuous mode
>
> which happens if you attach a macvtap to the network interface
> (in this case eth0). As this is the only system where this happens,
> I assume the Ethernet driver is doing something pretty bad when
> it comes to managing the skb life cycle. None of the non-Marvell
> systems I have access to explode this way...
>
> Thanks,
>
>         M.
> --
> Jazz is not dead. It just smells funny...
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einval.com/pipermail/macchiato/attachments/20180130/13e9627b/attachment.html>


More information about the Macchiato mailing list