[Macchiato] [EXT] Network load balancing interrupt issue?

Wed Jan 31 08:56:50 GMT 2018

Hi Frederik,

Network driver has two Interrupts modes:

1.       Single queue mode:

a.       Only one CPU/interrupt handle RX traffic

b.      Other 4 interrupts is per CPU interrupts and will handle TX done(release of transmitted resources)

c.       This mode is default one.

2.       Multi queue mode:

a.       In this mode you can enable RSS feature: https://www.kernel.org/doc/Documentation/networking/scaling.txt

b.      In this mode traffic would be distributed between all 4 CPU’s by different hash of 5-Tuple in packet header

So to utilize all 4 CPU’s with traffic do the following:

1. In U-Boot:
                            - Reset board and stop system in boot by hitting on any button
                            - Set multi queue mode:

                                                            setevn extra_params mvpp2x.queue_mode=1
                                                            saveenv

                            - Press boot to start Linux

            2. In Linux:
            ifconfig eth2 up
            ifconfig eth0 up

            - Enable RSS hash distribution:

            ethtool -K eth2 rxhash on
            ethtool -K eth0 rxhash on

            - Set configuration for 4 cores:

            ethtool -X eth2 weight 1 1 1 1
              ethtool -X eth0 weight 1 1 1 1

In Ixia you should create steams with different 5-Tuples, let’s say different source IP address.

Streams with different 5-Tuples would be distributed to different CPU’s.

Best Regards.

From: Macchiato [mailto:macchiato-bounces at lists.einval.com] On Behalf Of Frederik Lotter
Sent: Tuesday, January 30, 2018 6:41 PM
To: macchiato at lists.einval.com<mailto:macchiato at lists.einval.com>
Subject: [EXT] [Macchiato] Network load balancing interrupt issue?

External Email
________________________________
Hi guys,

I apologize if I have this wrong.

Hi,

I am using an IXIA packet generator to test the networking interfaces (kernel 4.4.52).

It does not seem like the kernel ever spend more than ~25% of processing during packet reads using pcap/tcpdump. I have tried multiple tcpdump instances at once, and my PCAP read loop is threaded over 4 threads.

The basic issue is that only one processor is at 100% system (kernel) utilization, while the other 3 processors are mostly running userspace threads (let’s say at 10% utilization each).

This lead me to look at the interrupt handling:

cat /proc/interrupts

116: 2 0 0 0 ICU 129 Level f2000000.ppv22.eth0.link
117: 0 0 0 0 ICU 39 Level f2000000.ppv22.eth0.cpu0
118: 0 0 0 0 ICU 43 Level f2000000.ppv22.eth0.cpu1
119: 0 0 0 0 ICU 47 Level f2000000.ppv22.eth0.cpu2
120: 0 0 0 0 ICU 51 Level f2000000.ppv22.eth0.cpu3
121: 0 0 0 0 ICU 55 Level f2000000.ppv22.eth0.rx_shared
122: 5 0 0 0 ICU 129 Level f4000000.ppv22.eth1.link
123: 0 0 0 0 ICU 39 Level f4000000.ppv22.eth1.cpu0
124: 0 0 0 0 ICU 43 Level f4000000.ppv22.eth1.cpu1
125: 0 0 93 0 ICU 47 Level f4000000.ppv22.eth1.cpu2
126: 0 0 0 14 ICU 51 Level f4000000.ppv22.eth1.cpu3
127: 26 0 0 0 ICU 55 Level f4000000.ppv22.eth1.rx_shared
128: 2 0 0 0 ICU 128 Level f4000000.ppv22.eth2.link
129: 256721 0 0 0 ICU 40 Level f4000000.ppv22.eth2.cpu0
130: 0 88767 0 0 ICU 44 Level f4000000.ppv22.eth2.cpu1
131: 0 0 12367 0 ICU 48 Level f4000000.ppv22.eth2.cpu2
132: 0 0 0 20693 ICU 52 Level f4000000.ppv22.eth2.cpu3
133: 1118351 0 0 0 ICU 56 Level f4000000.ppv22.eth2.rx_shared
134: 0 0 0 0 ICU 127 Level f4000000.ppv22.eth3.link
135: 0 0 0 0 ICU 41 Level f4000000.ppv22.eth3.cpu0
136: 0 0 0 0 ICU 45 Level f4000000.ppv22.eth3.cpu1
137: 0 0 0 0 ICU 49 Level f4000000.ppv22.eth3.cpu2
138: 0 0 0 0 ICU 53 Level f4000000.ppv22.eth3.cpu3
139: 0 0 0 0 ICU 57 Level f4000000.ppv22.eth3.rx_shared
140: 0 0 0 0 MSI 0 Edge PCIe PME, aerdrv
141: 0 0 0 0 pMSI 1024 Edge f26a0000.dma_xor
142: 0 0 0 0 pMSI 1280 Edge f26c0000.dma_xor
143: 0 0 0 0 pMSI 1536 Edge f46a0000.dma_xor
144: 0 0 0 0 pMSI 1792 Edge f46c0000.dma_xor
145: 1 1 0 0 MSI 524288 Edge enp1s0np0-lsc
146: 0 0 0 0 MSI 524289 Edge enp1s0np0-exn
147: 86 0 0 0 MSI 524290 Edge enp1s0np0-rxtx-0
148: 0 11 0 0 MSI 524291 Edge enp1s0np0-rxtx-1
149: 0 0 14 0 MSI 524292 Edge enp1s0np0-rxtx-2
150: 0 0 0 0 MSI 524293 Edge enp1s0np0-rxtx-3
IPI0: 62499 15118972 10318390 12673769 Rescheduling interrupts

The thing to note here is that eth2 is the management port, and is enabled at startup. The IXIA tests run on eth1 (marvell) and enp1s0np0 (proprietory) NICs. It does not matter which NIC I excercise, the respective NIC generate almost no interrupts, and it appears that interrupt traffic due to networking
always end up with:

133: 1118351 0 0 0 ICU 56 Level f4000000.ppv22.eth2.rx_shared

and also:

IPI0: 62499 15118972 10318390 12673769 Rescheduling interrupts

Although RES interrupts are somewhat load balanced (although watching it live it is not really actively balanced, but more like switched periodically), while RX traffic appear to all come in on 133.

This, I believe is causing a serious networking bottleneck.

Any ideas or comments on my observation?

Regards,
Fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.einval.com/pipermail/macchiato/attachments/20180131/41dcf8f7/attachment-0001.html>