From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id AED2343AE1; Thu, 8 Feb 2024 18:51:42 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 91C0842E65; Thu, 8 Feb 2024 18:51:09 +0100 (CET) Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by mails.dpdk.org (Postfix) with ESMTP id 4DF9942E2F for ; Thu, 8 Feb 2024 18:51:06 +0100 (CET) Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-6da202aa138so61280b3a.2 for ; Thu, 08 Feb 2024 09:51:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20230601.gappssmtp.com; s=20230601; t=1707414665; x=1708019465; darn=dpdk.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5fLbfOoPWFp8+7e+bVUmtPAhNL+m2IvM3AhwA0UFGoM=; b=nNHm0pX8ITvSvhBeRPUh09YWj+vNGYR99xppm64sicd+pkgUHP88S+IPFA6pUPfzjc njMrOloEHbsecfGP4A+WKzNPALOpCujpKvpS9QwyyXYARFaxfmM7Fq3SDk/4Yym/wRmz OIOkIkBK3Dv/1WQW8XI4xcH+8tJ49ETNR1NTxqCgJ08UWLJVGqhX6lpINfNoImVS6lwN G+wSHb7qQLq9bZiPpNuorMWtK8a9AxvkG7ggJaG0Taxc9E2oJXE5R32Y61thAW9gLHlv Atp/FiA5XTZY08qBIEqk8j3vCQx3D4YuDhvggS57mOE5BnAbKYGHQ+rFL98chLEMopRA JURw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707414665; x=1708019465; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5fLbfOoPWFp8+7e+bVUmtPAhNL+m2IvM3AhwA0UFGoM=; b=f2jgkazYqvPeQ1ezKdLQPqRYUO+n5IKPEnLYYicfB+VicQ0NnHXdUIOsKtSxfPojSE LSfuhjZjJmrTnORGbS2phIJQYQhM/tRU5RkcOK9nkxwJEmGaLCOXABsTEn7R7qH/C8zg 2GePKMEGWr8nRzu8Rv/RZH3duib5RJcXWrUmJ9seqWieIVtN0pnHnQlUgPHwdO72X+uq eISWgZbLiwQriKZELp6zer26kdafBcCAKmlF6xqCziPXjMJvTwoA/nG4DrcGeh+g6ORy BQwAhS3iiRyl8QkxcQhSulkIvioCa5J57NlPtvAE53NiUeWVclfb+a7nV0hWNFL3jrrx IjaQ== X-Gm-Message-State: AOJu0Yxs3sUXEZsa2dMp9Eud/eS7LYpEbF0JwRox61tNxJUkbvDNX9PX 8hqHco3gyvSanVikwZLhinTUHKZ1GX3EvXPWRuYYkMtWClf66tL6GnHsKsQ1ngklZaWZ/2WsbEu 06wM= X-Google-Smtp-Source: AGHT+IH1xjot0vHE3ekGDXMKt6KAk0tgW0Afd+ucHPdXDfH3mwavmR+EV6HGN4YG7hxh/twt8LySxw== X-Received: by 2002:a05:6a20:258c:b0:19c:aaee:bb1b with SMTP id k12-20020a056a20258c00b0019caaeebb1bmr386585pzd.7.1707414665241; Thu, 08 Feb 2024 09:51:05 -0800 (PST) Received: from hermes.local (204-195-123-141.wavecable.com. [204.195.123.141]) by smtp.gmail.com with ESMTPSA id f19-20020a056a00229300b006dbda7bcf3csm5030pfe.83.2024.02.08.09.51.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Feb 2024 09:51:04 -0800 (PST) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger Subject: [PATCH v3 5/7] net/tap: use libbpf to load new BPF program Date: Thu, 8 Feb 2024 09:41:29 -0800 Message-ID: <20240208175051.326550-6-stephen@networkplumber.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240208175051.326550-1-stephen@networkplumber.org> References: <20240130034925.44869-1-stephen@networkplumber.org> <20240208175051.326550-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org There were multiple issues in the RSS queue support in the TAP driver. This required extensive rework of the BPF support. Change the BPF loading to use bpftool to create a skeleton header file, and load with libbpf. The BPF is always compiled from source so less chance that source and instructions diverge. Also resolves issue where libbpf and source get out of sync. The program is only loaded once, so if multiple rules are created only one BPF program is loaded in kernel. The new BPF program only needs a single action. No need for action and re-classification step. It alsow fixes the missing bits from the original. - supports setting RSS key per flow - level of hash can be L3 or L3/L4. Signed-off-by: Stephen Hemminger --- drivers/net/tap/bpf/tap_bpf_program.o | Bin 0 -> 28080 bytes drivers/net/tap/meson.build | 26 +- drivers/net/tap/rte_eth_tap.c | 2 + drivers/net/tap/rte_eth_tap.h | 9 +- drivers/net/tap/tap_flow.c | 381 +++++++------------------- drivers/net/tap/tap_flow.h | 11 +- drivers/net/tap/tap_rss.h | 3 + drivers/net/tap/tap_tcmsgs.h | 4 +- 8 files changed, 113 insertions(+), 323 deletions(-) create mode 100644 drivers/net/tap/bpf/tap_bpf_program.o diff --git a/drivers/net/tap/bpf/tap_bpf_program.o b/drivers/net/tap/bpf/tap_bpf_program.o new file mode 100644 index 0000000000000000000000000000000000000000..db016eaadefc63e0bbc45b3115f497a465438d33 GIT binary patch literal 28080 zcmbtc4Rl>qnY~G0+W>_|BoeE_McN@Cl-$3UvKaIassx~TUg(Zzk<#m{m2FHQoj1rW=GRzEt@uvbx!3wj_x!?ER_kGkZSo7AR zci+A5Is4rG&N=U#oAD zhx_k-lCSEXyZbeVw-C{_F+ZP*>w<~zUcD5@5bz2*UKkhrm0RllA?QAg>c#16c;##m z+)ki!a1OdY+UP;oz!~}R$_4(|jGIvHnUVWf?###mcp8*t*Ng+Y_2x8UUB zeS_fA89RL&-1RFiTXKmLmtMZIXU&X>Du2nU@G{4*SRHn+nX#kFuUxe%bo`|~SB7h5 zj92;Iu2sE`Z#i-<`0bwGGSx9bu{~b@{og|3JWFZvJ?+K6ie}$MbwE zN95O?3)Q_FYpxpe?Q5Fy!@7_E2mH})_tq|)a?O+m9H%$u*YC;4NyprLW!1-B**`eV z7wEr!AAfABuM6G+w6MO(-)H~zc`*HJeX*c2&tDcSog4(t?_3;%LxW)cteig=-6;f| ze|CPpu#neZxjsL4@4ri`e`bEZ6xX|U^&f^H@LGz4yYu-c_)}cx?oOP)8x`QS6bI)& z|1f_VRPt^cxcb^bZKEUi>z+cL@6RrmN3VPEXujUHw}k7Ppu5*5=yy(9cb?Qk@KT>Q zb;i9?ckPw>jR&Mlr0&`&^`>|A{!HpP@uH{XW zYVE&!qSReE$;*Ko?>-MYRxZ!^O?cnVM-a>7yd3jf&edCz_2T+12uA9m(N`cCsry{G zT`2z~f|0tIc{pGI+ zrhn~pe}jAKFRvgN&UJ77sZGWjC7^&O- zaxH?9y6rEoAsDIK{_+}v>0j$U1@=6@yoz8r*S+2!f|0ts)jPUol2z>#9 zk-E=C=o=7B{|#?NtM<@eP}vCxrgD~-U;lY@YRb!O-)ZoTGS?)n6Q7(Kl58?;^&*DXab z=DJ2%&%-Vl?b9IjFC!SE7o`4P1Y`7isecr~7`;yFze7M;FJoO@yU$-E7_FC-ejkF+ zdZ_ee2tezNN*_l6T5nML_Yi>A3rg=s09vnC`hy5S>vc+BhX9mbM*k7_KLY_My`=RN z0VqAxdM^S{ItGTU|4{^>bPNosZ$tn}N8gqDsR%&n=&Mq{2>~b_Z>`i1MF3LA?JNCw z1R!qjCK)H*fXVbzSdT7X(khq=e%(977!Y2lM?})1-R8 zpL6?E_gXk zGe2H|R$_bT*W-TQb$HIZ`5doOb-u=Re&40_Q0ji)rS(Rs`+b+z8>H^{U0N?l-S4}! zUN3dG@2Y6MPU?Q&rSx*`^9c9n_FKAqN$GxnrOJm&_xmfQW8UJ=Bh>HrS4zjcC3L^P zQaa`>q5J)n(lKud-R-Y*`!R0`-S4kt`7-*q(A~aDl`m=C@1tb-Q0smlC3QD%iTeFM zO6us4Quq5PsUtI~`+bzu(Y2-S_ED<+7$Q>l`zTQ!x36@!f0E^K`%3rwB~d;!y5BDe zU9D$uFMhuybT=Ml``mu1i=OvXW|47}<3qtq%AF1O#lDgmjNZri~qP*MxsP?;gLFj(}Bg*6U zmG1UEvOI2I>3*Lh%7;ex`y8SBc|rL5eU8xmydZS9&*@t0N?{z*{?$Eic3rh;iW}9K zThLG4e&-30C7jFa{JbEpbNe3N&$!QdKQHh+je7k4M?7y+`y77Wc9L_=_meo3s`D83 zPkx`H^-$`5f1~xof2co>+-$>of z2SWGz8>zebKKwLX}E0F18I2N?p; zdRfi?mm(0Ymz4e(0?>M>^zR@5t?Tu{1qeXvdVTN%1fX@jK3IhSw650&&maJ;>-E7S z2tes&^mpzb=nr2(07@@u{e1{P>7mvK5P;I%d?4!oBLYymSsy%&0F>_L15th(0#Le} z4}|_11fX;`9|--O2tew%eWkY}0IB2lmEMm4q#hdmX#^m3KOYGH2N8hO?fPIH0_a-n zo(kyCeZ-*F2k$@-Z)AN?L?C-xA1p&4^6znvBY^62&^NF?xCepA=lE!5DCggc5s1;% z`rsi1Vsy1WxCMb2U9AshAqbGvQ2 zt?Tu{#Rx#_dVTN^0?@i%AKZ!nw650&M<4*L>-E9kBLJnB(a*%^*Ow50(o0%zM*vC> zwcdvSly24sI}m`<&H7*~0#LeHACwV*(#`rHKmbZN>w`K3Aa&fny8lTCKgX0iD*I@qrmc6YHu0jxRWPR`=0@>U8U;=^2^}(YE;LWTL zHX;x?zafrL&cCN15TmR0!8!zDbhSR1fk2F|)(07aFuGbFT#5jUuGR;?L;zYZtNH(K z1funl($7Z#S`U?e9|F+2ULPz&09x1UgU1no*7f?}UId_Zy*^li0JN^x2R}mqN-v|o zi_fpmBLJnBv>qb>rH5MYK>$iO>w}*n0HvGt!EFdY>1KVbQM%|0g2=spIyQ{z(KN_0Z@CA^@q|^}#FzAa%PwI1K@Gt#!5TeSL5nf_NkAgCh~h z-qr_aArQGf*n|My%=%zH0+I6@njOl|mpc%M(bf9kcm!f}wLbV50x`N;AN&PD7+tLo z{)_;OuGR;?MF3hatNH(N1funl((ge4S`U@pg8;Oy*9SjE09x1UgWC~+*7f?}C1+Jlpbom69FjQtPj420F-Xl2i*uj>1KVf z4FM?KtPgHN07^IOgF_I2)N%Xj{*OZdQpfEp{VD_?_0Z^lLI6^?>x1VIfYj~!;1>v> zYj7rN-TU){ZzG5|vOYKmf$VL4a2EoptoC-nCS-jc0;sMJh?n1A!IdTXcZ_=wh|z1` zo8addsP|$7W3FqI^^74Hqc=!>J%TZMLFyp_F?zk!Pe2exuao*9g3x+d{k>)ig3)?O z>C+H|)m7mpmod(LO%onD7}n+ zFZA~z0Hv3-K7as}9%_9j0#Le}7exI}ApoVjc|qt8AONMic|quF5rER&ydd;_5P;I% zydd-g5rEWj`%0gM0HluFSNiD)Kz6b$G9k;Li??(Vq z$L%ZqY6S4QzbBxQmvQ5jCR~iyQampTaok2no~JfV3W669!150vG`yIX-_rnHs{Z{F z*FS?mj2=q;5d>m%^*yRP5Qx#$_o&{FK#Z=wM|CX%F}nI5)oTdE=<0h^uOb+&mur8Y z#r^F@AX+aeeHQ}IdZ_fB2tezY55)Ieb|3()V?Gf2KOz9FV?Gf2AOg@j<^!QmMF3jI zd?57c2tes&^e3SoiU5>e()#-mfYL*)UyA^g?&bqg|EmZ<>25v{`d<-%(%pO@^uHhg zrMvk+=r17vrMvk+=)Xq*QpfEp{Rsphb=q(TFeKaoo|1=`_@$L_agoEv;DdI z9f@mqf8p5t9F1^KI_J;yr@{RE&mjK$O(Cc&P`^9gaT5N)%gsmboApa^bnTno>R8lQ zmwyqe=AR4vdzJqB*k4!mpOob1sK@&s?oWgHZ_STW!MopYbr(}lxZkn3JFfKQ`tA9_ zb1p|u=aG2MwT)jVU*5GWJn^iPk3Dt48O*ugM7iHXx!+8=-%GjQRJq?%x!+v5-&?ug zV!2miehBKW`cTkt^?r5k*FCPi`P1QC9H495!969)Tbz>{=zGTb+28)R15zRQqyMdw zy=s#G#gk^w-p`#+%I{lCU|k~m^+A7qA!wwGSiQ-07F@sHwDSRco;ZHMT)&F0@6CT+ z;V$y88~x973c*fu{WhOtAsCi#2ui$$_tVusX83vJJ7M_s{=5)0xPBpCCy{T9;mhRP zVfazc{p1@Oehc}Q3|~*aiw*yM@+}+w zdH-{^La@Q`N0IMV!}s~~La@v5KlJZkA!zpRL+utWBj0wzf17+e4ga+NIwty$;g2NW zVZ*N`-%-OKLB8XLznFX*^RInY8~P>kZ8iLV`uC|2EHL~rvy z{dpnSZul>f?@q)2k8g1yXz}k)wfVv4$+yGszvOXe_=CxJ$nY1E?}*_?$al=}=acV* z;fKk$!5?eweB8eug`i~kL&lL-(kZKlJBVDAExm;ZuslScbDPsr+K5Epj0+*3M^r?ihE5Ka6}k48N3odkudN`3@Pr$DbF1 z5yRg}zGH^}6^-8s!-wSC;NQ1e`!6Nm7Q-Jxz8!`?pL}}_e=E-$hJT#r4a3hQ-!a2K z=5sFu6NW#Cd>itgr&e#^4DxL;{4M0$VR-!H8?QppYxwo#J7oBOrExl9_yfpy%`%^_!eGx`gUO5OTLQ@znjOA z;ZyS6VEC(O-q>pRpV0W-VfX>^^*^ehwz$Lh$v4b@Hd$4EN8_ku_=J2H8~z&dEgL=} z-wlSpjC{8m{u|`G!|=bQapeD}1JzarK^yso^t&&Xo}+P8GW-$byV&rH$hU0xuaoZv z!{_VcLa^2F8_9Qv;djtD3TR)%?QACB(D462z9qwdlg962!(TwYWy61-d^Z^WQ5r{E z4Sz8CP8j|o@@=GjCAYtcd|M6wA@c1s{Eg&0VEFAcUWN_-4)Pr}{KGU(#|^(9`R+3O zN#xs1&)c~DpCsRQ!~eU#pb&H#{uAUoVEBLLb%o(4lkceEPbc4T!+(~1cNzXqG>)3z z#x2i}F}0`N@E@V}bQ*p$jo$&o|AofUu;K3`-%-QAK;vlK@b4ktU54)`-{ymL`+r2f z?S@}LzMY1@i+l$R|6eqYh7Erd`HmX?a`GKFd<*%i2Lotw5Bavp2L|B3Ouik4f11Wo zui&SP^ z@Y`toP8j|`@@;sBZvRhcoVFOgo_sqDKbL%a4PPeTA;bTRx5IkX@Yj&q0cyQhVLcc0mI)*zQcxJLcXJh-$=gWhW|Be=GTR z8vaQdM+1hxnS6&0|1gcCQNvFm-*LmALB6{T|2gsvL(zD^)5y2O@Sh;xUc*;t{0aE%}ZZ{%IOV zV}=i@JrjoSr1mr%uG{|=YEO&dpRIS-6@m`Ke~o;54gVC4qank;o%1*RBJv$G{I|$= z!tfW8Z^KO8{x6VktKol6N7zo^UGGj90&xe}hw8~!IWew+VJxBss+ zj@k`>68Ux-{xjq|VE7Ywo;3V5b-$uTj zhF?Lx1BU-T`3@W2|4|U*Xw>lclJB_TchY#-W%w5IZEn%+zl?lKhW~-T49~9&zl?l` z4FAoBJhu_UKQ}q&#|+;}z7vLDL%t11dQko6S2T`V41X}Sr^E2`sXe`hzn$7MWcVF4 zjz$dsMe-dp{5XxH3Bxy%Z^Kd5rh9M>`L-DTtK{2Z__N8k*YG#*>#D=w`waik)SMqN z{B+LW@E<1M3B&JCz76m3&944)+P-=HEr!qkx09IX4L?TXx7YA*CEp>#KS<+u#PE~I zcg*nbrTd*Q{B`8paCEil9{h~PZ;RotCEo>x|It4A{SFxZMH)xLhM&vxq~UMidD8Ih z)AIU*V{|(&C*RQUcav|)@GsCfT5R~^$+v9yKkk?NZZQ08^4)6qCFHxq@OP1KFk83( zC4X87Lc@QHd`pIZhQ`Zc!yih%WyAN7?*_wfCf}`w?e>x(LTBm%HN`C!QEBnlmRq78zs=QU(cwm_fv# zol7&w7^Dmm1~G$(K^v8BV^Cy}F-RFC3}OZmgCbYMpvWL&kTOUZ#0(+^8CS!g$RJ~o zGDsN23?c?8SHqylAY+g+NEpNnA_fUp!=T6@V~{dP7{m-B1~FH|pvWL&kTOUZ#0;V; z;A$cQMFtszltIEEW)LxGYvtk$G6pGwgh9+8Vi32sQ>Sg?PD>!0Lm*|4Fo+pM4BELg zgN#ASAYl+Qh#0g{={5#M1{s5tLBb$r5HTonH4KUjG6pGwgh9+8Vvun)42ldg1}TGt zLChdxka9H)iVQLaDT9PT%phWra5W5y3^E2OgM>lMAYu@6H4KUjG6pGwgh9+8ssgSi zB2Z+IF-RFC3}OZmgSJ*K&LCruGDsLi42qbJS6tk+y35VF(d?*5JvZZ?OCXy=AZ3s+ zh#5o-+PO4?j6upEVGuKj7_?F8HU>oo8H1ET!XRc4F(`6142ldg1}TGtLChdxka0B( ziVQLaDT9PT%phWray1N!3^E2OgM>lMAYzbkH4KUjG6pGwgh9+8Vi0pR42ldg1}TGt zLChek0)L6JemAZ3s+h#5o-id+qYB7=-U${=A7 zGl&>uTn&RFgN#ASAYl+Qh!~_?4TBmq-^f;ZwW zOdw+>Wsoq48AJ@)sZbk(B7=-U${=A7Gl&?paWxEz3^E2OgM>lMAYxGDY8Vt5WDHUU z34@qH#318p7!(;~3{nOOgP1|YAmwTp6d7a;QU(cwm_fuK;c6HZ8DtDn1_^_hLBt^D zY8Vt5WDHUU34@qHR0Uj3M4-qZV~{dP7{m-B25qfeoI%DQWsoq48AJg8-`9sxL2Y+i zt8K#tIc_|l#yRA-?SgXW?mxc`7ko)z32Qs%oONFV+go|wZ*v{~@fDxo_(>3ZDnCls z@V=aj*zz}kaE}&Bd;uN%`2OkKJsxmv=kkrjyQA~B8Fm))3wZxem*;zUKEcsFrrA^Z ztEoIcC*UIPTKRei&(9oZ9`^+6sl0nmAz$2XuIG4&4!h?I@-<537f2GA@1N qO!nlsk_fd); internals->nlsk_fd = -1; + tap_flow_bpf_destroy(internals); } for (i = 0; i < RTE_PMD_TAP_MAX_QUEUES; i++) { @@ -1959,6 +1960,7 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, const char *tap_name, strlcpy(pmd->name, tap_name, sizeof(pmd->name)); pmd->type = type; pmd->ka_fd = -1; + pmd->rss = NULL; pmd->nlsk_fd = -1; pmd->gso_ctx_mp = NULL; diff --git a/drivers/net/tap/rte_eth_tap.h b/drivers/net/tap/rte_eth_tap.h index 5ac93f93e961..0cf2b30bb03b 100644 --- a/drivers/net/tap/rte_eth_tap.h +++ b/drivers/net/tap/rte_eth_tap.h @@ -79,12 +79,11 @@ struct pmd_internals { int flow_isolate; /* 1 if flow isolation is enabled */ int flower_support; /* 1 if kernel supports, else 0 */ int flower_vlan_support; /* 1 if kernel supports, else 0 */ - int rss_enabled; /* 1 if RSS is enabled, else 0 */ int persist; /* 1 if keep link up, else 0 */ - /* implicit rules set when RSS is enabled */ - int map_fd; /* BPF RSS map fd */ - int bpf_fd[RTE_PMD_TAP_MAX_QUEUES];/* List of bpf fds per queue */ - LIST_HEAD(tap_rss_flows, rte_flow) rss_flows; + + struct tap_rss *rss; /* BPF program */ + uint16_t bpf_flowid; /* next BPF class id */ + LIST_HEAD(tap_flows, rte_flow) flows; /* rte_flow rules */ /* implicit rte_flow rules set when a remote device is active */ LIST_HEAD(tap_implicit_flows, rte_flow) implicit_flows; diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c index 94436af55ce8..bc817bdb4fd3 100644 --- a/drivers/net/tap/tap_flow.c +++ b/drivers/net/tap/tap_flow.c @@ -16,24 +16,16 @@ #include #include -#include #include #include - -/* RSS key management */ -enum bpf_rss_key_e { - KEY_CMD_GET = 1, - KEY_CMD_RELEASE, - KEY_CMD_INIT, - KEY_CMD_DEINIT, -}; - -enum key_status_e { - KEY_STAT_UNSPEC, - KEY_STAT_USED, - KEY_STAT_AVAILABLE, -}; +#ifdef HAVE_BPF_RSS +/* Workaround for warning in bpftool generated skeleton code */ +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wcast-qual" +#include "tap_rss.skel.h" +#pragma GCC diagnostic pop +#endif #define ISOLATE_HANDLE 1 #define REMOTE_PROMISCUOUS_HANDLE 2 @@ -41,8 +33,7 @@ enum key_status_e { struct rte_flow { LIST_ENTRY(rte_flow) next; /* Pointer to the next rte_flow structure */ struct rte_flow *remote_flow; /* associated remote flow */ - int bpf_fd[SEC_MAX]; /* list of bfs fds per ELF section */ - uint32_t key_idx; /* RSS rule key index into BPF map */ + uint16_t flowid; struct nlmsg msg; }; @@ -72,6 +63,7 @@ struct action_data { } skbedit; struct bpf { struct tc_act_bpf bpf; + uint16_t classid; int bpf_fd; const char *annotation; } bpf; @@ -112,13 +104,12 @@ tap_flow_isolate(struct rte_eth_dev *dev, int set, struct rte_flow_error *error); -static int bpf_rss_key(enum bpf_rss_key_e cmd, __u32 *key_idx); -static int rss_enable(struct pmd_internals *pmd, - const struct rte_flow_attr *attr, - struct rte_flow_error *error); +#ifdef HAVE_BPF_RSS +static int rss_enable(struct pmd_internals *pmd, struct rte_flow_error *error); static int rss_add_actions(struct rte_flow *flow, struct pmd_internals *pmd, const struct rte_flow_action_rss *rss, struct rte_flow_error *error); +#endif static const struct rte_flow_ops tap_flow_ops = { .validate = tap_flow_validate, @@ -865,6 +856,9 @@ add_action(struct rte_flow *flow, size_t *act_index, struct action_data *adata) tap_nlattr_add(&msg->nh, TCA_ACT_BPF_PARMS, sizeof(adata->bpf.bpf), &adata->bpf.bpf); + tap_nlattr_add(&msg->nh, TCA_BPF_CLASSID, + sizeof(adata->bpf.classid), + &adata->bpf.classid); } else { return -1; } @@ -1101,8 +1095,7 @@ priv_flow_process(struct pmd_internals *pmd, }, }; - err = add_actions(flow, 1, &adata, - TCA_FLOWER_ACT); + err = add_actions(flow, 1, &adata, TCA_FLOWER_ACT); } } else if (actions->type == RTE_FLOW_ACTION_TYPE_QUEUE) { const struct rte_flow_action_queue *queue = @@ -1129,6 +1122,7 @@ priv_flow_process(struct pmd_internals *pmd, err = add_actions(flow, 1, &adata, TCA_FLOWER_ACT); } +#ifdef HAVE_BPF_RSS } else if (actions->type == RTE_FLOW_ACTION_TYPE_RSS) { const struct rte_flow_action_rss *rss = (const struct rte_flow_action_rss *) @@ -1137,13 +1131,14 @@ priv_flow_process(struct pmd_internals *pmd, if (action++) goto exit_action_not_supported; - if (!pmd->rss_enabled) { - err = rss_enable(pmd, attr, error); + if (pmd->rss == NULL) { + err = rss_enable(pmd, error); if (err) goto exit_action_not_supported; } if (flow) err = rss_add_actions(flow, pmd, rss, error); +#endif } else { goto exit_action_not_supported; } @@ -1239,26 +1234,17 @@ tap_flow_set_handle(struct rte_flow *flow) * */ static void -tap_flow_free(struct pmd_internals *pmd, struct rte_flow *flow) +tap_flow_free(struct pmd_internals *pmd __rte_unused, struct rte_flow *flow) { - int i; - if (!flow) return; - if (pmd->rss_enabled) { - /* Close flow BPF file descriptors */ - for (i = 0; i < SEC_MAX; i++) - if (flow->bpf_fd[i] != 0) { - close(flow->bpf_fd[i]); - flow->bpf_fd[i] = 0; - } - - /* Release the map key for this RSS rule */ - bpf_rss_key(KEY_CMD_RELEASE, &flow->key_idx); - flow->key_idx = 0; - } - +#ifdef HAVE_BPF_RSS + struct tap_rss *rss = pmd->rss; + if (rss) + bpf_map__delete_elem(rss->maps.rss_map, &flow->flowid, + sizeof(flow->flowid), 0); +#endif /* Free flow allocated memory */ rte_free(flow); } @@ -1725,14 +1711,18 @@ tap_flow_implicit_flush(struct pmd_internals *pmd, struct rte_flow_error *error) return 0; } -#define MAX_RSS_KEYS 256 -#define KEY_IDX_OFFSET (3 * MAX_RSS_KEYS) -#define SEC_NAME_CLS_Q "cls_q" - -static const char *sec_name[SEC_MAX] = { - [SEC_L3_L4] = "l3_l4", -}; +/** + * Cleanup when device is closed + */ +void tap_flow_bpf_destroy(struct pmd_internals *pmd __rte_unused) +{ +#ifdef HAVE_BPF_RSS + tap_rss__destroy(pmd->rss); + pmd->rss = NULL; +#endif +} +#ifdef HAVE_BPF_RSS /** * Enable RSS on tap: create TC rules for queuing. * @@ -1747,226 +1737,62 @@ static const char *sec_name[SEC_MAX] = { * * @return 0 on success, negative value on failure. */ -static int rss_enable(struct pmd_internals *pmd, - const struct rte_flow_attr *attr, - struct rte_flow_error *error) +static int rss_enable(struct pmd_internals *pmd, struct rte_flow_error *error) { - struct rte_flow *rss_flow = NULL; - struct nlmsg *msg = NULL; - /* 4096 is the maximum number of instructions for a BPF program */ - char annotation[64]; - int i; - int err = 0; - - /* unlimit locked memory */ - struct rlimit memlock_limit = { - .rlim_cur = RLIM_INFINITY, - .rlim_max = RLIM_INFINITY, - }; - setrlimit(RLIMIT_MEMLOCK, &memlock_limit); - - /* Get a new map key for a new RSS rule */ - err = bpf_rss_key(KEY_CMD_INIT, NULL); - if (err < 0) { - rte_flow_error_set( - error, EINVAL, RTE_FLOW_ERROR_TYPE_HANDLE, NULL, - "Failed to initialize BPF RSS keys"); - - return -1; - } - - /* - * Create BPF RSS MAP - */ - pmd->map_fd = tap_flow_bpf_rss_map_create(sizeof(__u32), /* key size */ - sizeof(struct rss_key), - MAX_RSS_KEYS); - if (pmd->map_fd < 0) { - TAP_LOG(ERR, - "Failed to create BPF map (%d): %s", - errno, strerror(errno)); - rte_flow_error_set( - error, ENOTSUP, RTE_FLOW_ERROR_TYPE_HANDLE, NULL, - "Kernel too old or not configured " - "to support BPF maps"); + int err; - return -ENOTSUP; + /* Load the BPF program (defined in tap_bpf.h from skeleton) */ + pmd->rss = tap_rss__open_and_load(); + if (pmd->rss == NULL) { + TAP_LOG(ERR, "Failed to load BPF object: %s", strerror(errno)); + rte_flow_error_set(error, errno, RTE_FLOW_ERROR_TYPE_HANDLE, NULL, + "BPF object could not be loaded"); + return -errno; } - /* - * Add a rule per queue to match reclassified packets and direct them to - * the correct queue. - */ - for (i = 0; i < pmd->dev->data->nb_rx_queues; i++) { - pmd->bpf_fd[i] = tap_flow_bpf_cls_q(i); - if (pmd->bpf_fd[i] < 0) { - TAP_LOG(ERR, - "Failed to load BPF section %s for queue %d", - SEC_NAME_CLS_Q, i); - rte_flow_error_set( - error, ENOTSUP, RTE_FLOW_ERROR_TYPE_HANDLE, - NULL, - "Kernel too old or not configured " - "to support BPF programs loading"); - - return -ENOTSUP; - } - - rss_flow = rte_zmalloc(__func__, sizeof(struct rte_flow), 0); - if (!rss_flow) { - TAP_LOG(ERR, - "Cannot allocate memory for rte_flow"); - return -1; - } - msg = &rss_flow->msg; - tc_init_msg(msg, pmd->if_index, RTM_NEWTFILTER, NLM_F_REQUEST | - NLM_F_ACK | NLM_F_EXCL | NLM_F_CREATE); - msg->t.tcm_info = TC_H_MAKE(0, htons(ETH_P_ALL)); - tap_flow_set_handle(rss_flow); - uint16_t group = attr->group << GROUP_SHIFT; - uint16_t prio = group | (i + PRIORITY_OFFSET); - msg->t.tcm_info = TC_H_MAKE(prio << 16, msg->t.tcm_info); - msg->t.tcm_parent = TC_H_MAKE(MULTIQ_MAJOR_HANDLE, 0); - - tap_nlattr_add(&msg->nh, TCA_KIND, sizeof("bpf"), "bpf"); - if (tap_nlattr_nested_start(msg, TCA_OPTIONS) < 0) - return -1; - tap_nlattr_add32(&msg->nh, TCA_BPF_FD, pmd->bpf_fd[i]); - snprintf(annotation, sizeof(annotation), "[%s%d]", - SEC_NAME_CLS_Q, i); - tap_nlattr_add(&msg->nh, TCA_BPF_NAME, strlen(annotation) + 1, - annotation); - /* Actions */ - { - struct action_data adata = { - .id = "skbedit", - .skbedit = { - .skbedit = { - .action = TC_ACT_PIPE, - }, - .queue = i, - }, - }; - if (add_actions(rss_flow, 1, &adata, TCA_BPF_ACT) < 0) - return -1; - } - tap_nlattr_nested_finish(msg); /* nested TCA_OPTIONS */ - - /* Netlink message is now ready to be sent */ - if (tap_nl_send(pmd->nlsk_fd, &msg->nh) < 0) - return -1; - err = tap_nl_recv_ack(pmd->nlsk_fd); - if (err < 0) { - TAP_LOG(ERR, - "Kernel refused TC filter rule creation (%d): %s", - errno, strerror(errno)); - return err; - } - LIST_INSERT_HEAD(&pmd->rss_flows, rss_flow, next); + /* Attach the maps defined in BPF program */ + err = tap_rss__attach(pmd->rss); + if (err < 0) { + TAP_LOG(ERR, "Failed to attach BPF object: %d", err); + rte_flow_error_set(error, -err, RTE_FLOW_ERROR_TYPE_HANDLE, NULL, + "BPF object could not be attached"); + tap_flow_bpf_destroy(pmd); + return err; } - pmd->rss_enabled = 1; - return err; + return 0; } -/** - * Manage bpf RSS keys repository with operations: init, get, release - * - * @param[in] cmd - * Command on RSS keys: init, get, release - * - * @param[in, out] key_idx - * Pointer to RSS Key index (out for get command, in for release command) - * - * @return -1 if couldn't get, release or init the RSS keys, 0 otherwise. - */ -static int bpf_rss_key(enum bpf_rss_key_e cmd, __u32 *key_idx) -{ - __u32 i; - int err = 0; - static __u32 num_used_keys; - static __u32 rss_keys[MAX_RSS_KEYS] = {KEY_STAT_UNSPEC}; - static __u32 rss_keys_initialized; - __u32 key; - - switch (cmd) { - case KEY_CMD_GET: - if (!rss_keys_initialized) { - err = -1; - break; - } - - if (num_used_keys == RTE_DIM(rss_keys)) { - err = -1; - break; - } - - *key_idx = num_used_keys % RTE_DIM(rss_keys); - while (rss_keys[*key_idx] == KEY_STAT_USED) - *key_idx = (*key_idx + 1) % RTE_DIM(rss_keys); - - rss_keys[*key_idx] = KEY_STAT_USED; - - /* - * Add an offset to key_idx in order to handle a case of - * RSS and non RSS flows mixture. - * If a non RSS flow is destroyed it has an eBPF map - * index 0 (initialized on flow creation) and might - * unintentionally remove RSS entry 0 from eBPF map. - * To avoid this issue, add an offset to the real index - * during a KEY_CMD_GET operation and subtract this offset - * during a KEY_CMD_RELEASE operation in order to restore - * the real index. - */ - *key_idx += KEY_IDX_OFFSET; - num_used_keys++; - break; - - case KEY_CMD_RELEASE: - if (!rss_keys_initialized) - break; - - /* - * Subtract offset to restore real key index - * If a non RSS flow is falsely trying to release map - * entry 0 - the offset subtraction will calculate the real - * map index as an out-of-range value and the release operation - * will be silently ignored. - */ - key = *key_idx - KEY_IDX_OFFSET; - if (key >= RTE_DIM(rss_keys)) - break; - if (rss_keys[key] == KEY_STAT_USED) { - rss_keys[key] = KEY_STAT_AVAILABLE; - num_used_keys--; - } - break; - - case KEY_CMD_INIT: - for (i = 0; i < RTE_DIM(rss_keys); i++) - rss_keys[i] = KEY_STAT_AVAILABLE; +/* Choose next flow id to use for BPF action */ +static int tap_rss_flow_assign(struct pmd_internals *pmd, uint16_t *flow_id) +{ + struct rte_flow *flow; + uint16_t id; - rss_keys_initialized = 1; - num_used_keys = 0; - break; + id = pmd->bpf_flowid; - case KEY_CMD_DEINIT: - for (i = 0; i < RTE_DIM(rss_keys); i++) - rss_keys[i] = KEY_STAT_UNSPEC; +next_id: + /* Skip 0xffff and 0 as id's */ + if (++id == UINT16_MAX) + id = 1; - rss_keys_initialized = 0; - num_used_keys = 0; - break; + /* Wrapped around, all id's have been used */ + if (id == pmd->bpf_flowid) + return -1; - default: - break; + /* Make sure this id has not been used already */ + for (flow = LIST_FIRST(&pmd->flows); flow; flow = LIST_NEXT(flow, next)) { + if (flow->flowid == id) + goto next_id; } - return err; + /* Record starting point for next time */ + pmd->bpf_flowid = id; + *flow_id = id; + return 0; } - /* Default RSS hash key also used by mlx devices */ static const uint8_t rss_hash_default_key[] = { 0x2c, 0xc6, 0x81, 0xd1, @@ -2050,34 +1876,34 @@ static int rss_add_actions(struct rte_flow *flow, struct pmd_internals *pmd, else if (rss->types & (RTE_ETH_RSS_IPV6 | RTE_ETH_RSS_FRAG_IPV6 | RTE_ETH_RSS_IPV6_EX)) hash_type |= RTE_BIT32(HASH_FIELD_IPV6_L3); - /* Get a new map key for a new RSS rule */ - err = bpf_rss_key(KEY_CMD_GET, &flow->key_idx); + + /* Choose new flow id, which is used as index into the BPF map */ + err = tap_rss_flow_assign(pmd, &flow->flowid); if (err < 0) { rte_flow_error_set( error, EINVAL, RTE_FLOW_ERROR_TYPE_HANDLE, NULL, - "Failed to get BPF RSS key"); + "Failed to get BPF flowid"); return -1; } - /* Update RSS map entry with queues */ - rss_entry.nb_queues = rss->queue_num; - for (i = 0; i < rss->queue_num; i++) - rss_entry.queues[i] = rss->queue[i]; - rss_entry.hash_fields = hash_type; rte_convert_rss_key((const uint32_t *)key_in, (uint32_t *)rss_entry.key, TAP_RSS_HASH_KEY_SIZE); + /* Update RSS map entry with queues */ + rss_entry.nb_queues = rss->queue_num; + for (i = 0; i < rss->queue_num; i++) + rss_entry.queues[i] = rss->queue[i]; /* Add this RSS entry to map */ - err = tap_flow_bpf_update_rss_elem(pmd->map_fd, - &flow->key_idx, &rss_entry); - + err = bpf_map__update_elem(pmd->rss->maps.rss_map, + &flow->flowid, sizeof(uint16_t), + &rss_entry, sizeof(rss_entry), 0); if (err) { TAP_LOG(ERR, "Failed to update BPF map entry #%u (%d): %s", - flow->key_idx, errno, strerror(errno)); + flow->flowid, errno, strerror(errno)); rte_flow_error_set( error, ENOTSUP, RTE_FLOW_ERROR_TYPE_HANDLE, NULL, "Kernel too old or not configured " @@ -2086,33 +1912,16 @@ static int rss_add_actions(struct rte_flow *flow, struct pmd_internals *pmd, return -ENOTSUP; } - - /* - * Load bpf rules to calculate hash for this key_idx - */ - - flow->bpf_fd[SEC_L3_L4] = - tap_flow_bpf_calc_l3_l4_hash(flow->key_idx, pmd->map_fd); - if (flow->bpf_fd[SEC_L3_L4] < 0) { - TAP_LOG(ERR, - "Failed to load BPF section %s (%d): %s", - sec_name[SEC_L3_L4], errno, strerror(errno)); - rte_flow_error_set( - error, ENOTSUP, RTE_FLOW_ERROR_TYPE_HANDLE, NULL, - "Kernel too old or not configured " - "to support BPF program loading"); - - return -ENOTSUP; - } - /* Actions */ { + const struct bpf_program *rss_prog = pmd->rss->progs.rss_flow_action; struct action_data adata[] = { { .id = "bpf", .bpf = { - .bpf_fd = flow->bpf_fd[SEC_L3_L4], - .annotation = sec_name[SEC_L3_L4], + .annotation = "tap_rss", + .classid = flow->flowid, + .bpf_fd = bpf_program__fd(rss_prog), .bpf = { .action = TC_ACT_PIPE, }, @@ -2120,13 +1929,13 @@ static int rss_add_actions(struct rte_flow *flow, struct pmd_internals *pmd, }, }; - if (add_actions(flow, RTE_DIM(adata), adata, - TCA_FLOWER_ACT) < 0) + if (add_actions(flow, 1, adata, TCA_FLOWER_ACT) < 0) return -1; } return 0; } +#endif /** * Get rte_flow operations. diff --git a/drivers/net/tap/tap_flow.h b/drivers/net/tap/tap_flow.h index 240fbc3dfaef..41f9833619a1 100644 --- a/drivers/net/tap/tap_flow.h +++ b/drivers/net/tap/tap_flow.h @@ -9,7 +9,6 @@ #include #include #include -#include /** * In TC, priority 0 means we require the kernel to allocate one for us. @@ -41,10 +40,6 @@ enum implicit_rule_index { TAP_REMOTE_MAX_IDX, }; -enum bpf_fd_idx { - SEC_L3_L4, - SEC_MAX, -}; int tap_dev_flow_ops_get(struct rte_eth_dev *dev, const struct rte_flow_ops **ops); @@ -57,10 +52,6 @@ int tap_flow_implicit_destroy(struct pmd_internals *pmd, int tap_flow_implicit_flush(struct pmd_internals *pmd, struct rte_flow_error *error); -int tap_flow_bpf_cls_q(__u32 queue_idx); -int tap_flow_bpf_calc_l3_l4_hash(__u32 key_idx, int map_fd); -int tap_flow_bpf_rss_map_create(unsigned int key_size, unsigned int value_size, - unsigned int max_entries); -int tap_flow_bpf_update_rss_elem(int fd, void *key, void *value); +void tap_flow_bpf_destroy(struct pmd_internals *pmd); #endif /* _TAP_FLOW_H_ */ diff --git a/drivers/net/tap/tap_rss.h b/drivers/net/tap/tap_rss.h index 6009be7031b0..51b7ff0d007e 100644 --- a/drivers/net/tap/tap_rss.h +++ b/drivers/net/tap/tap_rss.h @@ -9,6 +9,9 @@ #define TAP_MAX_QUEUES 16 #endif +/* Size of the map from BPF classid to queue table */ +#define TAP_RSS_MAX TAP_MAX_QUEUES + /* Fixed RSS hash key size in bytes. */ #define TAP_RSS_HASH_KEY_SIZE 40 diff --git a/drivers/net/tap/tap_tcmsgs.h b/drivers/net/tap/tap_tcmsgs.h index a64cb29d6fa8..00a0f22e3108 100644 --- a/drivers/net/tap/tap_tcmsgs.h +++ b/drivers/net/tap/tap_tcmsgs.h @@ -6,7 +6,6 @@ #ifndef _TAP_TCMSGS_H_ #define _TAP_TCMSGS_H_ -#include #include #include #include @@ -14,9 +13,8 @@ #include #include #include -#ifdef HAVE_TC_ACT_BPF #include -#endif + #include #include -- 2.43.0