DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
@ 2014-12-10  2:25 Michael Qiu
  2014-12-10 10:41 ` Bruce Richardson
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Qiu @ 2014-12-10  2:25 UTC (permalink / raw)
  To: dev

When the first address is the compared address in the loop,
it will also do memory copy, which is meaningless,
worse more, when hugepg_tbl is mostly in order. This should
be a big deal in large hugepage memory systerm(like hunderd
or thousand GB).

Meanwhile smallest_idx never be a value of -1,so remove this check.

This patch also includes some coding style fix.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index e6cb919..700aba2 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -678,14 +678,13 @@ error:
 static int
 sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
 {
-	unsigned i, j;
-	int compare_idx;
+	unsigned i, j, compare_idx;
 	uint64_t compare_addr;
 	struct hugepage_file tmp;
 
 	for (i = 0; i < hpi->num_pages[0]; i++) {
 		compare_addr = 0;
-		compare_idx = -1;
+		compare_idx = i;
 
 		/*
 		 * browse all entries starting at 'i', and find the
@@ -704,11 +703,9 @@ sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
 			}
 		}
 
-		/* should not happen */
-		if (compare_idx == -1) {
-			RTE_LOG(ERR, EAL, "%s(): error in physaddr sorting\n", __func__);
-			return -1;
-		}
+		/* avoid memory copy when the first entry is the compared */
+		if (compare_idx == i)
+			continue;
 
 		/* swap the 2 entries in the table */
 		memcpy(&tmp, &hugepg_tbl[compare_idx],
-- 
1.9.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
  2014-12-10  2:25 [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages Michael Qiu
@ 2014-12-10 10:41 ` Bruce Richardson
  2014-12-10 10:59   ` Qiu, Michael
  0 siblings, 1 reply; 8+ messages in thread
From: Bruce Richardson @ 2014-12-10 10:41 UTC (permalink / raw)
  To: Michael Qiu; +Cc: dev

On Wed, Dec 10, 2014 at 10:25:41AM +0800, Michael Qiu wrote:
> When the first address is the compared address in the loop,
> it will also do memory copy, which is meaningless,
> worse more, when hugepg_tbl is mostly in order. This should
> be a big deal in large hugepage memory systerm(like hunderd
> or thousand GB).

I actually doubt the time taken by this sorting is a significant part of the
initialization time taken, even for hundreds of GBs of memory. Do you have
any measurements to confirm this?
[However, that's not to say that we can't merge in this patch]


> 
> Meanwhile smallest_idx never be a value of -1,so remove this check.
> 
> This patch also includes some coding style fix.
> 
> Signed-off-by: Michael Qiu <michael.qiu@intel.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_memory.c | 13 +++++--------
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index e6cb919..700aba2 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -678,14 +678,13 @@ error:
>  static int
>  sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
>  {
> -	unsigned i, j;
> -	int compare_idx;
> +	unsigned i, j, compare_idx;
>  	uint64_t compare_addr;
>  	struct hugepage_file tmp;
>  
>  	for (i = 0; i < hpi->num_pages[0]; i++) {
>  		compare_addr = 0;
> -		compare_idx = -1;
> +		compare_idx = i;
>  
>  		/*
>  		 * browse all entries starting at 'i', and find the
> @@ -704,11 +703,9 @@ sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
>  			}
>  		}
>  
> -		/* should not happen */
> -		if (compare_idx == -1) {
> -			RTE_LOG(ERR, EAL, "%s(): error in physaddr sorting\n", __func__);
> -			return -1;
> -		}
> +		/* avoid memory copy when the first entry is the compared */
> +		if (compare_idx == i)
> +			continue;
>  
>  		/* swap the 2 entries in the table */
>  		memcpy(&tmp, &hugepg_tbl[compare_idx],
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
  2014-12-10 10:41 ` Bruce Richardson
@ 2014-12-10 10:59   ` Qiu, Michael
  2014-12-10 11:08     ` Ananyev, Konstantin
  0 siblings, 1 reply; 8+ messages in thread
From: Qiu, Michael @ 2014-12-10 10:59 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: dev

On 12/10/2014 6:41 PM, Richardson, Bruce wrote:
> On Wed, Dec 10, 2014 at 10:25:41AM +0800, Michael Qiu wrote:
>> When the first address is the compared address in the loop,
>> it will also do memory copy, which is meaningless,
>> worse more, when hugepg_tbl is mostly in order. This should
>> be a big deal in large hugepage memory systerm(like hunderd
>> or thousand GB).
> I actually doubt the time taken by this sorting is a significant part of the
> initialization time taken, even for hundreds of GBs of memory. Do you have
> any measurements to confirm this?
> [However, that's not to say that we can't merge in this patch]

I've no hardware env of so huge memory, so I haven't do measurements on
this.

This is not a big improvement, but indeed it may do lots of useless
memory copy in initialize stat.

It could, at least could  save time :)

Thanks,
Michael
>
>
>> Meanwhile smallest_idx never be a value of -1,so remove this check.
>>
>> This patch also includes some coding style fix.
>>
>> Signed-off-by: Michael Qiu <michael.qiu@intel.com>
>> ---
>>  lib/librte_eal/linuxapp/eal/eal_memory.c | 13 +++++--------
>>  1 file changed, 5 insertions(+), 8 deletions(-)
>>
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
>> index e6cb919..700aba2 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
>> @@ -678,14 +678,13 @@ error:
>>  static int
>>  sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
>>  {
>> -	unsigned i, j;
>> -	int compare_idx;
>> +	unsigned i, j, compare_idx;
>>  	uint64_t compare_addr;
>>  	struct hugepage_file tmp;
>>  
>>  	for (i = 0; i < hpi->num_pages[0]; i++) {
>>  		compare_addr = 0;
>> -		compare_idx = -1;
>> +		compare_idx = i;
>>  
>>  		/*
>>  		 * browse all entries starting at 'i', and find the
>> @@ -704,11 +703,9 @@ sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
>>  			}
>>  		}
>>  
>> -		/* should not happen */
>> -		if (compare_idx == -1) {
>> -			RTE_LOG(ERR, EAL, "%s(): error in physaddr sorting\n", __func__);
>> -			return -1;
>> -		}
>> +		/* avoid memory copy when the first entry is the compared */
>> +		if (compare_idx == i)
>> +			continue;
>>  
>>  		/* swap the 2 entries in the table */
>>  		memcpy(&tmp, &hugepg_tbl[compare_idx],
>> -- 
>> 1.9.3
>>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
  2014-12-10 10:59   ` Qiu, Michael
@ 2014-12-10 11:08     ` Ananyev, Konstantin
  2014-12-10 17:58       ` Jay Rolette
  0 siblings, 1 reply; 8+ messages in thread
From: Ananyev, Konstantin @ 2014-12-10 11:08 UTC (permalink / raw)
  To: Qiu, Michael, Richardson, Bruce; +Cc: dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Qiu, Michael
> Sent: Wednesday, December 10, 2014 10:59 AM
> To: Richardson, Bruce
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
> 
> On 12/10/2014 6:41 PM, Richardson, Bruce wrote:
> > On Wed, Dec 10, 2014 at 10:25:41AM +0800, Michael Qiu wrote:
> >> When the first address is the compared address in the loop,
> >> it will also do memory copy, which is meaningless,
> >> worse more, when hugepg_tbl is mostly in order. This should
> >> be a big deal in large hugepage memory systerm(like hunderd
> >> or thousand GB).
> > I actually doubt the time taken by this sorting is a significant part of the
> > initialization time taken, even for hundreds of GBs of memory. Do you have
> > any measurements to confirm this?
> > [However, that's not to say that we can't merge in this patch]
> 
> I've no hardware env of so huge memory, so I haven't do measurements on
> this.
> 
> This is not a big improvement, but indeed it may do lots of useless
> memory copy in initialize stat.
> 
> It could, at least could  save time :)

I wonder why we do need to write our own bubble sort procedure?
Why we can't use standard qsort() here?
Konstantin

> 
> Thanks,
> Michael
> >
> >
> >> Meanwhile smallest_idx never be a value of -1,so remove this check.
> >>
> >> This patch also includes some coding style fix.
> >>
> >> Signed-off-by: Michael Qiu <michael.qiu@intel.com>
> >> ---
> >>  lib/librte_eal/linuxapp/eal/eal_memory.c | 13 +++++--------
> >>  1 file changed, 5 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
> >> index e6cb919..700aba2 100644
> >> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> >> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> >> @@ -678,14 +678,13 @@ error:
> >>  static int
> >>  sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
> >>  {
> >> -	unsigned i, j;
> >> -	int compare_idx;
> >> +	unsigned i, j, compare_idx;
> >>  	uint64_t compare_addr;
> >>  	struct hugepage_file tmp;
> >>
> >>  	for (i = 0; i < hpi->num_pages[0]; i++) {
> >>  		compare_addr = 0;
> >> -		compare_idx = -1;
> >> +		compare_idx = i;
> >>
> >>  		/*
> >>  		 * browse all entries starting at 'i', and find the
> >> @@ -704,11 +703,9 @@ sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
> >>  			}
> >>  		}
> >>
> >> -		/* should not happen */
> >> -		if (compare_idx == -1) {
> >> -			RTE_LOG(ERR, EAL, "%s(): error in physaddr sorting\n", __func__);
> >> -			return -1;
> >> -		}
> >> +		/* avoid memory copy when the first entry is the compared */
> >> +		if (compare_idx == i)
> >> +			continue;
> >>
> >>  		/* swap the 2 entries in the table */
> >>  		memcpy(&tmp, &hugepg_tbl[compare_idx],
> >> --
> >> 1.9.3
> >>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
  2014-12-10 11:08     ` Ananyev, Konstantin
@ 2014-12-10 17:58       ` Jay Rolette
       [not found]         ` <2601191342CEEE43887BDE71AB977258213BED0C@IRSMSX105.ger.corp.intel.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Jay Rolette @ 2014-12-10 17:58 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

On Wed, Dec 10, 2014 at 5:08 AM, Ananyev, Konstantin <
konstantin.ananyev@intel.com> wrote:

> I wonder why we do need to write our own bubble sort procedure?
> Why we can't use standard qsort() here?
>

Sadly, even bubble sort would be better than the selection sort being used
here. It's guaranteed to be O(n^2) in all cases.

I just got through replacing that entire function in my repo with a call to
qsort() from the standard library last night myself. Faster (although
probably not material to most deployments) and less code.

Jay

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
       [not found]         ` <2601191342CEEE43887BDE71AB977258213BED0C@IRSMSX105.ger.corp.intel.com>
@ 2014-12-10 18:39           ` Ananyev, Konstantin
  2014-12-10 21:36             ` Jay Rolette
  0 siblings, 1 reply; 8+ messages in thread
From: Ananyev, Konstantin @ 2014-12-10 18:39 UTC (permalink / raw)
  To: rolette; +Cc: dev


 
> From: Jay Rolette [mailto:rolette@infiniteio.com]
> Sent: Wednesday, December 10, 2014 5:58 PM
> To: Ananyev, Konstantin
> Cc: Qiu, Michael; Richardson, Bruce; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
> 
> 
> On Wed, Dec 10, 2014 at 5:08 AM, Ananyev, Konstantin <konstantin.ananyev@intel.com> wrote:
> I wonder why we do need to write our own bubble sort procedure?
> Why we can't use standard qsort() here?
> 
> Sadly, even bubble sort would be better than the selection sort being used here. It's guaranteed to be O(n^2) in all cases.

Ah yes, your right it is a selection sort here.

> 
> I just got through replacing that entire function in my repo with a call to qsort() from the standard library last night myself. Faster
> (although probably not material to most deployments) and less code.

If you feel like it is worth it, why not to submit a patch? :)
Thanks
Konstantin

> 
> Jay

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
  2014-12-10 18:39           ` Ananyev, Konstantin
@ 2014-12-10 21:36             ` Jay Rolette
  2014-12-11  1:44               ` Qiu, Michael
  0 siblings, 1 reply; 8+ messages in thread
From: Jay Rolette @ 2014-12-10 21:36 UTC (permalink / raw)
  To: Ananyev, Konstantin; +Cc: dev

On Wed, Dec 10, 2014 at 12:39 PM, Ananyev, Konstantin <
konstantin.ananyev@intel.com> wrote:

> > I just got through replacing that entire function in my repo with a call
> to qsort() from the standard library last night myself. Faster
> > (although probably not material to most deployments) and less code.
>
> If you feel like it is worth it, why not to submit a patch? :)


On Haswell and IvyBridge Xeons, with 128 1G huge pages, it doesn't make a
user-noticeable difference in the time required for
rte_eal_hugepage_init(). The reason I went ahead and checked it in my repo
is because:

a) it eats at my soul to see an O(n^2) case for something where qsort() is
trivial to use
b) we will increase that up to ~232 1G huge pages soon. Likely doesn't
matter at that point either, but since it was already written...

What *does* chew up a lot of time in init is where the huge pages are being
explicitly zeroed in map_all_hugepages().

Removing that memset() makes find_numasocket() blow up, but I was able to
do a quick test where I only memset 1 byte on each page. That cut init time
by 30% (~20 seconds in my test).  Significant, but since I'm not entirely
sure it is safe, I'm not making that change right now.

On Linux, shared memory that isn't file-backed is automatically zeroed
before the app gets it. However, I haven't had a chance to chase down
whether that applies to huge pages or not, much less how hugetlbfs factors
into the equation.

Back to the question about the patch, if you guys are interested in it,
I'll have to figure out your patch submission process. Shouldn't be a huge
deal other than the fact that we are on DPDK 1.6 (r2).

Cheers,
Jay

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages
  2014-12-10 21:36             ` Jay Rolette
@ 2014-12-11  1:44               ` Qiu, Michael
  0 siblings, 0 replies; 8+ messages in thread
From: Qiu, Michael @ 2014-12-11  1:44 UTC (permalink / raw)
  To: Jay Rolette, Ananyev, Konstantin; +Cc: dev

On 12/11/2014 5:37 AM, Jay Rolette wrote:
> On Wed, Dec 10, 2014 at 12:39 PM, Ananyev, Konstantin <
> konstantin.ananyev@intel.com> wrote:
>
>>> I just got through replacing that entire function in my repo with a call
>> to qsort() from the standard library last night myself. Faster
>>> (although probably not material to most deployments) and less code.
>> If you feel like it is worth it, why not to submit a patch? :)
>
> On Haswell and IvyBridge Xeons, with 128 1G huge pages, it doesn't make a
> user-noticeable difference in the time required for
> rte_eal_hugepage_init(). The reason I went ahead and checked it in my repo
> is because:
>
> a) it eats at my soul to see an O(n^2) case for something where qsort() is
> trivial to use
> b) we will increase that up to ~232 1G huge pages soon. Likely doesn't
> matter at that point either, but since it was already written...
>
> What *does* chew up a lot of time in init is where the huge pages are being
> explicitly zeroed in map_all_hugepages().
>
> Removing that memset() makes find_numasocket() blow up, but I was able to
> do a quick test where I only memset 1 byte on each page. That cut init time
> by 30% (~20 seconds in my test).  Significant, but since I'm not entirely
> sure it is safe, I'm not making that change right now.
>
> On Linux, shared memory that isn't file-backed is automatically zeroed
> before the app gets it. However, I haven't had a chance to chase down
> whether that applies to huge pages or not, much less how hugetlbfs factors
> into the equation.
>
> Back to the question about the patch, if you guys are interested in it,
> I'll have to figure out your patch submission process. Shouldn't be a huge
> deal other than the fact that we are on DPDK 1.6 (r2).

Go ahead and post it :)

Thanks,
Michael
> Cheers,
> Jay
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-12-11  1:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-10  2:25 [dpdk-dev] [PATCH] Avoid possible memory cpoy when sort hugepages Michael Qiu
2014-12-10 10:41 ` Bruce Richardson
2014-12-10 10:59   ` Qiu, Michael
2014-12-10 11:08     ` Ananyev, Konstantin
2014-12-10 17:58       ` Jay Rolette
     [not found]         ` <2601191342CEEE43887BDE71AB977258213BED0C@IRSMSX105.ger.corp.intel.com>
2014-12-10 18:39           ` Ananyev, Konstantin
2014-12-10 21:36             ` Jay Rolette
2014-12-11  1:44               ` Qiu, Michael

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git