Wikipedia:Reference desk/Archives/Computing/2018 December 21

Computing desk
< December 20 << Nov | December | Jan >> December 22 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


December 21 edit

memory management and a large hash table edit

I'm using what is basically a large hash table. I'm trying to get as much data as possible into a Windows computer with 128GB of RAM. I'm using Delphi. The number of rows (the hash values) is fixed but the number of items for each hash varies from 0 to over 65000 (and is not known in advance).

My first approach was to use a list of lists. But when I added a new item to the list, it seemed to double the amount of memory allocated, which got to taking up too much memory.

I switched to a dynamic array of dynamic arrays, with the idea of when I needed more data for a particular hash, I'd increase the corresponding array by a certain percentage (I tried 10%, 20%, and 25%). But initially it seems to be using a lot more memory than expected. In the test I'm running now, (has been running for a few hours and will finish overnight), the memory use according to task manager got up to 97%, 98%, and then 99%. It is constantly adding data as it runs. But after a while, the memory use dropped to 83%, then to 70%, and now it is at 62% (even though it is rapidly adding data).

So what I'm wondering is, when I increase the size of a dynamic array by a few percent, is it actually allocating double that (the way it does with lists), and then as it runs, the system is smart enough to realize that it isn't using a lot of this memory and releasing it? Bubba73 You talkin' to me? 04:04, 21 December 2018 (UTC)[reply]

Now it is down to using only 55%, even though it adds about 200,000 8-byte pieces of data per second. You might think that I could let it use more memory to begin with and then let it scale back, but that causes it to swap to HD, which brings it to a crawl. Bubba73 You talkin' to me? 05:51, 21 December 2018 (UTC)[reply]
To answer such a question about how the internal memory of the dynamic array are managed, you will need to tell us exactly what version of Delphi compiler and runtime support libraries you are using.
For example, if you're using Embarcadero Delphi, the official product documentation wiki describes the platform memory manager. On Windows, the 'FastMM' system backs your arrays, and it uses a block-based allocation algorithm, rounding the amount of memory up to the nearest block size.
Modern memory management algorithms are complicated by confounding details of the internal implementation. I would posit that the actual implementation details so complicated that it would be prohibitively difficult for us to give you a very detailed answer. You could empirically measure and benchmark the performance, or you could get in touch with your software vendor if it's really important; or you could write your memory-sensitive code in a language that lets you get closer to the operating-system and hardware memory details, without putting a friendly runtime support library between you and the actual memory layout.
Nimur (talk) 17:30, 21 December 2018 (UTC)[reply]
I'm using the almost-current version 10.2.3 (on 64-bit Windows 100. The program finished a few minutes ago. When I got up today, it was back to using 99% of the memory, but it didn't seem to be swapping to HD (at least I didn't hear any of the constant thrashing to the HD). Bubba73 You talkin' to me? 18:51, 21 December 2018 (UTC)[reply]
It says "For Win32 and Win64, the Memory Manager employs an algorithm that anticipates future block reallocations, reducing the performance impact usually associated with such operations. The reallocation algorithm also helps reduce address space fragmentation." So that must be why it seems to double the allocated space for both lists and dynamic arrays. But I didn't see anything saying that it releases unused memory. The Task Manager reported that consistant dropping of memory usage even as the memory was being filled up. Bubba73 You talkin' to me? 19:00, 21 December 2018 (UTC)[reply]
 
Diagram of a symmetric multiprocessing system

Rent time on Google Compute Engine or one of its competitors?

Buy a used 256GB server for a couple of thousand bucks? https://www.amazon.com/gp/offer-listing/B075XR4G18/

--Guy Macon (talk) 00:09, 22 December 2018 (UTC)[reply]

I'd like to do that. I bought a used HP Z420 workstation with 128GB last month. Bubba73 You talkin' to me? 00:15, 22 December 2018 (UTC)[reply]
Question: with a dual CPU system like that, will each CPU have access to all of the memory or does each of the two chips get half of it? Bubba73 You talkin' to me? 00:18, 22 December 2018 (UTC)[reply]
Each CPU access all of RAM. They each have their own cache so as to minimize waiting for memory. --Guy Macon (talk) 16:16, 22 December 2018 (UTC)[reply]
Thanks, then it looks like I have to have more than one CPU chip to get more memory slots. And a related question: I have 16GB sticks of DDR3 ECC in my HP Z420, even though the docs say a maximum of 8GB sticks (which were probably the maximum at the time). Are there sticks with more capacity that would work? (I found 32GB sticks, but I'm not sure if they are right. The documentation says DDR3 ECC or non-ECC, unbuffered.) Bubba73 You talkin' to me? 03:14, 23 December 2018 (UTC)[reply]