Memory Tuning - How OS Page Sizes Can Impact Oracle Performance
Theory
Modern operating systems provide each process it's own virtual address space that is potentially larger than physical memory. It does this by mapping the virtual addresses onto actual physical memory pages. The following diagram shows how large virtual address spaces are mapped onto small physical memory.
The virtual to physical relationship is maintained in mapping tables. Linux maintains a separate map for each process. The map contains one entry for each page where the size of the page is set by the operating system. The CPU attempts to speed up the mapping of virtual memory to physical addresses by caching a subset of the mapping entries in its internal cache. Modern large memory systems and the resulting large processes have increased the size of memory mapping tables faster than the cpu cache sizes have increased. The processor cache is usually very small with only maybe 100 entries in may processors. This mismatch causes faults that can have significant performance implications.
System Analysis
Linux uses physical memory to maintain the process memory mapping tables and swaps those mapping tables in and out of the processor cache on context switches and on memory page table faults. This physical memory is unavailable for other uses essentially reducing the amount of memory available to the rest of the system. The amount of memory used for page tables is the sum of all the virtual process sizes / the page table size multiplied by the size of a page table entry. So a system with a 20GB Oracle SGA uses 50MB of physical memory for page tables just for the SGA:
Linux provides the ability to change the page size from 4KB to 2MB potentially returning large amounts of memory to the swappable pool. Calculations show that converting just the Oracle SGA in the previous example would save almost 10GB.
Huge (2MB) pages are different from the normal 4K pages in that they are locked into memory and do not swap. This means that extreme care must be taken when selecting the number and size of the huge page space. The Huge Page space should be large enough to hold the Oracle SGA while leaving enough other memory to run the non-SGA client code, the operating system and other processes. Teams must understand their memory needs so that htey can set the correct number of huge pages.
Production Example
I was talking with a colleague about their system and we did some back of the napkin calculations. They had an Oracle RAC with two 72GB servers each with 500 database connections. This ran a little slower than expected when both machines were up but their real problem was that they were unable to fail over to a single node even though they had plenty of CPU capacity. The following calculations show why.
The team converted 35GB of memory to Huge Pages so that the entire SGA fit into the huge pages. This returned 40GB of free memory to each machine and made it possible fall back to a single machine when they had planned or unplanned outages on a single server.
A smaller system at another site recovered 7GB of physical memory on their 32GB servers by converting their 20GB oracle SGA to Huge Pages. This ended all their system paging.
Recap
Intelligent conversion from small memory pages to large memory pages modern modern large RAM systems can have significant performance implications:
Modern operating systems provide each process it's own virtual address space that is potentially larger than physical memory. It does this by mapping the virtual addresses onto actual physical memory pages. The following diagram shows how large virtual address spaces are mapped onto small physical memory.
The virtual to physical relationship is maintained in mapping tables. Linux maintains a separate map for each process. The map contains one entry for each page where the size of the page is set by the operating system. The CPU attempts to speed up the mapping of virtual memory to physical addresses by caching a subset of the mapping entries in its internal cache. Modern large memory systems and the resulting large processes have increased the size of memory mapping tables faster than the cpu cache sizes have increased. The processor cache is usually very small with only maybe 100 entries in may processors. This mismatch causes faults that can have significant performance implications.
System Analysis
Linux uses physical memory to maintain the process memory mapping tables and swaps those mapping tables in and out of the processor cache on context switches and on memory page table faults. This physical memory is unavailable for other uses essentially reducing the amount of memory available to the rest of the system. The amount of memory used for page tables is the sum of all the virtual process sizes / the page table size multiplied by the size of a page table entry. So a system with a 20GB Oracle SGA uses 50MB of physical memory for page tables just for the SGA:
- 20GB Process Size / 4K page size * 10 = 50Megabytes. (5 million entries)
- 20GB Process Size/ 4K page size * 10 entry size * 200 connections = 10GB. (1 Billion page table entries)
Linux provides the ability to change the page size from 4KB to 2MB potentially returning large amounts of memory to the swappable pool. Calculations show that converting just the Oracle SGA in the previous example would save almost 10GB.
- 20GB Process size / 2MB page size * 10 entry size * 200 connections = 20MB (2 million entries)
Huge (2MB) pages are different from the normal 4K pages in that they are locked into memory and do not swap. This means that extreme care must be taken when selecting the number and size of the huge page space. The Huge Page space should be large enough to hold the Oracle SGA while leaving enough other memory to run the non-SGA client code, the operating system and other processes. Teams must understand their memory needs so that htey can set the correct number of huge pages.
Production Example
I was talking with a colleague about their system and we did some back of the napkin calculations. They had an Oracle RAC with two 72GB servers each with 500 database connections. This ran a little slower than expected when both machines were up but their real problem was that they were unable to fail over to a single node even though they had plenty of CPU capacity. The following calculations show why.
Before
Page Table Entries (SGA) = 35GB / 4KB = 8.75MB
Page Table Entries (1000 connections) = 1000 * 8.75MB = 8.75GB
Memory (1000 connections) = 10B * 8.75G = 87.5GB
After
Page Table Entries (SGA) = 35GB / 2MB = 17.5KB
Page Table Entries (1000 connections) = 1000 * 17.5KB = 17.5MB
Memory (1000 connections) = 10B * 17.5M = 175MB
The team converted 35GB of memory to Huge Pages so that the entire SGA fit into the huge pages. This returned 40GB of free memory to each machine and made it possible fall back to a single machine when they had planned or unplanned outages on a single server.
A smaller system at another site recovered 7GB of physical memory on their 32GB servers by converting their 20GB oracle SGA to Huge Pages. This ended all their system paging.
Recap
Intelligent conversion from small memory pages to large memory pages modern modern large RAM systems can have significant performance implications:
- Larger page sizes result in fewer page table entries in the CPU cache. This results in fewer cache misses and less CPU stalling.
- The memory savings resulting from the 1/500 reduction in page table entries is multiplied by the number of processes sharing the same shared memory.
Nice post. Oracle is a relational database management system produced by oracle corporation. Nowadays most of the multinational companies used this oracle database for storing and managing their data's and programs. So learning Oracle Training Chennai is one of the best idea to make a bright career.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDelete