LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: [lvs-users] ipvs-dr and ip_vs_conn

To: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Subject: Re: [lvs-users] ipvs-dr and ip_vs_conn
From: "calculator@xxxxxxx" <calculator@xxxxxxx>
Date: Fri, 06 Nov 2009 16:59:47 +0300

Simon Horman wrote:
> On Thu, Nov 05, 2009 at 11:22:54PM +0300, calculator@xxxxxxx wrote:
>   
>> Strange, can not see previous my answer. Try again.
>>     
>>> On Tue, Nov 03, 2009 at 11:54:07AM +0300, calculator@xxxxxxx wrote:
>>>       
>>>> Graeme Fowler:
>>>>         
>>>>> Hi there
>>>>>
>>>>> On Sun, 2009-11-01 at 11:10 +0300, calculator@xxxxxxx wrote:
>>>>>
>>>>>           
>>>>>> Similar problem post on
>>>>>> http://lists.graemef.net/pipermail/lvs-users/2005-May/013820.html
>>>>>>
>>>>>>             
>>>>> Do you mean you have a director which is running out of memory?
>>>>>
>>>>>           
>>>> No, my mistake.
>>>> Records from /proc/net/ip_vs_conn are always removed and it's ok,
>>>> but with default settings, sometimes i have "IPVS: ip_vs_conn_new: no 
>>>> memory available." with 4G memory...
>>>> I don't understand: why if i use ipvs-dr scheme with sh
>>>> sheduler there are records in /proc/net/ip_vs_conn?
>>>>
>>>> Does this is so because we do not want to drop "not full openen"/"not
>>>> full closed" connection so we renew sh sheduler _routing table_?
>>>>         
>>> I guess that technically it ought to be possible to do without
>>> /proc/net/ip_vs_conn entries when the sh-scheduler is in use.
>>> But from an implementation point of view this is quite
>>> a departure from the way LVS works internally. That is,
>>> the scheduler is called to choose a real-server for new connections
>>> and an entry is created in /proc/net/ip_vs_conn which is used
>>> for the rest of the life of the connection.
>>>
>>>       
>>>> And one more question. If I set 'ipvsadm --set 1 1 1' will it influence on 
>>>> whole ipvs (are packets which does not make whole cicle in 1 second
>>>> droped?), or will it influence only on count of records in ip_vs_conn?
>>>>         
>>> I doubt that timeouts that short will have any affect on the system
>>> as there are other, per-connection-state timeouts which are significantly
>>> larger. I imagine that the smallest workable timeout for TCP would
>>> be around 2 minutes.
>>>       
>> Am i right, with lvs-dr+sh good timeouts important on real server? Not 
>> in ip_vs_conn.
>> Does packages always delived to real server in order to sh _route 
>> table_? Or not?
>>     
>
> I'm not sure what you are asking.
>
>   
>> simple test:
>> LVS ~ # ipvsadm --set 1 1 1
>> http_test ~ # telnet host 80
>> ...
>> (sleep 5 sec)
>> GET / HTTP/1.0
>> HTTP/1.1 302 Moved Temporarily
>> ...
>> Connection closed by foreign host.
>>
>> LVS ~ # cat /proc/net/ip_vs_conn
>> Pro FromIP FPrt ToIP TPrt DestIP DPrt State Expires
>> TCP D949C8D6 A77D D949C8DE 0050 0A680466 0050 ESTABLISHED 0
>> (sleep 5 sec)
>> LVS ~ # cat /proc/net/ip_vs_conn
>> Pro FromIP FPrt ToIP TPrt DestIP DPrt State Expires
>> TCP D949C8D6 A77D D949C8DE 0050 0A680466 0050 CLOSE 8
>>
>> All ok.
>>     
>
> Do you notice any difference in memory usage when you set time-outs to 1 1 1?
>   
Yes.
> Have you observed what state the entries in ip_vs_conn are in? Profiling
> that may help in working out how to tune things - or why there are so many
> entries.
>
>   
On test station, with sets 1 1 1 i have ~1000 ip_vs_conn OBJECTS after 2 
minutes test. with default settings: ~55000
Profiling is good, but my competition in profiling is not good :-(
>>> Are you sure that you are exhausting ~4bytes of memory
>>> through the entries in /proc/net/ip_vs_conn? To put this
>>> in perspective, this implies that you have in the
>>> order of 30 million connections in /proc/net/ip_vs_conn.
>>>       
>> I have stat of current memory usage:
>>
>> IP Virtual Server version 1.2.1 (size=262144)
>> Prot LocalAddress:Port CPS InPPS OutPPS InBPS OutBPS
>> -> RemoteAddress:Port
>> TCP XXX.XXX.XXX.XXX:80 17446 92686 0 13826264 0
>> ...
>> slabtop:
>>
>> OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
>> 2041710 1976868 96% 0.12K 68057 30 272228K ip_vs_conn
>> 150300 149100 99% 0.25K 10020 15 40080K ip_dst_cache
>> 8482 8408 99% 2.00K 4241 2 16964K size-2048
>> 65726 65689 99% 0.06K 1114 59 4456K inet_peer_cache
>> 13720 8647 63% 0.19K 686 20 2744K skbuff_head_cache
>>     
>
> Unless I'm reading things wrong, it looks like you have about 272Mbytes
> worth of ip_vs_conn entries, or about 2Million entires. I must say
> that I am surprised that the number is that high. It implies
> that you are routing 2Million simultaneous connections through LVS.
>   
This depends from big expires(like 900) on some connections.
>   
>> sometime server goes down when CPS like ~45k:
>>     
>
> I'm still a little surprised that you're running out of memory,
> the numbers above don't seem to add up to anything close to 4Gb.
> Perhaps the kernel only has access to a portion of that memory?
>   
Maybe. How can i test this?
>   
>> 1 Time(s): Oct 29 14:16:36 lb1-2 kernel: [ 565.355342] IPVS: 
>> ip_vs_conn_new: no memory available.
>> 1 Time(s): Oct 29 14:23:49 lb1-2 kernel: [ 997.509759] printk: 3365 
>> messages suppressed.
>> 1 Time(s): Oct 29 14:23:49 lb1-2 kernel: [ 997.509774] heartbeat invoked 
>> oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
>> 1 Time(s): Oct 29 14:23:49 lb1-2 kernel: [ 997.509798] [<c045489e>] 
>> out_of_memory+0x1ae/0x1e0
>> 1 Time(s): Oct 29 14:23:50 lb1-2 kernel: [ 997.509818] [<c04561a6>] 
>> __alloc_pages+0x276/0x2e0
>> 1 Time(s): Oct 29 14:23:50 lb1-2 kernel: [ 997.509834] [<c045622e>] 
>> __get_free_pages+0x1e/0x40
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.509848] [<c04a68f8>] 
>> proc_file_read+0x98/0x280
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.509863] [<c0635527>] 
>> sys_recv+0x37/0x40
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.509877] [<c0473661>] 
>> vfs_read+0xa1/0x160
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.509968] [<c04729bc>] 
>> vfs_llseek+0x3c/0x50
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510056] [<c04a6860>] 
>> proc_file_read+0x0/0x280
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510181] [<c0473bd1>] 
>> sys_read+0x41/0x70
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510270] [<c040525d>] 
>> sysenter_past_esp+0x56/0x79
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510362] 
>> =======================
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510446] Mem-info:
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510448] DMA per-cpu:
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510533] cpu 0 hot: high 
>> 0, batch 1 used:0
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510618] cpu 0 cold: high 
>> 0, batch 1 used:0
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510704] cpu 1 hot: high 
>> 0, batch 1 used:0
>> 1 Time(s): Oct 29 14:23:51 lb1-2 kernel: [ 997.510789] cpu 1 cold: high 
>> 0, batch 1 used:0
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.510875] DMA32 per-cpu: empty
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.510960] Normal per-cpu:
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511046] cpu 0 hot: high 
>> 186, batch 31 used:32
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511178] cpu 0 cold: high 
>> 62, batch 15 used:51
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511266] cpu 1 hot: high 
>> 186, batch 31 used:167
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511350] cpu 1 cold: high 
>> 62, batch 15 used:54
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511434] HighMem per-cpu:
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511516] cpu 0 hot: high 
>> 186, batch 31 used:21
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511600] cpu 0 cold: high 
>> 62, batch 15 used:7
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511684] cpu 1 hot: high 
>> 186, batch 31 used:140
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511768] cpu 1 cold: high 
>> 62, batch 15 used:10
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511853] Free pages: 
>> 2142380kB (2137456kB HighMem)
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.511940] Active:17594 
>> inactive:2652 dirty:0 writeback:38 unstable:0 free:535595 slab:219589 
>> mapped-file:2869 mapped-anon:14553 pagetables:596
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.512151] DMA free:3548kB 
>> min:68kB low:84kB high:100kB active:4kB inactive:0kB present:16384kB 
>> pages_scanned:6 all_unreclaimable? yes
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.512322] lowmem_reserve[]: 
>> 0 0 880 3055
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.512415] DMA32 free:0kB 
>> min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB 
>> pages_scanned:0 all_unreclaimable? no
>> 1 Time(s): Oct 29 14:23:52 lb1-2 kernel: [ 997.512583] lowmem_reserve[]: 
>> 0 0 880 3055
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.512677] Normal 
>> free:1376kB min:3756kB low:4692kB high:5632kB active:112kB 
>> inactive:180kB present:901120kB pages_scanned:265 all_unreclaimable? no
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.512850] lowmem_reserve[]: 
>> 0 0 0 17400
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.512943] HighMem 
>> free:2137456kB min:512kB low:2832kB high:5156kB active:70224kB 
>> inactive:10464kB present:2227200kB pages_scanned:0 all_unreclaimable? no
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.513136] lowmem_reserve[]: 
>> 0 0 0 0
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.513228] DMA: 1*4kB 1*8kB 
>> 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB 
>> = 3548kB
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.513417] DMA32: empty
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.513498] Normal: 0*4kB 
>> 4*8kB 0*16kB 2*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 
>> 0*4096kB = 1376kB
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.513686] HighMem: 182*4kB 
>> 59*8kB 78*16kB 105*32kB 39*64kB 14*128kB 2*256kB 2*512kB 0*1024kB 
>> 0*2048kB 519*4096kB = 2137456kB
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.513878] Swap cache: add 
>> 0, delete 0, find 0/0, race 0+0
>> 1 Time(s): Oct 29 14:23:53 lb1-2 kernel: [ 997.513964] Free swap = 0kB
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.514045] Total swap = 0kB
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.514163] Free swap: 0kB
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520251] 786176 pages of RAM
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520253] 556800 pages of 
>> HIGHMEM
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520254] 8061 reserved pages
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520255] 26709 pages shared
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520256] 0 pages swap cached
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520257] 0 pages dirty
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520258] 38 pages writeback
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520259] 2869 pages mapped
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520260] 219589 pages slab
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520261] 596 pages pagetables
>> 1 Time(s): Oct 29 14:23:54 lb1-2 kernel: [ 997.520298] Out of memory: 
>> Killed process 2291 (bash).
>>
>> Maybe, there is missconfiguration kernel?
>>
>> _______________________________________________
>> Please read the documentation before posting - it's available at:
>> http://www.linuxvirtualserver.org/
>>
>> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
>> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
>> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>>     
>
> _______________________________________________
> Please read the documentation before posting - it's available at:
> http://www.linuxvirtualserver.org/
>
> LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
> Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
> or go to http://lists.graemef.net/mailman/listinfo/lvs-users
>
>
>
>
>   

_______________________________________________
Please read the documentation before posting - it's available at:
http://www.linuxvirtualserver.org/

LinuxVirtualServer.org mailing list - lvs-users@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-users-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-users

<Prev in Thread] Current Thread [Next in Thread>