Hi All,
Well after some big thought we have found a solution for this previously
mentioned problem. So I just give this back to the community.
1. The Constat : Some website on the internet implie that the IP address of
a specific client still the same during its browsing flight because most of
new internet website use incoming IP address as a clue for session
generation and other application env specifics... This kind of problem is
mainly observed when using a loadbalanced PROXY topology because of proxy
selection not sticked to a specific client.
2. The need : We have a level PROXY topology => SQUID proxy and VIRUS
proxy. SQUID proxy (SPi) must have VIRUS proxy as parent (PPi). We have a
pool of proxy of each kind.
3. Solutions : We have differents possibilities to handle specific client
PROXY sticking :
3.1. Fixing PPi per SPi handling PPi failover at SPi level
This setup is sum by the following sketch :
+----------------------------+
| Internet |
+----------------------------+
| | |
+----------------+ | +-----------+
| | |
+----------------+ +----------------+ +----------------+
| PROXY-PARENT-1 | | PROXY-PARENT-2 | | PROXY-PARENT-3 |
+----------------+ +----------------+ +----------------+
| | |
+---------------+ +---------------+ +---------------+
| SQUID-PROXY-1 | | SQUID-PROXY-2 | | SQUID-PROXY-3 |
+---------------+ +---------------+ +---------------+
| | |
+--------------+ | +---------+
| | |
+----------------------------+
| LoadBalancer |
+----------[ VIP ]-----------+
=> All web browser from the LAN use loadbalancer VIP as default proxy
=> Loadbalancer use a persistent SPi selection per LAN client requests
=> Each SPi use a specific/dedicated PPi (PPi <-> SPi)
=> In PPi failover mode, if PPi fails then SPi will select PP[(i+1)mod3]
on SP1 :
cache_host PROXY-PARENT-1 parent 3128 8080 no-query
cache_host PROXY-PARENT-2 parent 3128 8080 no-query
cache_host PROXY-PARENT-3 parent 3128 8080 no-query
on SP2 :
cache_host PROXY-PARENT-2 parent 3128 8080 no-query
cache_host PROXY-PARENT-3 parent 3128 8080 no-query
cache_host PROXY-PARENT-1 parent 3128 8080 no-query
on SP3 :
cache_host PROXY-PARENT-3 parent 3128 8080 no-query
cache_host PROXY-PARENT-1 parent 3128 8080 no-query
cache_host PROXY-PARENT-2 parent 3128 8080 no-query
=> IF previous failed PPi comes back then SPi will be back to PPi
Conclusion on this solution : This solution works fine but suffer some
problems :
=> Scalability is limited since new SP inclusion into the loadbalanced SP
pool implie a new PP installation. bacause we define a one-to-one relation
between SP & PP
=> SPi back from failover PPi convergeance : We have try to measure the
convergeance time during this transition, this time is not constant and
range from our test from 30s to 2min.
3.2. Using a multi-level loadbalanced topology with NAT
This setup is sum by the following sketch :
+----------------------------+
| Internet |
+----------------------------+
|
+----------------------------+
| router |
+----------------------------+
| | |
+-------------+ | +---------+
| | |
+----------------+ +----------------+ +----------------+
| PROXY-PARENT-1 | | PROXY-PARENT-2 | | PROXY-PARENT-3 |
+----------------+ +----------------+ +----------------+
| | |
+--------------+ | +---------+
| | |
+----------------------------+
| LoadBalancer 2 |
+---------[ VIP-PP ]---------+
|
+--------------------------+----------------------+
| | |
+---------------+ +---------------+ +---------------+
| SQUID-PROXY-1 | | SQUID-PROXY-2 | | SQUID-PROXY-3 |
+---------------+ +---------------+ +---------------+
| | |
+--------------+ | +---------+
| | |
+----------------------------+
| LoadBalancer 1 |
+---------[ VIP-SP ]---------+
=> All web browser from the LAN use loadbalancer VIP-SP as default proxy
=> No persistence/stickyness is used on Loadbalancer.
=> Each SPi use as default parent the VIP-PP
=> The upper level routing equipment NAT traffic coming from all PPi
Conclusion : We have chosen to implement this setup because of :
=> Scalability : Each PROXY pool (SP pool & PP pool) are completly
independent. So performance augmentation of PPi will not be linked with a
SP performance augmentation. This setup is usefull because PPi (VIRUS
stream checkers) are more loaded than SP because they are inspecting all
the stream coming from the Internet. So traitment capacity must be
desolidarised from SP pool.
=> Failover convergeance is handled at layer3/4 level so convergeance
time is routing relevant.
=> This solution makes us some thought on the intrinsec NAT limitation.
Because NATing router will be loaded with PPi flow... This is the only
limitation/eventual problems I can see.
Hope it will be usefull,
Best regards,
Alexandre
|