LVS
lvs-users
Google
 
Web LinuxVirtualServer.org

Re: AW: AW: SSL accelarators and LVS by Peter Baitz

To: Joseph Mack <mack.joseph@xxxxxxx>
Subject: Re: AW: AW: SSL accelarators and LVS by Peter Baitz
Cc: "LinuxVirtualServer.org users mailing list." <lvs-users@xxxxxxxxxxxxxxxxxxxxxx>
Cc: MKrauss@xxxxxxxxxxxxxx
From: Julian Anastasov <ja@xxxxxx>
Date: Wed, 12 Mar 2003 00:13:28 +0200 (EET)
        Hello,

On Mon, 11 Mar 2003, Joseph Mack wrote:

> > My experience shows
> > that the user-space model needs many threads just for private keys
>
> private keys are kept in threads rather than in a table?

        If you have to use the following sequence for an incoming
SSL connection (user space):

SSL_accept
loop:
        SSL_read, SSL_write to client
        read, write to real server

The SSL_accept operation is the bottleneck for non-hardware SSL
processing, SSL_accept handles the private key which costs very
much. The hw accel cards offload atleast this processing (the
engine is used internally from SSL_accept) but we continue to call 
SSL_read and SSL_write without using the hw engine.

        Considering the above sequence we have two phases which
repeat for every incoming connection:

- wait the card drivers to finish the private key operations.
That means one thread waiting in blocked state for SSL_accept
to finish.

- do I/O for the connection (this includes data encryption and
decryption and everything else)

        What we want is while the card is busy with processing
(it does not have any PCI I/O during this processing) to use
the CPU not for the idle kernel process but for encryption and
decryption of other connections that are not waiting the engine.
So, the goal is to keep the queue of the hwaccel busy with
requests and to use the CPU at the same time for other processing.
As result, the accel reaches its designed limit of RSA keys/sec
and we don't waste CPU in waiting only one SSL_accept for results.

> > to keep the accelerator busy
>
> if there was only one thread, the accelerator would be a bottle neck
> or it wouldn't work at all?

        The CPU is idle waiting SSL_accept (the card) then the
card is idle waiting the CPU to encrypt/decrypt data. The result
could be 20% usage of the card and (30% usage of the CPU) and
the idle process is happy: 70% CPU.

> > and the rest
>
> of the CPU time?
>
> > is spent for CPU encryption
> > and decryption of the data. I don't know if this is true
> > for all cards. But even with accelerator, the using of SSL costs 3-4
> > times more than just the plain HTTP.
>
> 3-4 times the number of CPU cycles?

        Yes, handling of SSL encrypted HTTP traffic with hwaccel
is 3-4 times slower than handling the same HTTP traffic without
using SSL. Of course, without hwaccel this difference could be
20 and that depends on the used CPU model. What I want to say
is that it is better to delay this processing at the place where it
is needed: if the SSL traffic needs to be decrypted for scheduling
and persistence reasons than it should be done in director but
if SSL is used only as secure transport and not for the above
reasons than it is better to buy one or more hwaccel cards for
the real servers. Loading the director should be avoided if
possible.

        As for any tricks to include LVS in the SSL processing I
don't know how that can be done without using kernel-space SSL
hwaccel support. And even then, LVS can not be used, may be
ktcpvs can perform URL switching and cookie management.

> thanks Joe

Regards

--
Julian Anastasov <ja@xxxxxx>

<Prev in Thread] Current Thread [Next in Thread>