On Sun, Nov 11, 2007 at 11:47:02PM +0200, Rumen Bogdanovski wrote:
> Hi,
> I have been thinking about the best way to handle the
> synchronization created connections and I decided (since
> there was no comment on that) that the best way to do so
> is as they are locally created. The only difference is that
> synced connections will have IP_VS_CONN_F_SYNC set. The
> reasons are:
>
> 1. If the sync connections are treated as inactive, on
> fail-over the task schedulers will use them with lower
> weight making the scheduling not quite adequate.
>
> 2. Introducing new counter for synced connection means
> that schedulers should be altered to take these connections
> in to account. And the existing tools should be changed too
> in order to show the real connection status.
>
> 3. No backward compatibility issues, no new data to be sent
> to the user space. All active connections will be listed as
> active and all inactive will be listed as inactive regardless
> of the connection origin.
>
> 4. The previous patch I have sent several days ago makes it
> possible to distinguish the locally created from sync-created
> connections via /proc/net/ip_vs_conn_sync.
>
> 5. It is less intrusive and much easier to implement.
>
> 6. Active connections are active no matter where they were created.
> On fail-over they will load the new director the same way the local
> connections do. And their life cycle will be just the same as if they
> were locally created. So why should they be treated differently?
> Any reason for that? I could not think of any.
>
> So I am sending again the patch that
> implements /proc/net/ip_vs_conn_sync. And a new one that creates the
> synced connections according to their
> real state allowing the backup director, in case of fail-over, to handle
> them in more appropriate way, avoiding scheduling and compatibility
> issues. Well In case somebody wants to know which connections are
> created locally which are sync-created he is always welcome to look in
> /proc/net/ip_vs_conn_sync.
>
> Any comments please!
> Simon, if you are happy with this let me know and I will provide a brief
> description and sign-off.
Hi Rumen,
sorry to be slow.
I think that your approach is sound. The patches seem to work in
such a way that they won't break existing tools. But tools could
be enhanced to take advantage of these new features. Likewise
for the schedulers.
The patches look fine. Though I'd like to request two things.
1) could you run thr patches through ./scripts/checkpatch.pl (in the
kernel tree) and fix the things that it complains about.
2) If at all possible, could you make the patches against the
net-2.6 tree that you can get using git from git.kernel.org.
See http://git.kernel.org/ or just run
git clone git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
I have also put some minor comments inline.
>
> Regards,
> Rumen
>
>
>
> On Wed, 2007-11-07 at 15:36 +0900, Simon Horman wrote:
> > On Mon, Nov 05, 2007 at 05:25:16PM +0200, Rumen Bogdanovski wrote:
> > > I was thinking about some cosmetics by using currently unused
> > > connection flag IP_VS_CONN_F_SYNC and introducing dest.syncconns (sync
> > > connection counter). This way we should introduce additional field in
> > > /proc/net/ip_vs and make it show dest.syncconns in addition to
> > > inactconns and activeconns. Also a new field in /proc/net/ip_vs_conn
> > > to designate whether it is a local or sync connection should be added.
> > > IMHO this way it should be much nicer and more informative to the
> > > user. However changing this will not be well accepted (I guess), so I
> > > have other idea: to introduce /proc/net/ip_vs_sync and
> > > /proc/net/ip_vs_conn_sync to show only syncconns the same way
> > > /proc/net/ip_vs and /proc/net/ip_vs_conn does. I can work on this in
> > > my free time if one of the ideas is accepted.
> > >
> > > Do you like the idea? I am waiting for comments.
> >
> > Hi,
> >
> > yes, I like the idea. But for compatibility reasons I think that for
> > compatibility reasons we will have to go with the /proc/net/ip_vs_sync
> > and /proc/net/ip_vs_conn_sync version of the idea.
> >
> diff -Naur linux-2.6.23.1_patch/net/ipv4/ipvs/ip_vs_conn.c
> linux-2.6.23.1_patch1/net/ipv4/ipvs/ip_vs_conn.c
> --- linux-2.6.23.1_patch/net/ipv4/ipvs/ip_vs_conn.c 2007-11-11
> 14:40:25.000000000 -0500
> +++ linux-2.6.23.1_patch1/net/ipv4/ipvs/ip_vs_conn.c 2007-11-11
> 14:30:38.000000000 -0500
> @@ -390,9 +390,14 @@
>
> /* Increase the refcnt counter of the dest */
> atomic_inc(&dest->refcnt);
> -
I think that this blank line should stay.
> /* Bind with the destination and its corresponding transmitter */
> - cp->flags |= atomic_read(&dest->conn_flags);
> + if ((cp->flags & IP_VS_CONN_F_SYNC) &&
> + (!(cp->flags & IP_VS_CONN_F_TEMPLATE)))
> + /* if the connection is normal and created by sync,
> + preserve the activity flag. */
I think that instead of "normal" it would be clearer to say
"not a template". Also, I believe that the usual format for
comments in kernel networking code is as follows, though
most of the lvs code uses a different style:
/* blah
* blah
*/
> + cp->flags |= atomic_read(&dest->conn_flags) &
> (~IP_VS_CONN_F_INACTIVE);
> + else
> + cp->flags |= atomic_read(&dest->conn_flags);
> cp->dest = dest;
>
> IP_VS_DBG(7, "Bind-dest %s c:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d "
> @@ -411,7 +416,11 @@
> /* It is a normal connection, so increase the inactive
> connection counter because it is in TCP SYNRECV
> state (inactive) or other protocol inacive state */
> - atomic_inc(&dest->inactconns);
> + if ((cp->flags & IP_VS_CONN_F_SYNC) &&
> + (!(cp->flags & IP_VS_CONN_F_INACTIVE)))
> + atomic_inc(&dest->activeconns);
> + else
> + atomic_inc(&dest->inactconns);
> } else {
> /* It is a persistent connection/template, so increase
> the peristent connection counter */
> diff -Naur linux-2.6.23.1_patch/net/ipv4/ipvs/ip_vs_sync.c
> linux-2.6.23.1_patch1/net/ipv4/ipvs/ip_vs_sync.c
> --- linux-2.6.23.1_patch/net/ipv4/ipvs/ip_vs_sync.c 2007-11-11
> 14:40:25.000000000 -0500
> +++ linux-2.6.23.1_patch1/net/ipv4/ipvs/ip_vs_sync.c 2007-11-11
> 14:49:01.000000000 -0500
> @@ -326,6 +326,13 @@
> dest = ip_vs_find_dest(s->daddr, s->dport,
> s->vaddr, s->vport,
> s->protocol);
> + /* Set the approprite ativity flag */
> + if (s->protocol == IPPROTO_TCP) {
> + if (ntohs(s->state) != IP_VS_TCP_S_ESTABLISHED)
> + flags |= IP_VS_CONN_F_INACTIVE;
> + else
> + flags &= ~IP_VS_CONN_F_INACTIVE;
> + }
> cp = ip_vs_conn_new(s->protocol,
> s->caddr, s->cport,
> s->vaddr, s->vport,
> diff -Naur linux-2.6.23.1_patch/net/ipv4/ipvs/ip_vs_conn.c
> linux-2.6.23.1_patch1/net/ipv4/ipvs/ip_vs_conn.c
> --- linux-2.6.23.1_patch/net/ipv4/ipvs/ip_vs_conn.c 2007-11-08
> 09:21:30.000000000 -0500
> +++ linux-2.6.23.1_patch1/net/ipv4/ipvs/ip_vs_conn.c 2007-11-09
> 07:22:24.000000000 -0500
> @@ -783,6 +783,57 @@
> .llseek = seq_lseek,
> .release = seq_release,
> };
> +
> +const char *ip_vs_origin_name(unsigned flags)
> +{
> + if (flags & IP_VS_CONN_F_SYNC)
> + return "SYNC";
> + else
> + return "LOCAL";
> +}
Should ip_vs_origin_name() be static?
> +static int ip_vs_conn_sync_seq_show(struct seq_file *seq, void *v)
> +{
> +
> + if (v == SEQ_START_TOKEN)
> + seq_puts(seq,
> + "Pro FromIP FPrt ToIP TPrt DestIP DPrt State Origin
> Expires\n");
> + else {
> + const struct ip_vs_conn *cp = v;
> +
> + seq_printf(seq,
> + "%-3s %08X %04X %08X %04X %08X %04X %-11s %-6s %7lu\n",
> + ip_vs_proto_name(cp->protocol),
> + ntohl(cp->caddr), ntohs(cp->cport),
> + ntohl(cp->vaddr), ntohs(cp->vport),
> + ntohl(cp->daddr), ntohs(cp->dport),
> + ip_vs_state_name(cp->protocol, cp->state),
> + ip_vs_origin_name(cp->flags),
> + (cp->timer.expires-jiffies)/HZ);
> + }
> + return 0;
> +}
> +
> +static const struct seq_operations ip_vs_conn_sync_seq_ops = {
> + .start = ip_vs_conn_seq_start,
> + .next = ip_vs_conn_seq_next,
> + .stop = ip_vs_conn_seq_stop,
> + .show = ip_vs_conn_sync_seq_show,
> +};
> +
> +static int ip_vs_conn_sync_open(struct inode *inode, struct file *file)
> +{
> + return seq_open(file, &ip_vs_conn_sync_seq_ops);
> +}
> +
> +static const struct file_operations ip_vs_conn_sync_fops = {
> + .owner = THIS_MODULE,
> + .open = ip_vs_conn_sync_open,
> + .read = seq_read,
> + .llseek = seq_lseek,
> + .release = seq_release,
> +};
> +
> #endif
>
>
> @@ -942,6 +993,7 @@
> }
>
> proc_net_fops_create("ip_vs_conn", 0, &ip_vs_conn_fops);
> + proc_net_fops_create("ip_vs_conn_sync", 0, &ip_vs_conn_sync_fops);
>
> /* calculate the random value for connection hash */
> get_random_bytes(&ip_vs_conn_rnd, sizeof(ip_vs_conn_rnd));
> @@ -958,5 +1010,6 @@
> /* Release the empty cache */
> kmem_cache_destroy(ip_vs_conn_cachep);
> proc_net_remove("ip_vs_conn");
> + proc_net_remove("ip_vs_conn_sync");
> vfree(ip_vs_conn_tab);
> }
> diff -Naur linux-2.6.23.1_patch/net/ipv4/ipvs/ip_vs_sync.c
> linux-2.6.23.1_patch1/net/ipv4/ipvs/ip_vs_sync.c
> --- linux-2.6.23.1_patch/net/ipv4/ipvs/ip_vs_sync.c 2007-11-08
> 09:21:30.000000000 -0500
> +++ linux-2.6.23.1_patch1/net/ipv4/ipvs/ip_vs_sync.c 2007-11-09
> 05:01:01.000000000 -0500
> @@ -308,7 +308,7 @@
> unsigned flags;
>
> s = (struct ip_vs_sync_conn *)p;
> - flags = ntohs(s->flags);
> + flags = ntohs(s->flags) | IP_VS_CONN_F_SYNC;
> if (!(flags & IP_VS_CONN_F_TEMPLATE))
> cp = ip_vs_conn_in_get(s->protocol,
> s->caddr, s->cport,
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
LVS Development mailing list - lvs-devel@xxxxxxxxxxxxxxxxxxxxxx
Send requests to lvs-devel-request@xxxxxxxxxxxxxxxxxxxxxx
or go to http://lists.graemef.net/mailman/listinfo/lvs-devel
|