Currently we have an LVS load balancer running on an AMD sempron64 2800,
1gb ram, and an Intel Dual Pro Gigabit NIC (dual port, pro1000/MT). LVS
is configured in NAT mode, with 8 web servers. During peak it pushes up
to 300Mbit/s outgoing traffic. Up until recently we have not had any
problems with throughput.
Lately whenever traffic goes above 300mbit/s, packetloss starts, and
sometimes we receive the following message in the logs:
e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang
Tx Queue <0>
TDH <a>
TDT <a>
next_to_use <a>
next_to_clean <1e>
buffer_info[next_to_clean]
time_stamp <238c523a>
next_to_watch <1e>
jiffies <238c6768>
next_to_watch.status <0>
NETDEV WATCHDOG: eth1: transmit timed out
Does it look like its time for a hardware upgrade? Or could it be
something else which is causing this problem? If a hardware upgraded is
recommended, what sort of hardware should we be using?
Any help is greatly appreciated.
Thanks!
Aaron
|