关注开源技术(黑龙江。哈尔滨) 倡导企业级开源应用,探索信息化方案标准; 集成开源众多新成果,消除开源方案忧与患; 力推低成本开源战车,笑纳八方来客叙开源; 普及开源知识助推力,喜迎开源企业展宏图;

HAProxy

2008-03-14 14:16:23 / 个人分类:开源应用TOOLS

   HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing. Supporting tens of thousands of connections is clearly realistic with todays hardware. Its mode of operation makes its integration into existing architectures very easy and riskless, while still offering the possibility not to expose fragile web servers to the Net, such as below :

Currently, two major versions are supported :

  • version 1.1 - maintains critical sites online since 2002
    The most stable and reliable, has reached years of uptime. Receives no new feature, dedicated to mission-critical usages only.
  • version 1.2 - opening the way to very high traffic sites
    The same as 1.1 with some new features such as poll/epoll support for very large number of sessions, IPv6 on the client side, application cookies, hot-reconfiguration, advanced dynamic load regulation, TCP keepalive, source hash, weighted load balancing, rbtree-based scheduler, and a nice Web status page. This code is still evolving but has significantly stabilized since 1.2.8.

Additionally, a third version 1.3 is under active development. New features include :

  • Content Switching : provides ability to select a group of server based on any part of the request such as the URI, the Host field, cookies, or anything else. There is a growing request for this feature from large sites which separate dynamic and static contents.
  • Full Transparent Proxy : it is possible connect to the server with the Client's IP address or even any other IP address. This is possible only on Linux 2.4/2.6 with the cttproxy patch. This feature also makes it possible to transparently handle part of the traffic for a particular server without changing any server's address.
  • New faster tree-based scheduler : versions up to 1.2.16 required that all timeouts were set to the same value to support tens of hundreds of connections at full speed. With this new scheduler, it is no longer the case. I have backported it to 1.2.17.
  • Kernel TCP splicing : avoiding kernel-to-user then user-to-kernel data copies improves bandwidth and lowers CPU usage. Haproxy 1.3 supports Linux L7SW in order to achieve multi-gigabit performance on commodity hardware.
  • Connection Tarpitting : since the cost of maintaining a connection open is low, it is sometimes desirable to "tarpit" attack bots, which means maintain their connections open to limit their capacity. This has been developped for a site crawling under a small DDoS with easily identifiable requests from a few thousand zombies.
  • Finer Header Processing : will make it easier to write header-based rules and to process parts of the URI.
  • Very Fast and reliable Header Parsing : full parsing and indexing of an average request typically takes less than 2 microseconds with fully RFC2616-compliant integrity checks.
  • Modular Design : allow more people to contribute to the project and make it easier to debug. The pollers have been split, already making their development a lot easier. Other subsystems will be modularized soon.
  • Speculative I/O processing : try to access data on a socket before being notified about its readiness. The poller just speculates about what should be available and what should not, tries to guess, and if it wins, several expensive syscalls are saved. If it loses, those syscalls will have to be called anyway. A net overall gain of about 10% has been observed using Linux epoll().
  • ACLs : use any combination of any criterion as a condition to any action.
  • More load balancing algorithms : right now, Weighted Round Robin, Weighted Source Hash and Weighted URL Hash are implemented. Weighted Least Conns is pending. Other algorithms may come later such as Weighted Measured Response Time.

Unlike other free "cheap" load-balancing solutions, this product is only used by a few hundreds of people around the world, but those people run very big sites serving several millions hits and between several tens of gigabytes to several terabytes per day to hundreds of thousands of clients. They need 24x7 availability and have internal skills to risk to maintain a free software solution. Often, the solution is deployed for internal uses and I only know about it when they send me some positive feedback or when they ask for a missing feature ;-)

http://haproxy.1wt.eu/

相关阅读:

TAG: 开源应用TOOLS

开源大讲堂_黄富强 黄富强 发布于2008-03-15 19:30:57
错位,不让改了,来blog吧
我来说两句

(可选)

Open Toolbar