Creating a High Load Ads Delivery System Based on OpenX

This post is also available in: Russian

High load ads

OpenX is an open ad server to manage Web based campaigns and deliver ads to consumers. Currently, this is actually the only available open source ad solution you can install on your server and adapt to your needs. We have already presented to you a review and a small demonstration of this product earlier, see Ads delivery on Openx.

In this article, we have tried to bring together all available online information on how to gain the maximum OpenX ad server performance.

We have faced this while creating a system to ensure stable ads delivery at the peak traffic of one million of concurrent users. Before starting the design, we preferred to reduce this customer benchmark to more measurable indicators: the number of requests handled per second or average response time at the given number of concurrent connections. Such data is based on the estimate of visit count at different pages and types of advertising presented on them. Such estimates would be pretty rough, but at least they are comparable with the load tests results.

Measuring Performance

Before starting OpenX optimization, and also at each iteration of refinements, it is useful to log system performance to understand the effect of the changes. To this end, OpenX developers recommend using Apache JMeter, providing fairly complex test scenarios and detailed statistics on each request within the test. OpenX developer documentation offers a set of ready scripts for JMeter. Before the tests, try to exclude all possible factors affecting the test results and not directly related to OpenX operation, such as performance and utilization of the testing PC, or utilization of links between the OpenX servers and the test computer. The variable parameters are the number of simultaneous connections and the total number of queries during the test. To yield the maximum system performance, the number of simultaneous connections should be set equal to or slightly higher than the number of cores that handle the requests. To see performance under the real load, set the number of connections equal to the design capacity. It is important to ensure that the percentage of failed requests is equal or near 0.

Also, we recommend that in addition to the latest version 2.4 you download a set of plug-ins for JMeter. For instance, Ultimate Thread Group could help you to flexibly configure dynamics of parallel connections during the tests and Response Times vs Threads and Transaction Throughput vs Threads will present graphs based on the test results.

If you are testing an already running system, it is useful to analyze server load graphs built in various server monitoring systems such as Zabbix, Cacti, etc. OpenX developers recommend Ganglia.

For basic installation of OpenX on an average server, different sources claim performance varying from 30 to 100 requests per second, depending on settings. Based on this, you can decide how many servers and how many cores on each server you need to handle your requests. Evidently, the spread between 30 and 100 is large enough, and only the improvements you make will constitute your specific results.

Server Architecture

First of all, let’s consider the server architecture of our system. Fundamentally, OpenX performance is limited by the need to run PHP scripts and log the impressions data. So you should initially design for horizontal scalability to easily add additional servers and improve performance. This possibility is natively supported by OpenX. It is facilitated by the open architecture of the product, using widely adopted open source technologies. Moreover, it can easily be divided into three main functional parts: Web-based advertising and statistics management, ad engine (deciding to deliver ads and logging the data), and the engine to process statistics and prioritize campaigns.

There are various options to install OpenX components on different servers. The developers propose distributed installation schematized below:

Here, the interface functions of campaign management and processing of raw statistics are implemented on a separate server, while the delivery features are delegated to a set of servers, taking on the main load in the system. Each delivery node has its own MySQL server configured to replicate data from the master database. In this case, the raw data logs are stored locally and periodically compiled by the statistics script. This script runs on the master server and saves statistics to the main database server. This way you can reduce the load on the main database server and distribute requests across the delivery scripts run on any number of servers. For more detail on this option, please refer to OpenX documentation and other resources [1 , 2 , 3].

We slightly modified this scheme, taking into account our current hardware and load tests results. All incoming requests are handled by the frontend, which distributes delivery requests between the PHP-backends and processes administration interface queries. The engine of statistics and prioritization and the Memcached server used to cache the database information, also run on this server. All these tasks require relatively low computing capacity. The main load is shared by the PHP backend server. The database is replicated from the master server to two slave servers, to optimize performance and ensure data integrity. So, we have improved the previous scheme, taking non-specific functions of the database server from the PHP backend. Now we can scale the various aspects of the system (database servers, PHP backend servers) independently of each other.

A few words about how to configure the server software. All servers are running Linux. The frontend runs a high-performance Nginx HTTP server which outputs advertising graphics and other static data and proxies requests to execute PHP scripts. On the backend side, spawn-fcgi + PHP   5.1 are installed. PHP 5.3 has several improvements in terms of performance, in particular, integration of the php-fpm module, but OpenX has not yet finalized its support.

When editing the software config files, we should eliminate all factors that may limit the number of parallel connections: the maximum number of connections and the number of running processes in Nginx, Linux limitations for the number of open connections and files (do not forget to set them on the load testing PC), Memcached and MySQL limits on the number of simultaneous connections, etc.

Here is the summary of key recommendations for the OS configuration, taken from various sources [1, 2, 3]:

  • Disable all unnecessary services and cron scripts.
  • For file systems, set the noatime option in /etc/fstab, to avoid updating last access time for each file.
  • In the /etc/sysctl.conf file, set the following parameters:
    1
    2
    3
    4
    5
    6
    7
    8
    <pre>
    # To automatically terminate incorrectly closed connections
    # in 10 seconds
    net.ipv4.tcp_fin_timeout = 10
    # To increase the number of available local ports for outgoing
    # connections
    net.ipv4.ip_local_port_range = 16384 65536
    </pre>
  • To cancel the limitation on the number of open files, in /etc/security/limits.conf file specify:
    1
    2
    3
    <pre>
    * Soft nofile 65000
    </pre>

For spawn-fcgi, we recommend you to specify the number of threads slightly exceeding the number of available cores. The number of PHP processes can be left equal to 1, but to increase reliability you can increase it slightly. When adjusting these settings, please note the Load average value of the fully loaded system. It should be approximately equal to the number of available cores.

When configuring the database, enable replication and set the maximum number of connections. Be sure to enable persistent MySQL connections in the OpenX config file: in the

1
[database]

section, set

1
persistent = 1

. Other MySQL options (based on 1, 2, 3) are:

1
2
3
4
5
<pre>
[Mysqld]
; the maximum number of connections
max_connections = 2048
</pre>

In the OpenX config file, increase the statistics sort buffers:

1
2
3
4
<pre>
[DatabaseMysql]
statisticsSortBufferSize = 536868864
</pre>

To store the tables, MyISAM is used by default, as it is faster. But the issue with MyISAM is that the table is completely locked on commit. Usually this does not affect performance, but on highly loaded systems the affect may be substantial, especially when launching the statistics script. So you have to monitor the script to make sure it does not run too long. For OpenX installations based on a dedicated high performance database server, InnoDB is advisable. Although access to InnoDB tables is slower in general, but they use row-lock rather than table-lock at commit. Generally speaking, you’d have to conduct load tests, monitor the actual load for each option and evaluate a more appropriate option.

The following settings were made for Nginx:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<pre>
# Number of worker processes is equal to the number of cores
worker_processes 16;
# Limitation on files open for each worker
worker_rlimit_nofile 8192;
# Specify the maximum number of connections
events {
worker_connections 65000;
use epoll;
}
# Enable reply compression to speed up
# their output and quickly release the connections
gzip on;
gzip_min_length 1100;
gzip_buffers 64 8k;
gzip_comp_level 3;
gzip_http_version 1.1;
gzip_proxied any;
gzip_types text/plain application/xml application/x-javascript
 text/css;
# Disable keep-alive connections
keepalive_timeout 0;
</pre>

Optimizing Delivery

To get an insight on further optimization, consider the main factors on system performance. To do this, you should understand the basic system patterns. When a visitor enters a page with a banner, the following queries are made:

  1. Request of promotional materials for each zone or for all zones.
  2. Query of advertising graphics (static image). We are not factoring it in, as it makes a minor load on the servers and is easily processed by Nginx.
  3. Impression logging.
  4. Click logging.

At stage 1, a read request is sent to the database to determine the set of banners available and select the banner to be shown, based on priorities and campaign settings. You may cache it, but the cache should be updated periodically to keep track of priorities. At stages 3 and 4, impression counter is incremented in the raw statistics table with the INSERT ON DUPLICATE UPDATE query. This cannot be cached, of course, but still certain techniques may apply.

Here are the limiting factors:

  • Need to read from the database
  • Need to write to the database
  • PHP script performance.

Caching of read queries

To optimize database reads, provide caching of all information requested from the database. OpenX developers have taken care of this by building in a fair caching system, easily extensible with simple plugins. The basic delivery includes a file cache plugin and Memcached. In the config file, you can specify a sequence of plugins, so that if one of them fails, the system automatically fails over to the other. Therefore, we recommend to immediately enable caching, having not forgotten to increase the maximum number of connections to the Memcached server.

1
2
3
4
5
6
7
8
<pre>
[Delivery]
cacheExpire = 600
cacheStorePlugin = "deliveryCacheStore:oxMemcached:oxMemcached"
[OxMemcached]
memcachedServers = "10.2.35.2:11211"
memcachedExpireTime = 600
</pre>

If everything possible is fully cached, requests to the cache may take a substantial share of query processing (about 20%). In view of this, it is recommended to set the following Memcached server parameters: use a binary protocol, use UDP.

Reducing the number of database queries

While this may not seem relevant in view of all-consuming caching, still sometimes requests transcend the cache, and also individual requests often have other overheads, even if cached. For example, if you write your extensions for OpenX, adding new banner attributes to the database schema or other system entities, from the development and design perspective, it is more convenient to store this value in a separate table uplinked to the entity by ID. This way the value can be retrieved by a dedicated query called by a separate plugin. However, all these architectural improvements may affect performance, so the optimal way is to change directly the OpenX source code, add columns directly to the basic tables, etc. Another possibility to reduce the number of queries is to disable unnecessary delivery plugins.

Optimization of the database entry

The earlier discussed distributed statistics architecture can partially overcome the slowdown due to writing delivery logs. Another novel approach is proposed in the article http://hi-load.php.com.ua/topic/31/. Contrary to writing statistics to the database, they save it to a file on disk, which is, most likely, less expensive. To do this, change the source code of the system, i.e. the logging function OA_Dal_Delivery_logAction located in lib\OA\Dal\Delivery\mysql.php. This function is called by various delivery plugins.

Moreover, the authors managed to avoid opening a file for writing at each query, passing the logged data directly to the Web server log using the apache_setenv PHP function and special Apache settings. Next, the logs are periodically imported into the database by a simple LOAD DATA INFILE SQL query. The obvious downside of this approach is dependency on Apache Web server. The principle formulated in this article could be developed further, by entirely abandoning saving of raw data into the database. It would be interesting to experiment with storing and reading the data by a statistics script involving a key-value storage or NoSQL servers, currently gaining momentum.

Improving PHP performance

After all of the above optimization, the main (and quite a tough) limitation is PHP script performance. To increase it, first of all install any of the PHP accelerators, caching the compiled PHP-code. They are all offering roughly the same values of acceleration, improving PHP speed several times. We used eAccelerator. To support it, make the following settings in your php.ini:

1
2
3
4
5
6
7
8
<pre>
; Shared memory cache size, MB
eaccelerator.shm_size = 512
; Only use cache in shared memory
eaccelerator.shm_only = 1
; Do not use compression for cached code
eaccelerator.compress = 0
</pre>

Developers have made a great effort to optimize the delivery scripts code. The delivery code is mostly independent from the remaining code and is well-optimized. It uses standard libraries for database access and does not use classes or object-oriented programming. Also, it minimizes the number of include and require directives. All the costly operations are saved in an external cache or in global variables, and run no more than once during script execution. The standard OpenX package has a special script to compile all includes in delivery scripts into a single file. All these rules should be considered by developers modifying the OpenX code or writing their own plugins. To identify bottlenecks in the code, use the execution time data obtained under real load using the built-in profiler of the XDebug extension for PHP. The profiler output files can be viewed using the WinCacheGrind utility.

Expert PHP developers might use real FastCGI (1, 2) and PHPDaemon to accelerate delivery scripts. The benefits of this approach is 30% reduction in execution time by running the initialization code only once, when the daemon is launched. The downsides are associated with the potential need to make significant changes to the delivery script and intricate memory leaks in PHP (PHP process needs to be periodically restarted, and the benefits of real FastCGI are lost), the need to use PHP 5.3 featuring a highly improved garbage collector.

If you prefer not to tamper with the code, still you’d better disable all unnecessary plugins in the OpenX configuration file. You can get a decent effect by disabling the plugins used to log additional information ([deliveryHooks] section), and the groups of rules deliveryLimitations of the Client plugin ([pluginGroupComponents] section). Initialization of browser restrictions takes a lot of time, as the external phpSniff library is used. Incidentally, it runs even if you do not use these restrictions). You can also completely disable logging of various delivery aspects in the [logging] section.

Another approach is focused at reducing the number of queries required to display banners on the page (1, 2), which also improves the page loading speed. In this case, a query returns ad code for all banners shown on a page, rather than for each area separately.

By default in OpenX, the statistics script is run during the execution of delivery script. It is convenient for small systems where the administrator does not have crone access. Also, it can be used to facilitate installation. Still on a heavily loaded system this is unacceptable, as the banner delivery could degrade. Statistics script should be run by crone, preferably on a separate machine (1, 2). To enable this, specify the following parameter in the OpenX config file:

1
2
3
4
<pre>
[Maintenance]
autoMaintenance = 0
</pre>

Then, add the following line to crontab:

1
2
3
4
<pre>
0 * * * * /path/to/php/ path/to/openx/scripts/maintenance
/maintenance.php www.mydomain.com
</pre>

Conclusion

We hope that this publication will help you optimize your OpenX server. In addition to links provided in the text, we also have used information from the following OpenX resources:

Leave a Reply