Main

July 05, 2008

Excerpt from Chapter 12 Performance Planning and Management

BI Accelerator

The BI accelerator is a recent addition to SAP BI and has shown some impressive performance gains with typical improvement factors between 10 and 100. It is based upon the TREX search engine for unstructured data applying its technology to structured data store in the InfoCubes of an SAP BI system, Figure 12.14 shows the architecture of the BI accelerator in the context of the overall SAP BW architecture. The BI accelerator replaces the MOLAP aggregate option available in previous releases of SAP BW.

Figure 12.14 BI accelerator architecture

The BI accelerator achieves these performance gains by creating special types of additional indexes — sometimes also called BI accelerator indexes — and using these indexes as the basis for massively parallel query execution. Both indexing and query execution run on a separate, highly parallel blade server.

The underlying technology is actually not new but has been part of the TREX search engine, which itself is part of the SAP NetWeaver suite. SAP BW just uses a special instance of the TREX search engine enhanced to better support searches in structured data of a business intelligence system. Creating a BI accelerator index involves three steps:

   1.  Vertical decomposition decomposes the data to be indexed by attribute (or database column) instead of by record (or table row), as it is done in traditional database indexes. This approach has been known from other search engines for non-structured data — only now applied to structured information stored in SAP BW

   2.  Smart compression involves recoding the attribute values found in the indexed data to smaller integer values using a directory, which is generated on-the-fly. Typical reductions in the size of the indexed data reach a factor of 10-20.

   3.  Horizontal partitioning divides the generated index into multiple partitions in such a way that query execution using these indexes can be run in parallel without having to share data.

BI accelerator indexes are not stored in a database, but are stored in flat files residing on the BI accelerator server. Query execution using the BI accelerator again involves up to three steps:

   1.  Load index to memory. When starting the execution of a query the system checks if the corresponding index is already (or still) available in the main memory of the BI accelerator server. If the index is not available, it is loaded into main memory. Alternatively, loading critical indexes can be triggered by external processes to ensure that these are instantly available at all times.

   2.  Aggregation. BI accelerator indexes also contain key figure values corresponding to the requested characteristic values. These key figure values are used for highly parallel on-the-fly aggregations.

   3.  Merge and return results. The final step before returning query results to the BI server is to retrieve and merge all subresults from the different parallelized query execution processes.

Using the BI accelerator has a couple of advantages over other traditional performance optimization methods. First of all, using it does not require any changes to existing information models or queries. Considerations around logical partitioning, line item dimensions or additional database indexes could even be ignored in the development of information models for use with the BI accelerator — although you should still aim for the best possible information model. As opposed to traditional aggregates, one BI accelerator index servers all queries of an InfoCube, regardless of the granularity of the result set or the actual filters used. Therefore, the manual effort for tuning aggregates is kept to an absolute minimum while the systems still provides stable, predictable query response times. Because BI accelerator indexes are not stored in a database, there’s also no need to optimize database queries or parameter sets.

On the other hand, hardware resource requirements (especially for main memory) are too demanding to allow for using the BI accelerator for all InfoCubes of the overall information model. In many applications, classic aggregates will be sufficient to provide good performance for query execution at much lower hardware requirements. Typically the BI accelerator will be deployed in scenarios with very high data volumes (hundreds of millions, or even billions of records in the fact tables), incalculable query requirements which are hard to optimize for or very high expectations (for example, with hard service level agreements) regarding average or maximum query run times. The BI accelerator does not optimize the analytic engine itself, nor does it help to cut down network transmission times. It is “only” useful to optimize database intensive query execution.


 


Hosting by Yahoo!