The problem of workload management for business intelligence (BI) queries was addressed in research by Stefan Krompass, Harumi Kuno, Umeshwar Dayal, and Alfons Kemper (2007). The research titled “Dynamic workload management for very large data warehouses: juggling feathers and bowling balls” was introduced at the proceedings of the 33rd international conference on Very large databases in Vienna, Austria. The present paper provides a summary of the aforementioned research.
The article starts to establish the problem statement of the study, which can be simplified toward potential challenges that might arise when managing the workload of a large warehouse. Krompass et al. (2007) outlined a typical scenario in such a warehouse, where a customer might initiate a BI query in a workload management system in which response time will not show whether such query is being executed or it tuned to be problematic.
Such a scenario led to the formulation of three challenges that provided the scope of the study, which are identifying a response before which a long query should be killed, prioritizing new queries and old ones, and prioritizing queries based on the authority and the rights of the submitter. The study proposed an approach in which queries will be automatically scheduled and managed, which is based on three components, workload categorization according to service level objectives (SLO), scheduling to accommodate the uncertainty of BI queries execution times, and execution management component that will identify and control problem queries.
For SLO categorization, the study relied on interviews of database practitioners as a result of which the model proposed in the study, identified only the objectives which are driven by deadlines. A flexible system was modeled, which objectives differ according to the different types of jobs, i.e., interactive jobs and batch jobs. The execution control component, on the other hand, was implemented to support fuzzy logic-based rules (Krompass et al. 1109).
Accordingly, a fuzzy controller was implemented, using which was motivated by the ability to address such issues as classification of queries, capturing management logic, the absence of complete knowledge about the state of the system. Using fuzzy logic rules and linguistic variables, the controller will be able to address each of the identified issues of overload situations in BI workloads (Krompass et al. 1109). The execution controller, in that regard, will gather certain metrics that were processed through fuzzy controllers, according to which three actions can be executed: reprioritize, kill, and resubmit.
The study explained the details of an experiment in which the proposed system was tested for the impact of unexpected behavior and the ability of such a system to react. The results of such an experiment showed that the execution control was able to trigger 119 actions for 75 problem queries, without identifying false positives. In order to assess the impact of more aggressive system, in which more falsely identified queries will exist, the system was recalibrated and showed that the controller might invoke superfluous control action against correct queries.
The penalties of the controller outweigh the benefits of canceling actual problematic queries in such an aggressive mode (Krompass et al. 1114). Accordingly, the experiment outlined the need for a progress indicator that might reduce the number of spontaneous actions (Krompass et al. 1114).
The study concluded that problem queries might significantly impact the performance of a database system. The execution controller modeled in the study was able to identify problem queries, without an excessive number of false negatives or false positives (Krompass et al. 1114). The study pointed out the necessity of qualitatively investigating the impact of problem queries on database systems, and a large number of experiments.
Works Cited
Krompass, Stefan, et al. “Dynamic Workload Management for Very Large Data Warehouses: Juggling Feathers and Bowling Balls.” Proceedings of the 33rd international conference on Very large data bases. Ed. 1325976: VLDB Endowment. Print.