SQL Problems Requiring Cursors

SQL Problems Requiring Cursors

A cursor is a construct available in most implementations of SQL that allows the programmer to handle data in a row-by-row manner.

The trouble with cursor processing is that the row-by-row nature cannot easily be made to occur in parallel by database optimizers. This results in a failure of the cursor processing to take full advantage of the processing power available on most database platforms.

To overcome this problem, cursor logic can often be converted into set-based SQL queries. Set-based SQL uses the syntax "SELECT...FROM..." with various clauses to perform the data manipulation work. Database optimizers have little trouble dividing such queries into parallel threads, thus fully utilizing the database hardware.

The resulting performance gain can be literal orders of magnitude: hours become minutes, and minutes become seconds.

Despite the tremendous performce gains from set-based SQL, it can be challenging to find the SQL that replaces fairly straigh-forward cursors.

This article discusses examples of such conversions.

Constraints

In this article, the following constraints apply:
* The term "cursor" includes all cursor-like behavior. For example, using a loop in a shell script that loops across SQL queries or the output of SQL queries is cursor-like behavior and does not achieve the goal of true set-based processing within the database.
* All set-based SQL must be ANSI SQL. A number of vendors provide some extremely powerful proprietary extensions. The goal is to avoid such extensions in favor of ANSI SQL.
* The solution must be generalizable. In one or more examples below, specific values may be used for demonstration purposes, but any solution must scale to any number feasible within the power of the database software and machine resources.

Example: Insert rows based on a count in the table itself

Problem

The table below represents marbles. The four text columns represent four characteristics of marbles. Each characteristic has two values for a total of 16 types of marbles.

The "quantity" column represents how many of that marble we have. But instead of having a quantity, we want to have one row for each marble.

Thus, the target table would have the four text columns, and a total of 40 + 20 + 20 + 10 + ... + 10 + 5 = 270 rows.

Source table:

QUANTITY TEXTURE APPEARANCE SHAPE COLOR---------- ---------- ---------- ---------- ----- 40 smooth shiny round blue 20 smooth shiny warped blue 20 smooth dull round blue 10 smooth dull warped blue 20 rough shiny round blue 10 rough shiny warped blue 10 rough dull round blue 5 rough dull warped blue 40 rough dull warped red 20 rough dull round red 20 rough shiny warped red 10 rough shiny round red 20 smooth dull warped red 10 smooth dull round red 10 smooth shiny warped red 5 smooth shiny round red

Table to generate:

TEXTURE APPEARANCE SHAPE COLOR---------- ---------- ---------- -----smooth shiny round blue -- 1smooth shiny round blue -- 2... -- and so onsmooth shiny round blue -- 40smooth shiny warped blue -- 1smooth shiny warped blue -- 2... -- and so onsmooth shiny warped blue -- 20... -- and so onsmooth shiny round red -- 1smooth shiny round red -- 2smooth shiny round red -- 3smooth shiny round red -- 4smooth shiny round red -- 5

Solution in cursor form

Generating the target table with a cursor is fairly simple. There are a number of lines of code, but the logic is little more than a loop within a loop.

declare cursor c is select * from marbles_seed;begin for r in c loop for i in 1..r.quantity loop insert into marbles values ( r.texture, r.appearance, r.shape, r.color_actual, r.hits, r.color_predicted, r.accuracy, r.coverage ); end loop; end loop;end;

Solution in ANSI set-based SQL

Solving the problem with ANSI SQL is a bit more code, but requires a bit more creative thought than the nested loop approach of cursors.

Number Table

To reach the solution requires an intermediate table. The table has one column of type NUMBER that has the values 0 to whatever number of rows is needed. For this discussion, we'll limit it to one million rows. The code is as follows:

create table numbers_seed ( n number(1) );create table numbers ( n number(7));insert into numbers_seed values ( 0 );insert into numbers_seed values ( 1 );insert into numbers_seed values ( 2 );insert into numbers_seed values ( 3 );insert into numbers_seed values ( 4 );insert into numbers_seed values ( 5 );insert into numbers_seed values ( 6 );insert into numbers_seed values ( 7 );insert into numbers_seed values ( 8 );insert into numbers_seed values ( 9 );insert into numbersselect n6.n * 100000 + n5.n * 10000 + n4.n * 1000 + n3.n * 100 + n2.n * 10 + n1.n * 1 n from numbers_seed n1, numbers_seed n2, numbers_seed n3, numbers_seed n4, numbers_seed n5, numbers_seed n6

Solution Core

Assume the source table above is named marbles_seed and the target table is named marbles. The code that generates the needed 270 rows is:

insert into marblesselect m.texture, m.appearance, m.shape, m.color_actual, m.hits, m.color_predicted, m.accuracy, m.coverage from marbles_seed m, numbers n where m.quantity > n.n

Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • SQL — This article is about the database language. For the airport with IATA code SQL, see San Carlos Airport. SQL Paradigm(s) Multi paradigm Appeared in 1974 Designed by Donald D. Chamberlin Raymond F. Boyce Developer …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”