DB2 – How Index Pages Distributed on Database Partitions | Hadoop, Data Warehousing and Big Data

In a database partitioned environment (MPP), DB2 can store rows of a single table on multiple nodes.

For example, let’s create a sample table as follows:

CREATE TABLE cities
(
  id INT,
  name VARCHAR(30),
  state CHAR(2)
)
DISTRIBUTE BY HASH (id);

Now let’s insert a few rows:

INSERT INTO cities VALUES (1, 'Seattle', 'WA');
INSERT INTO cities VALUES (2, 'Boston', 'MA');
INSERT INTO cities VALUES (3, 'Los Angeles', 'CA');
COMMIT;

DISTRIBUTE BY clause specifies that rows of CITIES table are distributed among all nodes in the partition group by ID column value. Using DBPARTITIONNUM built-in function you can see on which nodes the rows are actually located:

SELECT name, DBPARTITIONNUM(id) FROM cities;

Seattle       8
Boston       24
Los Angeles  24

You can see that rows are stored on nodes 8 and 24 in DB2.

What about Indexes?

If you create an index on a table, will DB2 distribute its blocks to nodes different from the corresponding table row nodes? No, DB2 uses a local index, each node creates an index only for the table rows stored on the node.

That’s why DB2 does not allow you to create an unique index on columns that are not used in DISTRIBUTE BY:

CREATE UNIQUE INDEX idx_cities ON cities(name);

SQLSTATE 42997: Capability is not supported by this version of the DB2
application requester, DB2 application server, or the combination of the two.

Uniqueness must be enforced among all rows in the table while each node can only check uniqueness on local data only. If you insert one more row for city Seattle it can be stored on another node:

INSERT INTO cities VALUES (4, 'Seattle', 'WA');
COMMIT;

You can see that this row in stored on node 24:

SELECT name, DBPARTITIONNUM(id) FROM cities;

Boston                                  24
Los Angeles                             24
Seattle                                 24
Seattle                                  8

To ensure uniqueness on name column all nodes should communicate with each other on each INSERT statement, but DB2 LUW 9.7 just does not allow you to create such unique indexes.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Hadoop, Data Warehousing and Big Data

by Dmitry Tolpeko

DB2 – How Index Pages Distributed on Database Partitions

Leave a comment Cancel reply

Hadoop, Data Warehousing and Big Data

by Dmitry Tolpeko

Share this:

Related

Leave a comment Cancel reply