Understanding Oracle Indexes

B-tree Index

Standard Oracle index is B-tree based.

A key is a column or expression on which you can build an index. Follow these guidelines for choosing keys to index:

  • Consider indexing keys that are used frequently in WHERE clauses.
  • Consider indexing keys that are used frequently to join tables in SQL statements. For more information on optimizing joins, see the section “Using Hash Clusters”.
  • Index keys that have high selectivity. The selectivity of an index is the percentage of rows in a table having the same value for the indexed key. An index’s selectivity is optimal if few rows have the same value.

    Note:

    Oracle automatically creates indexes, or uses existing indexes, on the keys and expressions of unique and primary keys that you define with integrity constraints.


    Indexing low selectivity columns can be helpful if the data distribution is skewed so that one or two values occur much less often than other values.

  • Do not use standard B-tree indexes on keys or expressions with few distinct values. Such keys or expressions usually have poor selectivity and therefore do not optimize performance unless the frequently selected key values appear less frequently than the other key values. You can use bitmap indexes effectively in such cases, unless a high concurrency OLTP application is involved where the index is modified frequently.
  • Do not index columns that are modified frequently. UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes as well as data in tables. They also generate additional undo and redo.
  • Do not index keys that appear only in WHERE clauses with functions or operators. A WHERE clause that uses a function, other than MIN or MAX, or an operator with an indexed key does not make available the access path that uses the index except with function-based indexes.
  • Consider indexing foreign keys of referential integrity constraints in cases in which a large number of concurrent INSERT, UPDATE, and DELETE statements access the parent and child tables. Such an index allows UPDATEs and DELETEs on the parent table without share locking the child table.
  • When choosing to index a key, consider whether the performance gain for queries is worth the performance loss for INSERTs, UPDATEs, and DELETEs and the use of the space required to store the index. You might want to experiment by comparing the processing times of the SQL statements with and without indexes. You can measure processing time with the SQL trace facility.

Bitmap Indexes

1. 案例

  有张表名为table的表,由三列组成,分别是姓名、性别和婚姻状况,其中性别只有男和女两项,婚姻状况由已婚、未婚、离婚这三项,该表共有100w个记录。现在有这样的查询:     select * from table where Gender=‘男’ and Marital=“未婚”;

姓名(Name)

性别(Gender)

婚姻状况(Marital)

张三

已婚

李四

已婚

王五

未婚

赵六

离婚

孙七

未婚

1)不使用索引

  不使用索引时,数据库只能一行行扫描所有记录,然后判断该记录是否满足查询条件。

2)B树索引

  对于性别,可取值的范围只有’男’,’女’,并且男和女可能各站该表的50%的数据,这时添加B树索引还是需要取出一半的数据, 因此完全没有必要。相反,如果某个字段的取值范围很广,几乎没有重复,比如身份证号,此时使用B树索引较为合适。事实上,当取出的行数据占用表中大部分的数据时,即使添加了B树索引,数据库如oracle、mysql也不会使用B树索引,很有可能还是一行行全部扫描。

2. 位图索引出马

如果用户查询的列的基数非常的小, 即只有的几个固定值,如性别、婚姻状况、行政区等等。要为这些基数值比较小的列建索引,就需要建立位图索引。

对于性别这个列,位图索引形成两个向量,男向量为10100…,向量的每一位表示该行是否是男,如果是则位1,否为0,同理,女向量位01011。

RowId

1

2

3

4

5

1

0

1

0

0

0

1

0

1

1

  对于婚姻状况这一列,位图索引生成三个向量,已婚为11000…,未婚为00100…,离婚为00010…。

RowId

1

2

3

4

5

已婚

1

1

0

0

0

未婚

0

0

1

0

1

离婚

0

0

0

1

0

   当我们使用查询语句“select * from table where Gender=‘男’ and Marital=“未婚”;”的时候 首先取出男向量10100…,然后取出未婚向量00100…,将两个向量做and操作,这时生成新向量00100…,可以发现第三位为1,表示该表的第三行数据就是我们需要查询的结果。

RowId

1

2

3

4

5

1

0

1

0

0

and

未婚

0

0

1

0

1

结果

0

0

1

0

0

3.位图索引的适用条件

  上面讲了,位图索引适合只有几个固定值的列,如性别、婚姻状况、行政区等等,而身份证号这种类型不适合用位图索引。

  此外,位图索引适合静态数据,而不适合索引频繁更新的列。举个例子,有这样一个字段busy,记录各个机器的繁忙与否,当机器忙碌时,busy为1,当机器不忙碌时,busy为0。

  这个时候有人会说使用位图索引,因为busy只有两个值。好,我们使用位图索引索引busy字段!假设用户A使用update更新某个机器的busy值,比如update table set table.busy=1 where rowid=100;,但还没有commit,而用户B也使用update更新另一个机器的busy值,update table set table.busy=1 where rowid=12; 这个时候用户B怎么也更新不了,需要等待用户A commit。

  原因:用户A更新了某个机器的busy值为1,会导致所有busy为1的机器的位图向量发生改变,因此数据库会将busy=1的所有行锁定,只有commit之后才解锁。

Cluster Index

Oracle will create an index to police an unique constraint where no pre-existing index is suitable. Without the index, Oracle would need to serialize operations (such as a table lock) whenever someone tries to insert or delete a row (or update the PK).

Contrarily to MS-SQL Server, this index is not clustered on heap tables (default table organization), i.e. this index won’t change the underlying table structure and natural order. The rows won’t be reordered when Oracle creates the index. The index will be a B-tree index and will exist as a separate entity where each entry points to a row in the main table.

Oracle doesn’t have clustered index as MS SQL, however indexed-organized tables share some properties with cluster-indexed tables. The PK is an integral part of such tables and has to be specified during creation.

(Oracle also has table clusters, but they are a completely different concept).

Reference:

1. Understanding Indexes and Clusters

2. Indexes and Index-Organized Tables  

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s