Tryst with all things Oracle: No more, no less comparison

One of the queries I finally came up today is worth documenting; well at least I thought so.

Here is the scenario:

An input list of statenos is to be flagged for being same:

If and only if such an EXACT list (no more, no less statenos) is currently pending review.

The highlighted part was the most difficult to implement.

Here are the involved structures:

1. First is a parent table that is to store merely the flagging instances and a status. ‘NEW’ for yet to be reviewed

CREATE TABLE NYEHMS_DUPLICATE_STATENOS

(

DUPLICATION_ID NUMBER,

FLAGGED_BY NUMBER,

FLAGGED_DATE DATE,

STATUS_CD VARCHAR2(10 BYTE),

REQUEST_COMMENTS VARCHAR2(3000 BYTE),

REVIEW_COMMENTS VARCHAR2(3000 BYTE),

LAST_UPDATED_BY NUMBER,

LAST_UPDATED_DATE DATE

)

2. Second is the meat of the matter here, the child table stores a parent id and all the involved statenos in that flagging

CREATE TABLE NYEHMS_DUPLICATE_STATENOS_LIST

(

DUPLICATION_LIST_ID NUMBER,

DUPLICATION_ID NUMBER,

ORIGINAL_STATENO VARCHAR2(40 BYTE),

CURRENT_STATENO VARCHAR2(40 BYTE)

)

So the query must loop through the child table field Current_stateno and look for a complete set identified by Duplication_ID that is a one-one match with the input list.

If each member of the input list is found to be in some different sets (different duplication_id) then that must not be counted as a full match. This would be much easier but as stated above that is not what we need.

Here is the query. It makes use of Analytical functions in Oracle among others.

1 SELECT COUNT (*)

2 INTO l_exist_count

3 FROM ( SELECT duplication_id, COUNT (match) match_cnt

4 FROM (SELECT tbl.COLUMN_VALUE,

5 dup_list.current_stateno match,

6 dup_list.duplication_id

7 FROM TABLE (l_stateno_num_arr) tbl

8 LEFT OUTER JOIN

9 (SELECT duplication_id,

10 current_stateno,

11 COUNT (DISTINCT current_stateno)

12 OVER (PARTITION BY duplication_id)

13 list_cnt

14 FROM nyehms_new_flagged_same_vw) dup_list

15 ON dup_list.current_stateno = tbl.COLUMN_VALUE

16 AND dup_list.list_cnt = l_token_cnt)

17 GROUP BY duplication_id)

18 WHERE duplication_id IS NOT NULL AND match_cnt = l_token_cnt;

· TABLE (l_stateno_num_arr) tbl represents the input

· nyehms_new_flagged_same_vw represents the yet to be reviewed existing flagged sets

Dissection of key parts:

1. Line 7 to 14: First thing we need is a LEFT OUTER JOIN between the input list and the yet to be reviewed (YTBR). Even in the YTBR we are interested in only those sets that have as many distinct statenos as the input list did. (For instance upon merges 2 of the values in current_stateno under one set could end up being just the same stateno). That explains the portion from 9 to 14

2. Line 4 to 6: We select each input stateno and if a match existed the matched stateno (calling it match) and the set (duplication_id) in which it was found. Being a Left outer, if a match wasn’t found it would just pull NULL for the match and set id fields

3. Line 3 and 17: This result-set from previous 2 steps is now aggregated to give in which set (duplication_id) how many matches were found, the GROUP BY and COUNT(match)

4. Line 1, 2 and 18: Finally we now check if any results from step 3 (duplication id and number of matches) exists that had as many matches as the count of our input list (line 18)

Tryst with all things Oracle

Friday, May 24, 2013

No more, no less comparison

No comments:

Post a Comment