Categories
Developer Teaching QlikView /en

Lesson 6 – Script Tips ‘n’ Tricks

Overview

This short lesson concentrates on typical questions from practice and their possible solutions.

Marking of duplicates

In practice it often happens that “duplicates” for one information exist but you only want to consider the “newest” dataset. To prevent this problem, QlikView offers so called “interrecord-functions” within scripts. These allow to access previous or following datasets during the loadtime of a source. The solution for our problem would look like this:

//in the first step we load the tables without "change"
RawTable:
LOAD 
    *
FROM [$(PATH_DATASTAGING)TestSource.qvd];

//afterwards we sort them after our duplicate key and e.g. a time infomration
SortedTable:
NOCONCATENATE LOAD
    *
RESIDENT RawTable
ORDER BY %DublettenId, %Date;

//In order to prevent circular references, we remove the old table
DROP TABLE RawTable;

//afterwards we compare the key and mark our dataset
FinalTable:
NOCONCATENATE LOAD
    *
    ,IF(PREVIOUS(%DublettenId) = %DublettenId,0,1) AS FLAG_VALID
RESIDENT SortedTable;

//at last we remove the table again in order to prevent circular references
DROP TABLE SortedTable;

Dissolving of “multiple” keys

In practice it often happens that a relation between 2 tables exists on multiple fields. To improve performance and to keep the datamodel “clean”, it is worth it to combine those keys with a simple trick.

//Initial situation
Table1:
LOAD
 %CostcenterId,
 %ProductId,
 ...
RESIDENT Table1;

Table2:
LOAD
 %CostcenterId,
 %ProductId,
 ...
RESIDENT Table2;

//we create a combined key to enhance the performance
Table1:
LOAD
  %CostcenterId & 'xx' & %ProductId AS %Connection,
  ..

Classification of data

ApplyMap function

Assign information based on a “map”. The following requirements on the map apply:

  • The assignment table exists of 2 columns, the first one holds the comparison value, the second one the future “class value”.
  • Both columns have to be named, although the name is not relevant later on. If you forget to name e.g. an inline-table, then the first value will never generate a hit.
  • When using the assignment table, the values of a certain field will be compared and if it leads to a hit, the assignment value will be “replaced” otherwise the original value persists.

If you are asking yourself why we don’t drag the data via Join into the datamodel, then the answer is point 3 of the just now mentioned points. A JOIN would always enlarge the table and make it impossible to “keep” the original value.

Example with the use of an inline-table:

//at first we build our inline-table
Discount Deadlines:
LOAD * INLINE
[
  Discount, Deadline
  VZ1Y, 1 Year
  VZ2Y, 2 Year
  B1Y, 1 Year
];

//afterwards we build a mapping table
Discount_Mapping:
MAPPING LOAD
  Discount,
  Deadline
RESIDENT Discount Deadlines;

//don't forget to delete the table
DROP TABLE Discount Deadlines;

//The card thus created can now be placed on the orders
Orders:
LOAD 
  *,
  applymap('Discount_Mapping',Discount,Discount) AS Discount Deadline
;
SQL SELECT * FROM Orders;

IntervallMatch Function

Classification of values based on a lower and upper value. This function however can also be used for more complex questions like a time course analysis. To understand the principle behind the function, it is certainly easier to just use the example of a real estate agency, which wants to make a classification based on size and living space.

//at first we load the apartments
Apartments:
LOAD 
 ApartmentId,
 Living space,
 Postcode;
SQL SELECT * FROM Apartments WHERE Living space > 0;

//Loading of the sizegroups
Sizegroups:
LOAD * INLINE
[
 VonQM, BisQM, SizeGroup
 0, 10, small
 10.0001, 40, intermediate
 40.0001, 100, big
 100.0001, 99999, gigantic 
];

//now we assign the sizegroups with the living space
IVMatch:
IntervalMatch(Living space)
LOAD DISTINCT VonQM, BisQM RESIDENT Sizegroups;

Can also be used for events and leads to the following datamodel in the background:

Intervalmatch

Performance – Linking of moving tables

The linking of moving tables – also with different column amounts and fields – gives, as long as all key fields are “overlapped”, a big performance improvement. As long as all key fields are included it is harmless for the later drilling in detail. Should however not all key fields be overlapping it is still possible, but the drilling can lead to breaks in detail, for which one might have to “explain” themselves.

Attention: QlikView atutomatically merges tables with the same structure. Which can lead to big problems, but also creates great opportunities. I will show you how to control this behaviour with the following example

//automatic merge
LOAD a, b, c FROM Table1.csv;
LOAD a, c, b FROM Table2.csv;

//manual merge
LOAD a, b, c FROM Table1.csv;
CONCATENATE LOAD a, c FROM Table2.csv;

//manual "not" merge
LOAD a, b, c FROM Table1.csv;
NOCONCATENATE LOAD a, b, c FROM Table2.csv;

In the last case you should keep working and change the table because there is a high risk of your datamodel to “burst”.

Execution of a function for every line of a table

QlikView permits you to create certain functions on your own and to request them with parameters. The easiest example is the writing of a message into a Log-file. (Just concerning the technique, don’t question the use!)

//Definition of the function
SUB Qvc.Log (_msg, _level)
 ...
END SUB;

//here we load the initial table
Persons:
LOAD 
  id,
  name
FROM Table1.csv;

//how many lines the table has
LET varNoOfRecords = NoOfRows('Personen');

//creation of an empty event table
LOAD 
 'F' as DummyField
AUTOGENERATE (1);

//start of a loop through the table
FOR ii = 0 TO $(varNoOfRecords) 

 LET varName = peek('name',$(ii),'Persons');
 Qvc.Log('Hello $(varName)');

NEXT;

 

Leave a Reply