Scripting for the long haul

You know how Qlik recommends creating custom properties to ease administration of security rules in Qlik Sense? In a conceptually similar way, I am realizing the wisdom of creating flags to encompass business rules in the transform layer, whereas I might have otherwise done it differently before.

For instance, at one customer, there are several business rules governing who should be included in HR Headcount — employee status, employee type, etc. We also have several versions of Headcount stored in QVDs aggregated at different levels. Instead of rewriting that business rule every time we re-crunch the Employee data, I added a binary “Include in Headcount” flag one step upstream that could easily be used in each load when we created the different Headcount aggregations. I’m sure the downstream reloads are faster with simpler WHERE clauses, too, ex. WHERE ExampleFlag = 1 instead of several arguments.

Conversely, a while ago, the same customer asked us to exclude a few Items from one UI application, which seemed like a one-off business rule. As time went on, they asked for those same Items to be excluded from other applications, too, and in each case we rewrote the business rule independently. Oops. Now, if that list of Items changes, we will have to change a lot of applications to absorb it.

We could have used an include file or created a Item list in a QVD to specify the excluded Items in each application, but in retrospect I think the best and cleanest approach, again, would have been to just create a binary “exclude” flag in the Item dimension table, to be referenced in each application where it is relevant. The flag could still be defined using an externalized Item list, of course, but with only one place to change it in the script, that additional layer of complexity may not even be worth the trouble.

Live and learn. Each project can be of higher quality than the previous. Consider the long-term implications of what you are doing, even if it seems like a bit of extra work, up front.

NB: Add a field comment explaining the exact business rule encapsulated in your flags, ex. COMMENT FIELD ExampleFlag WITH ‘A pretty sweet comment…’; That way any time you are loading from the QVD you can always see what is baked into that field using the QVD’s XML header without having to look back at the script where it was generated.

Share

No Comments

Squeezing more performance out of a large, concatenated fact table

DID YOU KNOW?

  1. Several years ago, I discovered that joining to a resident table in QlikView causes the resulting table to be ordered by the key field, in that field’s original load order. See screenshot below. This would be like if you executed an ORDER BY and had an option for ascending original load order as a sequence.
  2. I also discovered that, similar to indexing a table in a database, ordering a table by a field commonly used in calculations, filters, and dimensions can improve the general calculation performance of an application. I found this to be especially true when ordering a table containing a dimension that was referenced in an Aggr() function. In general, though, for large applications with various point-in-time analyses, trend graphs, and no problematic Aggr usage, the central date field is a great candidate.

Preview of data model after joining Customer to Sales fact table on CustomerKey:

Customer Sort

Recently I got around to doing some more testing involving both variables and codifying this in a subroutine so it would be easy to leverage more widely to improve application performance.

To the naked eye, the three tested scenarios have identical data models, anchored with a 39M-row concatenated fact table: the only difference is in how that fact table was sorted in the final application.

  • “Field Grouping” is grouping like field values together, as would result from a join, not necessarily in a natural sequence for analysis. See #1 from above. The first scenario is what I believe the default would be if you simply built a concatenated fact table with no consideration given to table sorting at all. The second scenario is left joining the distinct Date values back to the fact table in order to group all rows for the same Date together in the final application.
  • “Field Ordering” is the result of doing a true ORDER BY at the end of the reload. See #2 above. In the third scenario, the like Date values are not just grouped together but are also in sequential numeric order. I thought this may have made a difference because it would make all of the days in the same month, year, etc. adjacent in the application’s final data model.
Field Grouping:
Row Type
Field Grouping:
Date
Field Ordering:
Date
Size on disk (compressed) 267MB 263MB 282MB
Size in RAM 2,473MB 2,474MB 2,474MB
Initial page calc time (avg) 5.9s 3.9s 3.9s
Peak RAM consumption during subroutine - 3,738MB 4,645MB
Subroutine run time - 29s 2m40s

As you can see, in my test, applying logic to the fact table to group like Date values together improved performance by a full third, lowering calculation time from 5.9 to 3.9 seconds. Not bad for so little additional work and reload time.

As you can also see, I did not observe better performance from ordering the dates sequentially (scenario 3) compared to just grouping the like values together (scenario 2). Because that script’s execution took significantly longer and used a lot more RAM (resident loading the entire concatenated fact table again, with an ORDER BY), I would stick with the lower overhead field grouping approach.

Here’s the subroutine script I used for testing, ready to be re-used. You just have to plug in the table name and the field whose values you want to group.

Sub Ax.IndexResidentTable( v_TableName, v_GroupFieldName )

	LET v_StartSubTimestamp = Now();
	TRACE [MESSAGE]:  Grouping like '$(v_GroupFieldName)' field values in table '$(v_TableName)' to improve calculation performance;

	LEFT JOIN ([$(v_TableName)])
	LOAD FieldValue('$(v_GroupFieldName)',RecNo()) as [$(v_GroupFieldName)]
	AUTOGENERATE FieldValueCount('$(v_GroupFieldName)');

	LET v_SubDuration = Interval(Now() - v_StartSubTimestamp, 'h:mm:ss');
	TRACE [MESSAGE]:  Finished grouping like '$(v_GroupFieldName)' field values in table '$(v_TableName)' in $(v_SubDuration);

	LET v_TableName = ;
	LET v_GroupFieldName = ;
	LET v_StartSubTimestamp = ;
	LET v_SubDuration = ;

End Sub

Here is what you see when you call the sub:

Sort Panel

That said, if you are already normalizing the data model to reduce hops between tables by combining tables, this subroutine would not even be needed. Based on what we have learned about joins, we need only to join the calendar table to the fact table last in the overall sequence of joins to group the like date values together in the final data model.

Share

1 Comment

Lasso filtering in Qlik Sense Extensions

A few months ago I created a lasso plugin for D3 that mimics the lasso functionality available in Qlik Sense out-of-the-box charts. The ultimate goal in building the plugin was to use it in Qlik Sense Extensions to enable selections in the same way the out of the box charts function.

Recently, Brian Booden and Ralf Becher built an awesome hexagonal binning extension that utilizes the d3 lasso plugin to enable selections. Through this process they gave me some great feedback on issues they found. Having resolved those, I’ve updated the library and would like to demonstrate how others can integrate the lasso plugin with their own Qlik Sense Extensions.

Brian and Ralf's extension

Brian and Ralf’s extension


Read the rest of this entry »

Share

1 Comment

US Choropleth Using QlikView Pivot Table

We had some conversations internally around the idea that it might be easier to read a choropleth if the regions on it (in this case US states) were all a uniform size and shape. It is possible to do this using QlikView’s standard pivot table. Here I’ll explain how to set this up and provide a sample application.

First, we start with a data set. We’ll need 2 tables. The first is our [State Data] that has the data we want to compare for each state. The other table, [Map Grid], has the row and column information that will tell us where to put each state on the map. I used a relatively simple data set for my [State Data] table – just a list of states and a random number between 0 and 1 to make it simple to use the ColorMix1 function on the front end. The [Map Grid] table will have 3 columns – the row, column and state name that will relate to our [State Data] table. We’ll still need to store rows for the blank spaces between states because otherwise there isn’t actually any data to show – there needs to be something there to represent a space.

Once you have your data set, you’ll load that into QlikView. The [State Data] table and [Map Grid] table should be related on the state field.

US Choropleth - script US Choropleth - data model

On the front end, create a pivot table and add the Row and Column from the [Map Grid] table as dimensions. Set the background color and text color properties to argb(0,0,0,0) – this will essentially hide the values Row and Column values from the user as they are not useful anyway.

US Choropleth - dim background color US Choropleth - dim text color

Go to the Presentation tab and turn on the always fully expanded option. We’re using a pivot table, but we don’t actually want the user to be able to use it like a pivot table.

US Choropleth - always fully expanded

In your expressions, add the state name field. I’ve set it up so the expression is actually only({1} State), so the grid always shows all states. Selections will be highlighted with colors instead of hiding the state in the chart.

US Choropleth - expressions

In the background and text color properties, we’re going to add an expression that will color the cells based on the value of the metric being measured. In the background color, first we’re checking to see if there is a valid state in the cell – if there’s not, we’ll just use that argb(0,0,0,0) value again to make it transparent. If there is a state name, we’ll use the ColorMix1 function to return a color between my c_RiskLight and c_RiskDark variables (light blue and dark blue) based on the value of Rating 1, which from the first step is a random value between 0 and 1. The text color is just going to check for the value of Rating 1 and return white if it is greater than .5 and black if it is less than .5. This will make it easier to see the state name against the different colored backgrounds.

US Choropleth - background color US Choropleth - text color

At this point, you have a standard looking pivot table, but it doesn’t look much like a map. You’ll want to take the Column dimension and drag that up towards the top over the State Name expression.

US Choropleth - unpivottedUS Choropleth - pivotingUS Choropleth - pivoted

There’s a few touch ups we can do to make this look a little nicer. Make the column width on the Row and Column dimensions as small as you can, turn off drop down select, and make the labels just blank spaces. On the style tab, click the background button on the bottom of the window and change the transparency to 100% (not the cell background color transparency – this will turn off all coloring on your states). Soften the cell borders between states by changing the cell borders transparency on the Style tab to 100% (or whatever you’re comfortable with). On the presentation tab, wrap the cell text to 2 or 3 lines and make the Data (Text) alignment centered for the State Name expression. On the Sort tab, un-check all boxes for both the row and column dimensions so when you select a state, it stays in place relative to the other states.

When you’re done, you should have something that looks like this:

US Choropleth - complete

You can see this on Qlik Branch here or git here.

Share

No Comments

Hands-On Workshop (Hollywood, FL): Unleash Your Intuition

Learn How to Bring Your Data to Life

Imagine an analytical tool so intuitive, anyone in your company could easily create personalized reports and dynamic dashboards to explore vast amounts of data to find meaningful insights.

That’s Qlik® Sense - a revolutionary self-service data visualization and discovery application designed for individuals, groups, and organizations.

Please join us for a half day, hands-on workshop where you can roll up your sleeves so that you can see the power of Business Discovery firsthand. During this session you will:

  • Gain an understanding of why associative, in-memory analysis is currently driving the change in the Business Intelligence market
  • See the power of visualization and how it enables your organization to make faster, smarter decisions
  • Construct a fully-functional dashboard application
  • Learn how to navigate your application and present it to your colleagues

Axis Group will also be demoing one of our visual analytics applications.Register today to reserve your seat! We look forward to seeing you.

June 18, 2015
11:30 a.m. – 4:00 p.m.
Cocktail reception to follow

Registration is from 11:30-1:00 p.m. Lunch will be provided

Hard Rock Hotel & Casino 
1 Seminole Way
Hollywood, FL 33314

Who Should Attend 
IT or BI professionals
Business Analysts

What is Required 
Laptop
Latest edition of Qlik Sense

Share

No Comments

Consistent font usage

“…Thirteen’s just obnoxious.”

Share

No Comments

We’re hiring QlikView consultants

Apologies for the lag between posts lately, but we at Axis Group are ridiculously busy…in a good way.

In fact, we could use the help of some additional experienced QlikView consultants, if you’re looking for an opportunity with our company. We’re hiring both engineer- and architect-level positions.

  • Required
    • QlikView knowledge
  • Recommended
    • Ability to use Tuftean in a sentence

See our job page for more information, but bear in mind that the location is negotiable if you are willing to travel.

We look forward to hearing from you.

Share

3 Comments

d3vl extension and updates to senseUtils

d3vl
I’ve uploaded a new extension to GitHub called the d3 visualization library, or d3vl for short. It is a collection of 30 examples from the d3js.org gallery incorporated into a single Qlik Sense extension. The goal is to show how easily d3 can be integrated with Qlik Sense.

View a demo of the d3vl and it’s charts here.

The d3vl also demonstrates the concept of using an extension as a library of charts, rather than a single chart. Using an extension as a library of similar charts has a few advantages, such as an easy installation, minimal impact on Sense’s chart object list, and common themes across the charts. In the future I hope to see more libraries like these, as opposed to one-off extensions. A great example would be a network extension that contains charts related to networking, all with a common styling and notation.

senseUtils
The senseUtils library has also been updated with new functions. I will write up a future post about how to use them, but here is a quick rundown:

  • pageExtensionData: pages an extension’s data; currently, the max page size for an extension is 10,000 cells, so anything beyond that will not be returned to your extension object. if you need to use more data than that, you can use this function
  • flattenData: flattens pages of extension data into one array. Currently, the page arrays are contained in an outer array. This function concatenates them into a single array
  • getDimLabel: gets the dimension labels for an extension
  • getMeasureLabel: gets the measure labels for an extension
  • setupContainer: sets up the container for an extension, which includes logic for emptying the container on redraw
  • extendLayout: extends the layout object by adding some useful properties. An example is a property on the qMatrix row data that has a pre-defined function for selecting that row’s values. This makes it possible to easily incorporate selections into visualization libraries like d3 without having to write new selection functions

A post demonstrating how to use these new functions will be available soon. In the meantime, d3vl uses a lot of these new methods and can be used as an example.

-Speros

Share

No Comments

Mastering QlikView Book Review

Stephen Redmond is a QlikView expert, as indicated by his book Mastering QlikView in which he clearly illustrates myriad advanced techniques from data modeling and performance tuning to expression writing and visualization best practices. While there is a disclaimer in the beginning of the book that the content caters to more advanced QlikView developers, I believe that Mastering QlikView is a must-read for anyone endeavoring to create QlikView applications. Though having in-depth knowledge of the tool does help to provide more context to the optimization portions, Redmond does an excellent job of giving the reader a basic overview of each chapter’s content prior to delving into more detail; it doesn’t read at all like a technical manual, but more like a tutorial.

Mastering QlikView starts with a very technical-heavy performance tuning and optimization guide with detailed instructions on how to debug any QlikView application for slow behavior; the section on using JMeter for user simulation is particularly well-written and explains each step carefully without assuming foreknowledge of the tool itself. Redmond also explains how to most efficiently acquire test data and provides links to sources he uses when testing a QlikView environment. He then goes on to detail the process of data modeling, theories on data warehousing, and performance testing on the front end – he even includes a guide on how to use Expressor to optimize the ETL process.

What struck me most about this book is that it not only answers the question of “How?” but also the question of “Why?”. Redmond gives sound reasoning and empirical proof of the benefits of using a variety techniques for each topic he covers. The section on data visualization is imperative for any developer to have in his or her back pocket in order to understand how and why users interact with QlikView applications (or really anything) the way that they do – he uses an example of a door with a handle that says “PUSH”, a situation to which I’m sure all of us can relate.

Mastering QlikView is a great addition to any QlikView developer’s bookshelf – even the most sophisticated developer would certainly come away with something new or at the very least a better understanding of QlikView. I will certainly be referring back to this book in my future engagements and would recommend that any QlikView developer, novice or expert, keeps Mastering QlikView close at hand while creating QlikView applications.

Share

No Comments

Implementing the D3 General Update Pattern into a Qlik Sense Extension

Animation in data visualization is a relatively new area that we still don’t know much about. Like any visual component of a visualization, animations used in the wrong way can be unnecessarily flashy and distracting. However, there is evidence that animations can also add value to visualizations when used appropriately.

In this interactive example, Mike Bostock demonstrates the concept of object constancy:

a graphical element that represents a particular data point…can be tracked visually through the transition. This lessens the cognitive burden by using preattentive processing of motion rather than sequential scanning of labels.

Animation therefore can help users track changes in the data. When filtering, this can help the user discern what changes their filters caused, as opposed to having to reset their bearings on a newly appearing chart. When data in a chart is filtered, there are three possible events that can happen: data elements either update with new properties, exit because they were filtered out of the data set, or enter because they are filtered in to the data set. These are the main events we handle in typical Qlik Sense extensions. For more on animation possibilities, check out https://www.youtube.com/watch?v=vLk7mlAtEXI#t=54

Read the rest of this entry »

Share

No Comments