Archive for category Data Warehousing
The Hawthorne Effect: soft benefit of BI?
Posted by Mike S in Business Intelligence, Data Warehousing on July 6th, 2009
An excerpt from The Nike Experiment, in the most recent issue of Wired:
In the mid-1920s at Western Electric’s manufacturing plant in Cicero, Illinois, the management began an experiment. The lighting in an area occupied by one set of workers was increased so there was better illumination to help them see the telephone relays they were building. Perhaps not surprisingly, workers who had more light were able to assemble relays faster.
Other changes were then made: Employees were given rest breaks. Their productivity increased. They were allowed to work shorter hours. Again, they were more efficient during those hours.
But then something weird happened. The lighting was cut back to normal … and productivity still went up. In fact, just about every change the company made had only one effect: increased worker productivity. After months of tinkering, the work conditions were returned to the original state, and workers built more relays than they did in the exact same circumstances at the start of the experiment.
What was happening? Why was it that no matter what the Hawthorne plant managers did, the workers just performed better? Researchers puzzled over the results, and some still doubt the details of the experiment’s protocols. But the study gave rise to what’s known in sociology as the Hawthorne effect.
The gist of the idea is that people change their behavior—often for the better—when they are being observed (which is why it’s sometimes called the observer effect). Those workers at Western Electric didn’t build more relays because there was more or less light or because they had more or fewer breaks. The Hawthorne effect posits that they built more relays simply because they knew someone was keeping track of how many relays they built.
It can already be difficult to quantify the exact ROI of Business Intelligence, but imagine the potential increase in service quality from implementing a system in a call center that allows a company to study detailed metrics down to the person level. According to the Hawthorne Effect, there may be improved performance not just from optimizing the number of staff working at a peak times or identifying call topics with long average call times that may indicate a need for additional training, but simply from the employees knowing that they can be more effectively measured and, in turn, held accountable. How does one quantify that?

Very interesting stuff. Wired is already my favorite magazine, and the focus of several articles (and the cover) of the latest issue is data and measurement, so I highly recommend it.
“In God we trust; all others must bring data.” W. Edwards Deming
Posted by Mike S in Business Intelligence, Data Warehousing on June 17th, 2009
I came across that Deming quote in a book I’m currently reading, Competing on Analytics: The New Science of Winning.
I had never really heard of Deming, but he is certainly an interesting character, and several of his ideas about systems and management are easily applicable to Business Intelligence and Data Warehousing projects. For instance, his 14 Points for Management, from Out of the Crisis:
- Create constancy of purpose toward improvement of product and service, with the aim to become competitive and stay in business, and to provide jobs.
- Adopt the new philosophy. We are in a new economic age. Western management must awaken to the challenge, must learn their responsibilities, and take on leadership for change.
- Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place.
- End the practice of awarding business on the basis of price tag. Instead, minimize total cost. Move towards a single supplier for any one item, on a long-term relationship of loyalty and trust.
- Improve constantly and forever the system of production and service, to improve quality and productivity, and thus constantly decrease cost.
- Institute training on the job.
- Institute leadership. The aim of supervision should be to help people and machines and gadgets to do a better job. Supervision of management is in need of overhaul, as well as supervision of production workers.
- Drive out fear, so that everyone may work effectively for the company.
- Break down barriers between departments. People in research, design, sales, and production must work as a team, to foresee problems of production and in use that may be encountered with the product or service.
- Eliminate slogans, exhortations, and targets for the work force asking for zero defects and new levels of productivity. Such exhortations only create adversarial relationships, as the bulk of the causes of low quality and low productivity belong to the system and thus lie beyond the power of the work force.
- a.) Eliminate work standards (quotas) on the factory floor. Substitute leadership. b.) Eliminate management by objective. Eliminate management by numbers, numerical goals. Substitute workmanship.
- a.) Remove barriers that rob the hourly worker of his right to pride of workmanship. The responsibility of supervisors must be changed from sheer numbers to quality. b.) Remove barriers that rob people in management and in engineering of their right to pride of workmanship. This means, inter alia, abolishment of the annual or merit rating and of management by objective.
- Institute a vigorous program of education and self-improvement.
- Put everyone in the company to work to accomplish the transformation. The transformation is everyone’s work. “Massive training is required to instill the courage to break with tradition. Every activity and every job is a part of the process.”

Check out his lengthy Wikipedia entry for more on his concepts and philosophies.
BI/DW and SaaS Stock Indexes
Posted by Mike S in Business Intelligence, Data Warehousing, SaaS on February 20th, 2009
Rick Sherman of the Data Doghouse has compiled Google spreadsheets tracking Business Intelligence/Data Warehousing- and Software as a Service-related stocks. Perhaps not surprisingly, they are faring better than the market as a whole.
(If you’re unfamiliar with the GoogleFinance functions, these spreadsheets actually refresh, unlike the ones on your desktop.)
I am not surprised for a few reasons.
- Business Intelligence is frequently cited as the top technological initiative in organizations, regardless of ups and downs in the market
- Well-run data warehousing implementations have high ROI and can help identify “found money”
- Business Intelligence expedites decisionmaking and empowers decisionmakers to find their own answers, which are critical in a volatile market
- SaaS is ostensibly a way to get BI for your company without adding significantly to your staff or spending money to train your existing staff on new technologies
- Perhaps there is some even anticipation of stimulus money making its way to these companies in the IT-related initiatives (enormous stimulus visualization from the Washington Post here)
Nice work, Rick.
Mint.com: secretly exposing people to BI and Data Warehousing concepts
Posted by Mike S in Business Intelligence, Data Warehousing on February 18th, 2009

If you’re like me, explaining to friends, strangers, and your parents what you do for a living can be an arduous task. (No, I don’t store different companies’ data in a warehouse somewhere, nor do I fix people’s computers.) When I recently registered with Mint.com, however, I saw parallels that could make understanding what Business Intelligence and Data Warehousing are and how they benefit companies simpler, because what is Mint if not an online, personal finance data warehouse?
Staging disparate data in one place
What’s your net worth? Just add your checking and savings account balances, plus the values of any investments you have - 401(k), IRA, individual stocks - and subtract the current balances of your credit cards, mortgage, student loans, car loans, bookie, and any other debts you may have. That’s about a dozen web sites I would have to visit (and countless passwords to remember) to calculate my current net worth, and it’s actually easier for me, because I have gone to the trouble of setting up online access to those accounts. Otherwise, I would have to peruse dated, tree-killing statements to retrieve numbers that change daily. And as timeconsuming as the task is, periodically repeating the process doesn’t make it any faster or easier.
Mint allows you to register all of your accounts in one place - checking and savings, investments, loans, and credit cards - and refresh the balances on demand, even keeping track of historical values. Any question you have about your finances can be answered almost instantly, for faster, better informed decisions.
For a company - a commercial bank, for instance - the equivalent would be tracking multiple lines of business: deposit accounts, loans, credit cards, wire transfers, etc. Without a data warehouse, they would be adding the bottom lines of each of them manually, in a spreadsheet, the same way you would calculate your net worth, despite the fact that they have a few billion more in assets. It’s a tedious process that is dangerously prone to error and not conducive “drilling” into the data, i.e. conducting detailed analyses of numbers that stand out to the report recipients, because the output is anything but dynamic.
Data warehouses can contain a company’s data from multiple source systems, plus custom data, like goals and projections, staged in a single place and organized in a way that is conducive to getting the data back out in the form of reports or dashboards. They keep historical data, and new rows from the source systems are added when the warehouse is refreshed, usually daily or weekly. At that point, the reports and dashboards can be smoothly updated, as they sit on a predictable data structure.
Data cleansing
How good is the data in Mint? It makes its best guess as to how to categorize your transactions, but if you want to use its budgeting capabilities, they had better be pretty accurate. For instance, you may think you have already exceeded your entertainment budget for the month, but then come to realize that your cable/internet bill was mistakenly classified in the Entertainment category. Mint gives you the ability to reclassify that transaction as well as having that rule persist for future transactions with the same description. 
For data warehouses, data quality is most always problematic. Transactions need to have valid reference data so they can be properly classified, addresses in customer or property data must be validated, and other cleansing/special rules must be done with data cleansing tools or freehand code. In practice, when creating a data warehouse, this can be a lengthy process, as all parties and departments must agree on data definitions, what is valid, how metrics are to be computed, and against what those metrics will be compared, e.g. last year, last month, goals, projections, etc. And every department has a set of rules regarding their data that only they know - “these transactions don’t count towards the total, those transactions are treated differently” - like the ability in Mint to tag rows as reimbursable, tax-related, etc. Fitting everything together within a predictable, repeatable, accepted framework is one of the most challenging aspects of data warehousing. Remember that the processing will ultimately be done by computers, and a human cannot feasibly eyeball every transaction (not in any budgets I’ve seen).
Data visualization, Key Performance Indicators (KPIs), and alerts
Now that your data is all clean and organized, you can confidently look at it.
| The main page is like a dashboard, showing your high-level balances, current amounts spent relative to targets, and any alerts you may have triggered for exceeding those targets.
A BI/DW term commonly used to describe these is KPIs. A metric is anything that is being measured (total dollars spent), while a Key Performance Indicator is a metric with the added context of being measured against a specific target (amount of budget spent). |
|
| A trending page has more spending details, as well as your spending history. Oddly, the “trend” page largely utilizes pie charts – a poor choice for displaying trends (PDF).
The transactions tab, pictured in the Data Cleansing section of this post, contains the row-level data. |
|
| An investments page shows how your current holdings are tracking relative to several common indices. As you can see, I have been killing the market the last few months, only losing about 20% of the total value of my portfolio. |
Some screenshots from Lifehacker
The functions are very familiar to BI tools: a high-level dashboard, various visual representations to help identify trends and outliers, and the ability to access row-level transactions. This is the realm where I spend much of my work time, personally.
A typical bank might have a dashboard containing high level information about revenue, profit, average balances, and new/lost customers, measured against targets for the current year to date and quarter to date. It may also have information about certain types of customers, regions, or services and products offered. A mature data warehouse might even have information about that bank compared to competitors or the market, but getting feeds of that data can be difficult, just as getting people to agree on targets is. Reports contain the granular data needed to investigate notable findings in the dashboard. Depending on the tool, some reports are dynamic, allowing for drilling and other interactivity, while others are static.
A bit more on KPIs in both Mint and BI tools: in both, the user can set rules that, if violated, trigger alerts that notify the appropriate parties. In Mint, if you exceed a budget, you can be notified via email or text message. Similarly, most BI tools have the ability to send notifications (email is most common) if business rules are violated, thereby enabling better exception management.
Data moves in one direction
When Mint alerts you that a credit card bill is due, it does not provide you the functionality to pay it, and you can’t reallocate your portfolio through their interface upon viewing your atrocious recent investment performance. You merely decide what to do by viewing their site, with no ability to manipulate the data.
Similarly, data in a warehouse only travels in one direction: from source systems into the warehouse, and then from the warehouse to various end user applications (reports, dashboards). It is extremely uncommon that an end user would be able to modify the data that they are viewing, as would be the case with data stored in a spreadsheet. The warehouse is the standard, and end users cannot write data back to it. In short, a warehouse expedites decisionmaking, but does not facilitate it.
—
Mint can do more than I have enumerated here, as can a warehouse, and I’m sure I’ve missed some similarities, but they do have quite a bit in common. Both ideally grant the user better, more accurate, more timely data, affecting behavior positively and leading to faster, better informed decisions. One major area in which they part ways, however, is that Mint is free.




