Edit: This is the first of a series of articles (I hope!) in which I’m trying to teach myself about BI. Any articles I write that are related to this, starting with this one, will be preceded with “#BI101” in the title.
As I stated in a previous article, one topic about which I’m interested in learning more is business intelligence (BI). For those of you who are new to BI, it is a broad topic. In a nutshell, it can probably be described as “consuming and interpreting data so it can be used for business decisions and/or applications.”
I’ll admit that I don’t know a lot about BI (at least the fine details, anyway). I did work a previous job where I touched upon it; I was tasked with performing some data analysis, and I was introduced to concepts such as OLAP cubes and pivot tables. I’ve gotten better at creating pivot tables — I’ve done a few of them using MS Excel — but I’ll admit that I’m still not completely comfortable with building cubes. I suppose that’ll come as I delve further into this.
A while back, my friend, Paresh Motiwala, suggested that I submit a presentation for Boston SQL Saturday BI edition. At the time, I said to him, “the only thing I know about BI is how to spell it!” He said to me (something like), “hey, you know how to spell SQL, don’t you?” Looking back at the link, I might have been able to submit (I didn’t realize, at the time, that they were running a professional development track). That said, Paresh did indeed had a point. As I often tell people, I am not necessarily a SQL expert — I know enough SQL to be dangerous — nevertheless, that does not stop me from applying to speak at SQL Saturday. Likewise, as I dive further into this topic, I’m finding that I probably know more about BI than I’ve led myself to believe. Still, there is always room for improvement.
To tackle this endeavor, once again, I decided to jump into this using a subject that I enjoy profusely: baseball. Baseball is my favorite sport, and it is a great source of data for stat-heads, mathematicians, and data geeks. I’ve always been of the opinion that if I’m going to learn something new, I should make it fun!
Besides, the use of statistical analysis in baseball has exploded. Baseball analytics is a big deal, ever since Bill James introduced sabermetrics (there is some debate as to whether James has enhanced or ruined baseball). So what better way to introduce myself to BI concepts?
For starters, I came across some articles (listed below, for my own reference as much as anything else):
- A Guide to Sabermetric Research
- Baseball Analytics with R
- Beginner’s guide to baseball analytics
- Sabermetrics (Wikipedia article)
I also posted a related question in the SSC forums. We’ll see what kind of responses (if any) I get to my query.
Let’s start with the basics — what is BI?
Since I’m using baseball to drive this concept, let’s use a baseball example to illustrate this.
Let’s say you’re (NY Yankees manager) Aaron Boone. You’re down by a run with two outs in the bottom of the 9th. You have Brett Gardner on first, Aaron Judge at bat, and you’re facing Craig Kimbrel on the mound.
What do you do? How does BI come into play here?
Let’s talk a little about what BI is. You have all these statistics available — Judge’s batting average, Kimbrel’s earned run average, Gardner’s stolen base percentage, and so on. In years BS — “before sabermetrics” — a manager likely would have “gone with his gut,” decided that Judge is your best bet to hit the game-winning home run, and let him swing away. But is this the best decision to make?
Let’s put this another way. You have a plethora of data available at your fingertips. BI represents the ability to analyze all this data and provide information that allows you to make a good decision.
If Aaron Boone (theoretically) had this data available at his fingertips (to my knowledge, Major League Baseball bans the use of electronic devices in the dugout during games), he could use the data to consider Kimbrel’s pitching tendencies, Judge’s career numbers against Kimbrel, and so on. BI enables Boone to make the best possible decision based upon the information he has at hand.
I do want to make one important distinction. In the above paragraphs, I used the words data and information. These two words are not interchangeable. Data refers to the raw numbers that are generated by the players. Information refers to the interpretation of that data. Therein lies the heart of what BI is — it is the process of generating information based upon data.
What’s there to know about BI?
I’ve already mentioned some buzzwords, including OLAP, cubes, and pivot tables. That’s just scratching the surface. There’s also KPIs, reporting services, decision support systems, data mining, data warehousing, and a number of others that I haven’t thought of at this point (if you have any suggestions, please feel free to add them in the comments section below). Other than including the Wikipedia definition links, I won’t delve too deeply into them now, especially when I’m trying to learn about these myself.
So why bother learning about BI?
I have my reasons for learning more about BI. Among other things…
- It is a way to keep myself technically relevant. I’ve written before about how difficult it is to stay up-to-date with technology. (For further reading regarding this, I highly recommend Eugene Meidinger’s article about keeping up with technology; he also has a related SQL Saturday presentation that I also highly recommend.) I feel that BI is a subject I’m able to grasp, learn about, and contribute. By learning about BI, I can continue making myself technically valuable, even as my other technical skills become increasingly obsolete. Speaking of which…
- It’s another adjustment. Again, I’ve written before about making adjustments to keep myself professionally relevant. If there’s one thing I’ve learned, it’s that if you want to survive professionally, you need to learn to adjust to your environment.
- It is a subject that interests me. I’m sure that many of you, as kids, had “imaginary friends.” (I’ll bet some adults have, too — just look at Lieutenant Kije and Captain Tuttle.) When I was a kid, I actually had an imaginary baseball team. I went as far as to create an entire roster full of fictitious ballplayers, even coming up with full batting and pitching statistics for them. My star player was a power-hitting second baseman who had won MVP awards in both the National and American leagues, winning several batting titles (including a Triple Crown) and leading my imaginary team to three World Series championships. I figured, if my interest in statistics went that far back, there must be something behind it. Granted, now that I’ve grown
upolder, I’m not as passionate about baseball statistics as I was as a kid, but some level of interest still remains, nevertheless.
- It is a baseline for learning new things. I’ve seen an increasing number of SQL Saturday presentations related to BI, such as PowerBI, reporting services, and R. I’m recognizing that these potentially have value for my workplace. But before I learn more about them, I also need to understand the fundamental baseline that they support. I feel that I need to learn the “language” of BI before I can learn about the tools that support it.
So, hopefully, this article makes a good introduction (for both you and myself) for talking about BI. I’ll try to write more as I learn new things. We’ll see where this journey goes, and I hope you enjoy coming along for the ride.