#BI101: An introduction to BI using baseball

Edit: This is the first of a series of articles (I hope!) in which I’m trying to teach myself about BI.  Any articles I write that are related to this, starting with this one, will be preceded with “#BI101” in the title.

As I stated in a previous article, one topic about which I’m interested in learning more is business intelligence (BI).  For those of you who are new to BI, it is a broad topic.  In a nutshell, it can probably be described as “consuming and interpreting data so it can be used for business decisions and/or applications.”

I’ll admit that I don’t know a lot about BI (at least the fine details, anyway).  I did work a previous job where I touched upon it; I was tasked with performing some data analysis, and I was introduced to concepts such as OLAP cubes and pivot tables.  I’ve gotten better at creating pivot tables — I’ve done a few of them using MS Excel — but I’ll admit that I’m still not completely comfortable with building cubes.  I suppose that’ll come as I delve further into this.

A while back, my friend, Paresh Motiwala, suggested that I submit a presentation for Boston SQL Saturday BI edition.  At the time, I said to him, “the only thing I know about BI is how to spell it!”  He said to me (something like), “hey, you know how to spell SQL, don’t you?”  Looking back at the link, I might have been able to submit (I didn’t realize, at the time, that they were running a professional development track).  That said, Paresh did indeed had a point.  As I often tell people, I am not necessarily a SQL expert — I know enough SQL to be dangerous — nevertheless, that does not stop me from applying to speak at SQL Saturday.  Likewise, as I dive further into this topic, I’m finding that I probably know more about BI than I’ve led myself to believe.  Still, there is always room for improvement.

To tackle this endeavor, once again, I decided to jump into this using a subject that I enjoy profusely: baseball.  Baseball is my favorite sport, and it is a great source of data for stat-heads, mathematicians, and data geeks.  I’ve always been of the opinion that if I’m going to learn something new, I should make it fun!

Besides, the use of statistical analysis in baseball has exploded.  Baseball analytics is a big deal, ever since Bill James introduced sabermetrics (there is some debate as to whether James has enhanced or ruined baseball).  So what better way to introduce myself to BI concepts?

For starters, I came across some articles (listed below, for my own reference as much as anything else):

I also posted a related question in the SSC forums.  We’ll see what kind of responses (if any) I get to my query.

Let’s start with the basics — what is BI?

Since I’m using baseball to drive this concept, let’s use a baseball example to illustrate this.

Let’s say you’re (NY Yankees manager) Aaron Boone.  You’re down by a run with two outs in the bottom of the 9th.  You have Brett Gardner on first, Aaron Judge at bat, and you’re facing Craig Kimbrel on the mound.

What do you do?  How does BI come into play here?

Let’s talk a little about what BI is.  You have all these statistics available — Judge’s batting average, Kimbrel’s earned run average, Gardner’s stolen base percentage, and so on.  In years BS — “before sabermetrics” — a manager likely would have “gone with his gut,” decided that Judge is your best bet to hit the game-winning home run, and let him swing away.  But is this the best decision to make?

Let’s put this another way.  You have a plethora of data available at your fingertips.  BI represents the ability to analyze all this data and provide information that allows you to make a good decision.

If Aaron Boone (theoretically) had this data available at his fingertips (to my knowledge, Major League Baseball bans the use of electronic devices in the dugout during games), he could use the data to consider Kimbrel’s pitching tendencies, Judge’s career numbers against Kimbrel, and so on.  BI enables Boone to make the best possible decision based upon the information he has at hand.

I do want to make one important distinction.  In the above paragraphs, I used the words data and information.  These two words are not interchangeable.  Data refers to the raw numbers that are generated by the players.  Information refers to the interpretation of that data.  Therein lies the heart of what BI is — it is the process of generating information based upon data.

What’s there to know about BI?

I’ve already mentioned some buzzwords, including OLAP, cubes, and pivot tables.  That’s just scratching the surface.  There’s also KPIs, reporting services, decision support systems, data mining, data warehousing, and a number of others that I haven’t thought of at this point (if you have any suggestions, please feel free to add them in the comments section below).  Other than including the Wikipedia definition links, I won’t delve too deeply into them now, especially when I’m trying to learn about these myself.

So why bother learning about BI?

I have my reasons for learning more about BI.  Among other things…

  • It is a way to keep myself technically relevant.  I’ve written before about how difficult it is to stay up-to-date with technology.  (For further reading regarding this, I highly recommend Eugene Meidinger’s article about keeping up with technology; he also has a related SQL Saturday presentation that I also highly recommend.)  I feel that BI is a subject I’m able to grasp, learn about, and contribute.  By learning about BI, I can continue making myself technically valuable, even as my other technical skills become increasingly obsolete.  Speaking of which…
  • It’s another adjustment.  Again, I’ve written before about making adjustments to keep myself professionally relevant.  If there’s one thing I’ve learned, it’s that if you want to survive professionally, you need to learn to adjust to your environment.
  • It is a subject that interests me.  I’m sure that many of you, as kids, had “imaginary friends.”  (I’ll bet some adults have, too — just look at Lieutenant Kije and Captain Tuttle.)  When I was a kid, I actually had an imaginary baseball team.  I went as far as to create an entire roster full of fictitious ballplayers, even coming up with full batting and pitching statistics for them.  My star player was a power-hitting second baseman who had won MVP awards in both the National and American leagues, winning several batting titles (including a Triple Crown) and leading my imaginary team to three World Series championships.  I figured, if my interest in statistics went that far back, there must be something behind it.  Granted, now that I’ve grown up older, I’m not as passionate about baseball statistics as I was as a kid, but some level of interest still remains, nevertheless.
  • It is a baseline for learning new things.  I’ve seen an increasing number of SQL Saturday presentations related to BI, such as PowerBI, reporting services, and R.  I’m recognizing that these potentially have value for my workplace.  But before I learn more about them, I also need to understand the fundamental baseline that they support.  I feel that I need to learn the “language” of BI before I can learn about the tools that support it.

So, hopefully, this article makes a good introduction (for both you and myself) for talking about BI.  I’ll try to write more as I learn new things.  We’ll see where this journey goes, and I hope you enjoy coming along for the ride.

Advertisements

Want to learn about a topic? Try writing about it

Every now and then, I’ll peruse the forums on SSC.  In addition to people answering questions about SQL Server, there also tends to be some banter, which is probably not unusual in many forums of this nature.  One of the comments I’ve seen time and again is something like, “I learn more about subjects by answering people’s comments on these forums.”

There is more truth to this statement than people realize.  In my experience in writing about technology, I often find that I end up learning about the technology about which I’m writing — sometimes to the point of becoming a subject matter expert.

Several years ago, I taught part-time at a local business school (roughly equivalent to community college level).  I taught primarily general mathematics and a few computer classes.  I was once asked to fill in for another instructor who taught statistics.  My problem: I didn’t know much about statistics.  So, I read up on it (along with the syllabus that I would be teaching that day).  I wanted to at least be able to sound like I knew what I was talking about.  As it turned out, by teaching that class, I actually learned something about it.  The students were not the only ones who got an education that day.

The other day, I began writing a draft article regarding a subject about which I’d like to learn more: BI (edit: the now-finished article can be seen here.).  I’ve dabbled in BI a little bit; I worked a previous job where I was asked to perform some data analysis (which is how I learned about cubes and pivot tables), and I took a course in decision-support systems in grad school.  I’m seeing more SQL Saturday presentations about BI; indeed, there are even SQL Saturday conferences dedicated to BI topics (usually indicated by the words “BI Edition”).  It is a topic about which I do have some interest, and it’s something about which I’d like to learn more.

My friend, Paresh Motiwala, who was one of the organizers for SQL Saturday Boston BI Edition a while back, encouraged me to apply to speak at the event.  I said to him at the time, “the only thing I know about BI is how to spell it!”  (His response: “Hey, you know how to spell SQL!”)  On hindsight, I probably should have applied; it turns out that even BI Edition conferences accept professional development topics, under which nearly all of my presentations (so far) are categorized.

So if I claim to know so little about BI, why did I decide to start writing about it?  Well, I’m trying to learn about it, and I’d like to pass along what I learn.  But, I want to place a greater emphasis on the first part of that statement: I’m trying to learn about it.  Writing about it makes me learn something about it a little more in-depth.  And by doing so, I discover that I have a better grasp of the topic.

Hopefully, relatively soon, you’ll see an article from me about BI.  Hopefully, I’ll have learned enough writing about it that you’ll be able to learn something from me.  And hopefully, I’ll have demonstrated that I’m learning something new, and improving myself in the process.

If you want to learn something new, try teaching it or writing about it.  You’ll be surprised how much you, yourself, learn in the process.

Unite the world

“Hey you, don’t tell me there’s no hope at all; together we stand; divided, we fall…”
— Pink Floyd, Hey You

“An eye for an eye only makes the world blind.”
— Gandhi

“You may say I’m a dreamer, but I’m not the only one…”
— John Lennon, Imagine

“I have a dream…”
— Martin Luther King Jr.

Just for this one article, I am breaking my silence on all things political.

As is much of the country, I am outraged with what has happening at America’s southern border.  I have my opinions regarding the current administration, and what is happening to our country and around the world.

However, that is not the point of this article.  I am not going to write about my politics, my opinions, or my outrage.  Today, I want to write about something else.

It occurred to me this morning that, more than ever, we are being divided.  We are identified by our divisions: Democrat, Republican, liberal, conservative, and so on.  And that is the problem.

There have been studies performed in which individuals identify closely with groups to which they relate.  In these cases, people in groups will defend their groups, no matter what the groups are doing, and regardless of whether the groups’ actions are perceived as being good or bad, right or wrong.

I am not a psychologist, so I won’t pretend that I know anything about these studies (disclosure: I did do research on groupthink when I was in grad school).  Nevertheless, what they seem to reveal is that we relate strongly to the groups to which we relate.  And we will defend our groups, no matter how right or wrong the groups’ actions are.

I do understand the effects of group dynamics.  I say this because I am a sports fan, and few things test our group loyalties more than sports.  I root for the Yankees, Syracuse, and RPI.  As a result, I stand firmly behind my teams, and I tend to hold some contempt for the Red Sox, Mets, Georgetown, Boston College, Union, and Clarkson.  Many of my friends are Red Sox fans (heck, I’m married to one!), Mets fans, Union College, and Clarkson University alumni.  Yes, it is true that we will occasionally trash-talk each other when our teams face off against one another, but at the end of the day, they are just games and entertainment.  I will still sit down with them over a drink and pleasant conversation.

Likewise, I have many friends who are on both sides of the (major party) political aisle.  I have friends of many races, religions (or even atheists), cultures, and creeds.  However, no matter where they stand on their viewpoints, I respect each and every one of them.  And there, I believe, is the difference.  No matter where we stand, we need to listen to and respect the other side.  One of the issues regarding group identification is that we do not listen to the other side.  We lose complete respect and empathy for anyone who is our “opponent.”  That is where communication breaks down, and that is where divisions occur.

What we need is something that unites us.  We are not Democrats, Republicans, Christians, Jews, Muslims, Americans, Canadians, Europeans, Africans, Asians, white, black, yellow, or brown.

What we are is human.

Nelson Mandela united a divided South Africa behind rugby, a story depicted in the movie Invictus.  What will be our uniting moment?  For those of us in North America, I was thinking about something like the 2026 World Cup, but that is a long way off.

I don’t know what that something is, but we need to find it, and fast.  We are being torn apart by our divisions, and it could potentially kill us.  If you don’t believe me, take a look at our past history regarding wars and conflicts.  The American Civil War comes to mind.

I don’t know how much of a difference writing this article will make.  I am just one voice in the wilderness.  But if writing this contributes to changing the world for the better, then I will have accomplished something.

We now return you to your period of political silence.

SQL Saturday #741, Albany, NY, July 28 — the schedule is out

The schedule for SQL Saturday #741 in Albany is out!  (My presentation is scheduled for the first session of the morning.  Ugh!)

I will be doing a brand-new presentation (so new, in fact, that as of this article, my presentation slides are not yet finished!).

My new presentation is titled: “Networking: it isn’t just for breakfast anymore.”  It is based on my ‘blog article of the same name.  We will discuss networking, what it is, and why it’s important.  We’ll discuss where and how to network, and ways you can break the ice.  We’ll even have an opportunity to network within the confines of our room.  (I suppose an alternate presentation title could be, “Networking for beginners.”)

If you’re looking for networking opportunities or looking for ways to improve upon your networking skills, come check out my session!  Click this link to register for SQL Saturday #741, and join us in Albany, NY on Saturday, July 28!

See you there!

Better Comments

This is a reblog of a post by my friend, Steve Jones. I’ve often said that commenting code is a form of documentation, and needs to be done more.

Voice of the DBA

I assume most of your comment your code.

Well, you probably comment code most of the time.

I’d bet your comments have quite a bit of detail.

And you do this completely inconsistently.

That’s what I’d think, or maybe just what I want. Even the best developers I know will not consistently comment code. You can drift through any project on Github and see this. Those projects on GitHub might even be better documented because people know they are public. In most corporate environments I have worked in, I’ll find that when people get busy, or distracted, or even when they’re experimenting to find a solution, and they don’t write detailed comments. Usually only when someone fixes a bug, with a solution found quickly, do I get a really useful comment.

There are all sorts of ways that people think about commenting their code. I ran across a post from…

View original post 254 more words

Always ask someone to test your product

This morning, one of my colleagues posted this message to our Slack channel:

please ask someone else to test your code before pushing it

It brought to mind an important thought (and more ‘blog article fodder): any time you produce something, regardless of what it is — a software application, documentation, a presentation, a music composition, a dish you cooked, etc. — always ask someone else to test it out before you send it out for public consumption.

That testing could take several different forms — it could be an end user trying your application, somebody reading your document, listening to your presentation or your music, trying your dish, and so on.  Testing results in feedback, which results in improvements to your product.

Whenever we produce something, we have our own vision — and our own biases — as to how the product should come out.  We expect our products to be perfect as resulting from our own visions, and we expect (and demand) that the consumers adhere to our visions and how we expect the products to be viewed or interpreted.

Unfortunately, we are blinded by our biases.  The world does not share our same visions.  People who use our products will never, ever, perfectly interpret how our products should be consumed.  More often than not, we’ll find that what we produce will be used or interpreted in ways that never occurred to us.

Even in my own workplace, I write and edit a lot of online documentation.  Much of what I write comes from other sources, often about topics about which I know little (or, sometimes, nothing).  I try to write material based on the information I have at hand.  Very often, I come across gaps that need to be filled.  I’ll do my best to ask original authors what was intended, or to dig for information to fill those gaps.  But in absence of those resources, I end up making assumptions and using my own intuition to fill in the blanks.  Those assumptions might not necessarily be correct, and what I write could end up being different from what was originally intended.  It is for this reason why I am constantly asking my colleagues, “take a look at what I wrote.  I want to make sure what I wrote is accurate.”

In a manner of speaking, creating products is a form of communication — in that what we produce results from an idea in our heads, and the end users — the consumers — are the ones “listening” to the communication — in this case, the end product.  If you are familiar with the basic communication model, a sender creates a message, a receiver interprets the message, and the receiver reacts to the message in the form of feedback.  Producing a products works in exactly the same way — a producer creates a product, a consumer uses the product, and the consumer reacts to the product, generating feedback.  In between the sender and the receiver is “noise” that degrades the message or the product (it is not literally noise — the “noise” can simply be the fact that the sender’s and receiver’s interpretation of the message are not one and the same).

So, any time you create some kind of product, always ask someone else to try it out.  You’ll find that the person’s feedback will result in tweaks to your product.  And you will end up with a better product.

Humble beginnings

Once again, the Facebook “On This Day” memory feature shows it can be a curious thing.  And again, this is one I wanted to share.

The picture you see above showed up on my Facebook memories feed this morning.  Three years ago today, I gave a presentation at my local SQL Server user group meeting.  I had come up with a presentation idea that I thought would be of interest to my user group, as well as other technical professionals.  I jotted down some notes, put it into a presentation, and presented it at my local user group.

About a month later, I gave this very same presentation at our local SQL Saturday.  It was my first SQL Saturday presentation!

I was curious as to how other events would take to my presentation.  Later that year, I submitted it to, and was accepted at, another SQL Saturday.  It was my second time speaking at SQL Saturday, my first time speaking at an event in “foreign territory,” and my first SQL Saturday — speaking or attending — outside of New York State.

Since that humble beginning, I’ve spoken at 13 (soon to be 14) SQL Saturdays at seven different cities around the northeastern United States.  Thanks to this endeavor, I’ve traveled around the region, met a lot of great people, expanded my professional profile, started a ‘blog (that you’re reading right now!), enhanced my career, gained more confidence, improved my presentation skills, and become a better person.  This all came about because of these conferences and from this simple start three years ago.

I hope I’ll be doing many more!  Happy three year anniversary to me!