(Intermediate Statistics) 01. It Is Not About Statistics Anymore

12 Mar 2022 Math-Post Data Statistics

Intermediate Stats

Introduction

I have thought of building this series in long time ago, to make some advanced series of article for you to learning what is beyond. And because you’re reading this series, I am expecting that you’re likely familiar with the basics of statistics. You’re now ready to take it to the edge. That next level involves using what you know, picking up a few more tools and techniques at the intermediate level, and finally putting it all to use to help you answer more realistic questions by using real data.

statisticaly speaking, you should be ready to enter the world of the data analysis. This world’s an exciting one, with many options to explore and many tools available. But, as you may have guessed, you have to navigate this world very carefully, choosing the right methods for each situation and watch out some circumstances.

In this series (and be sure it is long one indeed 🤯), you can see that I’m including the underlying theories and ideas behind the methods and I am sure that by doing this it will help you make good decisions — and not just get into the point-and-click mode that today’s software packages offer.

In this part of the series (actually the next three 😅), you review the terms involved in statistics as they related to data analysis at the intermediate level. You get a glimpse of the impact that your results can have by seeing what these analysis techniques can do. You also gain insight into some of the common misuses of data analysis and their effects.

Now, let’s get started 😋😋😋.

It Is Not About Statistics Anymore

It used to be that statisticians were the only ones who really analyzed data. The reason for this is because the only computer programs that were availablethen were very complicated to use, requiring a great deal of knowledge about statistics to set up and carry out. The calculations were tedious and at times unpredictable and required a thorough understanding of the theories and methods behind the calculations to get correct and reliable answers.

Today, anyone who wants to analyze data can do it easily. Many user-friendly statistical software packages are made expressly for that purpose:

  • Microsoft Excel. Minitab. SAS. SPSS

Just to name a few. Free online programs are even available, such as:

  • Stat Crunch, to help you do just what it says — crunch your numbers and get an answer.

As you see in this section, the modern easy-to-use statistical packages are good in some ways, and not-so-good in other ways. The most important idea when applying statistical techniques to analyze data is to know what’s going on behind the number crunching, so you (not the computer) are in control of the analysis. That’s why knowledge of intermediate statistics is so critical.

Remembering the old days

Most of use are using this simple technique, even today, in order to determine whether methods gave different results, you had to write a computer program to do it, using code that you had to take a class to learn.

You had to type in your data in a specific way that the computer program demanded, after that, you had to submit your program to a mainframe computer and wait for the printer to print out your results. This is most of our day-to-day process, and this method was time consuming and a general all-around pain.

The statistical software packages have undergone an incredible evolution in the last 10 to 15 years, to the point where you can now enter your data quickly and easily in almost any format.

Moreover, the choices for data analysis are well organized and listed in pull-down menus. Now almost anyone can quickly see how to find the necessary procedure and tell the computer what to do. The results come instantly and successfully, and you can cut and paste them into a word-processing document without blinking an eye. For example, comparing the weight loss for people on different weight-loss programs now takes less than three clicks of the mouse to perform, which is great news for folks like me.

Many very useful and efficient statistical software packages exist, including:

  • SAS.
  • SPSS.
  • Data Desk.
  • Stat Crunch.
  • MS Excel.
  • Minitab

And each one has its own advantages and disadvantages (and its own users and protesters). My software of choice, and the one I reference throughout the whole series, is either R or Python (programming langauges not a software 🤣), Why I will use that??? Simply, because it’s very easy to use, the results are correct, the output is very clear and professional looking, and the software’s loaded with all the data-analysis techniques that are used in intermediate statistics as well as in this series.

The downside of statistical software

You may be wondering where the downside is in all of this. Is it too good to be true that what was once a tedious, complicated process for analyzing data has now become as easy as checking your e-mail on your cell phone? Yes and no.

Yes, it’s too good to be true that the software practically does everything for you — if you don’t pay attention to what the programs are really doing. Yes, it’s too good to be true if you don’t understand that conditions need to be checked in every situation before an analysis should be applied. Yes, it’s too good to be true if you take all the results as complete and utter gospel (as too many statistician wannabees do).

Bottom line: Today’s software packages are too good to be true if you don’t have a clear and thorough understanding of the intermediate level of statistics that lie underneath them. Here’s the good news, though. By following me throughout the series, you gain the understanding you need to set you up for success.

You get enough of the underlying intermediate statistical concepts to be empowered, but not be dangerous. You find out what conditions need to be checked on the data before applying an analysis and how to check them. You get a good feel for which analyses to use to answer your question (and which ones can cause you trouble), and you become aware of the kinds of results you can expect. Most importantly, you discover what’s possible and appropriate to conclude from your analysis and what limitations and caveats you need to make.

What Now 🤔

This part of the series is ends here, but the series continues. Stay tuned for the next part, if not uploaded yet 😅. And if you have any feedback, you can send me on email or twitter.

See You Soon, Hisham.

Search

    Table of Contents