### Skills to Demonstrate

- Concepts of central limit theorem
- Ability to use
**RANDBETWEEN**and**INDEX**functions in Excel

### Problem Statement

You are a Data Analyst with GlobalMart, an e-commerce giant in the business of selling Technology, Office Supplies and Furniture products.

As a part of your job, you are required to do a number of statistics related tasks on the data you have. This involves performing both descriptive as well as inferential statistics tasks.

Given the vast size of the data your organization generates, it's practically not possible to read all the sales data to perform any statistics. This means the parameters of distribution of your sales data can't be accurately calculated. You can only come to a close estimate of the mean of the population.

What would you do?

Recall Central Limit Theorem. What does it say?

Now, look at the distribution of existing sales data which you have :

Looks like it's highly positively skewed!

Now go ahead and use Excel to prove what Central Limit Theorem postulates is true or not!

Download the starter workbook here

Upload your work here

### My Solution

This article here helped me a lot how to think about implementing Central Limit Theorem in Excel

But hold on! Did you make an attempt to solve the case? I'd strongly suggest to make a submission above and then look at my solution.

Ready to proceed? Just click on the triangle icon below :

**Watch Solution**

### Points to Ponder

Before I close this case, I would leave you with some thoughts for which I am still looking for convincing answers. Will be happy to connect over our community and share ideas.

- What to do if the data doesn't follow CLT, what should we do to perform statistical analysis?

Here's an important caveat to be kept in mind to keep errors in check while using CLT

Source : http://www.statisticalengineering.com/central_limit_theorem_fineprint.htm

- How to determine sample size for a given data?

There's no consensus on the right or wrong sample size. Just to think from first principles, If sample sizes are too large then the variance too will be large. On the other hand, samples with small sizes may be representing the population enough.

Here's a good reference to read about ways to choose your sample size :

https://www.statisticshowto.com/probability-and-statistics/find-sample-size/

Wish to discuss your solution and solve more interesting cases?

Join us on Slack today!