Before the one goes with statistics, its important to know the difference between descriptive statistics and statistical inference. Descriptive statistics care about the procedures (for) the collection, summarization, and usually the presentation of quantitative data so that we can describe the relationship among the problem variables.

Statistical inference procedures are used to (make decisions) on the basis of (computed measures) taken from (samples). How both descriptive and inferential statistics correlate is simple; the computed measures used in inference process are in fact descriptive measures (e.g. average, variance), easy so far, isn't it?

Lets layout another point that shows the difference between the the two methods, now its time to talk about the workflow of the statistical data analysis process, in more simpler words, how statisticians and data analysts tackle any given problem? well, Assume that we are data analysts working for Cairo metro, and we have been requested to check if the number of trains scheduled to run every hour are sufficient to meet the expected passenger demand with the highest possible rates of passenger comfort or not? we lay-out this problem in step by step approach and we are going to use our skills in statistical data analysis to do the task. but how shall we start and what steps shall we go through?

The answer is depicted in the following diagram:

Now lets proceed with the above mentioned method to analyze such data, the target of the current phase is just to understand some basic characteristics of arrival behavior:

Gathering Data:

For our case study, we gather the data of passenger arrival at every metro line station, thanks for smart ticketing systems and metro gates applied in most of the metro networks to avail such data for us. Assume the data is stored into a database tables where every table stores a one day passenger arrival data. and the passenger arrivals are summed per minute not by individual passenger arrival event. The table columns are the line stations and there is a row for every minute.

Statistical inference procedures are used to (make decisions) on the basis of (computed measures) taken from (samples). How both descriptive and inferential statistics correlate is simple; the computed measures used in inference process are in fact descriptive measures (e.g. average, variance), easy so far, isn't it?

Lets layout another point that shows the difference between the the two methods, now its time to talk about the workflow of the statistical data analysis process, in more simpler words, how statisticians and data analysts tackle any given problem? well, Assume that we are data analysts working for Cairo metro, and we have been requested to check if the number of trains scheduled to run every hour are sufficient to meet the expected passenger demand with the highest possible rates of passenger comfort or not? we lay-out this problem in step by step approach and we are going to use our skills in statistical data analysis to do the task. but how shall we start and what steps shall we go through?

The answer is depicted in the following diagram:

Now lets proceed with the above mentioned method to analyze such data, the target of the current phase is just to understand some basic characteristics of arrival behavior:

Gathering Data:

For our case study, we gather the data of passenger arrival at every metro line station, thanks for smart ticketing systems and metro gates applied in most of the metro networks to avail such data for us. Assume the data is stored into a database tables where every table stores a one day passenger arrival data. and the passenger arrivals are summed per minute not by individual passenger arrival event. The table columns are the line stations and there is a row for every minute.