# Topic A1/A2 — Introducing Statistics + Tables & Graphs

## Topic A1 — Introducing Statistics

### Difference between descriptive and inferential statistics

Descriptive, well… describe a population or a sample. Tells about the features of the data. Examples of desciptive statistics?

• Climate change
• [Econ example] Tax data

Inferential stats start from a sample and try to generalize to the population. It also tries to draw relationships between variable, testing hypotheses and even draw predictions.

Bottom line: we need representativeness in our samples

### Data types

A good way to think about data types is to think in terms of two dimensions: time and individuals/subjects.

It’s a very simple model: time and subject can take either value ‘1’ or ‘many’. So a the start, you have 1 time point and 1 individual. But you can’t say much with that can you? That’s just anecdotal evidence. So let’s see what we get when we expand these dimensions.

### Cross-sectional data

A bunch of individuals, at one period in time

• [Class ex] Mortality rate
• Guiness consumption in this class in September

Like a snapshot: you observe 100ppl at time T

The key is that it only looks at one specific time point or (or period, summarized as a point).

### Time series data

One individual over time. It’s like a tracker measuring a specific variable over time. Like your step counter on your phone or smartwatch, or you personal daily consumption of Guiness over the month.

### Panel data

It’s a tracker on a bunch of people. So it would be getting all of you guys’ step counter and appending them together to form one dataset: a panel data set. In terms of Guiness consumption, it’s your and your classmates daily consumption over the month, for example.

[Show examples of how it appears in a table, stat software]

### Quantitative vs qualitative variables

Quantitative is basically stuff you can measure with numbers. Any examples?

• CO2 concentration in the atmosphere (continuous)
• Liters of Guiness (continuous)
• Pints of Guiness (0 to 2; more than 2 to 4; etc)
• Numbers of brothers and sisters (discrete, can’t slice it in smaller pieces)
• Elevators in a building (discrete)

Qualitative is describing a certain state or feature. Any examples?

• Computer is working / not working (state, nominal)
• Member of political party (nominal)
• Able to speak Irish (nominal)
• “Agree with this statement… (1) mostly not (2) not really (3) indifferent (4) a bit (5) mostly yes” (ordinal)

Coded in what is called a dummy variable.

[Done above]

[Done above]

### Interval vs Ratio

Interval when the interval between two values is meaningful.

Ratio has a clear definition of 0.0 of “it”: there is none (no quantity) when variable equals 0.

Temperature: C° or F° ar interval, Kelvins are ratio.

## Topic A2 — Tables & Graphs

In this topic we divide quantitative data in classes. It can be arbitrary, or based on some classification.

### Absolute frequency

It’s the absolute number of times this category appears in your data.

### Relative frequency

It’s the absolute times it appears divided by the total number of observations

### Cumulative frequency

When categories are ordered, you can “stack” the absolute frequencies (total goes to the total number of observations).

### Cumulative relative frequency

Same as above but with relative frequencies (total 100%).

### Histogram

Visual representation of frequency distribution. Two main characteristics:

• Heights: frenquency
• Width: class width

### Polygon

Same, but linking the top of the bars together

### Ogive

Plots the cumulative relative frequency. The x-axis represents the upper limits of each classes.

### Scatter plots

It’s a spatial representation of two variables and shows their relationship together.

When hours of the day vary, does Guiness consumption vary? If you plot time as the x-axis and pints on the y-axis, can we draw the scatter plot?