Statistics — Key Concepts & Formulas

Preview35 cardsMathematics

Term

Mean

Click card to reveal

1 / 6
👋 Tap = flip

All Terms (35)

Mean

The average of a data set, calculated by dividing the sum of all values by the number of values.

Preview

Median

The middle value in a data set when the values are arranged in ascending order.

Preview

Mode

The value that appears most frequently in a data set.

Preview

Standard Deviation

A measure of the amount of variation or dispersion in a set of values.

Preview

Variance

The square of the standard deviation, representing the average of the squared differences from the mean.

Preview
Sign up to unlock

Normal Distribution

A probability distribution that is symmetric about the mean, with a bell-shaped curve.

Sign up to unlock

What is a Z-score?

A measure of how many standard deviations an element is from the mean.

Sign up to unlock

Central Limit Theorem

The theorem stating that the sampling distribution of the sample mean approaches a normal distribution as the sample size becomes large.

Sign up to unlock

What does correlation measure?

The strength and direction of a linear relationship between two variables.

Sign up to unlock

Pearson Correlation Coefficient

A measure of the linear correlation between two variables, ranging from -1 to 1.

Sign up to unlock

Sample Space

The set of all possible outcomes in a probability experiment.

Sign up to unlock

P-value

The probability of obtaining test results at least as extreme as the observed results, assuming that the null hypothesis is true.

Sign up to unlock

Type I Error

The error of rejecting a true null hypothesis (a false positive).

Sign up to unlock

Type II Error

The error of failing to reject a false null hypothesis (a false negative).

Sign up to unlock

Hypothesis Testing

A method of making decisions using data, whether from a controlled experiment or an observational study.

Sign up to unlock

Confidence Interval

A range of values that is likely to contain the population parameter with a certain level of confidence.

Sign up to unlock

Regression Analysis

A statistical process for estimating the relationships among variables.

Sign up to unlock

What is an outlier?

An observation point that is distant from other observations, often due to variability in the measurement or it may indicate experimental error.

Sign up to unlock

Binomial Distribution

A discrete probability distribution of the number of successes in a sequence of independent experiments.

Sign up to unlock

Poisson Distribution

A probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space.

Sign up to unlock

Cumulative Distribution Function (CDF)

A function that represents the probability that a random variable is less than or equal to a certain value.

Sign up to unlock

What is the purpose of a scatter plot?

To visualize the relationship between two quantitative variables and identify potential correlations.

Sign up to unlock

Chi-square Test

A statistical test used to determine the association between categorical variables.

Sign up to unlock

ANOVA (Analysis of Variance)

A statistical method used to compare means of three or more samples.

Sign up to unlock

What is the Law of Large Numbers?

A principle that states as the number of trials increases, the experimental probability of an event will get closer to the theoretical probability of the event.

Sign up to unlock

Bayes' Theorem

A formula that describes how to update the probabilities of hypotheses when given evidence.

Sign up to unlock

What is a random variable?

A variable whose possible values are numerical outcomes of a random phenomenon.

Sign up to unlock

Skewness

A measure of the asymmetry of the probability distribution of a real-valued random variable about its mean.

Sign up to unlock

Kurtosis

A measure of the 'tailedness' of the probability distribution of a real-valued random variable.

Sign up to unlock

What is the difference between descriptive and inferential statistics?

Descriptive statistics summarize data from a sample using indexes, while inferential statistics draw conclusions from data that are subject to random variation.

Sign up to unlock

What is the difference between a population and a sample?

A population includes all elements from a set of data, while a sample consists of one or more observations drawn from the population.

Sign up to unlock

What is a probability density function (PDF)?

A function that describes the likelihood of a random variable to take on a particular value.

Sign up to unlock

What is a t-test?

A statistical test used to determine if there is a significant difference between the means of two groups.

Sign up to unlock

What is heteroscedasticity?

A condition in which the variance of errors or the dependent variable is not the same across all levels of an independent variable.

Sign up to unlock

What is multicollinearity?

A situation in which two or more independent variables in a multiple regression model are highly correlated.

Sign up free to unlock all 30 cards

Free forever. No credit card needed.

Ready to study Statistics — Key Concepts & Formulas?

Free forever. No credit card needed.