9   Hypothesis testing

In scientific studies, you’ll often see phrases like “the results are statistically significant”. This points to a technique called hypothesis testing, where we use p-values, a type of probability, to test our initial assumption or hypothesis.

In hypothesis testing, rather than providing an estimate of the parameter we’re studying, we provide a probability that serves as evidence supporting or contradicting a specific hypothesis. The hypothesis usually involves whether a parameter is different from a predetermined value (often 0).

Hypothesis testing is used when you can phrase your research question in terms of whether a parameter differs from this predetermined value. It’s applied in various fields, asking questions such as: Does a medication extend the lives of cancer patients? Does an increase in gun sales correlate with more gun violence? Does class size affect test scores?

Take, for instance, the previously used example with colored beads. We might not be concerned about the exact proportion of blue beads, but instead ask: Are there more blue beads than red ones? This could be rephrased as asking if the proportion of blue beads is more than 0.5.

The initial hypothesis that the parameter equals the predetermined value is called the “null hypothesis”. It’s popular because it allows us to focus on the data’s properties under this null scenario. Once data is collected, we estimate the parameter and calculate the p-value, which is the probability of the estimate being as extreme as observed if the null hypothesis is true. If the p-value is small, it indicates the null hypothesis is unlikely, providing evidence against it.

We will see more examples of hypothesis testing in Chapter 17 .

9.1 p-values

Suppose we take a random sample of \(N=100\) and we observe \(52\) blue beads, which gives us \(\bar{X} = 0.52\) . This seems to be pointing to the existence of more blue than red beads since 0.52 is larger than 0.5. However, we know there is chance involved in this process and we could get a 52 even when the actual \(p=0.5\) . We call the assumption that \(p = 0.5\) a null hypothesis . The null hypothesis is the skeptic’s hypothesis.

We have observed a random variable \(\bar{X} = 0.52\) , and the p-value is the answer to the question: How likely is it to see a value this large, when the null hypothesis is true? If the p-value is small enough, we reject the null hypothesis and say that the results are statistically significant .

The p-value of 0.05 as a threshold for statistical significance is conventionally used in many areas of research. A cutoff of 0.01 is also used to define highly significance . The choice of 0.05 is somewhat arbitrary and was popularized by the British statistician Ronald Fisher in the 1920s. We do not recommend using these cutoff without justification and recommend avoiding the phrase statistically significant .

To obtain a p-value for our example, we write:

\[\mbox{Pr}(\mid \bar{X} - 0.5 \mid > 0.02 ) \]

assuming the \(p=0.5\) . Under the null hypothesis we know that:

\[ \sqrt{N}\frac{\bar{X} - 0.5}{\sqrt{0.5(1-0.5)}} \]

is standard normal. We, therefore, can compute the probability above, which is the p-value.

\[\mbox{Pr}\left(\sqrt{N}\frac{\mid \bar{X} - 0.5\mid}{\sqrt{0.5(1-0.5)}} > \sqrt{N} \frac{0.02}{ \sqrt{0.5(1-0.5)}}\right)\]

In this case, there is actually a large chance of seeing 52 or larger under the null hypothesis.

Keep in mind that there is a close connection between p-values and confidence intervals. If a 95% confidence interval of the spread does not include 0, we know that the p-value must be smaller than 0.05.

To learn more about p-values, you can consult any statistics textbook. However, in general, we prefer reporting confidence intervals over p-values because it gives us an idea of the size of the estimate. If we just report the p-value, we provide no information about the significance of the finding in the context of the problem.

We can show mathematically that if a \((1-\alpha)\times 100\) % confidence interval does not contain the null hypothesis value, the null hypothesis is rejected with a p-value as smaller or smaller than \(\alpha\) . So statistical significance can be determined from confidence intervals. However, unlike the confidence interval, the p-value does not provide an estimate of the magnitude of the effect. For this reason, we recommend avoiding p-values whenever you can compute a confidence interval.

Pollsters are not successful at providing correct confidence intervals, but rather at predicting who will win. When we took a 25 bead sample size, the confidence interval for the spread:

included 0. If this were a poll and we were forced to make a declaration, we would have to say it was a “toss-up”.

One problem with our poll results is that, given the sample size and the value of \(p\) , we would have to sacrifice the probability of an incorrect call to create an interval that does not include 0.

This does not mean that the election is close. It only means that we have a small sample size. In statistical textbooks, this is called lack of power . In the context of polls, power is the probability of detecting spreads different from 0.

By increasing our sample size, we lower our standard error, and thus, have a much better chance of detecting the direction of the spread.

9.3 Exercises

  • Generate a sample of size \(N=1000\) from an urn model with 50% blue beads:

then, compute a p-value to test if \(p=0.5\) . Repeat this 10,000 times and report how often the p-value is lower than 0.05? How often is it lower than 0.01?

  • Make a histogram of the p-values you generated in exercise 1. Which of the following seems to be true?
  • The p-values are all 0.05.
  • The p-values are normally distributed; CLT seems to hold.
  • The p-values are uniformly distributed.
  • The p-values are all less than 0.05.

Demonstrate, mathematically, why see the histogram we see in exercise 2.

Generate a sample of size \(N=1000\) from an urn model with 52% blue beads:

Compute a p-value to test if \(p=0.5\) . Repeat this 10,000 times and report how often the p-value is larger than 0.05? Note that you are computing 1 - power.

  • Repeat exercise for but for the following values:

Plot power as a function of \(N\) with a different color curve for each value of p .

Hypothesis Testing – A Deep Dive into Hypothesis Testing, The Backbone of Statistical Inference

  September 21, 2023

Explore the intricacies of hypothesis testing, a cornerstone of statistical analysis. Dive into methods, interpretations, and applications for making data-driven decisions.

hypothesis testing gfg

In this Blog post we will learn:

  • What is Hypothesis Testing?
  • Steps in Hypothesis Testing 2.1. Set up Hypotheses: Null and Alternative 2.2. Choose a Significance Level (α) 2.3. Calculate a test statistic and P-Value 2.4. Make a Decision
  • Example : Testing a new drug.
  • Example in python

1. What is Hypothesis Testing?

In simple terms, hypothesis testing is a method used to make decisions or inferences about population parameters based on sample data. Imagine being handed a dice and asked if it’s biased. By rolling it a few times and analyzing the outcomes, you’d be engaging in the essence of hypothesis testing.

Think of hypothesis testing as the scientific method of the statistics world. Suppose you hear claims like “This new drug works wonders!” or “Our new website design boosts sales.” How do you know if these statements hold water? Enter hypothesis testing.

2. Steps in Hypothesis Testing

  • Set up Hypotheses : Begin with a null hypothesis (H0) and an alternative hypothesis (Ha).
  • Choose a Significance Level (α) : Typically 0.05, this is the probability of rejecting the null hypothesis when it’s actually true. Think of it as the chance of accusing an innocent person.
  • Calculate Test statistic and P-Value : Gather evidence (data) and calculate a test statistic.
  • p-value : This is the probability of observing the data, given that the null hypothesis is true. A small p-value (typically ≀ 0.05) suggests the data is inconsistent with the null hypothesis.
  • Decision Rule : If the p-value is less than or equal to α, you reject the null hypothesis in favor of the alternative.

2.1. Set up Hypotheses: Null and Alternative

Before diving into testing, we must formulate hypotheses. The null hypothesis (H0) represents the default assumption, while the alternative hypothesis (H1) challenges it.

For instance, in drug testing, H0 : “The new drug is no better than the existing one,” H1 : “The new drug is superior .”

2.2. Choose a Significance Level (α)

When You collect and analyze data to test H0 and H1 hypotheses. Based on your analysis, you decide whether to reject the null hypothesis in favor of the alternative, or fail to reject / Accept the null hypothesis.

The significance level, often denoted by $α$, represents the probability of rejecting the null hypothesis when it is actually true.

In other words, it’s the risk you’re willing to take of making a Type I error (false positive).

Type I Error (False Positive) :

  • Symbolized by the Greek letter alpha (α).
  • Occurs when you incorrectly reject a true null hypothesis . In other words, you conclude that there is an effect or difference when, in reality, there isn’t.
  • The probability of making a Type I error is denoted by the significance level of a test. Commonly, tests are conducted at the 0.05 significance level , which means there’s a 5% chance of making a Type I error .
  • Commonly used significance levels are 0.01, 0.05, and 0.10, but the choice depends on the context of the study and the level of risk one is willing to accept.

Example : If a drug is not effective (truth), but a clinical trial incorrectly concludes that it is effective (based on the sample data), then a Type I error has occurred.

Type II Error (False Negative) :

  • Symbolized by the Greek letter beta (ÎČ).
  • Occurs when you accept a false null hypothesis . This means you conclude there is no effect or difference when, in reality, there is.
  • The probability of making a Type II error is denoted by ÎČ. The power of a test (1 – ÎČ) represents the probability of correctly rejecting a false null hypothesis.

Example : If a drug is effective (truth), but a clinical trial incorrectly concludes that it is not effective (based on the sample data), then a Type II error has occurred.

Balancing the Errors :

hypothesis testing gfg

In practice, there’s a trade-off between Type I and Type II errors. Reducing the risk of one typically increases the risk of the other. For example, if you want to decrease the probability of a Type I error (by setting a lower significance level), you might increase the probability of a Type II error unless you compensate by collecting more data or making other adjustments.

It’s essential to understand the consequences of both types of errors in any given context. In some situations, a Type I error might be more severe, while in others, a Type II error might be of greater concern. This understanding guides researchers in designing their experiments and choosing appropriate significance levels.

2.3. Calculate a test statistic and P-Value

Test statistic : A test statistic is a single number that helps us understand how far our sample data is from what we’d expect under a null hypothesis (a basic assumption we’re trying to test against). Generally, the larger the test statistic, the more evidence we have against our null hypothesis. It helps us decide whether the differences we observe in our data are due to random chance or if there’s an actual effect.

P-value : The P-value tells us how likely we would get our observed results (or something more extreme) if the null hypothesis were true. It’s a value between 0 and 1. – A smaller P-value (typically below 0.05) means that the observation is rare under the null hypothesis, so we might reject the null hypothesis. – A larger P-value suggests that what we observed could easily happen by random chance, so we might not reject the null hypothesis.

2.4. Make a Decision

Relationship between $α$ and P-Value

When conducting a hypothesis test:

  • We first choose a significance level ($α$), which sets a threshold for making decisions.

We then calculate the p-value from our sample data and the test statistic.

Finally, we compare the p-value to our chosen $α$:

  • If $p−value≀α$: We reject the null hypothesis in favor of the alternative hypothesis. The result is said to be statistically significant.
  • If $p−value>α$: We fail to reject the null hypothesis. There isn’t enough statistical evidence to support the alternative hypothesis.

3. Example : Testing a new drug.

Imagine we are investigating whether a new drug is effective at treating headaches faster than drug B.

Setting Up the Experiment : You gather 100 people who suffer from headaches. Half of them (50 people) are given the new drug (let’s call this the ‘Drug Group’), and the other half are given a sugar pill, which doesn’t contain any medication.

  • Set up Hypotheses : Before starting, you make a prediction:
  • Null Hypothesis (H0): The new drug has no effect. Any difference in healing time between the two groups is just due to random chance.
  • Alternative Hypothesis (H1): The new drug does have an effect. The difference in healing time between the two groups is significant and not just by chance.
  • Choose a Significance Level (α) : Typically 0.05, this is the probability of rejecting the null hypothesis when it’s actually true

Calculate Test statistic and P-Value : After the experiment, you analyze the data. The “test statistic” is a number that helps you understand the difference between the two groups in terms of standard units.

For instance, let’s say:

  • The average healing time in the Drug Group is 2 hours.
  • The average healing time in the Placebo Group is 3 hours.

The test statistic helps you understand how significant this 1-hour difference is. If the groups are large and the spread of healing times in each group is small, then this difference might be significant. But if there’s a huge variation in healing times, the 1-hour difference might not be so special.

Imagine the P-value as answering this question: “If the new drug had NO real effect, what’s the probability that I’d see a difference as extreme (or more extreme) as the one I found, just by random chance?”

For instance:

  • P-value of 0.01 means there’s a 1% chance that the observed difference (or a more extreme difference) would occur if the drug had no effect. That’s pretty rare, so we might consider the drug effective.
  • P-value of 0.5 means there’s a 50% chance you’d see this difference just by chance. That’s pretty high, so we might not be convinced the drug is doing much.
  • If the P-value is less than ($α$) 0.05: the results are “statistically significant,” and they might reject the null hypothesis , believing the new drug has an effect.
  • If the P-value is greater than ($α$) 0.05: the results are not statistically significant, and they don’t reject the null hypothesis , remaining unsure if the drug has a genuine effect.

4. Example in python

For simplicity, let’s say we’re using a t-test (common for comparing means). Let’s dive into Python:

Making a Decision : “The results are statistically significant! p-value < 0.05 , The drug seems to have an effect!” If not, we’d say, “Looks like the drug isn’t as miraculous as we thought.”

5. Conclusion

Hypothesis testing is an indispensable tool in data science, allowing us to make data-driven decisions with confidence. By understanding its principles, conducting tests properly, and considering real-world applications, you can harness the power of hypothesis testing to unlock valuable insights from your data.

Statistics Tutorial

Descriptive statistics, inferential statistics, stat reference, statistics - hypothesis testing.

Hypothesis testing is a formal way of checking if a hypothesis about a population is true or not.

Hypothesis Testing

A hypothesis is a claim about a population parameter .

A hypothesis test is a formal procedure to check if a hypothesis is true or not.

Examples of claims that can be checked:

The average height of people in Denmark is more than 170 cm.

The share of left handed people in Australia is not 10%.

The average income of dentists is less the average income of lawyers.

The Null and Alternative Hypothesis

Hypothesis testing is based on making two different claims about a population parameter.

The null hypothesis (\(H_{0} \)) and the alternative hypothesis (\(H_{1}\)) are the claims.

The two claims needs to be mutually exclusive , meaning only one of them can be true.

The alternative hypothesis is typically what we are trying to prove.

For example, we want to check the following claim:

"The average height of people in Denmark is more than 170 cm."

In this case, the parameter is the average height of people in Denmark (\(\mu\)).

The null and alternative hypothesis would be:

Null hypothesis : The average height of people in Denmark is 170 cm.

Alternative hypothesis : The average height of people in Denmark is more than 170 cm.

The claims are often expressed with symbols like this:

\(H_{0}\): \(\mu = 170 \: cm \)

\(H_{1}\): \(\mu > 170 \: cm \)

If the data supports the alternative hypothesis, we reject the null hypothesis and accept the alternative hypothesis.

If the data does not support the alternative hypothesis, we keep the null hypothesis.

Note: The alternative hypothesis is also referred to as (\(H_{A} \)).

The Significance Level

The significance level (\(\alpha\)) is the uncertainty we accept when rejecting the null hypothesis in the hypothesis test.

The significance level is a percentage probability of accidentally making the wrong conclusion.

Typical significance levels are:

  • \(\alpha = 0.1\) (10%)
  • \(\alpha = 0.05\) (5%)
  • \(\alpha = 0.01\) (1%)

A lower significance level means that the evidence in the data needs to be stronger to reject the null hypothesis.

There is no "correct" significance level - it only states the uncertainty of the conclusion.

Note: A 5% significance level means that when we reject a null hypothesis:

We expect to reject a true null hypothesis 5 out of 100 times.


The Test Statistic

The test statistic is used to decide the outcome of the hypothesis test.

The test statistic is a standardized value calculated from the sample.

Standardization means converting a statistic to a well known probability distribution .

The type of probability distribution depends on the type of test.

Common examples are:

  • Standard Normal Distribution (Z): used for Testing Population Proportions
  • Student's T-Distribution (T): used for Testing Population Means

Note: You will learn how to calculate the test statistic for each type of test in the following chapters.

The Critical Value and P-Value Approach

There are two main approaches used for hypothesis tests:

  • The critical value approach compares the test statistic with the critical value of the significance level.
  • The p-value approach compares the p-value of the test statistic and with the significance level.

The Critical Value Approach

The critical value approach checks if the test statistic is in the rejection region .

The rejection region is an area of probability in the tails of the distribution.

The size of the rejection region is decided by the significance level (\(\alpha\)).

The value that separates the rejection region from the rest is called the critical value .

Here is a graphical illustration:

If the test statistic is inside this rejection region, the null hypothesis is rejected .

For example, if the test statistic is 2.3 and the critical value is 2 for a significance level (\(\alpha = 0.05\)):

We reject the null hypothesis (\(H_{0} \)) at 0.05 significance level (\(\alpha\))

The P-Value Approach

The p-value approach checks if the p-value of the test statistic is smaller than the significance level (\(\alpha\)).

The p-value of the test statistic is the area of probability in the tails of the distribution from the value of the test statistic.

If the p-value is smaller than the significance level, the null hypothesis is rejected .

The p-value directly tells us the lowest significance level where we can reject the null hypothesis.

For example, if the p-value is 0.03:

We reject the null hypothesis (\(H_{0} \)) at a 0.05 significance level (\(\alpha\))

We keep the null hypothesis (\(H_{0}\)) at a 0.01 significance level (\(\alpha\))

Note: The two approaches are only different in how they present the conclusion.

Steps for a Hypothesis Test

The following steps are used for a hypothesis test:

  • Check the conditions
  • Define the claims
  • Decide the significance level
  • Calculate the test statistic

One condition is that the sample is randomly selected from the population.

The other conditions depends on what type of parameter you are testing the hypothesis for.

Common parameters to test hypotheses are:

  • Proportions (for qualitative data)
  • Mean values (for numerical data)

You will learn the steps for both types in the following pages.

Why you should "accept" the null hypothesis when hypothesis testing, yuzheng sun, phd, the statement “you can never accept the null hypothesis you can only fail to reject it” is widely circulated but fundamentally flawed..

This misconception has caused confusion among many, even seasoned statisticians, about the nature of hypothesis testing.

a poll on whether or not to accept a null hypothesis

In this article, let's clear up the confusion by exploring:  

1) Where does this misunderstanding come from,  

2) Why people believe you shouldn't accept the null hypothesis, and  

3) Why and when you actually should accept it.

Where does this misconception come from?

The root cause is that modern textbooks often mix Fisher’s significance testing with Neyman-Pearson’s hypothesis testing, without explaining the key differences ( ref ).

It’s like blending the rules of two different sports. Can you touch the ball with your hands? It depends—are you playing basketball or soccer? Saying it’s illegal to touch the ball with your hands in basketball makes no sense, but that’s essentially what happens when people declare “accepting the null hypothesis” as wrong under the hypothesis testing framework.

So, can you "accept" the null hypothesis? In Fisher’s framework, no. But in Neyman-Pearson’s framework, yes—you can, and you must.

Why is “accepting the null hypothesis” bad under Fisher’s framework?

First, we can never definitively "prove" a hypothesis based on observations alone, because for any given observation, there are infinitely many possible hypotheses that could be true, each with different probabilities.

Second, in the strict Fisher p-value framework, there is no alternative hypothesis. While we may set a threshold for rejecting the null hypothesis (e.g., p < 0.05), there isn't a similarly clear rule for which alternative hypothesis we should “accept” if we fail to reject the null. This contrasts with the Neyman-Pearson framework, where there is a specific alternative hypothesis and its specific beta.

Third, Fisher’s original stance was that we shouldn't assume the null hypothesis is true with complete certainty. In fact, he didn’t support the idea of a fixed significance level (like 0.05) but instead suggested that the p-value should be seen as a continuous measure of evidence against the null hypothesis.

In sum, the dangers of using the term "accepting a hypothesis" in the p-value framework are:

Many people mistakenly interpret "accepting" the null as "proving" it, which is incorrect.

"Accepting the null hypothesis" isn't a rigorously defined concept and overlooks the core purpose of the test, which is to decide whether to reject the null hypothesis given our observations.

Therefore, within Fisher’s p-value framework, calling something "accepting the null hypothesis" is essentially invalid. But note that terms like alternative hypothesis, alpha, beta, power, and minimum detectable effect (MDE) are also out of place in this context. If someone uses these terms while calling accepting a hypothesis illegal, he is inconsistent.

Why is “accepting the null hypothesis” necessary under the Neyman-Pearson framework?

In the Neyman-Pearson framework, "accepting" both the null and alternative hypotheses is not only allowed but necessary. However, "accepting" a hypothesis doesn’t mean you believe it; it simply means you act as though it’s true.

In hypothesis testing, remember how we can outline the null and alternative hypotheses and precisely calculate alpha and beta? This process isn't possible unless we temporarily assume one of the hypotheses is true.

The Neyman-Pearson framework, often referred to as the hypothesis testing framework, is a mathematically consistent approach for linking observations, hypotheses, and decision rules, all while calculating the relevant probabilities. I have a 15-minute video that walks you through this framework with helpful visualizations. I highly recommend you to watch this video if you have been confused by textbooks.

Optional: Fisher’s Framework vs Neyman-Pearson’s Framework

If this is your first time learning that these two are different frameworks, please forget what you’ve learned for a minute and treat these two frameworks as basketball and soccer. Below is a comparison between these two frameworks.

Fisher's framework:

Focus on significance testing: Fisher introduced the concept of significance testing, where the primary goal is to assess whether the observed data provide strong enough evidence to reject the null hypothesis.

Null hypothesis as a default assumption: In Fisher's approach, the null hypothesis ((H_0)) represents a default position that there is no effect or no difference. It is a specific hypothesis that is tested without necessarily specifying an alternative hypothesis.

Use of p-values: The p-value is central in Fisher's framework. It measures the probability of observing data as extreme as (or more extreme than) the observed data, assuming the null hypothesis is true. A smaller p-value indicates stronger evidence against (H_0).

No fixed significance level: Fisher did not advocate for a fixed significance level (like 0.05) but suggested that the p-value should be interpreted as a continuous measure of evidence against (H_0).

Neyman-Pearson framework:

Focus on hypothesis testing as decision-making: Neyman and Pearson formalized hypothesis testing as a decision-making process between two competing hypotheses: the null hypothesis ((H_0)) and the alternative hypothesis ((H_1)).

Null and alternative hypotheses treated symmetrically: Both (H_0) and (H_1) are explicitly defined, and tests are designed to decide between them based on the data.

Control of error rates: The framework introduces Type I error (rejecting (H_0) when it is true) and Type II error (failing to reject (H_0) when (H_1) is true). Significance levels (alpha) and power (1 - beta) are predetermined to control these error rates.

Use of critical regions: Instead of p-values, the Neyman-Pearson approach uses critical values and regions to decide whether to reject (H_0), based on the likelihood ratio or test statistic.

Main differences in defining the null hypothesis:

Purpose and interpretation:

Fisher: The null hypothesis is a provisional assumption to be tested. It is not necessarily meant to be accepted or rejected definitively but used to measure the strength of evidence against it.

Neyman-Pearson: The null hypothesis is one of two competing hypotheses, and the testing procedure is designed to make a clear decision to accept or reject (H_0) based on controlled error probabilities.

Role of the alternative hypothesis:

Fisher: The alternative hypothesis is often implicit or not formally specified. The focus is on assessing evidence against (H_0).

Neyman-Pearson: The alternative hypothesis ((H_1)) is explicitly defined, and tests are constructed to distinguish between (H_0) and (H_1).

Decision-making vs. evidence assessment:

Fisher: Emphasizes measuring evidence against (H_0) without necessarily making a final decision.

Neyman-Pearson: Emphasizes making a decision between (H_0) and (H_1), incorporating the long-run frequencies of errors.

In summary:

Fisher's Null Hypothesis: A unique, specific hypothesis tested to see if there is significant evidence against it, using p-values as a measure of evidence.

Neyman-Pearson's Null Hypothesis: One of two explicitly defined hypotheses in a decision-making framework, where tests are designed to control error rates and decide between (H_0) and (H_1).


Fisher, R.A. (1925). Statistical Methods for Research Workers.

Neyman, J., & Pearson, E.S. (1933). "On the Problem of the Most Efficient Tests of Statistical Hypotheses." Philosophical Transactions of the Royal Society A , 231(694-706), 289-337.

The hypothesis is a common term in Machine Learning and data science projects. As we know, machine learning is one of the most powerful technologies across the world, which helps us to predict results based on past experiences. Moreover, data scientists and ML professionals conduct experiments that aim to solve a problem. These ML professionals and data scientists make an initial assumption for the solution of the problem.

This assumption in Machine learning is known as Hypothesis. In Machine Learning, at various times, Hypothesis and Model are used interchangeably. However, a Hypothesis is an assumption made by scientists, whereas a model is a mathematical representation that is used to test the hypothesis. In this topic, "Hypothesis in Machine Learning," we will discuss a few important concepts related to a hypothesis in machine learning and their importance. So, let's start with a quick introduction to Hypothesis.

It is just a guess based on some known facts but has not yet been proven. A good hypothesis is testable, which results in either true or false.

: Let's understand the hypothesis with a common example. Some scientist claims that ultraviolet (UV) light can damage the eyes then it may also cause blindness.

In this example, a scientist just claims that UV rays are harmful to the eyes, but we assume they may cause blindness. However, it may or may not be possible. Hence, these types of assumptions are called a hypothesis.

The hypothesis is one of the commonly used concepts of statistics in Machine Learning. It is specifically used in Supervised Machine learning, where an ML model learns a function that best maps the input to corresponding outputs with the help of an available dataset.

There are some common methods given to find out the possible hypothesis from the Hypothesis space, where hypothesis space is represented by and hypothesis by Th ese are defined as follows:

It is used by supervised machine learning algorithms to determine the best possible hypothesis to describe the target function or best maps input to output.

It is often constrained by choice of the framing of the problem, the choice of model, and the choice of model configuration.

. It is primarily based on data as well as bias and restrictions applied to data.

Hence hypothesis (h) can be concluded as a single hypothesis that maps input to proper output and can be evaluated as well as used to make predictions.

The hypothesis (h) can be formulated in machine learning as follows:


Y: Range

m: Slope of the line which divided test data or changes in y divided by change in x.

x: domain

c: intercept (constant)

: Let's understand the hypothesis (h) and hypothesis space (H) with a two-dimensional coordinate plane showing the distribution of data as follows:

Hypothesis space (H) is the composition of all legal best possible ways to divide the coordinate plane so that it best maps input to proper output.

Further, each individual best possible way is called a hypothesis (h). Hence, the hypothesis and hypothesis space would be like this:

Similar to the hypothesis in machine learning, it is also considered an assumption of the output. However, it is falsifiable, which means it can be failed in the presence of sufficient evidence.

Unlike machine learning, we cannot accept any hypothesis in statistics because it is just an imaginary result and based on probability. Before start working on an experiment, we must be aware of two important types of hypotheses as follows:

A null hypothesis is a type of statistical hypothesis which tells that there is no statistically significant effect exists in the given set of observations. It is also known as conjecture and is used in quantitative analysis to test theories about markets, investment, and finance to decide whether an idea is true or false. An alternative hypothesis is a direct contradiction of the null hypothesis, which means if one of the two hypotheses is true, then the other must be false. In other words, an alternative hypothesis is a type of statistical hypothesis which tells that there is some significant effect that exists in the given set of observations.

The significance level is the primary thing that must be set before starting an experiment. It is useful to define the tolerance of error and the level at which effect can be considered significantly. During the testing process in an experiment, a 95% significance level is accepted, and the remaining 5% can be neglected. The significance level also tells the critical or threshold value. For e.g., in an experiment, if the significance level is set to 98%, then the critical value is 0.02%.

The p-value in statistics is defined as the evidence against a null hypothesis. In other words, P-value is the probability that a random chance generated the data or something else that is equal or rarer under the null hypothesis condition.

If the p-value is smaller, the evidence will be stronger, and vice-versa which means the null hypothesis can be rejected in testing. It is always represented in a decimal form, such as 0.035.

Whenever a statistical test is carried out on the population and sample to find out P-value, then it always depends upon the critical value. If the p-value is less than the critical value, then it shows the effect is significant, and the null hypothesis can be rejected. Further, if it is higher than the critical value, it shows that there is no significant effect and hence fails to reject the Null Hypothesis.

In the series of mapping instances of inputs to outputs in supervised machine learning, the hypothesis is a very useful concept that helps to approximate a target function in machine learning. It is available in all analytics domains and is also considered one of the important factors to check whether a change should be introduced or not. It covers the entire training data sets to efficiency as well as the performance of the models.

Hence, in this topic, we have covered various important concepts related to the hypothesis in machine learning and statistics and some important parameters such as p-value, significance level, etc., to understand hypothesis concepts in a better way.

The statistical practice of hypothesis testing is widespread not only in statistics but also throughout the natural and social sciences. When we conduct a hypothesis test there a couple of things that could go wrong. There are two kinds of errors, which by design cannot be avoided, and we must be aware that these errors exist. The errors are given the quite pedestrian names of type I and type II errors. What are type I and type II errors, and how we distinguish between them? Briefly:

  • Type I errors happen when we reject a true null hypothesis
  • Type II errors happen when we fail to reject a false null hypothesis

We will explore more background behind these types of errors with the goal of understanding these statements.

Hypothesis Testing

The process of hypothesis testing can seem to be quite varied with a multitude of test statistics. But the general process is the same. Hypothesis testing involves the statement of a null hypothesis and the selection of a level of significance . The null hypothesis is either true or false and represents the default claim for a treatment or procedure. For example, when examining the effectiveness of a drug, the null hypothesis would be that the drug has no effect on a disease.

After formulating the null hypothesis and choosing a level of significance, we acquire data through observation. Statistical calculations tell us whether or not we should reject the null hypothesis.

In an ideal world, we would always reject the null hypothesis when it is false, and we would not reject the null hypothesis when it is indeed true. But there are two other scenarios that are possible, each of which will result in an error.

Type I Error

The first kind of error that is possible involves the rejection of a null hypothesis that is actually true. This kind of error is called a type I error and is sometimes called an error of the first kind.

Type I errors are equivalent to false positives. Let’s go back to the example of a drug being used to treat a disease. If we reject the null hypothesis in this situation, then our claim is that the drug does, in fact, have some effect on a disease. But if the null hypothesis is true, then, in reality, the drug does not combat the disease at all. The drug is falsely claimed to have a positive effect on a disease.

Type I errors can be controlled. The value of alpha, which is related to the level of significance that we selected has a direct bearing on type I errors. Alpha is the maximum probability that we have a type I error. For a 95% confidence level, the value of alpha is 0.05. This means that there is a 5% probability that we will reject a true null hypothesis. In the long run, one out of every twenty hypothesis tests that we perform at this level will result in a type I error.

Type II Error

The other kind of error that is possible occurs when we do not reject a null hypothesis that is false. This sort of error is called a type II error and is also referred to as an error of the second kind.

Type II errors are equivalent to false negatives. If we think back again to the scenario in which we are testing a drug, what would a type II error look like? A type II error would occur if we accepted that the drug had no effect on a disease, but in reality, it did.

The probability of a type II error is given by the Greek letter beta. This number is related to the power or sensitivity of the hypothesis test, denoted by 1 – beta.

How to Avoid Errors

Type I and type II errors are part of the process of hypothesis testing. Although the errors cannot be completely eliminated, we can minimize one type of error.

Typically when we try to decrease the probability one type of error, the probability for the other type increases. We could decrease the value of alpha from 0.05 to 0.01, corresponding to a 99% level of confidence . However, if everything else remains the same, then the probability of a type II error will nearly always increase.

Many times the real world application of our hypothesis test will determine if we are more accepting of type I or type II errors. This will then be used when we design our statistical experiment.

Statistics - Hypothesis testing

  • Data Analysis
  • Data Visualization
  • Machine Learning
  • Deep Learning
  • Computer Vision
  • Artificial Intelligence
  • AI ML DS Interview Series
  • AI ML DS Projects series
  • Data Engineering
  • Web Scrapping

P-Value: Comprehensive Guide to Understand, Apply, and Interpret

A p-value is a statistical metric used to assess a hypothesis by comparing it with observed data.

This article delves into the concept of p-value, its calculation, interpretation, and significance. It also explores the factors that influence p-value and highlights its limitations.

Table of Content

  • What is P-value?

How P-value is calculated?

How to interpret p-value, p-value in hypothesis testing, implementing p-value in python, applications of p-value, what is the p-value.

The p-value, or probability value, is a statistical measure used in hypothesis testing to assess the strength of evidence against a null hypothesis. It represents the probability of obtaining results as extreme as, or more extreme than, the observed results under the assumption that the null hypothesis is true.

In simpler words, it is used to reject or support the null hypothesis during hypothesis testing. In data science, it gives valuable insights on the statistical significance of an independent variable in predicting the dependent variable. 

Calculating the p-value typically involves the following steps:

  • Formulate the Null Hypothesis (H0) : Clearly state the null hypothesis, which typically states that there is no significant relationship or effect between the variables.
  • Choose an Alternative Hypothesis (H1) : Define the alternative hypothesis, which proposes the existence of a significant relationship or effect between the variables.
  • Determine the Test Statistic : Calculate the test statistic, which is a measure of the discrepancy between the observed data and the expected values under the null hypothesis. The choice of test statistic depends on the type of data and the specific research question.
  • Identify the Distribution of the Test Statistic : Determine the appropriate sampling distribution for the test statistic under the null hypothesis. This distribution represents the expected values of the test statistic if the null hypothesis is true.
  • Calculate the Critical-value : Based on the observed test statistic and the sampling distribution, find the probability of obtaining the observed test statistic or a more extreme one, assuming the null hypothesis is true.
  • Interpret the results: Compare the critical-value with t-statistic. If the t-statistic is larger than the critical value, it provides evidence to reject the null hypothesis, and vice-versa.

Its interpretation depends on the specific test and the context of the analysis. Several popular methods for calculating test statistics that are utilized in p-value calculations.




Used when dealing with large sample sizes or when the population standard deviation is known.

A small p-value (smaller than 0.05) indicates strong evidence against the null hypothesis, leading to its rejection.

Appropriate for small sample sizes or when the population standard deviation is unknown.

Similar to the Z-test

Used for tests of independence or goodness-of-fit.

A small p-value indicates that there is a significant association between the categorical variables, leading to the rejection of the null hypothesis.

Commonly used in Analysis of Variance (ANOVA) to compare variances between groups.

A small p-value suggests that at least one group mean is different from the others, leading to the rejection of the null hypothesis.

Measures the strength and direction of a linear relationship between two continuous variables.

A small p-value indicates that there is a significant linear relationship between the variables, leading to rejection of the null hypothesis that there is no correlation.

In general, a small p-value indicates that the observed data is unlikely to have occurred by random chance alone, which leads to the rejection of the null hypothesis. However, it’s crucial to choose the appropriate test based on the nature of the data and the research question, as well as to interpret the p-value in the context of the specific test being used.

The table given below shows the importance of p-value and shows the various kinds of errors that occur during hypothesis testing.

Correct decision based 
on the given p-value

Type I error

Type II error

Incorrect decision based 
on the given p-value

Type I error: Incorrect rejection of the null hypothesis. It is denoted by α (significance level). Type II error: Incorrect acceptance of the null hypothesis. It is denoted by ÎČ (power level)

Let’s consider an example to illustrate the process of calculating a p-value for Two Sample T-Test:

A researcher wants to investigate whether there is a significant difference in mean height between males and females in a population of university students.

Suppose we have the following data:

\overline{x_1} = 175

Starting with interpreting the process of calculating p-value

Step 1 : Formulate the Null Hypothesis (H0):

H0: There is no significant difference in mean height between males and females.

Step 2 : Choose an Alternative Hypothesis (H1):

H1: There is a significant difference in mean height between males and females.

Step 3 : Determine the Test Statistic:

The appropriate test statistic for this scenario is the two-sample t-test, which compares the means of two independent groups.

The t-statistic is a measure of the difference between the means of two groups relative to the variability within each group. It is calculated as the difference between the sample means divided by the standard error of the difference. It is also known as the t-value or t-score.

t = \frac{\overline{x_1} - \overline{x_2}}{ \sqrt{\frac{(s_1)^2}{n_1} + \frac{(s_2)^2}{n_2}}}

  • s1 = First sample’s standard deviation
  • s2 = Second sample’s standard deviation
  • n1 = First sample’s sample size
  • n2 = Second sample’s sample size

\begin{aligned}t &= \frac{175 - 168}{\sqrt{\frac{5^2}{30} + \frac{6^2}{35}}}\\&= \frac{7}{\sqrt{0.8333 + 1.0286}}\\&= \frac{7}{\sqrt{1.8619}}\\& \approx  \frac{7}{1.364}\\& \approx 5.13\end{aligned}

So, the calculated two-sample t-test statistic (t) is approximately 5.13.

Step 4 : Identify the Distribution of the Test Statistic:

The t-distribution is used for the two-sample t-test . The degrees of freedom for the t-distribution are determined by the sample sizes of the two groups.

 The t-distribution is a probability distribution with tails that are thicker than those of the normal distribution.

df = (n_1+n_2)-2

  • where, n1 is total number of values for 1st category.
  • n2 is total number of values for 2nd category.

df= (30+35)-2=63

The degrees of freedom (63) represent the variability available in the data to estimate the population parameters. In the context of the two-sample t-test, higher degrees of freedom provide a more precise estimate of the population variance, influencing the shape and characteristics of the t-distribution.



The t-distribution is symmetric and bell-shaped, similar to the normal distribution. As the degrees of freedom increase, the t-distribution approaches the shape of the standard normal distribution. Practically, it affects the critical values used to determine statistical significance and confidence intervals.

Step 5 : Calculate Critical Value.

To find the critical t-value with a t-statistic of 5.13 and 63 degrees of freedom, we can either consult a t-table or use statistical software.

We can use scipy.stats module in Python to find the critical t-value using below code.

Comparing with T-Statistic:


The larger t-statistic suggests that the observed difference between the sample means is unlikely to have occurred by random chance alone. Therefore, we reject the null hypothesis.


  • p ≀ (α = 0.05) : Reject the null hypothesis. There is sufficient evidence to conclude that the observed effect or relationship is statistically significant, meaning it is unlikely to have occurred by chance alone.
  • p > (α = 0.05) : reject alternate hypothesis (or accept null hypothesis). The observed effect or relationship does not provide enough evidence to reject the null hypothesis. This does not necessarily mean there is no effect; it simply means the sample data does not provide strong enough evidence to rule out the possibility that the effect is due to chance.

In case the significance level is not specified, consider the below general inferences while interpreting your results. 

  • If p > .10: not significant
  • If p ≀ .10: slightly significant
  • If p ≀ .05: significant
  • If p ≀ .001: highly significant

Graphically, the p-value is located at the tails of any confidence interval. [As shown in fig 1]

hypothesis testing gfg

Fig 1: Graphical Representation 

What influences p-value?

The p-value in hypothesis testing is influenced by several factors:

  • Sample Size : Larger sample sizes tend to yield smaller p-values, increasing the likelihood of detecting significant effects.
  • Effect Size: A larger effect size results in smaller p-values, making it easier to detect a significant relationship.
  • Variability in the Data : Greater variability often leads to larger p-values, making it harder to identify significant effects.
  • Significance Level : A lower chosen significance level increases the threshold for considering p-values as significant.
  • Choice of Test: Different statistical tests may yield different p-values for the same data.
  • Assumptions of the Test : Violations of test assumptions can impact p-values.

Understanding these factors is crucial for interpreting p-values accurately and making informed decisions in hypothesis testing.

Significance of P-value

  • The p-value provides a quantitative measure of the strength of the evidence against the null hypothesis.
  • Decision-Making in Hypothesis Testing
  • P-value serves as a guide for interpreting the results of a statistical test. A small p-value suggests that the observed effect or relationship is statistically significant, but it does not necessarily mean that it is practically or clinically meaningful.

Limitations of P-value

  • The p-value is not a direct measure of the effect size, which represents the magnitude of the observed relationship or difference between variables. A small p-value does not necessarily mean that the effect size is large or practically meaningful.
  • Influenced by Various Factors

The p-value is a crucial concept in statistical hypothesis testing, serving as a guide for making decisions about the significance of the observed relationship or effect between variables.

Let’s consider a scenario where a tutor believes that the average exam score of their students is equal to the national average (85). The tutor collects a sample of exam scores from their students and performs a one-sample t-test to compare it to the population mean (85).

  • The code performs a one-sample t-test to compare the mean of a sample data set to a hypothesized population mean.
  • It utilizes the scipy.stats library to calculate the t-statistic and p-value. SciPy is a Python library that provides efficient numerical routines for scientific computing.
  • The p-value is compared to a significance level (alpha) to determine whether to reject the null hypothesis.

Since, 0.7059>0.05 , we would conclude to fail to reject the null hypothesis. This means that, based on the sample data, there isn’t enough evidence to claim a significant difference in the exam scores of the tutor’s students compared to the national average. The tutor would accept the null hypothesis, suggesting that the average exam score of their students is statistically consistent with the national average.

  • During Forward and Backward propagation: When fitting a model (say a Multiple Linear Regression model), we use the p-value in order to find the most significant variables that contribute significantly in predicting the output.
  • Effects of various drug medicines: It is highly used in the field of medical research in determining whether the constituents of any drug will have the desired effect on humans or not. P-value is a very strong statistical tool used in hypothesis testing. It provides a plethora of valuable information while making an important decision like making a business intelligence inference or determining whether a drug should be used on humans or not, etc. For any doubt/query, comment below.

The p-value is a crucial concept in statistical hypothesis testing, providing a quantitative measure of the strength of evidence against the null hypothesis. It guides decision-making by comparing the p-value to a chosen significance level, typically 0.05. A small p-value indicates strong evidence against the null hypothesis, suggesting a statistically significant relationship or effect. However, the p-value is influenced by various factors and should be interpreted alongside other considerations, such as effect size and context.

Frequently Based Questions (FAQs)

Why is p-value greater than 1.

A p-value is a probability, and probabilities must be between 0 and 1. Therefore, a p-value greater than 1 is not possible.

What does P 0.01 mean?

It means that the observed test statistic is unlikely to occur by chance if the null hypothesis is true. It represents a 1% chance of observing the test statistic or a more extreme one under the null hypothesis.

Is 0.9 a good p-value?

A good p-value is typically less than or equal to 0.05, indicating that the null hypothesis is likely false and the observed relationship or effect is statistically significant.

What is p-value in a model?

It is a measure of the statistical significance of a parameter in the model. It represents the probability of obtaining the observed value of the parameter or a more extreme one, assuming the null hypothesis is true.

Why is p-value so low?

A low p-value means that the observed test statistic is unlikely to occur by chance if the null hypothesis is true. It suggests that the observed relationship or effect is statistically significant and not due to random sampling variation.

How Can You Use P-value to Compare Two Different Results of a Hypothesis Test?

Compare p-values: Lower p-value indicates stronger evidence against null hypothesis, favoring results with smaller p-values in hypothesis testing.

  1. Understanding Hypothesis Testing

    Hypothesis testing is a statistical method that is used to make a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. It evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data.

  2. Hypothesis in Machine Learning

    A hypothesis is a function that best describes the target in supervised machine learning. The hypothesis that an algorithm would come up depends upon the data and also depends upon the restrictions and bias that we have imposed on the data. The Hypothesis can be calculated as: y = mx + b y =mx+b. Where, y = range. m = slope of the lines.

  3. Hypothesis Testing

    Discussion. This video delves into the fundamentals of hypothesis testing, how hypothesis testing works, its importance in drawing meaningful conclusions from data, and its applications in various fields. A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles ...

  4. Introduction to Hypothesis Testing with Examples

    Likelihood ratio. In the likelihood ratio test, we reject the null hypothesis if the ratio is above a certain value i.e, reject the null hypothesis if L(X) > 𝜉, else accept it. 𝜉 is called the critical ratio.. So this is how we can draw a decision boundary: we separate the observations for which the likelihood ratio is greater than the critical ratio from the observations for which it ...

  5. A Gentle Introduction to Statistical Hypothesis Testing

    A statistical hypothesis test may return a value called p or the p-value. This is a quantity that we can use to interpret or quantify the result of the test and either reject or fail to reject the null hypothesis. This is done by comparing the p-value to a threshold value chosen beforehand called the significance level.

  6. Hypothesis Testing in Machine Learning

    The process of hypothesis testing is to draw inferences or some conclusion about the overall population or data by conducting some statistical tests on a sample. The same inferences are drawn for different machine learning models through T-test which I will discuss in this tutorial. For drawing some inferences, we have to make some assumptions ...

  7. Introduction to Data Science

    9. Hypothesis testing. In scientific studies, you'll often see phrases like "the results are statistically significant". This points to a technique called hypothesis testing, where we use p-values, a type of probability, to test our initial assumption or hypothesis. In hypothesis testing, rather than providing an estimate of the parameter ...

  8. Hypothesis Testing

    Explore the intricacies of hypothesis testing, a cornerstone of statistical analysis. Dive into methods, interpretations, and applications for making data-driven decisions. In this Blog post we will learn: What is Hypothesis Testing? Steps in Hypothesis Testing 2.1. Set up Hypotheses: Null and Alternative 2.2. Choose a Significance Level (α) 2.3.

  9. A Complete Guide to Hypothesis Testing

    2. Photo from StepUp Analytics. Hypothesis testing is a method of statistical inference that considers the null hypothesis H ₀ vs. the alternative hypothesis H a, where we are typically looking to assess evidence against H ₀. Such a test is used to compare data sets against one another, or compare a data set against some external standard.

  10. Hypothesis Testing Made Easy for Data Science Beginners

    Steps of Hypothesis Testing. The steps of hypothesis testing typically involve the following process: Formulate Hypotheses: State the null hypothesis and the alternative hypothesis.; Choose Significance Level (α): Select a significance level (α), which determines the threshold for rejecting the null hypothesis.Commonly used significance levels include 0.05 and 0.01.

  11. Hypothesis Testing Formula

    Hypothesis testing helps check if the outcomes of an experiment are reliable. It works by comparing two statements: the null hypothesis (what we expect to happen) and the alternate hypothesis (a different possibility). Learn about hypothesis testing with practical examples.

  12. An Introduction to Hypothesis Testing

    Hypothesis testing is a statistical method that is used to make a statistical decision using experimental data. Hypothesis testing is basically an assumption that we make about a population parameter. It evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data.

  13. 9.1: Introduction to Hypothesis Testing

    In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis.The null hypothesis is usually denoted \(H_0\) while the alternative hypothesis is usually denoted \(H_1\). An hypothesis test is a statistical decision; the conclusion will either be to reject the null hypothesis in favor ...

  14. Everything you need to know about Hypothesis Testing in Machine Learning

    The null hypothesis represented as H₀ is the initial claim that is based on the prevailing belief about the population. The alternate hypothesis represented as H₁ is the challenge to the null hypothesis. It is the claim which we would like to prove as True. One of the main points which we should consider while formulating the null and alternative hypothesis is that the null hypothesis ...

  15. Statistics

    Hypothesis testing is based on making two different claims about a population parameter. The null hypothesis (H 0) and the alternative hypothesis (H 1) are the claims. The two claims needs to be mutually exclusive, meaning only one of them can be true. The alternative hypothesis is typically what we are trying to prove.

  16. PDF Lecture 14: Introduction to hypothesis testing (v2) Ramesh Johari

    o the sampling distribution un. r 0.The hypothesis testing recipeThe basic id. is:If the true parameter was 0...then T (Y) should look like it c. e from f(Y j 0).We compare the observed T (Y) to the sampling distribution under 0.If the observed T (Y) is unlik. ly under the sampling distribution given 0, we reject the null hy.

  17. Why you should "accept" the null hypothesis when hypothesis testing

    In hypothesis testing, remember how we can outline the null and alternative hypotheses and precisely calculate alpha and beta? This process isn't possible unless we temporarily assume one of the hypotheses is true. The Neyman-Pearson framework, often referred to as the hypothesis testing framework, is a mathematically consistent approach for ...

  18. T-test

    The t-test is a parametric test, meaning it makes certain assumptions about the data. Here are the key prerequisites for conducting a t-test. Hypothesis Testing: Hypothesis testing is a statistical method used to make inferences about a population based on a sample of data. P-value:

  19. Hypothesis in Machine Learning

    Where, Y: Range. m: Slope of the line which divided test data or changes in y divided by change in x. x: domain. c: intercept (constant) Example: Let's understand the hypothesis (h) and hypothesis space (H) with a two-dimensional coordinate plane showing the distribution of data as follows:. Now, assume we have some test data by which ML algorithms predict the outputs for input as follows:

  20. Type I vs. Type II Errors in Hypothesis Testing

    The process of hypothesis testing can seem to be quite varied with a multitude of test statistics. But the general process is the same. Hypothesis testing involves the statement of a null hypothesis and the selection of a level of significance. The null hypothesis is either true or false and represents the default claim for a treatment or ...

  21. Understanding Hypothesis Testing

    Understanding Hypothesis Testing. In this video, we will explore the concept of hypothesis testing in statistics. Hypothesis testing is a fundamental method used to make inferences about populations based on sample data. This tutorial is perfect for students, professionals, or anyone interested in enhancing their statistical analysis skills.

  22. Statistics

    Hypothesis testing is a set of formal procedures used by statisticians to either accept or reject statistical hypotheses. Statistical hypotheses are of two types: Null hypothesis, H0 H 0 - represents a hypothesis of chance basis. Alternative hypothesis, Ha H a - represents a hypothesis of observations which are influenced by some non-random cause.

  23. P-Value: Comprehensive Guide to Understand, Apply, and Interpret

    Output: t-statistic: -0.3895364838967159 p-value: 0.7059365203154573 Fail to reject the null hypothesis. The difference is not statistically significant. Since, 0.7059>0.05, we would conclude to fail to reject the null hypothesis.This means that, based on the sample data, there isn't enough evidence to claim a significant difference in the exam scores of the tutor's students compared to ...

  24. 8.1: The null and alternative hypotheses

    The Null hypothesis \(\left(H_{O}\right)\) is a statement about the comparisons, e.g., between a sample statistic and the population, or between two treatment groups. The former is referred to as a one-tailed test whereas the latter is called a two-tailed test. The null hypothesis is typically "no statistical difference" between the ...