Hypothesis Testing & Software Development
4. Hypothesis Testing & Software Development¶
Objectives
Understand the concepts and processes of hypothesis testing.
Use hypothesis testing to evaluate and interpret the performance of machine learning models.
Understand the software development process and reproducibility.
Learn to use GitHub for version control and collaboration.
Expected time to complete: 3 hours
Machine learning is a powerful tool for decision making and scientific discovery. However, the results of machine learning models are often not as straightforward as it seems to interpret, and we need to be careful when using machine learning models to make decisions or draw conclusions. For example, a model may have a high accuracy but it may not be clear whether the model is actually useful.
We typically develop software to implement the machine learning model. Different software development processes can lead to different results and costs. We need to be aware of the software development process and the reproducibility of the results. Moreover, nowadays, software development is often a collaborative process. We need to learn how to collaborate with others in software development and also ensure that the software is reproducible and maintainable.
In this chapter, we will learn about the concepts and processes of hypothesis testing and how to use hypothesis testing to evaluate and interpret the performance of machine learning models. We will also learn about the software development process and reproducibility, and how to use GitHub for version control and collaboration.
Process transparency: hypothesis testing
Starting point: One or multiple groups of data to make a decision or draw a conclusion
Define the null hypothesis and the alternative hypothesis
Compute a test statistic to summarise the strength of evidence against the null hypothesis
Compute a \(p\)-value to quantify the strength of evidence against the null hypothesis
Decide whether to reject the null hypothesis or not based on a chosen significance level
End point: Report the decision or conclusion
Process transparency: software development
Starting point: A problem to solve
Initiate the project with rationale, scope, and vision
Define the project’s objectives and requirements
Build the software
Test and evaluate the software in a controlled environment
Deploy the software in a production environment
End point: Deployed software working as expected