So this is the goal of training. If the model works well on 20% of data, you can conclude it will also predict well on future data which comes with unknown values. The trained model on 80% is unaware of 20% test data, right? So, the values of 20% of data (such as ball possession, number of shots, corners, etc) are similar to future unknown data for your model. Therefore if it works well on that, you can claim it will work well in the future as well. This is the essential goal of machine learning.