First time here? Checkout the FAQ!
x
+1 vote
711 views
asked in Machine Learning by (180 points)  

So far, I have modeled on known historical data. What if there are variables known only after the fact?
Let me give you an example. I want to predict the outcome of the match, win, lose or draw. I use variables from previous games such as ball possession, number of shots, corners, etc. Let's say the Chelsea-Arsenal game is approaching Saturday. How am I supposed to build a model and predict the result if this data is not yet available for my event? What to do in such cases, is it possible to forecast such data?

 
  

1 Answer

0 votes
answered by (116k points)  
selected by
 
Best answer
Your answer is actually based on what we always do in machine learning. We collect datasets, split to training and testing set, we train using the training set, and evaluate performance based on the testing set.

Assume you have 100 matches with all statistics and parameters you want to use in the training (such as ball possession, number of shots, corners, etc). You can take 80 of these matches for training and the rest of 20 matches for evaluating the model you created based on 80% of data simply because you already know that "future" statistics and outcome to compare with the output of your model to check the performance.

I hope this answers your question.
commented by (180 points)  
Thank you for your response. I understand the part with data preparation and model evaluation. The question is how to use this model to predict unknown data?
commented by (116k points)  
So this is the goal of training. If the model works well on 20% of data, you can conclude it will also predict well on future data which comes with unknown values. The trained model on 80% is unaware of 20% test data, right? So, the values of 20% of data (such as ball possession, number of shots, corners, etc) are similar to future unknown data for your model. Therefore if it works well on that, you can claim it will work well in the future as well. This is the essential goal of machine learning.
...