1,040 views
0 0 votes

Hi. I have a question about model-based predictions when data is only available after the fact. Let me give you an example. I try to predict the result (HOME, AWAY or a DRAW) of the match based on data like number of shots, ball possession, number of fouls, etc.

TARGET TEAM 1 TEAM 2

possesion

team 1

possesion

team 2

shots

team 1

shots

team 2

fouls

team 1

fouls

team 2

HOME Arsenal Chelsea 60 40 12 8 5 7

Let's say I'm already after training the model and I want to see if I can predict the upcoming match. However, this match is only a few days away and I want to know the result of the model today. I understand that if the match had already taken place and I had the data, I could test it on the model and get the result. The goal is for the model to predict what will happen before the match.

Is it possible at all? What are my options? Should I only select pre-match variables? For example, last game form, match referee etc or should I aggregate the variables and include average possession, average shots and average number of fouls from recent matches?

50% Accept Rate Accepted 1 answers out of 2 questions

1 Answer

0 0 votes

My recommendation:

Speak to or think as a football fun, obviously I am not a that type person :) 

Try to find out what can help us to predict "next" game's result as an expert. Collect that data to feed in your model.(and/or any relevant data available)

For example, all matches have been played between Arsenal and Chelsea so far might have a value in your model. Also the last games each team played might have an affect at the next match's result. 

As you stated in your question go with pre-match variables to "predict" next game's score. 

Another model could be:

You can take the features(data) in your question for the first t minute of the match and try to predict the result. Let's say use the data belonging the first half of the match to predict second half's result. 

On the other hand, the way you are doing at the moment can be helpful if you are looking for some exploratory analysis. For example which feature(s) has more impact on winning a game.

Hope this helps and looking forward to see other answers/and your analysis results.

ia

Related questions

1 1 vote
1 1 answer
725
725 views
metelon asked Dec 15, 2020
725 views
When I standardized my data when I created my model. Do I need to save the standardization transformation when I want to predict with my model new data ?
1 1 vote
1 answers 1 answer
1.3k
1.3k views
Kesz asked Oct 27, 2020
1,340 views
So far, I have modeled on known historical data. What if there are variables known only after the fact?Let me give you an example. I want to predict the outcome of the ma...
0 0 votes
0 0 answers
550
550 views
Anas asked Nov 28, 2021
550 views
So say I have a column with categorical data like different styles of temperature: 'Lukewarm', 'Hot', 'Scalding', 'Cold', 'Frostbite',... etc.I know that we can use pd.ge...
0 0 votes
0 0 answers
531
531 views
HbibOs asked Jun 21, 2021
531 views
Hello,I trained a CNN using synthetic data to perform a segmentation task on human faces. During the test and to evaluate the prediction of this network, I used 200 examp...
1 1 vote
0 0 answers
1.1k
1.1k views
Gwanza asked Jan 21, 2022
1,077 views
I want to build a dynamic pricing model which means if product is too expansive for a client and there is a risk that we might loose a client we lower the price for them ...