Breakdown of NBA Win Probability Predictor
Here's the breakdown of how the NBA Live Win Probability Predictor Works. You can find the code here @app.py or visit the website here.
Firstly, to populate the home page, I used a request to get json data from the NBA website for the games that are happening tonight. Then, store them into a list
object and send it for index.html to use.
@app.route('/')
def home():
# get the data from nba's json
url = 'https://data.nba.com/data/5s/v2015/json/mobile_teams/nba/2019/scores/00_todays_scores.json'
response = requests.get(url)
data = response.json()
#Store the data for each game into a variable
games = data['gs']['g']
game_data = []
for _,g in enumerate(games):
game_data.append((g["stt"],_,g["cl"],g["v"]["s"],g["h"]["s"],g["v"]["ta"],g["h"]["ta"]))
return flask.render_template('index.html', game_data=game_data)
My index.html looks like this:
<div class="row">
{% for time,id,time_remaining,visitor_score,home_score,visitor,home in game_data %}
<div class="card text-center col-sm-6" onclick="location.href = '/results/{{ id }}';" style="padding: 10px;">
<div class="card-header">
<h3 class="card-title" id="game_status_{{ id }}">Game Status: {{ time }}</h3>
<h3 class="card-title" id="game_time_{{ id }}">Time Remaining: {{ time_remaining }}</h3>
</div>
<div class="card-body">
<h4 class="card-text">
<img src="{{url_for('static', filename='images/'+ visitor|string + '_logo.svg')}}" alt="Team Logo" height="80" width="80">
VS.
<img src="{{url_for('static', filename='images/'+ home|string + '_logo.svg')}}" alt="Team Logo" height="80" width="80">
</h4>
<h4 class="card-text" id="game_score_{{ id }}">
Score: {{ visitor_score }} - {{ home_score }}
</h4>
</div>
</div>
{% endfor %}
</div>
As for the JavaScript I've written for index.html
, I called a GET request to an API in the server. The API will return a JSON object which will be parsed into an array that contains the data to update the index.html
. The setTimeout(update_data,10000);
will make the function update_data
rerun every 10 seconds.
function update_data(){
var url = '/get_data';
var xhReq = new XMLHttpRequest();
xhReq.open("GET", url, false);
xhReq.send(null);
var data = JSON.parse(xhReq.responseText);
for (i = 0; i < data.length; i++){
if(data[i][0].toString().includes("ET")){
document.getElementById("game_status_"+i.toString()).innerHTML = "Game Time: " + data[i][0].toString();
}else{
document.getElementById("game_status_"+i.toString()).innerHTML = "Game Status: " + data[i][0].toString();
};
if(data[i][2] == null){
document.getElementById("game_time_"+i.toString()).style.display = "none"
}else{
document.getElementById("game_time_"+i.toString()).innerHTML = "Time Remaining: " + data[i][2];
};
document.getElementById("game_score_"+i.toString()).innerHTML = "Score: " + data[i][3].toString() + " - " + data[i][4].toString();
}
setTimeout(update_data, 10000);
};
update_data();
The json.dumps()
is used to parse the list
into a String
so that JavaScript can parse it into an array. Here's the code to generate API from Flask app:
# API to change home page live
@app.route('/get_data', methods=['GET'])
def get_json_data():
# get the data from nba's json
url = 'https://data.nba.com/data/5s/v2015/json/mobile_teams/nba/2019/scores/00_todays_scores.json'
response = requests.get(url)
data = response.json()
#Store the data for each game into a variable
games = data['gs']['g']
game_data = []
for _,g in enumerate(games):
game_data.append((g["stt"],_,g["cl"],g["v"]["s"],g["h"]["s"],g["v"]["ta"],g["h"]["ta"]))
json_data = json.dumps(game_data, ensure_ascii=False)
return json_data
As for the steps to generate predictions, here's the breakdown:
The <int:game_id>
is used to reference which game for prediction. The game_id
is passed into the function for reference. The regex here used is to find any 2 numbers separated by ":" and the following 2 numbers. These data obtained is used for populating the predictions page and then we will use JavaScript to update the page the same way we update the index.html
# Prediction page
@app.route('/results/<int:game_id>')
def result(game_id):
url = 'https://data.nba.com/data/5s/v2015/json/mobile_teams/nba/2019/scores/00_todays_scores.json'
response = requests.get(url)
data = response.json()
#Store the data for each game into a variable
game = data['gs']['g'][game_id]
game_status = game["stt"]
time_remaining = game["cl"]
if time_remaining is None:
time_remaining = "None"
else:
time_remaining = re.findall(r'[0-9]{1,2}:[0-9]{2}',time_remaining)[0]
visitor_score = game["v"]["s"]
home_score = game["h"]["s"]
visitor = game["v"]["ta"]
home = game["h"]["ta"]
game_data = []
games = data['gs']['g']
for _,g in enumerate(games):
game_data.append((g["stt"],_,g["cl"],g["v"]["s"],g["h"]["s"],g["v"]["ta"],g["h"]["ta"]))
return flask.render_template('game_prediction.html', game_id=game_id, game_status=game_status, time_remaining=time_remaining,
visitor_score=visitor_score, home_score=home_score, visitor=visitor, home=home, game_data=game_data)
JavaScript is included in the game_prediction.html
because Jinja2 reference to object passed from the function above will not work if the JavaScript file is called from another folder. Here's the code for plotting the graph (I used Google Charts to plot the graph):
// for plotting graph
google.charts.load('current', {'packages':['line']});
google.charts.setOnLoadCallback(drawChart);
// used to store data for drawChart() function to plot
var tableData = [[0, 50, 50]]
function drawChart() {
var data = new google.visualization.DataTable();
data.addColumn('number', 'Minutes Played');
data.addColumn('number', '{{ home }} Win Probability');
data.addColumn('number', '{{ visitor }} Win Probability');
data.addRows(
tableData
);
var options = {
chart: {
title: '{{ home }} vs {{ visitor }}',
subtitle: 'Win Probability'
},
width: 900,
height: 500,
hAxis: {
title: 'Minutes Played',
ticks: [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75]
},
vAxis: {
title: 'Win Probability',
ticks: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
},
colors: ['#a52714', '#097138']
};
//Reference to where you want the graph. In my case: <div id="chart_div"></div>
var chart = new google.charts.Line(document.getElementById('chart_div'));
chart.draw(data, options);
var url = '/get_predictions/{{game_id}}';
var xhReq = new XMLHttpRequest();
xhReq.open("GET", url, false);
xhReq.send(null);
var json_data = JSON.parse(xhReq.responseText);
//if the game hasn't start, change game status to game time
if(json_data[0][0].toString().includes("ET")){
document.getElementById("game_status").innerHTML = "Game Time: " + json_data[0][0].toString();
document.getElementById("home_win_percentage").innerHTML = "{{home}}" + " Win Percentage: 50%";
document.getElementById("visitor_win_percentage").innerHTML = "{{visitor}}" + " Win Percentage: 50%";
}else{
document.getElementById("game_status").innerHTML = "Game Status: " + json_data[0][0].toString();
document.getElementById("home_win_percentage").innerHTML = "{{home}}" + " Win Percentage: " + json_data[0][7].toString() + "%";
document.getElementById("visitor_win_percentage").innerHTML = "{{visitor}}" + " Win Percentage: " + json_data[0][8].toString() +"%";
};
if(json_data[0][1] == null){
document.getElementById("game_time").style.display = "none";
}else{
document.getElementById("game_time").innerHTML = "Time Remaining: " + json_data[0][1].toString();
};
document.getElementById("game_score").innerHTML = "Score: " + json_data[0][2].toString() + " - " + json_data[0][3].toString();
if (json_data[0][1].toString().includes("out")){
tableData.push([])
}else{
tableData.push([ json_data[0][6], json_data[0][7], json_data[0][8]])
};
setTimeout(drawChart, 10000);
};
If you noticed this code var url = '/get_predictions/{{game_id}}';
, it is actually for the same purpose that is to call an API except this time the predictions for win probability are included. Here's the code and further breakdown of the code are below:
To use your model for prediction, you have to include these:
from tensorflow.keras.models import load_model
model = load_model('model.h5')
In my case, I have a trained neural network called model.h5
# API to update game data and predictions
@app.route('/get_predictions/<int:game_id>', methods=['GET'])
def get_win_percentage(game_id):
# get the data from nba's json
url = 'https://data.nba.com/data/5s/v2015/json/mobile_teams/nba/2019/scores/00_todays_scores.json'
response = requests.get(url)
data = response.json()
#Store the data for each game into a variable
game = data['gs']['g'][game_id]
game_data = []
game_status = game["stt"]
time_remaining = game["cl"]
# Sometimes the game time(game["cl") is not available so I used and if/else to check for it
if game["cl"] is None:
time_remaining = "00:00"
else:
time_remaining = re.findall(r'[0-9]{1,2}:[0-9]{2}',time_remaining)[0]
visitor_score = int(game["v"]["s"])
home_score = int(game["h"]["s"])
# These if statements are the same way I used to extract the time played for each quarter when I'm scraping the data
if "1st Qtr" in game_status:
time_played = np.round((DT.datetime(1900,1,1,0,12) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif "2nd Qtr" in game_status:
time_played = np.round((DT.datetime(1900,1,1,0,24) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif "3rd Qtr" in game_status :
time_played = np.round((DT.datetime(1900,1,1,0,36) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif "4th Qtr" in game_status:
time_played = np.round((DT.datetime(1900,1,1,0,48) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif "OT 1" in game_status:
time_played = np.round((DT.datetime(1900,1,1,0,53) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif "OT 2" in game_status:
time_played = np.round((DT.datetime(1900,1,1,0,58) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif "OT 3" in game_status:
time_played = np.round((DT.datetime(1900,1,1,1,3) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif "OT 4" in game_status:
time_played = np.round((DT.datetime(1900,1,1,1,8) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif "OT 5" in game_status:
time_played = np.round((DT.datetime(1900,1,1,1,13) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
elif game_status == "Halftime":
time_played = np.round((DT.datetime(1900,1,1,0,24) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
else:
time_played = 0
prediction_data = [[time_played, home_score, visitor_score]]
home_win_percentage = np.round(model.predict(prediction_data)[0][0]*100,2)
visitor_win_percentage = np.round(100-home_win_percentage,2)
game_data.append((game["stt"],game["cl"],game["v"]["s"],game["h"]["s"],game["v"]["ta"],game["h"]["ta"],time_played,home_win_percentage,visitor_win_percentage))
json_data = json.dumps(game_data, ensure_ascii=False)
return json_data
Conclusion
While doing this project, I've tried to use Celery
package to run some background process(update the Frontend) but then I realized it wasn't working the way I want. I also tried to run a get request
to the data.nba.com
from JavaScript but there's CORS(Cross-origin resource sharing) error because the API isn't from the origin(my server). So, the solution is to make your own API the call it from JavaScript.
Some future improvements to be made to the app is to include team ELO and also find a way to include players that will play in the particular game because a superstar sitting out for the night will affect the predictions greatly!