Breakdown of NBA Win Probability Predictor

Here's the breakdown of how the NBA Live Win Probability Predictor Works. You can find the code here @app.py or visit the website here.

Firstly, to populate the home page, I used a request to get json data from the NBA website for the games that are happening tonight. Then, store them into a list object and send it for index.html to use.

@app.route('/')
def home():

    # get the data from nba's json
    url = 'https://data.nba.com/data/5s/v2015/json/mobile_teams/nba/2019/scores/00_todays_scores.json'
    response = requests.get(url)
    data = response.json()
    #Store the data for each game into a variable
    games = data['gs']['g']
    game_data = []

    for _,g in enumerate(games):
        game_data.append((g["stt"],_,g["cl"],g["v"]["s"],g["h"]["s"],g["v"]["ta"],g["h"]["ta"]))

    return flask.render_template('index.html', game_data=game_data)

My index.html looks like this:

<div class="row">
      {% for time,id,time_remaining,visitor_score,home_score,visitor,home in game_data %}
      <div class="card text-center col-sm-6" onclick="location.href = '/results/{{ id }}';" style="padding: 10px;">
        <div class="card-header">
          <h3 class="card-title" id="game_status_{{ id }}">Game Status: {{ time }}</h3>
          <h3 class="card-title" id="game_time_{{ id }}">Time Remaining: {{ time_remaining }}</h3>
        </div>
        <div class="card-body">
          <h4 class="card-text">
            <img src="{{url_for('static', filename='images/'+ visitor|string + '_logo.svg')}}" alt="Team Logo" height="80" width="80">
              VS.
            <img src="{{url_for('static', filename='images/'+ home|string + '_logo.svg')}}" alt="Team Logo" height="80" width="80">
          </h4>
          <h4 class="card-text" id="game_score_{{ id }}">
            Score: {{ visitor_score }} - {{ home_score }}
          </h4>
        </div>
      </div>
      {% endfor %}
    </div>

As for the JavaScript I've written for index.html, I called a GET request to an API in the server. The API will return a JSON object which will be parsed into an array that contains the data to update the index.html. The setTimeout(update_data,10000); will make the function update_data rerun every 10 seconds.

function update_data(){
 var url = '/get_data';
 var xhReq = new XMLHttpRequest();
 xhReq.open("GET", url, false);
 xhReq.send(null);
 var data = JSON.parse(xhReq.responseText);

 for (i = 0; i < data.length; i++){
   if(data[i][0].toString().includes("ET")){
     document.getElementById("game_status_"+i.toString()).innerHTML = "Game Time: " + data[i][0].toString();
   }else{
     document.getElementById("game_status_"+i.toString()).innerHTML = "Game Status: " + data[i][0].toString();
   };

   if(data[i][2] == null){
     document.getElementById("game_time_"+i.toString()).style.display = "none"
   }else{
     document.getElementById("game_time_"+i.toString()).innerHTML = "Time Remaining: " + data[i][2];
   };

   document.getElementById("game_score_"+i.toString()).innerHTML = "Score: " + data[i][3].toString() + " - " + data[i][4].toString();
 }
 setTimeout(update_data, 10000);
  };

 update_data();

The json.dumps() is used to parse the list into a String so that JavaScript can parse it into an array. Here's the code to generate API from Flask app:

# API to change home page live
@app.route('/get_data', methods=['GET'])
def get_json_data():
    # get the data from nba's json
    url = 'https://data.nba.com/data/5s/v2015/json/mobile_teams/nba/2019/scores/00_todays_scores.json'
    response = requests.get(url)
    data = response.json()
    #Store the data for each game into a variable
    games = data['gs']['g']
    game_data = []
    for _,g in enumerate(games):
        game_data.append((g["stt"],_,g["cl"],g["v"]["s"],g["h"]["s"],g["v"]["ta"],g["h"]["ta"]))
    json_data = json.dumps(game_data, ensure_ascii=False)
    return json_data

As for the steps to generate predictions, here's the breakdown:

The <int:game_id> is used to reference which game for prediction. The game_id is passed into the function for reference. The regex here used is to find any 2 numbers separated by ":" and the following 2 numbers. These data obtained is used for populating the predictions page and then we will use JavaScript to update the page the same way we update the index.html

# Prediction page
@app.route('/results/<int:game_id>')
def result(game_id):
    url = 'https://data.nba.com/data/5s/v2015/json/mobile_teams/nba/2019/scores/00_todays_scores.json'
    response = requests.get(url)
    data = response.json()
    #Store the data for each game into a variable
    game = data['gs']['g'][game_id]
    game_status = game["stt"]
    time_remaining = game["cl"]
    if time_remaining is None:
        time_remaining = "None"
    else:
        time_remaining = re.findall(r'[0-9]{1,2}:[0-9]{2}',time_remaining)[0]
    visitor_score = game["v"]["s"]
    home_score = game["h"]["s"]
    visitor = game["v"]["ta"]
    home = game["h"]["ta"]
    game_data = []

    games = data['gs']['g']
    for _,g in enumerate(games):
        game_data.append((g["stt"],_,g["cl"],g["v"]["s"],g["h"]["s"],g["v"]["ta"],g["h"]["ta"]))

    return flask.render_template('game_prediction.html', game_id=game_id, game_status=game_status, time_remaining=time_remaining,
    visitor_score=visitor_score, home_score=home_score, visitor=visitor, home=home, game_data=game_data)

JavaScript is included in the game_prediction.html because Jinja2 reference to object passed from the function above will not work if the JavaScript file is called from another folder. Here's the code for plotting the graph (I used Google Charts to plot the graph):

// for plotting graph
  google.charts.load('current', {'packages':['line']});
  google.charts.setOnLoadCallback(drawChart);

// used to store data for drawChart() function to plot
  var tableData = [[0, 50, 50]]
  function drawChart() {
    var data = new google.visualization.DataTable();
    data.addColumn('number', 'Minutes Played');
    data.addColumn('number', '{{ home }} Win Probability');
    data.addColumn('number', '{{ visitor }} Win Probability');
    data.addRows(
      tableData
    );
    var options = {
      chart: {
        title: '{{ home }} vs {{ visitor }}',
        subtitle: 'Win Probability'
      },
      width: 900,
      height: 500,
      hAxis: {
        title: 'Minutes Played',
        ticks: [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75]
      },
      vAxis: {
        title: 'Win Probability',
        ticks: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
      },
      colors: ['#a52714', '#097138']
    };
    //Reference to where you want the graph. In my case: <div id="chart_div"></div>
    var chart = new google.charts.Line(document.getElementById('chart_div'));
    chart.draw(data, options);
    var url = '/get_predictions/{{game_id}}';
    var xhReq = new XMLHttpRequest();
    xhReq.open("GET", url, false);
    xhReq.send(null);
    var json_data = JSON.parse(xhReq.responseText);

    //if the game hasn't start, change game status to game time
    if(json_data[0][0].toString().includes("ET")){
      document.getElementById("game_status").innerHTML = "Game Time: " + json_data[0][0].toString();
      document.getElementById("home_win_percentage").innerHTML = "{{home}}" + " Win Percentage: 50%";
      document.getElementById("visitor_win_percentage").innerHTML = "{{visitor}}" + " Win Percentage: 50%";
    }else{
      document.getElementById("game_status").innerHTML = "Game Status: " + json_data[0][0].toString();
      document.getElementById("home_win_percentage").innerHTML = "{{home}}" + " Win Percentage: " + json_data[0][7].toString() + "%";
      document.getElementById("visitor_win_percentage").innerHTML = "{{visitor}}" + " Win Percentage: " + json_data[0][8].toString() +"%";
    };

    if(json_data[0][1] == null){
      document.getElementById("game_time").style.display = "none";
    }else{
      document.getElementById("game_time").innerHTML = "Time Remaining: " + json_data[0][1].toString();
    };

    document.getElementById("game_score").innerHTML = "Score: " + json_data[0][2].toString() + " - " + json_data[0][3].toString();
    if (json_data[0][1].toString().includes("out")){
      tableData.push([])
    }else{
      tableData.push([ json_data[0][6], json_data[0][7], json_data[0][8]])
    };
    setTimeout(drawChart, 10000);

  };

If you noticed this code var url = '/get_predictions/{{game_id}}';, it is actually for the same purpose that is to call an API except this time the predictions for win probability are included. Here's the code and further breakdown of the code are below:

To use your model for prediction, you have to include these:

from tensorflow.keras.models import load_model
model = load_model('model.h5')

In my case, I have a trained neural network called model.h5

# API to update game data and predictions
@app.route('/get_predictions/<int:game_id>', methods=['GET'])
def get_win_percentage(game_id):
    # get the data from nba's json
    url = 'https://data.nba.com/data/5s/v2015/json/mobile_teams/nba/2019/scores/00_todays_scores.json'
    response = requests.get(url)
    data = response.json()
    #Store the data for each game into a variable
    game = data['gs']['g'][game_id]
    game_data = []
    game_status = game["stt"]
    time_remaining = game["cl"]

    # Sometimes the game time(game["cl") is not available so I used and if/else to check for it
    if game["cl"] is None:
        time_remaining = "00:00"
    else:
        time_remaining = re.findall(r'[0-9]{1,2}:[0-9]{2}',time_remaining)[0]
    visitor_score = int(game["v"]["s"])
    home_score = int(game["h"]["s"])

    # These if statements are the same way I used to extract the time played for each quarter when I'm scraping the data
    if "1st Qtr" in game_status:
        time_played = np.round((DT.datetime(1900,1,1,0,12) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif "2nd Qtr" in game_status:
        time_played = np.round((DT.datetime(1900,1,1,0,24) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif "3rd Qtr" in game_status :
        time_played = np.round((DT.datetime(1900,1,1,0,36) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif "4th Qtr" in game_status:
        time_played = np.round((DT.datetime(1900,1,1,0,48) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif "OT 1" in game_status:
        time_played = np.round((DT.datetime(1900,1,1,0,53) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif "OT 2" in game_status:
        time_played = np.round((DT.datetime(1900,1,1,0,58) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif "OT 3" in game_status:
        time_played = np.round((DT.datetime(1900,1,1,1,3) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif "OT 4" in game_status:
        time_played = np.round((DT.datetime(1900,1,1,1,8) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif "OT 5" in game_status:
        time_played = np.round((DT.datetime(1900,1,1,1,13) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    elif game_status == "Halftime":
        time_played = np.round((DT.datetime(1900,1,1,0,24) - DT.datetime.strptime(time_remaining,'%M:%S')).total_seconds()/60,2)
    else:
        time_played = 0

    prediction_data = [[time_played, home_score, visitor_score]]
    home_win_percentage = np.round(model.predict(prediction_data)[0][0]*100,2)
    visitor_win_percentage = np.round(100-home_win_percentage,2)

    game_data.append((game["stt"],game["cl"],game["v"]["s"],game["h"]["s"],game["v"]["ta"],game["h"]["ta"],time_played,home_win_percentage,visitor_win_percentage))
    json_data = json.dumps(game_data, ensure_ascii=False)
    return json_data

Conclusion

While doing this project, I've tried to use Celery package to run some background process(update the Frontend) but then I realized it wasn't working the way I want. I also tried to run a get request to the data.nba.com from JavaScript but there's CORS(Cross-origin resource sharing) error because the API isn't from the origin(my server). So, the solution is to make your own API the call it from JavaScript.

Some future improvements to be made to the app is to include team ELO and also find a way to include players that will play in the particular game because a superstar sitting out for the night will affect the predictions greatly!