The Cloud & Microsoft Azure, Part II

Front Matter

October 20th 2020 - Version 1.0.0

Contact Details

  • Dr. James Percival

  • Room 4.85 RSM building

  • email: j.percival@imperial.ac.uk

  • Teams: @Percival, James R in #ACSE1 or #General, or DM me.

Learning Objectives

By the end of this lecture you should:

  • Understand the basic concepts of HTTP communication and RESTful APIs.

  • Be able to code a simple app in Flask.

  • Be able to serve that app from Azure.

Azure Web services & Web Apps

One of the key questions with cloud services is which protocol to use to access them. For Azure services their are three major options:

  • Remote Desktop Protocol (RDP), to access Windows (and some linux) virtual machines and to use them in the same manner as a desktop.

  • Secure Shell (SSH), to access a terminal on VMs (or apps on linux through X forwarding)

  • Hypertext Transfer Protocol (HTTP/HTTPS) to access services via the web, whether through a browser, or another application.

Lets look further how that can work:

RDP

RDP should be familar, either from your time with an Azure lab, or from the exercises yesterday. This allows a user to connect to a GUI on remote machine.

SSH

SSH (the secure shell) should again be familiar to most of you from the exercises yesterday.

HTTP/HTML

HTTP & HTTPS (i.e secure HTTP) addresses will be familiar to you from your experience on the web. They are an example of a uniform resource locator (url), which take the form

https://user:password@www.imperial.ac.uk:8000/example/example/example.html?val1=abc&val2=123.4

This address can be broken down into several sections

Protocol

The leftmost part defines the protocol (think of it as an agreed language) being used. With HTTPS encryption must be agreed between the user agent (e.g. the browser) and the server.

authentication

It is not frequently used with HTTP (due to relative lack of security in that service) but usernames and passwords can be provided in the URL.

Server

The server provides a human readable mnemonic for the IP address of the remote server being connected to. This is looked up, working from right to left by contacting helper “name servers” (aka DNS servers) to find the machine you request (so in this case, a global server will be contacted to find a .uk DNS server, which points us at the .imperial server, which points us at the webserver dealing with requests to www)

Port number

Ports can be thought of as individual communication addresses on a single machine. Only one type of communication can happen on one port at a one time, although multiple users can be served. Some protocols have standard ports which they default to if no specific port number is given.

End point

The portion from the end of the server name to the beginning of the parameters is passed to the remote application connected to the port to decide what response to give. For a simple static http server this might be a directory path to a specific file. For a dynamic server, this might be a more complicated incantation.

Parameters

The text beyond the ? consists of a set of parameters, encoded in a key=value format, which is again passed on to the server application to control its output.

RESTful APIs

You may remember our script to look up TFL train line statuses on the tube which we introduced last week:

status.py:

from urllib.request import urlopen
import json
import argparse

parser = argparse.ArgumentParser()

parser.add_argument("mode", nargs='*',
                    help="transport modes to consider: eg. tube, bus or dlr.",
                    default=("tube", "overground"))
parser.add_argument("-l", "--lines", nargs='+',
                    help="specific lines/bus routes to list: eg. Circle, 73.")

args = parser.parse_args()

if args.lines:
    url = "https://api.tfl.gov.uk/line/%s/status"%','.join(args.lines)
else:
    url = "https://api.tfl.gov.uk/line/mode/%s/status"%','.join(args.mode)
    
status = json.loads(str(urlopen(url).read(),'ascii'))

short_status = {s['name']:s['lineStatuses'][0]['statusSeverityDescription']
	             for s in status}
	
for _ in short_status.items():
    print('%s: %s'%_)

The information provided by the Transport For London Unified API is an example of a RESTful API. In general, these work by providing a set of http-based URLs (i.e., web addresses) which respond to requests by returning relevant database information. Fuller documentation is available at

For example, sending an http GET request to https://api.tfl.gov.uk/Occupancy/BikePoints/BikePoints_187 (e.g. by trying to open it in your browser) will receive a response like

[{"$type":"Tfl.Api.Presentation.Entities.BikePointOccupancy, Tfl.Api.Presentation.Entities","id":"BikePoints_187","name":"Queen's Gate (South), South Kensington","bikesCount":3,"emptyDocks":21,"totalDocks":25}]

This is an example of JSON, a data format derived from Javascript, which works similarly to the Python database definition. Other, slightly less common formats include XML, YAML and CSV.

The Python system module urllib.request can be used to handle transmitting and receiving the requests, although non-system packages with more features are also available, and usually recommended when accessible (the most famous is probably the requests package.). The json, xml and csv modules can also be used to provide basic data processing on responses although, for large data sets, a package such as pandas may be more appropriate.

Connect to some other RESTful apis.

Some examples include:

but there are many more out there.

More on JSON

JSON data is very similar to Python script, with only a few key differences, along with some minor variations in terminology. The Python bulit-in json module can be used for automatic translation and many third-party packages (such as the data processing package Pandas you’ll learn about later in the week)

Flask - Python Web Apps

The are now many frameworks for creating web services, ranging from the simple but lightweight to the complicated but powerful. We will introduce a Python framework called Flask, originally created as an April Fool’s joke, which is on the lightweight end of the spectrum, but makes it very easy to create one file Web Apps driven by form processing.

A “Hello World” Flask program

As a Python package, we can install Flask using pip or conda using a command like:

pip install flask

With Flask installed, we can write a short example program and give it the “magic” name app.py.

app.py

from flask import Flask

app = Flask(__name__)


@app.route("/hello")
def root():
    return "<b>Hello</b> World!"

In this program we create a Flask application, and write a short function root, which we assign using a Python decorator to be called whenever an HTTP request is made to the end point /hello.

We can test run this on our local system with the command

flask run

inside the directory with the app.py file, which starts a web server on the local host on port 5000. We can then point our browser at the full URL <http//localhost:5000/hello> to see the final result.

Using Azure App Services to serve apps

Azure Web Apps Services delivers http based (especially Flask based) Apps direct from GitHub

Exercise

Log in to the Azure portal and create a web app from some of your Flask code stored on GitHub.

Local Python GUIs

There exist a number of GUI Toolkits compatible with Python, including TK, GTK+ and QT5. We’ll give an example of the use of the last one, since it interacts well with Anaconda.

The following requires the qtpy package.

from qtpy import QtWidgets, QtCore
import sys

class MainWindow(QtWidgets.QMainWindow):
    
    def __init__(self, parent=None):

        super().__init__()
        self.setWindowTitle("Hello world!")
        
        widget = QtWidgets.QWidget()
        self.setCentralWidget(widget)

        layout = QtWidgets.QVBoxLayout(self)
        widget.setLayout(layout)
        
        self.label = QtWidgets.QLabel("A qt GUI", self)
        self.label.setAlignment(QtCore.Qt.AlignCenter)
        layout.addWidget(self.label)
        
        self.greet_button = QtWidgets.QPushButton("Greet", self)
        self.greet_button.clicked.connect(self.greet)
        layout.addWidget(self.greet_button)
        
        self.close_button = QtWidgets.QPushButton("Close", self)
        self.close_button.clicked.connect(self.close)
        layout.addWidget(self.close_button)
        
    def greet(self, widget, callback_data=None):
        print("Greetings!")
        
    def quit(self):
        self.app.exit()
        
app = QtWidgets.QApplication(sys.argv)
win = MainWindow(app)
win.show()

if __name__ == "__main__":
    sys.exit(app.exec_())

When run, this script creates a basic windoxbox, with two buttons. The “Greet” button directs a greeting to your console, the “Close” button closes the window. Although small, this toy example demonstrates the use of Python to generate and control a widget, and can easily be extended.

Note that this code is written to work locally in a terminal. If you are attempting to run it in a Jupyter session then:

  1. The session will have to be running on a local system or one which you connect to via a windowing system (e.g. RDP, or with a suitable SSH connection with X forwarding).

  2. You will need to use the %gui qt iPython magic (or whichever is appropriate for your choice of GUI toolkit).

Security and the Cloud

Firewalls

In general, computers and services connected to the internet for a significant time should expect to be attacked by malicious users, whether in order to gain illicit access to the system to suborn it to their own purposes, or to deny it to others via denial of service attacks, whether from a single location, or from a distributed network. One protection against this is to use firewalls to limit access to systems to come from from IP addresses from which requests are accepted.

Azure in particular provides controls on network interfaces to limit the ports and services which are available over the network. Default options (and the safest option) usuall denies access unless it is specifically permitted.

Authentication & Authorization

Single Sign On (SSO)

Understanding of how to deal with passwords has improved over the years, but it is still very easy to make a mistake. On the other hand, as a technically trained person it’s possible that it’s something you will one day be asked to organize (or manage). Current best practice is at or above the following protocols:

  1. Use HTTPS for your initial communication.

  2. When a user picks a password, add a “salt” to it, and then apply a cryptographic hashing algorithm.

  3. Store the salt & hashed password along with your immutable user key (not necessarily username) as your password database. Forget the clear text password as soon as possible.

  4. When user logs in (sending the clear text password) apply the same algorithm as in step 2 and then compare the results.

  5. Regardless, secure your database and only grant access on a need to know basis.

In terms of password strength

All this is complicated, both for you and the user, and it would often be easier to make it someone else’s problem. Single Sign On (SSO) makes this possible by redirecting authentication requests to a single large provider, who then responds with short lived “tokens” which assert the user’s identity to the third party website. The full path of communication is shown in the image below.

There are many providers of SSO services, including famous names such as Google, Facebook, Twitter & Weibo. Many of these use a common framework called Open Authentication version 2 (also known as OAuth2).

A variety of SSO helper packages exist for Python. For Azure & Microsoft Active directory, the relevant package is called msal. An example use case, leveraging another package called flask-login looks something like the following:

login.py:


import os
import secrets

import msal

from flask import Flask, request, flash, redirect,\
    url_for, render_template, session
from flask_login import LoginManager, current_user, UserMixin,\
    login_user, logout_user, login_required

app = Flask(__name__)

__all__ = ['login', 'logout']

login_manager = LoginManager()
login_manager.login_view = 'login'
login_manager.init_app(app)

client_id = os.environ.get('CLIENT_ID', None)
client_secret = os.environ.get('CLIENT_SECRET', None)
tenant_id = os.environ.get('TENANT_ID', None)

csrf_token = secrets.token_urlsafe()

authority = f'https://login.microsoftonline.com/{tenant_id}'

aad = msal.ConfidentialClientApplication(client_id,
                                         client_secret,
                                         authority)

class User(UserMixin):

    def __init__(self, user_id):
        global aad
        self.id = user_id
        print('account', aad.token_cache._cache)

    @property
    def username(self):
        return self.id.split('@')[0]
            
    @property
    def is_authenticated(self):
        global aad
        account = aad.get_accounts(self.id)
        print('is_authenticated', account)
        if account:
            return 'access_token' in aad.acquire_token_silent([], account[0])
        return False

@login_manager.user_loader
def load_user(user_id):
    print(user_id)
    return User(user_id)

@app.route('/login')
def login():

    if current_user.is_authenticated:
        return redirect(url_for('index'))

    
    code = request.args.get('code')
    if code:
        if request.args.get('state') != csrf_token:
            flash('CSRF error!')
            return(url_for('login'))
        response = aad.acquire_token_by_authorization_code(code,
                                                           [])
        if response and 'access_token' in response:
            user = User(response['id_token_claims']['preferred_username'])
            login_user(user)
            flash('Logged in successfully via AAD.')
            return redirect(url_for('index'))
        
    return redirect(aad.get_authorization_request_url([], state=csrf_token))

@app.route('/logout')
def logout():
    global aad
    account = aad.get_accounts(current_user.get_id())
    if account:
        aad.remove_account(account[0])
    logout_user()

    ms_uri = 'https://login.microsoftonline.com/common/oauth2/v2.0/logout'
    site = 'https://localhost:5050'
    
    return redirect(ms_uri+f'?post_logout_redirect_uri={site}'+url_for('index'))

To use this pattern we must create an application secret inside the Active directory blade in the Azure portal, as well as looking up the relevant Tenant ID (the hash which identifies which user directory we are going to be using). These are read from local environment using the os.environ object. This is a very common pattern to use for secret data which should never be stored inside code repositories.

Multifactor Authentication (MFA)

Currently the gold standard for authentication involves 2 factor authentication (or more). Under this philosophy, a user needs to present at least 2 responses from two different categories out of:

  1. Something you know (e.g. a password)

  2. Something you have (e.g. your phone)

  3. Something you are (e.g. your fingerprint).

The idea is that a bad actor needs to steal several things from you in order to obtain unauthorised access. The most common implementation on the web uses a passcode system sent via text message. On cost and convenience grounds it is frequently only used when additional security is required (for dangerous behaviour or when permanently modifying profiles).

Additional Services

Azure Functions

Azure Functions is a service which allows a Python function to be accessed directly from the web via parameters passed through a URL. An example will be shown in the lecture.

Data

Azure has several systems available to store data, depending on its format. This might be unstructured binary data, structured databases or something in between

Blob Storage

To quote Microsoft, blob storage is designed to hold:

  • Serving images or documents directly to a browser.

  • Storing files for distributed access.

  • Streaming video and audio.

  • Writing to log files.

  • Storing data for backup and restore, disaster recovery, and archiving.

  • Storing data for analysis by an on-premises or Azure-hosted service.

The data is accessed via a network interface, with charges depending on how frequent access is expected to be and the volume of data transferred. In general a URL is assigned to each item, which can be used in multiple ways, including those listed above, to access the blob object.

SQL

Azure provides a number of ways to access data in databases. Most of them are built around the SQL database language. SQL, which dates back to 1974, follows a hierarchical approach, with a database server holding databases, each of which can hold multiple tables holding records each of which has multiple values in multiple columns. A useful mental reference is to multiple spreadsheet files (e.g. Excel) each containing multiple sheets with rows with data in multiple columns. However as so often with scriptable text interfaces, access is more powerful, although difficult for newcomers.

Python comes with inbuilt support for SQL in SQLite format, in which individual databases are stored in local files, via the builtin package sqlite3. To use a full fat SQL server on Azure appropriate additional software should be downloaded e.g the MySql connector. However the basic syntax to connect to, read and update individual databases remains similar.

import sqlite3

#Connect to/create db file
conn = sqlite3.connect('my_db.sqlite')

cur = conn.cursor()
try:
    cur.execute("CREATE TABLE fruit(id INTEGER PRIMARY KEY AUTOINCREMENT, name VARCHAR(50), price INTEGER)")
    print("Table created")
except sqlite3.OperationalError:
    print("Table exists")

# Write some data
cur.execute("INSERT INTO fruit (name, price) VALUES (?,?);", ("apple", 300))

# Read some data
cur.execute("SELECT * FROM fruit;")
rows = cur.fetchall()

for row in rows:
    print(row)

cur.execute("SELECT price FROM fruit WHERE id=?;", "1")
row = cur.fetchone()
print('Price:', row)

conn.commit()
cur.close()
conn.close()

For complicated interactions, packages such as Pandas or SQLAlchemy which wrap together Python types to SQL more closely may be more useful.

Summary

You should now:

  • Understand the basic concepts of HTTP communication and RESTful APIs.

  • Be able to code a simple app in Flask.

  • Be able to serve that app from Azure.

Further Reading