Commit ceacc4b4 authored by Jakob Moser's avatar Jakob Moser
Browse files

Merge branch 'backend' into 'master'

Add backend in PHP, write proper architecture documentation

See merge request !1
parents c6a3834b 98a923ad
Loading
Loading
Loading
Loading
+158 −0
Original line number Diff line number Diff line
@@ -2,6 +2,7 @@

Maintainer: Jakob Moser <moser@cl.uni-heidelberg.de>

[🔗 Website](https://status.fsco.li) that shows the status of several web services hosted by the Fachschaft Computerlinguistik. Consists of a JavaScript-powered HTML frontend and a PHP-powered backend which communicate using an API.

## Run locally

@@ -17,6 +18,159 @@ sudo docker compose up

The site is then available at http://localhost:8080

## Architecture

The application consists of two parts: A client-side part, written in HTML, CSS and Javascript, and a server-side part, written in PHP. They communicate using an API.

### The client side

```
.
├── 📄 index.html
├── 📁 css
│   └── 🎨 style.css
├── 📁 img
│   ├── 🖼️ dropdown.png
│   ├── 🖼️ favicon.png
│   ├── 🖼️ header.jpg
│   ├── 🖼️ logo.png
│   ├── 🖼️ news.png
│   ├── 🖼️ rocket.png
│   └── 🖼️ ticker.png
└── 📁 js
    └── 🧩 index.mjs
```

The client-side code consists mainly of `index.html` and `index.mjs`, accompanied by some CSS and images for styling. 

All these files are _static_, meaning that they are not in any way interpreted by the server, but just sent to the client (i.e. your browser) as is. This is generally nice for scenarios where load balancing is important, because static files can be cached and easily distributed, after all, there is no central database that could be a bottleneck.

Static sites can also be nicely (and freely) hosted using [GitLab Pages](https://docs.gitlab.com/ee/user/project/pages/) or [GitHub Pages](https://pages.github.com/). Several people use static sites for blogging. “But wait, isn't a blog dynamic? After all, new blog posts are added all the time.”, you might wonder, and you wouldn't really be mistkane: A blog _feels_ very dynamic, but actually only needs to be changed whenever you want to add a blog post. For the rest of the time, it is static. As long as you are comfortable with editing an HTML file (or some other kind of text file) when posting, a blog is actually a pretty static thing.

Back to the status page. “Hey now, a status page is definitely dynamic. How should this work with a static website?”, you might wonder, and you wouldn't be mistaken: With an entirely static website, it doesn't (unless you are comfortable with editing the status page manually every time there is an update).

This is why the client side here actually does not contain any data. The JavaScript program `index.mjs` makes HTTPS requests to the server side (which is very dynamic) over an API, which returns the data the program than integrates into the website for displaying in your browser.

### The API

So how do those requests look like? And how do the replies from the server look like?

A request looks like this:

`GET` https://status.fsco.li/api/v1/services

And a response like this:

```json
[
  {
    "name": "Website",
    "host": "fachschaft.cl.uni-heidelberg.de",
    "status": 200,
    "category": "public"
  },
  {
    "name": "Tickets",
    "host": "tickets.fachschaft.cl.uni-heidelberg.de",
    "status": 404,
    "category": "public"
  }
]
```

The response is a list of objects in the JSON format (very similary to a list of dictionaries in Python). For every service, it lists the name, the host, the status and a category that the JavaScript script will use to build the webpage that is ultimately displayed by your browser.

You might wonder if there is a more standardized way to specify how an API looks like than giving some examples in a continuous text. And you would be right, there is a way:

```
.
├── 📁 api
│   └── 📁 v1
│       ├── 📄 index.html
│       └── 📋 openapi-spec.yaml
└── 📁 lib
    └── 📁 redoc
        ├── 📄 LICENSE
        ├── 🧩 redoc.standalone.js
        └── 📄 redoc.standalone.js.LICENSE.txt
```

The file `api/v1/openapi-spec.yaml` contains a description of the API in a machine-readable format (namely the OpenAPI format). This can be rendered by various tools, e.g., the Swagger Editor or Redoc, or the GitLab-integrated viewer.

Of course, displaying them is not the only thing you can do with OpenAPI files. You could also validate check for consistency, or even automatically generate clients and servers for a given API as described in such a file.

The `api/v1/index.html` and everything in `lib/redoc` is static code that renders the OpenAPI specification for a visitor. This means you can comfortably look at the API at https://status.fsco.li/api/v1.

* [Swagger Editor](https://editor.swagger.io/)
* [Redoc](https://redocly.github.io/redoc/)

### The server side

```
.
└── 📁 api
    ├── 🔑 .htaccess
    └── 📁 v1
        ├── 🔑 .htaccess
        └── 🐘 services.php
```

The main part of the server-side code is `api/v1/services.php`. Whenever a client makes a request to the server at this path, the server executes the PHP code. The PHP code can then output things (using `echo`) which are sent back to the client. Unlike static files (that are sent to the client in verbatim), the actual code contained in the PHP file is never sent back to the client[^1].

The code itself sends HTTPS requests to the different services to see if they are up and crafts the JSON response as described in the OpenAPI specification.

The `.htaccess` files are configuration for the Apache webserver:

* `api/.htaccess` redirects every request to `/api` to `/api/v1` (so you can enter https://status.fsco.li/api in your browser and are automatically directed to https://status.fsco.li/api/v1). This is just a comfort feature.
* `api/v1/.htaccess` is more complex: It rewrites every request to `/api/v1/services` to go to `/api/v1/services.php`. <!-- TODO: Why is this necessary? A: Query parameters --> 

### Docker

```
.
├── 🐋 docker-compose.yml
└── 🐋 Dockerfile
```

Only for development purposes, this repository contains a Docker configuration, so you can quickly start an Apache webserver locally to preview the site.

Why only for development? If you have a look at the `Dockerfile`, you see that it actually doesn't `COPY` any code. It just sets an Apache configuration option and then calls it a day. The `docker-compose.yml` is where we dynamically mount the contents of this repository into the container. This is very common for development: Any change you make to the repository is immediately reflected inside the container, so you don't need to rebuild the image every time you make a change.

While very common for development, it is very uncommon for production to only mount the code: The idea of an image is precisely to be self-contained, so it should contain all necessary code by itself (also allowing you to quickly switch between versions of the code by switching between images). To build a production-ready setup, you should add a `COPY . /var/www/html` instruction to the `Dockerfile` and remove the `volumes` section from `docker-compose.yml`.

## Deployment

<!-- TODO -->

## Questions & Answers

### The PHP code doesn't look very idiomatic, does it?

You're probably right. PHP is often mixed together with HTML to form some kind of templating system where the `.php` file directly produces HTML that is sent to and displayed in the browser.

This is also how PHP is explained in many tutorials, e.g. this one:

* [PHP: Your first PHP-enabled page - Manual](https://www.php.net/manual/en/tutorial.firstpage.php)

However, nothing about PHP (the language) mandates that it has to output HTML. If you prefer to have a clear separation of concerns (server code only supplies data in machine-readable format, client code deals with presenting that to the user), you can write a “modern-style” API returning JavaScript in any language, including PHP.

One goal of this project is to show that this is possible.

### Why the convoluted directory structure for the API?

Why is the PHP code located in `/api/v1/services.php` and not simply in a file `/services.php`?

First, it is common to separate the API routes of a web application from the routes that e.g. serve the frontend, therefore `/api/`.

Second, you should version your API, e.g. using Semantic Versioning (version numbers in the format `MAJOR.MINOR.PATCH`, e.g. `1.0.0`). It is then common to add the major version number to the API path. If you make incompatible changes to your API, you increase the major version number. This allows you to easily keep the old version of the API as long as some clients still need it. Therefore `v1/`.

This is why we have chosen a more convoluted-looking directory structure instead of just placing the file at root level.

* [Semantic Versioning](https://semver.org/)

### Couldn't we have done this entirely on the client?

<!-- CORS, maybe link LiveOverflow -->

## License

@@ -28,3 +182,7 @@ Note that this does not apply to third party files (those found in `lib/*`), nam
* fscoli-next Theme (`lib/fscoli-next/*`), which is licensed under the [GNU General Public License v3.0](lib/fscoli-next/style.css)

For details and authorship information, see the linked license files.

## Footnotes

[^1]: That is, unless you make a configuration error or run a server without PHP capabilities. This means: Yes, you can technically store secrets in PHP files, they should not be sent to the client – but if you ever do that, make sure you verify everything works as it should.

api/v1/.htaccess

0 → 100644
+2 −0
Original line number Diff line number Diff line
RewriteEngine on
RewriteRule services$ /api/v1/services.php
+17 −19
Original line number Diff line number Diff line
<!DOCTYPE html>
<!doctype html>
<html lang="en">

    <head>
        <title>Status API | Fachschaft Computerlinguistik</title>

    <link rel="shortcut icon" href="/img/favicon.png" />
        <link rel="shortcut icon" href="/lib/fscoli-next/img/favicon.png" />

        <meta charset="utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
@@ -20,5 +19,4 @@
        <redoc spec-url="./openapi-spec.yaml"></redoc>
        <script src="/lib/redoc/redoc.standalone.js"></script>
    </body>

</html>
+77 −80
Original line number Diff line number Diff line
@@ -2,7 +2,7 @@
openapi: 3.0.2
info:
    title: fscoli Status
  version: 1.0.0
    version: 0.0.1
    description: |
        Monitor the status of services of the Fachschaft Computerlinguistik, Universität Heidelberg
paths:
@@ -62,11 +62,8 @@ components:
                    type: string
                    readOnly: true
                status:
          description: If the service is reachable or not.
          enum:
            - up
            - down
          type: string
                    description: The HTTP response code one get's when sending a simple HTTP GET request to the service.
                    type: integer
                    readOnly: true
                host:
                    description: |-
@@ -83,5 +80,5 @@ components:
            example:
                name: Website
                host: fachschaft.cl.uni-heidelberg.de
        status: up
                status: 200
                category: public

api/v1/services.php

0 → 100644
+41 −0
Original line number Diff line number Diff line
<?php

$services = [
  ["name" => "Website", "host" => "fachschaft.cl.uni-heidelberg.de", "category" => "public"],
  ["name" => "Tickets", "host" => "tickets.fachschaft.cl.uni-heidelberg.de", "category" => "public"],
  ["name" => "Finanzen", "host" => "finanzen.fachschaft.cl.uni-heidelberg.de", "category" => "down"],
  ["name" => "Todo", "host" => "todo.fachschaft.cl.uni-heidelberg.de", "category" => "public"],
  ["name" => "Framadate", "host" => "framadate.fachschaft.cl.uni-heidelberg.de", "category" => "public"],
  ["name" => "Automation", "host" => "automation.fachschaft.cl.uni-heidelberg.de", "category" => "auth"],
  ["name" => "Traefik", "host" => "traefik.fachschaft.cl.uni-heidelberg.de", "category" => "auth"],
  ["name" => "Grafana", "host" => "grafana.fachschaft.cl.uni-heidelberg.de", "category" => "down"],
  ["name" => "Planet", "host" => "planet.fachschaft.cl.uni-heidelberg.de", "category" => "down"],
];

// The browser expects to see a Content-Type indication in the HTTP headers, so we
// send one. If we hadn't written this, the content type would be text/html.
header("Content-Type: application/json; charset=utf-8");

$multi_handle = curl_multi_init();
$handles = [];

foreach($services as $service) {
  $ch = curl_init("https://{$service['host']}");
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  $handles[] = $ch;

  curl_multi_add_handle($multi_handle, $ch);
}

$active_requests = -1;
do {
  curl_multi_exec($multi_handle, $active_requests);
  curl_multi_select($multi_handle);
} while($active_requests > 0);

foreach($handles as $index => $handle) {
  $services[$index]["status"] = curl_getinfo($handle, CURLINFO_RESPONSE_CODE);
}

// Everything that is written using `echo` is sent to the client (i.e. the browser)
echo json_encode($services);
Loading