Contributors: Alicia Wang, Conner Swenberg


How do other people use our application?

As we learned in the first lecture, requests are made to servers. For our application to be networked, we too need to get our code running on a publicly accessible server. This process of publishing code into a real, running application is called deployment.

Outline of deployment process:

  1. Compress code into a production environment

  2. Prepare production environment to run-ready

  3. Spin up server(s)

  4. Download & run prepared production environment on server(s)

Last time we discussed containerization, which covers the first two steps in this process. Now we will continue and review the final two steps of actually running our application code so that it is a publicly accessible API. Let us first revisit servers.


As you may recall from the first chapter on Routes, a server is a software or hardware device that accepts and responds to requests made over a network (i.e. the internet, a local network, etc.), while a client is a device that makes the request, and receives a response from the server.

Some Examples of Servers

Don't worry about these details, they're just here as examples to get your head around on what a server could look like!

Web server

  • Web servers show pages and run apps through web browsers

  • The client is the web browser, like Chrome, Firefox, Safari

Email server

  • Email servers manage sending and receiving emails

  • An email client can be a desktop application (Microsoft Outlook, iOS Mail) or simply web-based applications that access email on a server through a browser (Gmail page in the Mozilla Firefox browser)

  • The email client on your computer connects to an IMAP or POP server to download emails, and a SMTP server to send emails

Identity server

  • Identity servers manage logins and security roles for authorized users

  • A good example is logging into Cornell's student center but making a pit stop at CUWebLogin with your NetID and password -- that process of authenticating a Cornell student is done through making a request on an identity server and getting a valid response back

How a server connects to a network

An internet server is assigned an IP address by the Internet Network Information Center (InterNIC, established by the US National Science Foundation) or by a web host (a company in charge of web pages, ex. GoDaddy).

This IP address is what distinguishes the server from the others on a network when the server connects to a router or a switch. A router is a hardware device that receives, analyzes, and moves data to another network, and a common example is the Wireless (Wi-Fi) router.

How users connect to a server

Users connect to a server by using its domain name (ex. google.com), which is translated to the server's IP address by a DNS resolver. For example, the domain name "google.com" points to the address, and the DNS (domain name system) server is what converts the domain name into the IP address on the internet.

Where are servers stored?

Servers can vary from your own computer or equipment in companies that are stored away in closets. Servers are often remote--meaning that they are located in data centers, managed by some company. Later, when you "create" your own server, you'll be getting it from Google Cloud, which is essentially a data center managed by Google.


Running an Application: Our Needs

In order to allow users to utilize our applications at all times, our server needs to always be running and open to receiving requests. The server should also be publicly accessible on the web. So far we've been running our servers locally, which is impractical if we need to scale up our application. Instead, we can use more powerful hardware that is managed by a third-party to run our server(s).

Cloud services are able to run our containerized software exactly as we would locally. Popular services include Amazon Web Services, Google Cloud, and Microsoft Azure. These companies will handle the physical maintenance of the server and will provide tools to automate security, scaling, and crash handling.

Accessing a Server

To access our server remotely, we use Secure Shell (SSH) network protocol. This protocol gives users a secure way to access another computer, where users can open and view resources, execute commands, install packages, and update software. For security, users must provide credentials in the form of keys and passwords. SSH can either use automatically generated public-private key pairs to encrypt the connection and request password credentials for the user, or users can use a manually generated public-private key pair to log in without a password.

In short, SSH ensures secure transfer of information between the host and the client. Host refers to the remote server you are trying to access, while the client is the computer you are using to access the host.

>>> ssh user@server-ip
>>> ssh user@domain.com

If you have never connected to this server before, it will ask you to confirm access, and your computer will add the IP to a list of known_hosts in your home directory so that it remembers and trusts the source. You will be validated as follows. Upon SSH request, will be prompted to provide the server's key and a request to connect. Your computer will provide the public/private key pair or your credentials. If successful, you will be granted access, logged in to the server and open a command prompt.

Google Cloud's Web Terminal

Instead of SSHing into our server from the command line, we can also use Google Cloud's web terminal to access the server. We are automatically authenticated by logging in with our Google account when we pull up the web command prompt interface. The functionality is exactly the same as if we were to SSH through your command line. The web terminal just simplifies the process for newcomers.



For any given application, a server may have to balance many different roles within the application. Typical role examples are dedicating a port to run a web application (what we have been doing), operating a relational database management system (RDBMS), or handling the storage of more freeform data like random files. In industry, service providers like Amazon Web Services (AWS) have different servers dedicated to each aspect of this system. You have EC2 instances for running application code, RDS instances for managing databases, and S3 buckets for storing the raw content of files (images, documents, etc.).

The goal of clustering is to create a group of servers that connect in a way to behave as one system. By delineating the specific role of a server, cloud computing services like AWS and GCloud can optimize each server class to perform this task. When connected properly to allow for fluid communication, you can create a better performing system than one instance handling all the roles of running an application.


The increased modularity of our system gives our application more resiliency and fine-tune control. We can backup versions of our database at different times, add specific server downtime-handling methods all to improve our data and application resiliency. By separating these roles, we also open up an easier way to manage these systems. Need a database capable of more operations/second or net storage size? Increase our database server specs. Want to add a new aspect of the application that demands storing a new file type? Add a new subdirectory within a bucket or create a new one entirely. Need to handle increased user traffic? Increase our web server specs. By separating our application's needs across multiple places, we gain a new level of orthonormality.

Load Balancing

Another cool extension of clustering is load balancing. The basic idea with load balancing is to have many servers running the same web app. For the exposed IP accepting requests, we have a server whose duty is just to reroute requests to one of the many servers available. In our example, we have one load balancer that has three web servers available to route too, therefore lowering the load on each server by a factor of three.

In the case of a steep increase in user traffic, we can automate a process of spinning up new servers to give the load balancer more options to split the load, still allowing us to give speedy responses to clients.

Last updated