SFTP as CaaS on Cloud

SFTP set up for Docker container that can work on the cloud out of the box (like Azure Blob Storage, AWS S3)

Photo by Cameron Venti on Unsplash

FTP or File Transfer Protocol helps to transfer files between server and client. For authentication, FTP uses username and password sent over the internet in the clear-text sign-in protocol.

SFTP or FTPS, on the other hand, provides a mechanism that encrypts the credentials and files before sending them over the internet. SFTP uses Secure Shell or SSH to encrypt the files, and FTPS uses SSL-based encryption (the same mechanism which is being used in HTTPS).

However, when creating an SFTP on the cloud, there aren’t a lot many options available natively by the popular cloud providers. One most suitable option we may find is to set up a virtual machine and host an SFTP on it. But since a VM is an IaaS solution, that may not be the best-suited option for you as it involves a lot of maintenance and security and even cost more, or even due to your company’s policy.

This leaves us to use a SaaS-based service, like an MFT server which usually requires licensing. Although this solution can be a better way to transfer heavy loads of files between servers, since it cost more, it doesn’t make much sense to use it for medium to smaller workloads.

Cloud providers like AWS already provide SFTP solutions for their cloud infrastructure. AWS provides a service called AWS Transfer Family that transfers files between our client and AWS S3 bucket. Other cloud providers, such as Microsoft Azure, don’t provide an SFTP service for their storage account. As per their customer support, their support for SFTP is a work in progress. Although there are a few 3rd party solutions available on Azure Marketplace, they can cost a lot more monthly.

For this article, we’ll be focusing on setting up SFTP on Azure which can save files in Azure Blob Storage and will also be a low-cost solution. Although this solution is not just limited to Azure. We may use solutions similar to this on AWS, Google Cloud, or any other cloud provider by changing one or more steps, overall the concept will not change. So let get started.

What we’ll need:

  • Azure Storage Account (To access Azure Blob Storage), or any other storage
  • Docker (To run the solution locally)
  • A little bit of knowledge of Linux

Let’s first create a storage account on Azure

Azure Storage Account is out of the scope for this article, but let’s get a brief idea of what this is. Azure Storage Account provides storage for many types of data objects, like a blob, files shares, queues, tables, disks at a very low cost per GB of data.

Microsoft explains the procedure to create a storage account here. This can be done via Portal, PowerShell, CLI, Predefined Templates. Whichever method you prefer to use, after creating a storage account, you’ll be offered 4 kinds of data storage:

  • Blob Storage Container: This service is optimized for storing massive amounts of unstructured data. This can be any kind of data from small images to massively heavy database backups. This is also called Azure Blob Storage.
  • Files Shares: As the name says, this service is used to support data sharing between multiple resources. This supports protocols like Server Message Block (SMB) or Network File System (NFS).
  • Queues: This service is used to store the queued messages.
  • Tables: This service is a low-cost Cloud-Based NoSQL database that offers schema-less design to store structured or non-relational data.

After creating a storage account, go ahead and create a Blob storage container with any name you prefer. Once the container, we’ll get a URL for the container. We would also need a Shared Access Signature (SAS) token. You can get your SAS token by following the steps mentioned here.

Your Blob Storage URL with SAS token should look like this:

https://{STORAGE_ACCOUNT}.blob.core.windows.net/{STORAGE_CONTAINER}?{SAS_TOKEN}

Lets now set up the server locally, later we’ll move it to Azure

Let’s first look at the architecture we’re going to create:

To set up the solution locally, we need to first install Docker. So go ahead, download and install it from here.

After installing Docker, create a new folder in your local to contain your code and create a Dockerfile. This Dockerfile will contain our code that will set up and start our SFTP server. For setting up the SFTP server, we’ll be using an existing container image atmoz/sftp. This image will provide us with an easy-to-use SFTP server using OpenSSH.

Use the following content for the Dockerfile:

FROM atmoz/sftp

# Configure user and storage for SFTP
RUN mkdir /etc/sftp
COPY users.conf /etc/sftp
COPY storage.conf /etc/sftp

# Create setup dir
RUN mkdir /setup

# Copy and exec setup script
COPY setup/setup.sh /setup
RUN chmod +x /setup/setup.sh
RUN /setup/setup.sh

# Copy start script
COPY setup/start.sh /setup
RUN chmod +x /setup/start.sh

# Copy script which will push files to blob storage
COPY setup/copy-to-storage.sh /setup
RUN chmod +x /setup/copy-to-storage.sh

# Scripts in /etc/sftp.d executes on container startup automatically
# So create a script in this dir to execute /setup/start.sh in separate process
# NOTE: & in the end of "/setup/start.sh &" makes this script run as a separate process
RUN mkdir /etc/sftp.d
RUN echo "/setup/start.sh &" > /etc/sftp.d/try-start.sh
RUN chmod +x /etc/sftp.d/try-start.sh

# Exposing port 22 as it is SFTP's default port
EXPOSE 22

In the same folder, create two new files:

  • users.conf: This will be used to store the username and password of users which will be configured for SFTP. Use the following content for the file:
username:password:::upload

Replace your username and password with your specified username and password. The syntax for creating new users is as follows:

user:pass[:e][:uid[:gid[:dir1[,dir2]...]]]
  • storage.conf: This will be used to store the configuration for the end storage where the files will be pushed. Use this file to store the URL and SAS token that you created above. The reason we’re using a separate file to store this is to provide us the flexibility to update the SAS token without restarting or deploying the container. What we need to do to change the SAS token is to open the container shell and edit this file using some text editor (like nano or vim) and save the file. It should be similar to what is below:
https://{STORAGE_ACCOUNT}.blob.core.windows.net/{STORAGE_CONTAINER}?{SAS_TOKEN}

After creating these files, let’s create a folder “setup” in the same directory, and create 3 files as below:

  • setup.sh: This will be a shell script and will be used to set up all required dependencies in the Linux image. Use the following content for the file:
#!/bin/bash

# Fetch the metadata for all installable packages
apt-get update

# Install wget and Install inotify-tools package to watch files
apt-get install -y wget inotify-tools nano

# Download AzCopy
wget https://aka.ms/downloadazcopy-v10-linux -P /setup

# Expand Archive
tar -xvf /setup/downloadazcopy-v10-linux

# Move AzCopy to the destination you want to store it
cp ./azcopy_linux_amd64_*/azcopy /usr/bin/
  • start.sh: This will be a shell script that will start our watch over the files transferred via SFTP:
#!/bin/bash

# Running an infinite loop here to when a new file is detected, it will keep on listening to the events
while $(true); do
  inotifywait -m /home/username/upload -e close_write | \
    while read FileName
    do
      Src=$(echo $FileName | sed 's/ CLOSE_WRITE,CLOSE //')
      Dest=$(cat /etc/sftp/storage.conf)

      echo Detected new file: $Src as $FileName
      /setup/copy-to-storage.sh "$Src" "$Dest" </dev/null
    done;
done;
  • copy-to-storage.sh: This shell script will transfer our files from SFTP to our Azure Blob storage container:
#!/bin/bash

echo "New files found. Initializing file copy to blob storage command."

if [[ -z $1 ]]; then
    echo "Error: Source is not provided"
    exit 1
fi

if [[ -z $2 ]]; then
    echo "Error: Destionation is not provided"
    exit 1
fi

azcopy cp $1 $2 --recursive=true

# If azcopy was success, only then delete the source file
if [ $? -eq 0 ]; then
    rm $1
fi

Let’s create the last file that will be used to start our Docker container. Create a file with the name “docker-compose.yml” in the same folder as your Dockerfile and use the following content:

version: "3.9"
services:
  sftp:
    image: sftp-debian:0.3
    ports:
      - "22:22"

We’re all set to go now.

Start a command line, change drive to your project folder, and type the following commands:

docker build . --tag="sftp-debian:0.3" -f .\Dockerfile

After Docker builds the image “sftp-debian:0.3”, lets now start the image:

docker-compose up

Opening up the Docker Desktop now will preview the following image running on it:

Running Docker Compose on PowerShell will preview the following output:

Now let’s test our solution:

To test, we’ll use FileZilla to connect to our SFTP server, and Azure Storage Explorer to explore our Blob storage container.

In FileZilla, use the Quick Connect option at the top fill the following info:

  • Host: sftp://localhost
  • Username: username
  • Password: password
  • Port: 22

After connecting, you should see a folder “upload”. Use this folder to upload files to the SFTP. After adding files to this folder, you’ll notice the files immediately get removed from the destination and get uploaded to the Blob storage container.

This happens because when we started the container using docker-compose, our start.sh file got executed which start a watch on the folder /home/username/upload/ using inotifywait. When a new file is detected in the folder, AzCopy gets executed which first fetches the destination URL from storage.conf and uses it to upload the file to our Blob Storage. Now if the file gets uploaded successfully by AzCopy, then our script will delete that file from SFTP local storage.

Now if everything works fine for you, let’s move this image to Azure in the next step.

Moving the solution to Azure

We’ll be using 2 types of resources in Azure to publish our solution:

  • Azure Container Registry: Create an Azure Container Registry resource by following the steps here. You may use any registry to upload the image. After creating the container registry, execute the following command to upload the image to the registry:
az acr build --image sftp-debian:0.3 --registry {REGISTRY_RESOURCE_NAME} --file Dockerfile .

The above command is of Azure CLI, and in case it is not already installed, install it from here. After installing, do remember to log-in from your command line to access your Azure account.

  • Azure Container Instance: Go to Azure Portal now and create an ACI resource using the image we uploaded above. Either search for ACI on Azure or click on the “+ Create a resource” button, select “Container” from the categories on left and click “Container Instances”.

A new form will appear and under the “Basics” tab, Select your subscription, your resource group, provide a container name, region, and from “Image source”, select “Azure Container Registry”:

Now go to the “Networking” tab, select “Networking type” as Public and provide a valid DNS name label for your resource. This is an important step so that you can get an FQDN after the container is created. Under the “Ports” section on the same page, expose port 22 with TCP protocol.

Later click “Review + create” to create the ACI instance.

Now after the resource is deployed, let’s test the resource using the FQDN provided by Azure:

Now using the same FQDN provided by Azure, let’s use FileZilla to connect to your SFTP server and start uploading files.

I have tested this solution by transferring multiple (around 20 files at once) and even heavy files (around 500 MB), which got copied to Azure Blob Storage in less than a second after it got transferred via SFTP to the container.

Not every solution is perfect

This solution also has its drawbacks. The major drawback that I should agree with is that this doesn’t provide a complete SFTP solution for you Azure Blob Storage, as it only transfers files from SFTP running on Docker to your Blob Storage.

Conclusion

Given that Azure doesn’t have any resource to support SFTP for their Storage Account, this is by far a nice and working solution for who just wants ingress of files to their Blob Storage.

Although while searching for a solution for the same, I also came across another solution to connect to Azure Blob Storage, Blobfuse. Blobfuse uses the Linux Fuse interface to mount Blob Storage onto the Docker container. This is by far the best solution to have an SFTP running on Docker because this works for both ingress and egress of blobs from Blob Storage. The only problem it had was that for Blobfuse to work on Linux Container, it requires some extra permissions while running the Docker container which Azure ACI doesn’t allow us to configure in anyways. If you’re interested in having an article for the same, just comment below and I’ll be happy to share that too.

Azure also provides SFTP on Azure, but that is just a custom implementation with ACI and Azure File Share. If you want to set up it with Azure File Share then they also provide a direct deployable ARM template that you can use. Although using blob storage to store files also has one more benefit which you don’t get via Azure File Share, which is to apply triggers to Azure Functions. If you’re uploading files via SFTP and want to process those files right away, then you can use Azure Functions which is a serverless way to process those files as they’re uploaded.

This solution doesn’t even work only on Azure, but can also work with AWS S3 or any other storage, given that storage has a mechanism to upload files via shell script.

I’ll more than happy if you provide have any feedbacks or improvements suggested for this solution.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s