Category Archives: Uncategorized

AWS & satellite data: a primer

Massive thanks to Annekatrien for her original post on this topic which should be available using this link:

This is mainly a post to remind me of the steps I undertook (following the blog post above) to access Sentinel 2 data on AWS (developed and managed by Sinergise).

I use an Ubuntu 16.04 Mate VM with shared folders to the host machine as my processing/dev box. I use the Anaconda Python 3.5 distribution and the following has been tested and works on that system.

Python setup

I installed the AWS commandline packages using the conda-forge repository
conda install -c conda-forge awscli=1.10.44

and tested the installation using
aws help

which returned the man pages.

AWS Setup

I am currently on the free tier ( which is fine for what is listed in this post.

Sign in and go to the Amazon Console.

Go to the Services tab at the top of the console then select:
Security&Identity > IAM > Users > Create a new user

Ensure that the check box to generate a new key is ticked, add your user name and click Create. Download your user credentials BEFORE clicking Close.

To access the data on Amazon S3, change the permissions using the Console by clicking on
Services > Security&Identity > IAM > Users

again and then clicking on the name of your new user. This will open a summary page where you can manage a user. Click on
Permissions > Attach policy

and choose AmazonEC2FullAccess and AmazonS3FullAccess (and any others you want) before clicking Attach. The IAM user should now be set up.


In the bash terminal, type:
aws configure

and type in your access key ID and secret access key from the file downloaded earlier, when prompted. For region use the appropriate value from the table below based on the data you want access to. For output format, use json.

Landsat Sentinel-2
us-west-2 eu-central-1

Accessing data

To list the data available for Landsat use the following Terminal commands:
aws s3 ls landsat-pds

and for Sentinel use
aws s3 ls sentinel-s2-l1c

As Annakatrien says in her blog post, ‘to go deeper into the storage, and see separate images, you have to know what you’re looking for’.

In general you will use the following structure for landsat:
aws s3 ls landsat-pds/L8/<path>/<row>/<image name>/

and this for Sentinel-2:
aws s3 ls sentinel-s2-l1c/tiles/<UTM zone number>/<grid number>/<subgrid number>/<year>/<month>/<day>/

To download an image use one of the following commands:
aws s3 cp s3://landsat-pds/L8/201/024/LC82015242016111LGN00/ ~/Downloads/ --recursive
aws s3 cp s3://sentinel-s2-l1c/tiles/30/U/YC/2016/4/13/0/ ~/Downloads/ --recursive

More effort is then required to sort the downloads (if more than one image at a time) into a file structure on the local computer, as all images are download to the same directory.


Another nice little bash tool that I’ve recently come across is stat.

More details can be found here:

but basically it returns the status of a file or filesystem.


A little trick to help test the cpu load on a multi-core machine.

Open a Terminal and type the following and press Enter.

yes > /dev/null &

You may need to repeat the command for as many cores as your CPU has in order to stress to the maximum. To kill the ‘yes’ command (which writes a string continuously) use:

killall yes

And use it at your own risk 🙂

Shared folders update

This is an update on an earlier post, regarding read/write access to  a shared folder on a host machine from a linux guest in Virtualbox:

mount -t vboxsf -o uid=1000,gid=1000 <folder name given in VirtualBox> /home/<user>/where/ever/you/want

To add ownership and automatically mount in virtualbox via vboxsf in Ubuntu add to the /etc/rc.local file before the exit 0 line

List installed Python modules/libs

This is simple, but incredibly useful. From inside iPython just type ‘import’ without the speech marks, add a space and press tab.

Alternatively, install pip and type:

pip freeze