Category Archives: Data

AWS & satellite data: a primer

Massive thanks to Annekatrien for her original post on this topic which should be available using this link:

This is mainly a post to remind me of the steps I undertook (following the blog post above) to access Sentinel 2 data on AWS (developed and managed by Sinergise).

I use an Ubuntu 16.04 Mate VM with shared folders to the host machine as my processing/dev box. I use the Anaconda Python 3.5 distribution and the following has been tested and works on that system.

Python setup

I installed the AWS commandline packages using the conda-forge repository
conda install -c conda-forge awscli=1.10.44

and tested the installation using
aws help

which returned the man pages.

AWS Setup

I am currently on the free tier ( which is fine for what is listed in this post.

Sign in and go to the Amazon Console.

Go to the Services tab at the top of the console then select:
Security&Identity > IAM > Users > Create a new user

Ensure that the check box to generate a new key is ticked, add your user name and click Create. Download your user credentials BEFORE clicking Close.

To access the data on Amazon S3, change the permissions using the Console by clicking on
Services > Security&Identity > IAM > Users

again and then clicking on the name of your new user. This will open a summary page where you can manage a user. Click on
Permissions > Attach policy

and choose AmazonEC2FullAccess and AmazonS3FullAccess (and any others you want) before clicking Attach. The IAM user should now be set up.


In the bash terminal, type:
aws configure

and type in your access key ID and secret access key from the file downloaded earlier, when prompted. For region use the appropriate value from the table below based on the data you want access to. For output format, use json.

Landsat Sentinel-2
us-west-2 eu-central-1

Accessing data

To list the data available for Landsat use the following Terminal commands:
aws s3 ls landsat-pds

and for Sentinel use
aws s3 ls sentinel-s2-l1c

As Annakatrien says in her blog post, ‘to go deeper into the storage, and see separate images, you have to know what you’re looking for’.

In general you will use the following structure for landsat:
aws s3 ls landsat-pds/L8/<path>/<row>/<image name>/

and this for Sentinel-2:
aws s3 ls sentinel-s2-l1c/tiles/<UTM zone number>/<grid number>/<subgrid number>/<year>/<month>/<day>/

To download an image use one of the following commands:
aws s3 cp s3://landsat-pds/L8/201/024/LC82015242016111LGN00/ ~/Downloads/ --recursive
aws s3 cp s3://sentinel-s2-l1c/tiles/30/U/YC/2016/4/13/0/ ~/Downloads/ --recursive

More effort is then required to sort the downloads (if more than one image at a time) into a file structure on the local computer, as all images are download to the same directory.

LiDAR processing

sudo apt-get install liblas-bin

Install the liblas library and you are good to go in terms of understanding what you have in your .las file.

A .las file is a standard binary format file containing LiDAR instrument data. LiDAR data provides a source of high-quality and very dense topographic data which is usually/often represented as an unstructured point cloud. The points will have an X, Y, Z coordinate to allow them to be placed in 3D space but can have much more information associated to them, such as return intensity, point classification and the return number. From this information it is possible to infer facts about the target being observed, such as whether it is vegetation and what the density of the vegetative canopy might be. The last return (or if there is only a single return) is usually taken to be the ground surface (or the roof of a building).

liblas provides access to some useful command line utilities:

1) lasinfo

lasinfo options:
  -h [ --help ]         produce help message
  -i [ --input ] arg    input LAS file
  -v [ --verbose ]      Verbose message output
  --no-vlrs             Don't show VLRs
  --no-schema           Don't show schema
  --no-check            Don't scan points
  --xml                 Output as XML
  -p [ --point ] arg    Display a point with a given id.

2) las2txt

las2txt options:
  -h [ --help ]         produce help message
  -i [ --input ] arg    input LAS file.
  -o [ --output ] arg   output text file.  Use 'stdout' if you want it written
                        to the standard output stream
  --parse arg           The '--parse txyz' flag specifies how to format each
                        each line of the ASCII file. For example, 'txyzia'
                        means that the first number of each line should be the
                        gpstime, the next three numbers should be the x, y, and
                        z coordinate, the next number should be the intensity
                        and the next number should be the scan angle.

                         The supported entries are:
                           x - x coordinate as a double
                           y - y coordinate as a double
                           z - z coordinate as a double
                           X - x coordinate as unscaled integer
                           Y - y coordinate as unscaled integer
                           Z - z coordinate as unscaled integer
                           a - scan angle
                           i - intensity
                           n - number of returns for given pulse
                           r - number of this return
                           c - classification number
                           C - classification name
                           u - user data
                           p - point source ID
                           e - edge of flight line
                           d - direction of scan flag
                           R - red channel of RGB color
                           G - green channel of RGB color
                           B - blue channel of RGB color
                           M - vertex index number

  --precision arg       The number of decimal places to use for x,y,z,[t]
                         --precision 7 7 3
                         --precision 3 3 4 6
                        If you don't specify any precision, las2txt uses the
                        implicit values defined by the header's scale value
                        (and a precision of 8 is used for any time values.)
  --delimiter arg       The character to use for delimiting fields in the
                         --delimiter ","
                         --delimiter ""
                         --delimiter " "
  --labels              Print row of header labels
  --header              Print header information
  -v [ --verbose ]      Verbose message output
  --xml                 Output as XML -- no formatting given by --parse is
                        respected in this case.
  --stdout              Output data to stdout

3) las2ogr

las2ogr options:
    -h print this message
    -i <infile>     input ASPRS LAS file
    -o <outfile>    output file
    -f <format>     OGR format for output file
    -formats        list supported OGR formats

Together these tools can help get the information contained in a las file into a GIS ready format to then be taken into a desktop software package such as SAGA or QGIS. Interpolation of the Z parameter for the relevant return number will then create an elevation model that can be used in subsequent analyses.

More information can be found here:

Tagged , ,

Geoserver install

So that I remember, should I ever need to do this again:

1) download the binary files from the geoserver website – You want the ones that are labeled Binary (OS independent)

2) unzip the file (in this case version 2.4, RC1)


3) change directory into the binary directory of the unzipped file

cd geoserver-2.4-RC1/bin

4) run the startup script


5) if there is a Java error then chech which version you have installed using

which java

if nothing is installed then use

sudo apt-get-install openjdk-7-jdk

check where it is installed using the ‘which’ command again and then set the JAVA_HOME variable

export JAVA_HOME=/usr

exit the terminal and start a new one. Navigate back to the bin directory of the geoserver extracted file

6) run geoserver


7) When the command line output halts and you see ‘Started SelectChannelConnection’ then goto a local browser and type


choose the second option to start the geoserver admin panel and login with the default details user=admin, passwd=geoserver

Tagged , ,

Python file handling

A lot of Python tinkering today, mainly in terms of file naming, management and extracting the contents. First off, how to create sensible file names: Date, Time, Details, Extension. So the code for this looks as follows:

from time import localtime
datetime_stamp = '%4d-%02d-%02dT%02d-%02d-%02d' % localtime()[:6]
title = "TestLogFile"
ext = "log"
print "Unique filename: %s-%s.%s" % (datetime_stamp, title, ext)

Which gives the following type of output:

Unique filename: 2011-10-31T10-09-33-TestLogFile.log

Continue reading

Tagged ,