Category Archives: RS

Notes on processing large files with GDAL

Context

I have recently had a project where I was provided with a reasonable number of large aerial image files. These were 600-800MB each and there were 10-15 of them. I needed to cut out the area of interest (which covered some part of all of the images), mosaic the images into a single representation of the site and find a way to share the data with non-specialists.

The workflow in my head was Clip – Mosaic – Webmap

Things to note

The images were too slow to render for me to mess about trying to do this using a GUI system, and the GDAL functions in QGIS3 (for me at least) don’t seem to be up and running properly (milage may vary).

Certain things that I have found when doing this in GDAL are that it’s best to define a data type for the bands, and to set the data structure as cloud optimised. The easiest way to create cloud optimised geotiffs is to use the following:

gdal_translate in.tif out.tif -co TILED=YES -co COPY_SRC_OVERVIEWS=YES -co COMPRESS=DEFLATE

Initially I was going to use gdal_merge.py to mosaic the clipped images as that is what is used in QGIS. However, gdal_merge.py reads everything into memory and I quickly found out that 16GB of RAM wasn’t enough as my swap partition started to be used and the whole thing ground to a halt.

The trick is to use the CPU if you can. First make a virtual raster using gdalbuildvrt, and then use gdal_translate (which utilises the CPU) to change the vrt file to whatever format you want (e.g. cloud optimised geotiff). This was actually really fast, and used the structure found in the following commands:

gdalbuildvrt output.vrt /path/to/folder/of/*.tif
gdal_translate -of GTiff output.vrt mosaic.tif

This ends up with an image that will be relatively large (1.2 GB in my case) which is still tricky to share. But fear not! We can use gdal2tiles.py to create a whole load of tiles and automatically generate some leaflet code to allow you to see the data. Then it’s just a question of moving that folder onto a web server and sending your contacts the link.

A quick thing to note/remember is that the webmap zoom factors relate to the following:

  • 0 represents the whole world (1:500,000,000)
  • 19 is very close up (1:1000)

Workflow

The workflow that I used in the end was as follows:

First crop each aerial image to the site shapefile

maindir="/path/to/folder/of/incoming/aerial/files"
cd ${maindir}

for f in *.tif

do
gdalwarp -ot Byte -of GTiff -cutline siteBoundary.shp 
-crop_to_cutline -dstnodata 0.0 -overwrite ${f} 
outputfolder/${f}_cropped.tif-co TILED=YES -co COPY_SRC_OVERVIEWS=YES
 -co COMPRESS=DEFLATE
done

Then build the virtual image

gdalbuildvrt output.vrt /path/to/folder/of/*.tif

Output the virtual image to a cloud optimised geotiff

gdal_translate -ot Byte -of GTiff -co TILED=YES 
-co COPY_SRC_OVERVIEWS=YES -co COMPRESS=DEFLATE output.vrt mosaic.tif

Create the web tiles and map file using the following command

gdal2tiles.py -s EPSG:27700 -z 11-19 mosaic.tif

Sources

https://lostingeospace.blogspot.com/2011/04/rapid-mosaicking-with-gdal.html

http://www.cogeo.org/

Advertisements
Tagged , , ,

ECW

I’ve been looking at getting ecw support in gdal on Ubuntu 16.04. It looks as if the last supported version using ubuntugis was 12.04.

Then I found this: https://gis.stackexchange.com/questions/94917/how-do-i-add-and-view-ecw-raster-images-in-qgis-2-2-0-on-ubuntu-14-04-lts/200532#200532

I was going to try and compile my own gdal (honest, I was) but it was quicker and easier to install gvSig – which just works.

Tagged ,

AWS & satellite data: a primer

Massive thanks to Annekatrien for her original post on this topic which should be available using this link: http://www.digital-geography.com/accessing-landsat-and-sentinel-2-on-amazon-web-services/

This is mainly a post to remind me of the steps I undertook (following the blog post above) to access Sentinel 2 data on AWS (developed and managed by Sinergise).

I use an Ubuntu 16.04 Mate VM with shared folders to the host machine as my processing/dev box. I use the Anaconda Python 3.5 distribution and the following has been tested and works on that system.

Python setup

I installed the AWS commandline packages using the conda-forge repository
conda install -c conda-forge awscli=1.10.44

and tested the installation using
aws help

which returned the man pages.

AWS Setup

I am currently on the free tier (https://aws.amazon.com/free/) which is fine for what is listed in this post.

Sign in and go to the Amazon Console.

Go to the Services tab at the top of the console then select:
Security&Identity > IAM > Users > Create a new user

Ensure that the check box to generate a new key is ticked, add your user name and click Create. Download your user credentials BEFORE clicking Close.

To access the data on Amazon S3, change the permissions using the Console by clicking on
Services > Security&Identity > IAM > Users

again and then clicking on the name of your new user. This will open a summary page where you can manage a user. Click on
Permissions > Attach policy

and choose AmazonEC2FullAccess and AmazonS3FullAccess (and any others you want) before clicking Attach. The IAM user should now be set up.

AWSCLI Config

In the bash terminal, type:
aws configure

and type in your access key ID and secret access key from the file downloaded earlier, when prompted. For region use the appropriate value from the table below based on the data you want access to. For output format, use json.

Landsat Sentinel-2
us-west-2 eu-central-1

Accessing data

To list the data available for Landsat use the following Terminal commands:
aws s3 ls landsat-pds

and for Sentinel use
aws s3 ls sentinel-s2-l1c

As Annakatrien says in her blog post, ‘to go deeper into the storage, and see separate images, you have to know what you’re looking for’.

In general you will use the following structure for landsat:
aws s3 ls landsat-pds/L8/<path>/<row>/<image name>/

and this for Sentinel-2:
aws s3 ls sentinel-s2-l1c/tiles/<UTM zone number>/<grid number>/<subgrid number>/<year>/<month>/<day>/

To download an image use one of the following commands:
aws s3 cp s3://landsat-pds/L8/201/024/LC82015242016111LGN00/ ~/Downloads/ --recursive
aws s3 cp s3://sentinel-s2-l1c/tiles/30/U/YC/2016/4/13/0/ ~/Downloads/ --recursive

More effort is then required to sort the downloads (if more than one image at a time) into a file structure on the local computer, as all images are download to the same directory.

Geoserver notes

This post is mainly for my notes. On a clean install of Ubuntu 14.04 server I went through the following.

1) Java

sudo apt-get update 

Then, check if Java is already installed:

java -version 

If it returns “The program java can be found in the following packages”, Java hasn’t been installed yet, so execute the following to install the standard runtime environment:

sudo apt-get install default-jre 

2) Geoserver

These installation notes are based on those found on the Geoserver website.

  • Select the Stable version of GeoServer to download.

Select Platform Independent Binary on the download page.

  • Download the archive and unpack to the directory where you would like the program to be located (assumed to be /usr/local/geoserver for these notes).
wget http://sourceforge.net/projects/geoserver/files/GeoServer/2.7.0/geoserver-2.7.0-bin.zip
sudo mv ~/geoserver-2.7.0-bin.zip .
sudo unzip geoserver-2.7.0-bin.zip
  • Add an environment variable to save the location of GeoServer by typing the following command:
echo "export GEOSERVER_HOME=/usr/local/geoserver" >> ~/.profile
. ~/.profile
  • Make yourself the owner of the geoserver folder using the following command replacing USER_NAME with your own username :
sudo chown -R USER_NAME /usr/local/geoserver/
  • Start GeoServer by changing into the directory geoserver/bin and executing the startup.shscript:
cd geoserver/bin
sh startup.sh

If you see the GeoServer logo, then GeoServer is successfully installed. If you get a message about Java then it may be that you need to run the following:

echo "export JAVA_HOME="/usr/lib/jvm/java-6-sun-1.6.0.07" >> ~/.profile
. ~/.profile
Tagged , ,

ARCSI

At the end of 2014 I used ARCSI (Atmospheric and Radiometric Correction of Satellite Imagery) for the first time. ARCSI is an open software project that provides a command line tool for the atmospheric correction of Earth Observation imagery. It provides a pretty much automatic way of running 6S.

Details of the software can be found here: https://bitbucket.org/petebunting/arcsi

At tutorial can be found here: https://spectraldifferences.wordpress.com/2014/05/27/arcsi/ 

This post is based largely on the tutorial linked to above but I also try to pull together some of the tips I read about in the help forums here: https://groups.google.com/forum/#!forum/rsgislib-support

Set up

All of the following instructions are run on an installation of the Ubuntu 14.04 operating system.

First, download the Anaconda 3.4 python distribution: http://docs.continuum.io/anaconda/

Install the Anaconda version of Python and the conda package manager using the command line. Once the bash script has run then install ARCSI and TuiView (a fast image viewer) using conda. If a warning comes up regarding gdal then install the gdal-data package.

bash Anaconda3-2.1.0-Linux-x86_64.sh

conda install -c https://conda.binstar.org/osgeo arcsi tuiview
conda update rsgislib arcsi

conda install -c jjhelmus gdal-data

For a successful installation I needed to change the default installation path for Anaconda from ~/anaconda3 to ~/anaconda

You should now be able to check your installation using the following command:


arcsi.py -h | less

To finalise the set up you need t point the GDAL driver path to the KEA drivers. Run the following to set up the path:


export GDAL_DRIVER_PATH=~/anaconda/gdalplugins:$GDAL_DRIVER_PATH

GDAL_DATA="/home/username/anaconda/share/gdal"

export GDAL_DATA

where username is the user account on the Ubuntu installation.

Run the code

When you run ARCSI, you’ll enter a command similar to the following:


arcsi.py -s ls8 -f KEA --stats -p RAD TOA SREF --aeropro NoAerosols --atmospro Tropical --aot 0.25 -o dir/to/outputs -i metadatafile_MTL.txt

To break this down a bit it first calls the arcsi.py script, passing to it the following parameters:

  • Sensor (-s) – landsat 8
  • Output image format (-f) – KEA
  • Parameters to compute (-p) – RAD Radiance conversion, TAO Top of atmosphere, SREF surface reflectance using 6S (for the full range please consult the official documentation)
  • Output directory (-o) – dir/to/outputs, into which the three computed outputs will be saved
  • Information (-i) – Landsat metadata file, using the format provided by the images available through Earth Explorer

 

Further information of all the parameters can be found in the help forums, the arcsi help command and in the code.

If an error is reported when you first run the ARCSI command, it might be that LIBGFORTRAN.SO.3 cannot be found. This will lead to the 6S model failing. You will need to install the required files using the following command:


sudo apt-get install libgfortran3

Dealing with outputs

The best way to view the output is to start the supplied viewer using the  ‘tuiview‘  command. This is an intuitive to use and very responsive viewer that handles the KEA format natively. To transform the KEA format outputs into a format readable by a wide range of GIS software, use the GDAL Translate command.


gdal_translate -of GTiff keafile.kea outputfile.tif

Tagged , , , ,
Advertisements