Python : Texas Terrain Harvesting

 Python : Texas Terrain Harvesting

Written by Andy Carter,PE | September 28, 2020


Video Walkthrough (11 minutes)




For every civil engineering project, there must be topographic terrain data for the site.  In the early days of my career, I would have to twiddle my thumbs waiting for the surveyors to the field acquire the required survey shots.  No work could commence until I got their survey and built an existing ground surface model.

Now, LiDAR flights are made regularly throughout the United States scanning the landscape.  With my work currently focused in Texas, I am always on the 'hunt' for current terrain data whether it is for a small site or a large drainage study.

Texas now has LiDAR coverage for the entire state.  While some data are in processing, I suspect that within a year (2021) Texas will have complete coverage served via the Texas Natural Resource Information Service. (TNRIS)


There are multiple flights, thousands of tiles and petabytes of data in the form of both bare earth terrain digital elevation models (DEM) and LiDAR point clouds (LAS).  For every project the task is to:

  1. Determine which tiles are needed for the project
  2. Downloading and unzipping the tiles
  3. Merging the tiles into one (1) single DEM
  4. If needed, clipping the terrain to the requested boundary limits
Wouldn't it be awesome if this entire process could be automated?  This is what I accomplished via a Python script.  

All the files needed to replicate this workflow on your machine are at GitHub.

Download Link (21 Mb): 


Step 1:  Determine the Tiles that are needed

First, we must create an ESRI shapefile of the requested area.  For this example, we are going to grab the entire limits of the Walnut Creek watershed in Austin, Texas.  We will use polygon (area) shapefile named "Walnut_AOI_AR_2277.shp".  Note that this file is in EPSG: 2277 (Texas Central State Plane, US ft).

Via the Python code, this shapefile is translated with GDAL and intersected with a TNRIS provided tile index shapefile (EPSG: 4629).  A list of the required tiles is returned and URLs are programmatically constructed to prepare for downloading.

Inputs needed:

  • Quarter Quad Index - StratMap_QQ_Index_AR_4269.shp
  • Texas available tiles  - TNRIS_LiDAR_Available_AR_4269.shp


Note that for this example, ten (10) quarter quadrangles (QQ) will be downloaded to cover the requested area.  When unzipped, each QQ contained sixteen (16) quarter-quarter-quarter-quads (QQQQ).  We will need 79 tiles (QQQQ) to cover the area. [These counts assume no buffer]


Step 2:  Download the Tiles

TNRIS allows download of bare earth LiDAR DEMs through an Amazon Web Service (AWS).  Here is an example URL of one (1) of the QQ tiles.

https://s3.amazonaws.com/data.tnris.org/0549d3ba-3f72-4710-b26c-28c65df9c70d/resources/stratmap17-50cm-central-texas_3097342_dem.zip

This is for one of the tiles in the 'stratmap-2017-50cm-central-texas' data bucket on AWS. Using multi-threading, the routine downloads multiple tiles (QQ) at once using the Python requests library.  Still... download speeds may vary.   After unzipped, this is 911 Mb of data.



Step 3:  Merge the Tiles

Using the Python 'rasterio' library, the downloaded tiles are filtered to those QQQQ tiles within the requested boundary.  All other tiles (QQQQ) are deleted.  Then the remaining tiles are merged into a single DEM. 




Step 4:  Clip and Re-project

Finally, the merged raster is clipped to the requested boundary limits (plus a buffer).  This DEM is projected to the same coordinate system of the requested boundary shapefile.  For this example it was EPSG: 2277 (Texas Central State Plane- US foot)



What is needed... Anaconda + Environment

To use this routine on your own machine you will need to do the following.

  • Install Anaconda.  https://www.anaconda.com/
  • Open the anaconda command prompt.  In the Start menu, type in 'Anaconda' and select the Anaconda Prompt (Anaconda3) application.



My machine is named FN34GZ1.  Your path will definitely be to a different location.

  • Copy the GitHub provided specifications file (terrainharvest_speclist.txt) to the same folder seen in the anaconda command prompt. (Mine is 'C:\Users\FN34GZ1')
  • In the anaconda command line, create the 'harvest' environment by entering in the following command.   
conda create --name harvest --file terrainharvest_speclist.txt


  • After installation is complete, activate the 'harvest' environment
conda activate harvest

 


Notice that the environment changed from 'base' to 'harvest'.  That means that the 'harvest' anaconda environment is now active.
  • With the 'harvest' environment active, launch jupyter notebook from the command prompt.
jupyter notebook



This will open Jupyter Notebook... which runs inside of a web browser.


Jupyter Notebook

  • Copy the 'TNRIS_Terrain_Harvest.ipynb' notebook from the GitHub resource to a folder named 'Terrain_Harverst' under the anaconda root directory.  My location is 'C:\Users\FN34GZ1\Terarin_Harvest'.  Your path will vary.

  • In Jupyter, select the folder that you created.
  • Select and open the file named 'TNRIS_Terrain_Harvest.ipynb'.

  • This notebook script needs to run with the 'harvest' environment that was previously created.  In Jupyter --- change the kernel to 'Python [conda env: harvest]'


Jupyter Notebook - Inputs and Outputs

  • There are three (3) input shapefiles that are needed to run this notebook.  The first two are indices of the TNRIS tile scheme.  The third shapefile is a requested boundary of the area needed.  All of these shapefiles must be polygons.  The requested boundary will only utilize the first found polygon.  You can not request multiple areas at once.
  • Revise the path to the tile indices in the "2.0 Input Data" code block.  Revise the path to the requested boundary shapefile.

  • Change the output directory to a location that exists on your machine.



Jupyter Notebook - Run the Notebook

  • To run the notebook, under the kernel menu, select the "Restart & Run All".  Provided the size of the request and the speed of your internet connection, the procedure could take several minutes.


Ultimately, the routine returns a merged DEM whose units are native to the tiles that are served by TNRIS.  This means that the completed tile is likely in UTM and the vertical data is is meters.





No comments: