Tutorials on making web app using python (Introduction)

Recently, we have joined hackathons and tried making web applications using Python. We intend to make tutorials on making these web apps. In this introduction, we present what we have done.

I. Ferryman


The website: https://ferry-man.herokuapp.com/

Github source code: https://github.com/IEScoders/ObservingEarthWithData

More information: https://www.ait.org.tw/results-of-taipei-2018-nasa-international-space-apps-challenge-hackathon/

Skills/Packages have been used:

  • Handling with NASA satellite data using read, write HDF and HDF5 file (pyhdf).

  • Calculations with spatial data using pandas, numpy and multiprocessing.

  • Plot data with matplotlib.

  • Create the website using dash.

II. Biozone

The website: https://biozone.us-east.mybluemix.net/

Github source code: https://github.com/IEScoders/BioZone

Skills/Packages have been used:

  • Crawl weather data from Central Weather Bureau (Taiwan)

  • Calculations with spatial data using pandas and numpy.

  • Plot interactive map using Folium

  • Create the website using Flask and Javascript.

Plot geochemical data using python: An example of analyzing Adakite rock data from GEOROC

We download the precompiled data of adakite from GEOROC database.
For a simple impression of adakite, the wikipedia page gives some clue: Adakites are volcanic rocks of intermediate to felsic composition that have geochemical characteristics of magma thought to have formed by partial melting of altered basalt that is subducted below volcanic arcs.

In this example, we demonstrate how to use python to simplify the data, discard the null data, classify and plot the geochemical properties.

First, let’s look at the data, it is quite large and differs in the available data: some elements are there, some are not.

Screen Shot 2018-09-24 at 3.50.03 PM

Now we can start by importing some useful packages:

import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import warnings
plt.rcParams['figure.figsize'] = (20, 10)

We use pandas to import the data, the encoding is used to solve some font conflicts.

df = pd.read_csv("ADAKITE.csv", encoding="iso-8859-1")

We use dropna() method to discard columns with less than 50% data available.

df = df.dropna(axis=1, thresh=int(0.5*len(df)))

To have a look at the sample occurrence, let’s plot the data. We select the lon, lat column, drop null value and change all data to numeric format.

m = Basemap(lon_0=180,projection='hammer')
lon = df["LONGITUDE MIN"].dropna()
lat = df["LATITUDE MIN"].dropna()
lon = pd.to_numeric(lon, errors='ignore');
lat = pd.to_numeric(lat, errors='ignore');
lon_ = [];lat_ = []
for x, y in zip(lon,lat):
        xx, yy = m(float(x),float(y))
m.scatter(lon_, lat_, marker = "o" ,s=15, c="r" , edgecolors = "k", alpha = 1)
plt.title('Adakite rocks sample')


We make a function called plot_harker() to plot Harker’s diagram:

def plot_harker(x,xlabel,y,ylabel,title=None,xlim=[40,80],ylim=None,color = "b",label=None):
    plt.scatter(x=x,y=y,marker="o", c=color,s=8,label = label)
    if title != None:

… and then plot some elements using that function:

plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["AL2O3(WT%)"],ylabel=(r'$Al_2O_3$ (wt%)'),title=r'$SiO_2$ vs $Al_2O_3$')
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["MGO(WT%)"],ylabel=(r'$MgO$ (wt%)'),title=r'$SiO_2$ vs $MgO$')
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["FEOT(WT%)"],ylabel=(r'$FeOt$ (wt%)'),title=r'$SiO_2$ vs $FeOt$')
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["TIO2(WT%)"],ylabel=(r'$TiO_2$ (wt%)'),title=r'$SiO_2$ vs $TiO_2$')
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["NA2O(WT%)"],ylabel=(r'$Na_2O$ (wt%)'),title=r'$SiO_2$ vs $Na_2O$')
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["K2O(WT%)"],ylabel=(r'$MgO$ (wt%)'),title=r'$SiO_2$ vs $K_2O$')
plt.suptitle(r'Harker diagram of Adakite vs $SiO_2$',y=0.92,fontsize=15)

download (1).png

We can try to see the tectonic settings of the rock:

tec = df['TECTONIC SETTING'].dropna()
tec_counts = tec.value_counts()
plt.title('Tectonic settings of Adakite')

download (2)

The following code demonstrates how to create new columns and divide the data. We divide the data in High Silica Adakite (SiO2 > 60%) and Low Silica Adakite (SiO2 < 60%)

df['SR/Y'] = (df["SR(PPM)"]/df["Y(PPM)"])
df['CAO+NA2O'] = df['CAO(WT%)'] + df['NA2O(WT%)']
df['CR/NI'] = df['CR(PPM)'] + df['NI(PPM)']
df_hsa = df[df["SIO2(WT%)"] > 60]
df_lsa = df[df["SIO2(WT%)"] < 60]

download (3).png

Bonus: Let’s see the publish about adakite in the GEOROC database by year:

cite = [df.CITATIONS[x] for x in range(0,len(df)) if len(df.CITATIONS[x]) > 20 and df.CITATIONS[x].count('[') < 3]
year = []
for i in range(0,len(cite)):

download (4)

The python code below is the full program of this example:

Nguyen Cong Nghia

Plot seismogram (SAC file), events, stations in Python (Part 1)

Here is an example of plotting SAC files in Python. The sample SAC files can be downloaded here and the Jupyter notebook can be downloaded here.

First, import some useful packages, including obspy, pandas, numpy and Basemap. By the way, they are all great packages (obspy is amazing for anyone who uses seismic data)

from obspy import read
import pandas as pd
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import numpy as np

#Ignore warnings due to python 2 and 3 conflict
import warnings

Let’s read the sample Z component using read from obspy

stream = read("2015.0529.0700/*Z.sac")

In here, we use the header from SAC file using tr.stats.sac.(SAC_header)

# setup mercator map projection.
m = Basemap(lon_0=180,projection='hammer')
evlat = stream[0].stats.sac.evla; evlon = stream[0].stats.sac.evlo

#Plot the event
xx,yy = m(evlon,evlat)
m.scatter(xx, yy, marker = "*" ,s=150, c="r" , edgecolors = "k", alpha = 1)

for tr in stream:
    stlat = tr.stats.sac.stla; stlon = tr.stats.sac.stlo 
    xx,yy = m(stlon,stlat)
    m.scatter(xx, yy, marker = "^" ,s=150, c="g" , edgecolors = "k", alpha = 1)
plt.title("Event-station map")

I used a simple trick to plot the seismogram with distance by make the y:
y = data + dist*weight_factor
with data is the amplitude of seismic trace, dist: distance in km (SAC header) and weight_factor = 0.01

The red line indicate the predicted P arrival time that I have calculated and store in SAC header t3

for tr in stream:
    dist = tr.stats.sac.dist
plt.ylabel("x100 km")    
Plot the same seismogram but using filled colors (which is more suitable to plot other kind of seismic traces)
for tr in stream:
    dist = tr.stats.sac.dist*0.01
    x = tr.times()
    y = tr.data+dist
    plt.fill_between(x,y, dist, y > dist, color='r', alpha = 0.8)
    plt.fill_between(x,y, dist, y < dist, color='b', alpha = 0.8)
plt.ylabel("x100 km")    
Nguyen Cong Nghia

Introduction to Genetics Algorithm (GA) (Part 2)

To find a basic introduction of GA, the first part can be found here.

III. Examples using Genetics Algorithm

In these examples, we will use Matlab and its function ga to apply GA for the optimization problem. For the manual of using this function, you can find it at https://www.mathworks.com/help/gads/ga.html or type in Matlab:

help ga

III.1. Optimization of a polynomial function

Problem: find the integer x  value that minimizes function:

y = 0.2x2 + 50/x

First, we define the function:

function y = simple_fitness(x)
 y = 0.2*x.^2+(50./x);

And use Matlab to finish the job:

clear; close all; clc


rng(1,'twister') % for reproducibility
[x,fval] = ga(@simple_fitness,1,[],[],[],[],lb

xlabel('Size of cup');
ylabel('Objective Function Value');
title(sprintf('Minimum of the function is %.2f at x = %.2f',min(y), x))
grid on

And here is the result:


III.2. Find the best location of the earthquakes

Problem: we have a set of 5 seismic stations with coordinates: [-2 3 0; 1 3 0; -2 -1 0; 0 -3 0; 2 -2 0]

The travel time of seismic ray from EQ to stations is calculated from:


To simplify the problem, we choose v = 6 km

clear; close all; clc
stnx=[-2 1 -2 0 2];
stny=[3 3 -1 -3 -2];
stnz=[0 0 0 0 0];
EQ=[2 2 -2];

rms = @(mp) sum(sqrt((1./(length(to)-4)).*((sqrt((mp(1)-stnx).^2+(mp(2)-stny).^2+(mp(3)-stnz).^2)/6)-to).^2));
% rms([2 2 -2])
ub=[10 10 0]; %upper bound
lb=[0 0 -10]; %lower bound
options = gaoptimset('Display','iter','PlotFcn',@gaplotbestf,'Generations',150)
% options = gaoptimset('Display','iter','PlotFcn',@gaplotbestf,'TolCon',1e-3);

mp = ga(rms,3,[],[],[],[],lb,ub,[],options)

figure; hold on; grid on;
scatter3(stnx, stny, stnz, 'b', 'filled', '^');
scatter3(mp(1), mp(2), mp(3), 'r', '*');
scatter3(EQ(1), EQ(2), EQ(3), 'g', '*');
view(3); xlabel('X axis'); ylabel('Y axis'); zlabel('Z axis');
legend('Stations','EQ location from GA','Real EQ location');


Utpal Kumar



Introduction to Genetics Algorithm (GA) (Part 1)

I. Introduction

In daily life as well as in doing research, we might come to problems that require a lowest/highest value of variables, e.g.:  find the shortest way from home to work, buying household items with a fixed amount of money, etc. These problems could be called “optimization” and today we will introduce an algorithm to solve these kinds of problem: the Genetic Algorithm.

Genetic Algorithm has been developed by Prof. John Holland in 1975, which search algorithm that mimics the process of evolution. This method is based on the “Survival of the Fittest” concept (Darwinian Theory), i.e.: successive generations are becoming better and better.

II. Algorithm

  1. Initialization
  2. Fitness Calculation
  3. Selection
  4. Crossing Over
  5. Mutation
  6. Repetition of the steps from 2-5 for the new population generated.


The pdf version of this introduction could be found here: Genetic Algorithm

The 2nd part can be found here.

Utpal Kumar


Best-fit quadratic surface from given points in 3D using Matlab

In Earth Science research, sometimes we need to construct 3D surfaces from given points, for example: creating the fault surface, locating a subducting slab from earthquake hypocenters, etc.  in a region of interest in X-Y plane.

In this example, we will show how to create a best-fit quadratic surface from given points in 3D using Matlab. The code is written in the following steps:

  1. Input the data 3D points: x, y, z
  2. Calculate the best fitting curve, store the parameters in C matrix.
  3. Define a region of interest xx, yy and using C to get zz.
  4. Plot the xx, yy, zz surface with x, y, z data.
  5. Make comparison (probabilistic density function of misfit between z and zz).

We use getPolygonGrid.m from mathworks.com.

Main program:


% We start with some some 3d points
data = mvnrnd([0 0 0], [1 -0.5 0.8; -0.5 1.1 0; 0.8 0 1], 50);

x = data(:,1); y = data(:,2); z = data(:,3);

% Make a best-fit quadratic curve from the given points
C = x2fx(data(:,1:2), 'quadratic') \ data(:,3);

% Define an area of interest in X-Y plane by polyon nodes
xv = [-3 -3 0 3 3 0];
yv = [ 0 3 3 0 -3 -3];

% Create a area of interest in x, y plane
inpoints = getPolygonGrid(xv, yv, 1/0.1);
xx = inpoints(:,1);
yy = inpoints(:,2);

% Create the corresponding zz value from the above grid
zz = [ones(numel(xx),1) xx(:) yy(:) xx(:).*yy(:) xx(:).^2 yy(:).^2] * C;
zz = reshape(zz, size(xx));

% plot points and surface
hold on;

% Plot the points
line(data(:,1), data(:,2), data(:,3), 'LineStyle','none', 'Marker','.', 'MarkerSize',25, 'Color','r')

% Make a triangular surface plot from xx, yy, zz
tri = delaunay(xx,yy); %x,y,z column vectors
trisurf(tri,xx,yy,zz, 'FaceColor','interp', 'EdgeColor','b', 'FaceAlpha',0.2);

% Change view propeties
title('Fitting surface');
grid on; axis tight equal;
xlabel x; ylabel y; zlabel z;

% Calculate the misfit between surface and each points:
zfit = [ones(numel(x),1) x y x.*y x.^2 y.^2] * C;
h = -2: 0.01: 2;
misfit = z - zfit;
mu = mean(misfit); sigma = std(misfit);
f = exp(-(h-mu).^2./(2*sigma^2))./(sigma*sqrt(2*pi));
title('Probability function of misfit');
hold on

Simple 1D velocity model inversion from P arrival time

Refer to Chapter 5, Introduction to Seismology, Shearer 2009.


From the P-wave travel time data below (note that the reduction velocity of 8km/s), inverse for the 1D velocity model using T(X) curve fitting (fit the T(X) curve with lines, then invert for the ray parameter p and delay time τ(p), then solve for the velocity and depth).


Fig 1. P-wave travel time from a seismic experiment.



We construct the T(X) plot from the data: y = T  – X/8 => T = y + X/8 and then divide the T(X) curve into fitting lines (Fig. 2).

Fig 2. T(X) plot with lines fitting to data points.

The ray parameter and delay is calculated for each line by using:

p = dT/dX

τ = T – pX

These parameters p and τ are slope and intercept of a line, respectively.

The τ(p) can be written as:

Screenshot from 2016-12-27 17-14-52.png

as in the matrix form of:

Screenshot from 2016-12-27 17-14-57.png

or, in short,

We can invert for the thickness of each layer using least square inversion:

If we assume ray slowness u(i) = p(i) for each line, we can solve the h and v for the 1D velocity model.

Here is the result of the inversion:


The sharp increase in Vp from ~6.6 km/s to 7.7 km/s at ~ 30km depth suggest for the Moho discontinuity.

The codes for the problem is written in python and can be downloaded: source codeτ inversion code, data points.

Nguyen Cong Nghia, IESAS


Ray tracing through a 1-D velocity model

Refer to Chapter 4 of Shearer, Introduction to Seismology.


For a ray piercing through Earth, the ray parameter (or horizontal slowness) p is defined by several expressions:

Screenshot from 2016-12-21 16-38-07.png

where u = 1/v is the slowness, θ is the ray incidence angle, T is the travel time, X is the horizontal range and utp is the slowness at the ray turning point.

Screenshot from 2016-12-21 16-37-05.png

The vertical slowness is defined as:


and integral expressions for the surface-to-surface travel time are:

Screenshot from 2016-12-21 16-41-26.png

Screenshot from 2016-12-21 16-41-32.png

With these equations we can calculate the travel time (T) and horizontal distance (X) for a given ray with ray parameter p and velocity model v(z).

Apply these to a problem (Exercise 4.8 Shearer):

(COMPUTER) Consider MARMOD, a velocity-versus-depth model, which is typical of much of the oceanic crust (Table 4.1). Linear velocity gradients are assumed to exist at intermediate depths in the model; for example, the P velocity at 3.75 km is 6.9 km/s. Write a computer program to trace rays through this model and produce a P-wave T(X) curve, using 100 values of the ray parameter p equally spaced between 0.1236 and 0.2217 s/km. You will find it helpful to use subroutine LAYERXT (provided in Fortran in Appendix D and in the supplemental web material as a Matlab script), which gives dx and dt as a function of p for layers with linear velocity gradients. Your program will involve an outer loop over ray parameter and an inner loop over depth in the model. For each ray, set x and t to zero and then, starting with the surface layer and proceeding downward, sum the contributions, dx and dt, from LAYERXT for each layer until the ray turns. This will give x and t for the ray from the surface to the turning point. Multiply by two to obtain the total surface-to-surface values of X(p) and T(p). Now produce plots of: (a) T(X) plotted with a reduction velocity of 8 km/s, (b) X(p), and (c) τ(p). On each plot, label the prograde and retrograde branches. Where might one anticipate that the largest amplitudes will occur?

Screenshot from 2016-12-21 21-32-25.png

Using the LAYERXT subroutine (Appendix D of Shearer), the FORTRAN and MATLAB can be downloaded from here. In this example, we will use python to calculate and plot the result.




The main program is as follow:

from ttcal import layerxt
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

v_model = pd.read_csv('v_model.dat')

p1 = 0.1236
p2 = 0.2217
n = 100

result = [['p','X','T']]
for p in np.linspace(p1, p2, n):
    X = 0; T = 0;
    for i in range(0, len(v_model)-1):
            p = p
            h = v_model.h[i+1] - v_model.h[i]
            utop = 1/v_model.vp[i]
            ubot = 1/v_model.vp[i+1]
            dx, dt, irtr = layerxt(p,h,utop,ubot)
            X += dx; T += dt
        #If the ray reflected at layer the dx, dt calculation stops
            if irtr == 2:

result_df = pd.DataFrame(result, columns=result.pop(0))
result_df = result_df[::-1]

#Multiply by 2 to get total surface-to-surface value ot X(p) and T(p)
result_df['T'] = 2*result_df['T']
result_df['X'] = 2*result_df['X']

result_df['RT'] = result_df['T']-result_df['X']/8
result_df.plot(x='X', y='RT', legend=False)
plt.title('Reduction time')
plt.ylabel('Reduce time T - X/8 (s)')
plt.xlabel('Distance (km)')
plt.text(20,0.80,'Prograde', rotation =40)
plt.text(40,1.6,'Retrograde', rotation =35)
plt.text(80,1.00,'Prograde', rotation =0)

result_df.plot(x='p', y='X', legend=False)
plt.ylabel('Distance X (km)')
plt.xlabel('Ray parameter (km/s)')

result_df['Taup'] = result_df['T'] - result_df['p']*result_df['X']
result_df.plot(x='p', y='Taup', legend=False)
plt.ylabel('Tau(p) (s)')
plt.xlabel('Ray parameter (km/s)')

v_model.h = v_model.h * (-1)
v_model.plot(x='vp', y='h', legend=False)
plt.ylabel('Depth (km)')
plt.xlabel('Velocity (km/s)')


LAYERXT subroutine written in python:

from math import sqrt, log

def layerxt(p,h,utop,ubot):
    ''' LAYERXT calculates dx and dt for ray in layer with linear velocity gradient
        Inputs:   p     =  horizontal slowness
             h     =  layer thickness
             utop  =  slowness at top of layer
             ubot  =  slowness at bottom of layer
        Returns:  dx    =  range offset
             dt    =  travel time
             irtr  =  return code
                   = -1,  zero thickness layer
                   =  0,  ray turned above layer
                   =  1,  ray passed through layer
                   =  2,  ray turned within layer, 1 segment counted in dx,dt'''
# ray turned above layer
    if p >= utop:
        dx = 0
        dt = 0
        irtr = 0
        return dx,dt,irtr

#Zero thickness layer
    elif h == 0:
        dx = 0
        dt = 0
        irtr = -1
        return dx,dt,irtr

#Calculate some parameters
    u1 = utop
    u2 = ubot
    v1 = 1.0/u1
    v2 = 1.0/u2
#Slope of velocity gradient
    b = (v2 - v1)/h
    eta1 = sqrt(u1**2 - p**2)

#Constant velocity layer
    if b == 0:
        dx = h*p/eta1
        dt = h*u1**2/eta1
        irtr = 1
        return dx,dt,irtr

    x1 = eta1/(u1*b*p)

#Ray turned within layer, no contribution to integral from bottom point
    if p >= ubot:
        dx = x1
        dtau = tau1
        dt = dtau + p*dx
        irtr = 2    
        return dx,dt,irtr



    return dx,dt,irtr

def flat(zsph,vsph):
    ''' Calculate the flatten transformation from a spherical Earth.
        zf, vf = flat(zsph,vsph)'''
# Radius of the Earth
    a = 6371.0

    rsph = a - zsph
    zf = -a*log(rsph/float(a))
    vf = (a/float(rsph))*vsph
    return zf, vf

Input velocity model:


The codes can be downloaded here: main program, python LAYERXT subroutine, velocity model.

You need to install numpy, pandas and matplotlib package to run this. Normally it can be installed in the terminal by (presume python has been installed):

pip install numpy pandas matplotlib

Nguyen Cong Nghia – IESAS




Modeling a wave on a string using finite differences

Based on problem 3.7 chapter 3 of Introduction to Seismology (Shearer)

(COMPUTER) In the case of plane-wave propagation in the x direction within a uniform medium, the homogeneous momentum equation (3.9) for shear waves can be expressed as


,where u is the displacement. Write a computer program that uses finite differences to solve this equation for a bar 100 km in length, assuming β = 4 km/s. Use dx = 1 km for the length spacing and dt = 0.1 s for the time spacing. Assume a source-time function  at u (50 km) of the form:


Apply a stress-free boundary condition at u(0 km) and a fixed boundary condition at u (100 km).  Approximate the second derivatives using the finite difference scheme:


Plot u(x) at 4 s intervals from 1 to 33 s. Verify that the pulses travel at velocities of 4 km/s. What happens to the reflected pulse at each endpoint? What happens when the pulses cross?

Solutions can be downloaded here.

!* *
!* 1-D Plane wave propagation with fixed and free boundaries * 
!* *

!----initialzie t, dx, dt, tlen, beta, and u1, u2, u3 arrays----c

implicit none
 integer:: i, counter, nx, nx1
 integer:: it, ntmax
 real(8):: t, dt, dt2, dx, dx2, beta, beta2, tlen, tmax, rhs
 real*8, allocatable:: u1(:), u2(:), u3(:)
 character(10):: data
 character(4):: nf
!----parameters ----c
! dt must be less than (dx/beta) for numerical stability

nx = 100 ! number of grid
 dt = 0.10d0 ! time interval
 dx = 1.0d0 ! grid interval
 beta = 4.0d0 ! wave velocity
 tlen = 5.0d0 ! time length of wave source
 ntmax = 1000 ! maximum time step (finish at it = ntmax, or)
 tmax = 33.0d0 ! maximum calculation time (finish at t< tmax)

!---- allocate dimension variables ----c

nx1 = nx + 1
 allocate (u1(nx1), u2(nx1), u3(nx1))

!---- initialize t, counter, u1, u2, u3 ----c

t = 0.0d0
 counter = 1
 do i = 1, nx1
 u1(i) = 0.0d0
 u2(i) = 0.0d0
 u3(i) = 0.0d0
 end do

!---- calculate c^2, dt^2, dx^2 ----c
 beta2 = beta**2
 dt2 = dt**2
 dx2 = dx**2

! ============= Time marching loop ===========
 do it = 1, ntmax

t = t + dt

!---- calculate u3(i) ----c

do i= 2, nx
 end do

!---- free boundary (du/dx=0) ----c
!---- fixed boundary (u=0) ----c
!---- source time function c----c

if (t.le.tlen) then
 u3(51) = sin(3.1415927*t/tlen)**2
 end if

!---- change new and old variables ----c

do i=1,nx1
 u1(i) = u2(i)
 u2(i) = u3(i)
 end do
!---- make data file ----c
 write(nf,'(i3.3)') counter
 data = "data"//nf

 do i=1, nx1
 write(7,*) i, u2(i)
 end do
!---- output u2 at desired intervals, stop when t is big enough

write(6,'(a5,1x,i3,1x,f6.3)') 'loop', counter, t
 if (t.gt.tmax) exit

counter = counter + 1

end do


The following is used to plot the results:

gnuplot << eof
set term gif animate
set output "animate.gif"
n=9 #n frames
set xrange [0:102]
set yrange [-2:2]
list = system('ls data*')
do for [file in list] {
 plot file w lines lt 1 lw 1.5 t sprintf("t=%i sec",i/10)
 i = i + 1
set output


Nguyen Cong Nghia – IESAS