The “ABC” of writing or editing scientific manuscript

If you’re doing science, you need to, inevitably, write a manuscript. I have read and collected, from various sources, some of the important points about writing in science. The most important and recommendable source is the “Writing in Science” course by Stanford University! I would like to share my collection with my readers. Don’t worry, I will keep it short and concise so that you don’t need to read it like a novel but you can always refer to it while writing.

The aim of the writer should be to make the manuscript clear, elegant, and stylish. It should have streamlined flow of ideas for which the writer must follow some set of rules. After completing the sentence, the writers must ask themselves whether it is: readable, understandable, and enjoyable to read.

Editing the manuscript:

Sentence Level Editing:

Writing a manuscript always starts with editing at the lowest level, which is a sentence. For editing the manuscript at the sentence level, the writer should keep in mind a few things:

  1. Use active voice (Subject + Verb + Object). It is livelier and easy to read.
  2. Write with verbs (instead of nouns).
    • Use strong verbs
    • Avoid turning verbs into nouns
    • Don’t bury the main verb
    • Pick right verb
    • Use “to be” verbs purposefully and sparingly.
    • Don’t turn spunky verbs into clunky nouns.
    • “Compared to” to point out similarities between different things. “Compared with” to point out differences between similar things (often used in science).
    • “That” defining. “Which” non-defining
  3. Avoid using “his/her”. Instead, use “their”.
  4. Cut unnecessary words and phrases ruthlessly. Get rid of
    • Dead-weight words and phrases. E.g., as it is well known, as it has been shown.
    • Empty words and phrases. E.g., basic tenets of, methodologic, important.
    • Long words or phrases.
    • Unnecessary jargons and acronyms
    • Repetitive words or phrases
    • Adverbs. E.g., very, really, quickly, basically, generally etc
  5. Eliminate negatives, and, superfluous uses of “there are/ there is”.
  6. Omit needless prepositions. Change, “They agreed that it was true” to “they agreed it was true”.
  7. Experiment with punctuation (comma, colon, dash, parentheses, semicolon, period). Use them to vary sentence structure.
    • Semicolon connects two independent clauses.
    • Colon to separate items in a list, quote, explanation, conclusion or amplification.
    • Parentheses to insert an afterthought/explanation.
    • Dash to add emphasis, or to insert an abrupt definition or description, join or condense. Don’t overuse it, or it loses its impact.
  8. Pairs of ideas joined by “and”, “or”, or “but” should be written in parallel form. E.g., The velocity decreased by 50% but the pressure decreased by only 10%.

Paragraph level Editing:

  1. 1 paragraph = 1 idea
  2. Give away the punch line early.
  3. The logical flow of ideas. General -\> specific-\> take home message. Logical arguments: if a then b; a therefore b.
  4. Parallel sentence structure.
  5. If necessary then transition words.
  6. The emphasis at the end.
  7. Variable sentence length. Long, short, long
  8. Follow: Arguments, counter-arguments, rebuttals

Writing Process:

Many writers are not sure how to start and how to organize their work. Here are some tips.

  1. Prewriting: give 70% time
    • Get Organized first
      • Arrange key facts and citations from literature into a crude road map- think in paragraphs and sections.
      • Like ideas should be grouped; like paragraphs should be grouped.
  2. Writing the first draft: give 10% time
    • Don’t be a perfectionist: get the ideas down in complete order.et
    • Focus on logical organization
    • Write it quickly and efficiently
  3. Revision: give 20% time
    • Read out your work loud: Brain processes the spoken word differently
    • Do a verb check: Underline the main verb in each sentence (lackluster verbs, passive verbs, buried verbs).
    • Cut clutter: Watch out for: dead weight words, empty words, long words and phrases, Unnecessary jargons and acronyms, repetitive words or phrases, adverbs.
    • Do an organizational review: tag each paragraph with a phrase or sentence that sums up the main point.
    • Get feedback from others: ask someone outside your department to read your manuscript. They should be able to grasp: the main findings, take-home messages, and significance of your work. Ask them to point out particularly hard-to-read sentences and paragraphs.
    • Get editing help: find a good editor to edit your work.

 

Checklist for Final Draft

  1. Check for consistency: the values of any variable (such as the mean temperature of your data) used in different sentences, paragraphs or sections should be the same.
  2. Check for numerical consistency:
    • Numbers in your abstract should match the numbers in your tables/figures/text,
    • Numbers in the text should match those in tables/figures.
  3. Check for references:
    • Does that information really exist in that paper?
    • Always cite/go back to primary source.

The Original Manuscript:

Recommended order of writing:

  1. Tables and Figures: Very important
    • Should stand-alone and tell a complete story. The reader should not need to refer back to the text.
    • Use fewest figures/tables
    • Do not present the same data in both table and figure.
    • Figures: Visual impact, show trends and patterns, tell a quick story, tell a whole story, highlight particular result
      • Keep it simple (If it’s too complex then maybe it belongs to the table)
      • Make easy to distinguish the group.
    • Tables: give precise values.
      • Use superscript symbols to identify footnotes and give footnotes to explain experimental details.
      • Use three horizontal lines for table format.
      • Make sure everything lines up and looks professional
      • Use a reasonable number of significant figures
      • Give units
      • Omit unnecessary columns
  2.  Results:
    • Summarize what the data show
    • Point out simple relationships
    • Describe big picture trends
    • Cite figures or tables that present supporting data.
    • Avoid repeating the numbers that already available in tables or figures.
    • Break into subsections with headings, if necessary.
    • Complement the information that is already in tables and figures
    • Give precise values that are not available in the figure
    • Report the percent change or percent difference if the absolute values are given in tables.
    • Don’t forget to talk about negative results.
    • Reserve information about what you did for the methods section
    • Reserve comments on the meaning of your results for the discussion section.
    • Use past tense for completed actions:
      • We found that…
      • Women were more likely to…
      • Men smoked more cigarettes than…
    • Use the present tense for assertions that continue to be true, such as what the tables show, what you believe, and what the data suggest:
      • Figure 1 shows…
      • The findings confirm….
      • The data suggest…
      • We believe that this shows…
      • Use the active voice
  3. Methods:
    • Give a clear overview of what was done.
    • Give enough information to replicate the study.
    • Be complete but make life easy for your reader:
      • Break into smaller subsections with subheadings
      • Cite a reference for commonly used methods
      • Display a flow diagram or table where possible.
      • May use jargon and the passive voice more liberally
    • Use past tense to report methods (“we measured”)
    • Use present tense to describe how data are presented in the paper (“data are summarized as means +- SD”)
  4.  Introduction:
    • Typically 3 paragraphs long (recommended range 2-5)
    • Should focus on the specific hypothesis/aim of your study
    • Information comes in Cone format:
      • What’s known: Paragraph 1
      • What’s unknown: limitations and gaps in previous studies: paragraph 2
      • Your burning question/hypothesis/aim: paragraph 3
      • Experimental approach: paragraph 3
      • Why your experimental approach is new, different and important: paragraph 3
    • Keep paragraphs short
    • Write for the general audience (clear, concise, non-technical)
    • Do not answer the research questions.
    • Summarize at a high level
  5. Discussion:
    • Information comes in inverted cone format
    • Answer the questions asked
    • Support your conclusion
    • Defend your conclusion
    • Give a big-picture take-home message: what do my results mean and why should anyone care. Make sure your take-home message is clear and consistent.
    • Use active voice
    • Tell it like a story
    • Don’t travel too far from your data
    • Focus on limitations that matter
    • Verb Tense:
      • Past when referring to study details, results, analyses, and background research. E.g., we found that…, Subjects may have experienced…, Miller et al. found…
      • Present when talking about what data suggest. E.g, the greater weight loss suggests…, the explanation for this difference is not clear, potential explanation includes…
  6. Abstract:
    • Overview of the main story
    • Gives highlights from each section of the paper
    • Limited length (100-300 words)
    • Stands on its own
      • Background
      • Question/aim/hypothesis

      • Experiment
      • Results
      • Conclusion
      • Implication, speculation or recommendation

Plagiarism:

In the end, it is profoundly important to stay away from plagiarism. Do not pass off other people’s writing (or tables and figures) as your own.

What is plagiarism:

  1. Cutting or pasting sentences or even phrases
  2. Slightly rewriting or re-arranging others’ words. It is unlikely that 2 people will come up with exact 7-8 strings in a sentence independently.

-Utpal Kumar

Institute of Earth Sciences, Academia Sinica

Plot geochemical data using python: An example of analyzing Adakite rock data from GEOROC

We download the precompiled data of adakite from GEOROC database.
For a simple impression of adakite, the wikipedia page gives some clue: Adakites are volcanic rocks of intermediate to felsic composition that have geochemical characteristics of magma thought to have formed by partial melting of altered basalt that is subducted below volcanic arcs.

In this example, we demonstrate how to use python to simplify the data, discard the null data, classify and plot the geochemical properties.

First, let’s look at the data, it is quite large and differs in the available data: some elements are there, some are not.

Screen Shot 2018-09-24 at 3.50.03 PM

Now we can start by importing some useful packages:

import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import warnings
warnings.filterwarnings("ignore")
plt.rcParams['figure.figsize'] = (20, 10)
plt.style.use('default')

We use pandas to import the data, the encoding is used to solve some font conflicts.

df = pd.read_csv("ADAKITE.csv", encoding="iso-8859-1")

We use dropna() method to discard columns with less than 50% data available.

df = df.dropna(axis=1, thresh=int(0.5*len(df)))

To have a look at the sample occurrence, let’s plot the data. We select the lon, lat column, drop null value and change all data to numeric format.

plt.figure(figsize=(10,10))
m = Basemap(lon_0=180,projection='hammer')
lon = df["LONGITUDE MIN"].dropna()
lat = df["LATITUDE MIN"].dropna()
lon = pd.to_numeric(lon, errors='ignore');
lat = pd.to_numeric(lat, errors='ignore');
lon_ = [];lat_ = []
for x, y in zip(lon,lat):
    try:
        xx, yy = m(float(x),float(y))
        lon_.append(xx);lat_.append(yy)
    except:
        pass
m.scatter(lon_, lat_, marker = "o" ,s=15, c="r" , edgecolors = "k", alpha = 1)
m.drawcoastlines()
plt.title('Adakite rocks sample')
plt.show()

download

We make a function called plot_harker() to plot Harker’s diagram:

def plot_harker(x,xlabel,y,ylabel,title=None,xlim=[40,80],ylim=None,color = "b",label=None):
    plt.scatter(x=x,y=y,marker="o", c=color,s=8,label = label)
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.xlim(xlim)
    try:
        plt.ylim(ylim)
    except:
        pass
    if title != None:
        plt.title(title)

… and then plot some elements using that function:

plt.figure(figsize=(12,12))
plt.subplot(321)
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["AL2O3(WT%)"],ylabel=(r'$Al_2O_3$ (wt%)'),title=r'$SiO_2$ vs $Al_2O_3$')
plt.subplot(322)
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["MGO(WT%)"],ylabel=(r'$MgO$ (wt%)'),title=r'$SiO_2$ vs $MgO$')
plt.subplot(323)
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["FEOT(WT%)"],ylabel=(r'$FeOt$ (wt%)'),title=r'$SiO_2$ vs $FeOt$')
plt.subplot(324)
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["TIO2(WT%)"],ylabel=(r'$TiO_2$ (wt%)'),title=r'$SiO_2$ vs $TiO_2$')
plt.subplot(325)
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["NA2O(WT%)"],ylabel=(r'$Na_2O$ (wt%)'),title=r'$SiO_2$ vs $Na_2O$')
plt.subplot(326)
plot_harker(x=df["SIO2(WT%)"],xlabel=r'$SiO_2$ (wt%)',
            y=df["K2O(WT%)"],ylabel=(r'$MgO$ (wt%)'),title=r'$SiO_2$ vs $K_2O$')
plt.suptitle(r'Harker diagram of Adakite vs $SiO_2$',y=0.92,fontsize=15)
plt.subplots_adjust(hspace=0.3)
plt.show()

download (1).png

We can try to see the tectonic settings of the rock:

plt.figure(figsize=(8,8))
tec = df['TECTONIC SETTING'].dropna()
tec = tec.replace('ARCHEAN CRATON (INCLUDING GREENSTONE BELTS)','ARCHEAN CRATON')
tec_counts = tec.value_counts()
tec_counts.plot(kind="bar",fontsize=10)
plt.title('Tectonic settings of Adakite')
plt.ylim([0,500])
plt.show()

download (2)

The following code demonstrates how to create new columns and divide the data. We divide the data in High Silica Adakite (SiO2 > 60%) and Low Silica Adakite (SiO2 < 60%)

df['SR/Y'] = (df["SR(PPM)"]/df["Y(PPM)"])
df['CAO+NA2O'] = df['CAO(WT%)'] + df['NA2O(WT%)']
df['CR/NI'] = df['CR(PPM)'] + df['NI(PPM)']
df_hsa = df[df["SIO2(WT%)"] > 60]
df_lsa = df[df["SIO2(WT%)"] < 60]

download (3).png

Bonus: Let’s see the publish about adakite in the GEOROC database by year:

cite = [df.CITATIONS[x] for x in range(0,len(df)) if len(df.CITATIONS[x]) > 20 and df.CITATIONS[x].count('[') < 3]
year = []
for i in range(0,len(cite)):
    year.append(int(cite[i].split('[')[2].split(']')[0]))

download (4)

The python code below is the full program of this example:

Nguyen Cong Nghia
IESAS

Using Git and Github to store our programs (tutorial)

Git & GitHub tutorial

Git is a version control system for tracking changes in computer files. It was initially created by Linus Torvalds (creator of Linux system) in 2005.
We can use it to store any kinds of programs. It is distributed version control system. It means that many developers can work on the same project without being on the same network. It tracks each of the changes made to the files in the project. The user can revert to any file at any time of it had been committed to the repository. We can even look the snapshots of the code at any particular time in history. We can upload (or push) the files to the remote repository.

Initializing local git repository

git init

Configuring the Username and Email

git config --global user.name 'utpalkumar'
git config --global user.email 'utpalkumar50@gmail.com'

We can have a look at the configuration using the following command

git config --list

If you need any help, you can just type:

git help

Now, to add the files to the index and working tree (our final aim)

we use git add command
git add filename.py

To add all the files in the current directory

git add .
git add *.html

We can get the information about the tracked and untracked files. Tracked files are those which have been added to the working directory.

git status

If we make any changes to the files, we can inspect the changes using the diff command

git diff

If we want to remove the file from the index (untrack the file), then we can simply type:

git rm --cached filename.py

To remove the files from the index and the working tree:

git rm filename.txt

If we want to rename the file then we can do it using the git command too. We don’t need to untrack the file and then rename the original file and add it again

git mv filname.txt newfilename.txt

Now, we can commit the files to add to our repository on the GitHub.

There are two ways of doing that:
1. First way opens the vi or default directory on the local computer, and the user is prompted to enter the message. It is safe to enter meaningful messages because it is useful to track the changes made to the file.
git commit
2. The user can also enter the message using the -m flag
git commit -m 'made some changes'

To make changes to the committed files, the command is –amend

git commit --amend

If we don’t want to include some files in the current directory into the index or working tree, we can add the name of those files in the .gitignore file.

touch .gitignore

We can obtain the log of the git actions

git log --pretty=oneline
We can make this better formatted
git log --pretty=format:"%h : %an : %ar : %s"
For all commits within a week
git log --since=1.weeks
For all commits since some given date
git log --since="2014-01-12"
All the commits of a given author
git log --author="utpalkumar"
All commits before a given date
git log --before="2014-04-30"

If we are a group of developers. We don’t wish to add any changes to the repository without finishing a particular sub-project. We can avoid that by working in a branch

To create a branch other the main branch (master)
git branch mybranch
To switch to the new branch
git checkout mybranch

We can make any change in this branch and merge these changes in the master branch, we need to first switch to the master

git checkout master

to merge the changes to the original file

git merge mybranch -m "added mybranch"

For adding files to the remote git repository, we need to make account on the Github website and then we need to start a new repository, name it, give some required details then, we can add files from our local repository to the remote repository using the following commands:

git remote add origin https://repositoryaddress.git
Replace the above fake URL with the URL you get from the repository you create on the Github website.
git push

If we want to add the same files to another repository, we need to remove the added remove origin using the command

git remote rm origin # to remove the remote origin

We can download the git directory by simply using the git clone command

git clone https://repositoryaddress.git

Plotting spiral pattern using MATLAB

MATLAB Codes:

 

clear; close all; clc

 t = linspace(0,10*pi,1000);

 x = t.*cos(t);

 y = t.*sin(t);

 z=linspace(0,2*pi,1000);

plot3(x,y,z,'LineWidth',8)

axis tight, grid on, view(35,30)

c = 1:numel(t);      

surface([x(:), x(:)], [y(:), y(:)], [z(:), z(:)], ...

    [c(:), c(:)], 'EdgeColor','flat', 'FaceColor','none','LineWidth',8);

colormap(jet(numel(t))) 

colorbar

saveas(gcf,'spiralPlot.png')

spiralPlot

Calculating the curvature of a curve

Screenshot from 2017-05-03 16-46-57.png

 

The curvature of the curve is the amount by which it deviates from being a straight line. It is defined as the reciprocal of the radius of the best fitting circle. So, the curvature of a straight line is zero.  The curvature of a circle of radius R should be large if R is small and vice-versa.

Screenshot from 2017-05-03 16-41-38

Screenshot from 2017-05-03 16-42-03

Download pdf file, please click here.