Archive | July, 2012

Blue Jay and Scrub Jay : Using rvertnet to check the distributions in R

30 Jul

As part of my Google Summer of Code, I am also working on another package for R called rvertnet. This package is a wrapper in R for VertNet websites. Vertnet is a vertebrate distributed database network consisting of FishNet2MaNISHerpNET, and ORNIS. Out of that currently Fishnet, HerpNET and ORNIS have their v2 portals serving data. rvertnet has functions now to access this data and import them into R data frames.

Some of my lab mates faced a difficulty in downloading data for Scrub Jay (Aphelocoma spp. ) due to large number of records (220k+) on ORNIS, so decided to try using rvertnet package which was still in development that time. The package really helpe and got the results quickly. So while ecploring that data I came up with this case study.

So here to get data for Blue Jay (Cyanocitta cristata) which is distributed in eastern USA, we use vertoccurrence function and specify taxon we are looking for as Blue Jay with t=”Cyanocitta cristata” and we need to specify this is bird species with grp=”bird” since currently we have to access the data form three different websites of VertNet for Fishes, Birds and Herps(Reptiles and Amphibians). This fetches us all the records for Blue Jay. Now we want to get discard the records without Latitude and Longitude values so we use subset function on the data with Latitude !=0 & Longitude != 0. This gives us all geocoded records for Blue jay which then map using maps and ggplot packages like we did in earlier post.

library(rvertnet)
bluej1=vertoccurrence(t="Cyanocitta cristata",grp="bird")
bluej2=subset(bluej1,Latitude !=0 & Longitude != 0)

library(maps)
library(ggplot2)
world  = map_data("world")
ggplot(world, aes(long, lat)) +
  geom_polygon(aes(group = group), fill = "white",
               color = "gray40", size = .2) +
  geom_jitter(data = bluej2,
              aes(Longitude, Latitude), alpha=0.6, size = 4,
              color = "red") +
                opts(title = "Cyanocitta cristata (Blue Jay)")

The final output of the code snippet is as following.

Now let us put Blue Jay and Scrub Jay side by side on the map to see how they are distributed in North America.  This is same as the earlier code except that we get data for both Jays and while plotting we use additional geom_jitters for the other Jay with a different color to distinguish the two. Also not the reduction in the size from 4 to 1 in order to make the points clearly visible

library(rvertnet)
bluej1=vertoccurrence(t="Cyanocitta cristata",grp="bird")
bluej2=subset(bluej1,Latitude !=0 & Longitude != 0)
scrubj1=vertoccurrence(t="Aphelocoma",grp="bird")
scrubj2=subset(scrubj1,Latitude !=0 & Longitude != 0)

library(maps)
library(ggplot2)
world = map_data("world")
ggplot(world, aes(long, lat)) +
  geom_polygon(aes(group = group), fill = "white", color = "gray40",
               size = .2) +
  geom_jitter(data = bluej2,
              aes(Longitude, Latitude), alpha=0.6, size = 1,
              color = "blue") +
                opts(title = "Blue Jay and Scrub Jay") +
  geom_jitter(data = scrubj2,
              aes(Longitude, Latitude), alpha=0.6, size = 1,
              color = "brown")

The final output of the code snippet is as following.

Map biodiversity records with rgbif and ggmap packages in R

23 Jul

When I attended usrR! 2012 last month, there was an interesting presentation by Dr. David Kahle about the package ggmap. It is a package built over ggmap2 and helps us map spatial data over online maps like Google maps or Open Street Maps. I decided to give ggmap package a try with biodiversity data.

So first let us create a map for the Plain Tiger or the African Monarch Butterfly (Danaus chrysippus). We use occurrencelist from rgbif package again like previous post.

We use qmap function from ggmap package to quickly pull up the base map from Google Maps. So in essence the qmap function eliminates two step process of getting map data using map_data function and then setting up map display using ggplot function into one step. We use geom_jitter function to plot the occurrence points in the specified size(size = 4) and color(color = “red”).

library(rgbif)
Dan_chr=occurrencelist(sciname = 'Danaus chrysippus',
                       coordinatestatus = TRUE,
                       maxresults = 1000,
                       latlongdf = TRUE, removeZeros = TRUE)
library(ggmap)
library(ggplot2)
wmap1 = qmap('India',zoom=2)
wmap1 +
      geom_jitter(data = Dan_chr,
                  aes(decimalLongitude, decimalLatitude),
                  alpha=0.6, size = 4, color = "red") +
                    opts(title = "Danaus chrysippus")

Here is the opuput map of the code snippet:

Though in earlier code we have used geom_jitter, high density of the points in some regions are not clearly seen. If we want to get better idea about the number of points we can try two dimensional density maps using the stat_density2d function. It just adds density lines on the map showing higher density with darker circles.

library(rgbif)
Dan_chr=occurrencelist(sciname = 'Danaus chrysippus',
                       coordinatestatus = TRUE,
                       maxresults = 1000,
                       latlongdf = TRUE, removeZeros = TRUE)
library(ggmap)
library(ggplot2)
wmap1 = qmap('India',zoom=2)
wmap1 +
  stat_density2d(aes(x = decimalLongitude, y = decimalLatitude,
                     fill = ..level.., alpha = ..level..),
                 size = 4, bins = 6,
                 data = Dan_chr, geom = 'line') +
      geom_jitter(data = Dan_chr,
                  aes(decimalLongitude, decimalLatitude),
                  alpha=0.6, size = 4, color = "red") +
                    opts(title = "Danaus chrysippus :: Density Plot")

Map biodiversity records with rgbif and dismo packages in R

16 Jul

In the earlier post we generated maps from GBIF biodiversity records using maps and ggplot2 packages. We used world map with country borders for that. Now we will generate maps with google maps as base layer using dismo package.

Like earlier we download data for Danaus chrysippus from GBIF using occurrencelist function into a data frame Dan_chr.

Then use dismo package which has function gmap to quickly download base layer maps form google and display it using plot function. We can specify the extent of map range we need to download using extent function and specifying Latitude and Longitude range. We plot the points first by converting them into Mercator system using points.

library(rgbif)
Dan_chr=occurrencelist(sciname = 'Danaus chrysippus',
                       coordinatestatus = TRUE,
                       maxresults = 1000,
                       latlongdf = TRUE, removeZeros = TRUE)
library(dismo)
e = extent( -179 , 179 , -80 , 80 )
r = gmap(e)
plot(r, interpolate=TRUE, main="Map")
xy1=cbind(Dan_chr$decimalLongitude,Dan_chr$decimalLatitud)
points(Mercator(xy1) , col='blue', pch=20)
text(160,0, "\n\n\nDanaus \nchrysippus", adj = c(0,1),
     col="red")

The output of the code snippet is as follows:

Map of Danaus chrysippus

Map biodiversity records with rgbif, maps and ggplot2 packages in R

9 Jul

Global Biodiversity Information Facility or GBIF is an international consortium working towards making Biodiversity information available through single portal to everyone.  GBIF with its partners are working towards mobilizing data, developing data and metadata standards, developing distributed database system and making the data accessible through APIs. At this point this largest single window data source covering wide spectrum of taxa and geographic range.

rgbif is a package in R to download the data from GBIF data portal using its API. Once the data is available as data frame in R, we can use several functions and packages to analyse and visualize it. [ Cran, rOpenSci, my work ]

Here first we use occurrencelist function to download 1000 records (maxresults = 1000) for “Danaus plexipus”  (sciname = ‘Danaus plexippus’) which is Monarch Butterfly. We specify that we want records that have been geo-coded (coordinatestatus=TRUE), we want it to be stored in data frame (latlongdf=TRUE) and we want to remove any records that have zeros in Latitude and Longitude values (removeZeros = TRUE). This command will result in a data frame dan_ple with monarch occurrence records.

Now we use map_data function form maps package to get world map. Use ggplot function form ggplot2 package to plot the world map as polygon layer. On top of that we plot the Monarch butterfly occurrence points using decimalLatitude and decimalLongitude columns in red color. and specify the title of the map.

library(rgbif)
dan_ple=occurrencelist(sciname = 'Danaus plexippus', 
                       coordinatestatus = TRUE, maxresults = 1000, 
                       latlongdf = TRUE, removeZeros = TRUE)
library(maps)
library(ggplot2)
world = map_data("world")
ggplot(world, aes(long, lat)) +
geom_polygon(aes(group = group), fill = "white", 
              color = "gray40", size = .2) +
geom_jitter(data = dan_ple,
aes(decimalLongitude, decimalLatitude), alpha=0.6, 
             size = 4, color = "red") +
opts(title = "Danaus plexippus")

The final output of the code snippet is as following.

Map of Danaus plexippus

More maps using other packages to come soon.