bdvis development version available for early feedback

31 Jul

Google Summer of Code 2013 is half way through. Mid term evaluations are underway. I thought this is a good logical point for us to share what we have been doing for Biodiversity Data Visualizations in R project and open up the package for testing and some early feedback. We have named the package bdvis. The package is on github, and I would appreciate if you could install and test it. Feedback may be given in the comments here, using issues on github  by twitter or email.

Getting data

The data was obtained from the Data portal of Global Biodiversity Information Facility. (http://data.gbif.org). The data set we are looking for is iNaturalist research grade records. We accessed the datasets page at http://data.gbif.org/datasets/ and selected the iNaturalist.org page from the alphabetic list which is at http://data.gbif.org/datasets/provider/407. Once on this page use link Explore: Occurrences and then from the next page click Download: Spreadsheet of results. On this page make sure  Comma separated values is selected and then press Download Now button. Website may take a few minutes to make your download ready. Once it is ready, the download link will be provided. Typically the name of the file will be occurrence-search-12345.zip The number of digits would be as many as 40.  Use the link to download the .zip file and then extract the data file occurrence-search-12345.csv in the working directory of R. Since this file has a long name, let us rename it to inat.csv for convenience.

Now we are ready to load our data.

inat = read.csv("inat.csv")
dim(inat)

If it shows something like

[1] 66581    47

we are on right track. Our data is loaded into R. For the time being, this package handles only GBIF provided data format, but getting user generated biodiversity data in this format using some built in functions is being worked out.

Package installation

Now let us install bdvis package. First we need to get devtools package which will let us install packages from github (rather than CRAN).

install.packages("devtools")
require(devtools)

install_github("bdvis", "vijaybarve")
require(bdvis)

if this produces something like

Loading required package: bdvis

Attaching package: ‘bdvis’

The following object(s) are masked from ‘package:base’:

summary

we are on right track. Our packages is installed and loaded into R.

Package functions

1. summery

Let us start playing with the functions now. We have the data loaded in inat data frame.

bdvis::summary(inat)

Should produce something like:

Total no of records = 66581
Date range of the records from  1710-02-26  to  2012-12-31
Bounding box of records  -77.89309 , -177.37895  -  78.53431 , 179.2615
Taxonomic summary...
No of Families :  1394
No of Genus :  5089
No of Species :  11299

What does this tell us about our data ?

  • We have 66581 records in the data set
  • The date range is from 1710 to 2012. (Really we have record form 1710? Looks we have a problem there.)
  • The bounding box is almost the whole world. Yes, this is global data set.
  • We have so many Families, Genus and Species represented in this data set.

I have two questions here:

  1. What more would you like to get in the summary?
  2. Should I rename the function summary to something else, so it does not clash with usual data frame summery function name?

2. mapgrid

Now let us generate a Heat map of the records in this data set. This map will show us the density of records in different parts of the world. To generate this map

mapgrid(inat,ptype="species")
mapgrid output for iNaturalist data

mapgrid output for iNaturalist data

ptype could be records if we need the map with raw records rather than aggregated to species. Again the questions:

  • What more options would you like to see here?
  • Ability to zoom in certain region?
  • Control over color pallet ?

3. tempolar

Now coming to Temporal visualizations, the function tempolar would make polar plots of temporal data into daily, weekly and monthly plots. The code and samples are as follows:

tempolar(inat,color="green",title="iNaturalist daily"
          ,plottype="r",timescale="d")
tempolar(inat,color="blue",title="iNaturalist weekly"
          ,plottype="p",timescale="w")
tempolar(inat,color="red",title="iNaturalist monthly"
          ,plottype="r",timescale="m")
Dailyly plot of Temporal data. Each line is records on each day of the year.

Dailyly plot of Temporal data. Each line is records on each day of the year.

Weekly plot of Temporal data. Plottype polygon is used here.

Weekly plot of Temporal data. Plottype polygon is used here.

Monthly plot of Temporal data. Each line is representing records in that month.

Monthly plot of Temporal data. Each line is representing records in that month.

Here options to control color, title, plottype and of course timescale are provided.

We are less than half way through our original proposal, and will continue to actively build this package. As I build more functionality, I will post more information on the blog. Till that time keep the feedback flowing telling us what more you would like to see in this package.

About these ads

2 Responses to “bdvis development version available for early feedback”

  1. Tal Galili July 31, 2013 at 4:46 pm #

    Hi there,
    I think you package is quite lovely.
    One note, regarding “summary”, since it is an S3 method, it is better if you’d start by defining a new class (say, biodiv), and use methods for that class. Such as: summary.biodiv.
    This is quite simple to understand and do, but if you are not experience in it – go and google for S3 methods in R and read a bit.

    There is NO REASON in the world for you to override “summary” – using S3 methods is the way to go (and would also make your chances to get on CRAN better).

    With regards,
    Tal

    • vijaybarve August 1, 2013 at 3:05 am #

      Thanks Tal, I am glad you liked the package.

      I agree with you, I will make the necessary changes.

      Regards,

      Vijay

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 406 other followers

%d bloggers like this: