The time has come: setting up a DB with MySQL and R

The recent decision by Yahoo to screw with their API for financial data (and in the process disabling all packages/functions in various programing languages obtaining EOD (end-of-day) data, at least temporarily) shows us two important things:

  1. Nothing is free and reliable forever
  2. It’s a good idea to have a database set up

So for everyone in the “why do i need a DB/I don’t know how to set it up/I’m too lazy” camp, this post will be a quick tutorial (you should be up and running in 30mins max) on how to set up a MySQL Database under Windows for storing price data and interfacing it with R.

Why only Windows and R you ask? That’s why.

So let’s get started. If you are proficient with MySQL and already have a database set up, click here to get to the R part directly.

First, let’s get the MySQL installer from here. Once downloaded, execute the installer and you will be prompted with a screen like this:

post4_1

I would recommend going for the Developer Default installation, it might have way more features than you will ever need but you never know what you want to do in the future (quickly trying some tutorial you saw on a blog for example). For this tutorial, however, it is enough to install the MySQL Server and the MySQL Workbench.

You can add both when selecting the Custom installation option. After selecting what you want to install, hit next.

post4_2

You will see an overview of what you have selected, and pressing execute will start the installation.

post4_3

Once everythings installed, we need to configure our server.

post4_4

post4_5

post4_6

On the next screen you can set the password for the Root Account (superuser) as well as add users that will be able to access your server.

post4_7

Now we just need to give our server a name and we are good to go!

post4_8

post4_9

After finishing the installation, you can start your MySQL Workbench. You will see the following screen:

post4_10

We will now set up a connection to the server. Click on the + sign next to MySQL Connections. Here you need to specify the username and password you selected before. Clicking Test Connection will show you if everything works as intended.

post4_11

After pressing Ok, you have successfully connected!

post4_12

Now we will create our price database. On the left hand side under schemas (the pricedb will not be there yet for you), right-click and go Create Schema, enter a name you like (in this case pricedb) and click Apply.

Now open the dropdown under your new schema, right-click on Tables and go Create Table. You will now see a menu where you can create a new table, set up all the columns and data types and after that, we’re good to go.

The way I set it up for this example is the following:

post4_13

It is the classic OHLCV layout we know from Yahoo Finance (RIP) using Date and Ticker as primary keys (unique identifiers, requires them to be NN, not null) and also requiring the AdjustedClose column to be non-empty. Once you’re done, click Apply. You can now try query for your newly created table (for some basic MySQL commands see here or a quick google search).

post4_14

Unfortunately, our table is still empty.

Let’s move over to R and start filling it!

The package we will be using to interface with our database is RMySQL. It’s simple to use, fast and we can easily write wrappers around the main functions to use our price table.

Firstly, we need to connect to the database. We will write a wrapper around the RMySQL::dbConnect function whose main purpose it to hold the default values for connecting.

require(xts)
require(RMySQL)
dbCon <- function(db='pricedb',host='localhost',user='username',pw='pw'){
    db <- dbConnect(MySQL(), user=user, password=pw, dbname=db, host=host)
    return(db)
}

This function will return a connection object to the database that we will use for all other functions.

Next, assuming you have access to some timeseries that you want to write (to my knowledge, Joshua Ulrich already updated quantmod and it should be working for the time being), we need functionality to write it to our table.

dbWritePrices <- function(data,connection){
    ticker <- deparse(substitute(data))
    date <- index(data)
    res <- cbind.data.frame(rep(ticker,nrow(data)),date,as.matrix(data))
    colnames(res) <- c("Ticker","Date","Open","High","Low","Close","AdjustedClose","Volume")
    dbWriteTable(connection, value = res, name = "prices", row.names=FALSE, overwrite=TRUE)
}

The input for this function is a standard xts table like this:

image

as well as a connection object created by dbCon. It will format and arrange the data according to the table layout in the database and then write to it. If we now go back to the workbench and repeat our query we will actually see some data there!

post4_15

So now only the last part of the cycle is missing, retrieving data from the database and move it to R.

dbGetPrices <- function(ticker,connection){
    data <- dbSendQuery(connection,paste("SELECT * FROM pricedb.prices WHERE Ticker='",ticker,"'",sep=""))
    data <- fetch(data,n=-1)
    data <- xts(data[,c("Open","High","Low","Close","AdjustedClose","Volume")],as.Date(data[,"Date"]))
    assign(ticker,data,envir=.GlobalEnv)
}

This function allows us to query the database for a ticker, retrieve it and assign it to the global environment.

So there you have it, a quick way to store and fetch price data in a database using R.

As always, if you have any questions, comments or critique, please get in contact with me either through the comment section or drop me a PM.

Until next time,

QUANTBEAR

One thought on “The time has come: setting up a DB with MySQL and R

  1. Pingback: Quantocracy's Daily Wrap for 05/22/2017 | Quantocracy

Leave a comment