{"id":178,"date":"2015-07-30T21:30:51","date_gmt":"2015-07-30T21:30:51","guid":{"rendered":"http:\/\/rinscience.com\/?p=178"},"modified":"2017-10-30T15:38:17","modified_gmt":"2017-10-30T19:38:17","slug":"how-to-create-data-frames-in-r","status":"publish","type":"post","link":"https:\/\/datascienceplus.com\/how-to-create-data-frames-in-r\/","title":{"rendered":"How to create Vectors, Factors, Lists, Matrices and Datasets with R Programming"},"content":{"rendered":"<p>In this post, we will show how to create vectors, factors, lists, matrices and datasets in R<\/p>\n<h2>Vectors<\/h2>\n<p>The vector is a very important tool in R programming. Through vectors, we create matrix and data frames.<br \/>\nVectors can have numeric, character and logical values. The function <code>c()<\/code> is used to create vectors in R programming.<\/p>\n<p>For example, lets create a numeric vector:<\/p>\n<pre>\r\n# numeric\r\nx <- c(1, 3, 2, 5.2, -4, 5, 12)\r\nx\r\n<em>1  3  2  5.2 -4  5 12<\/em>\r\n<\/pre>\n<p>Also, we can have a character vector<\/p>\n<pre>\r\n# character\r\ny <- c(\"red\", \"blue\", \"green\", \"no color\")\r\ny\r\n<em>\"red\" \"blue\" \"green\" \"no color\"<\/em>\r\n<\/pre>\n<p>Finally, we can create and logical vectors<\/p>\n<pre>\r\n# logical\r\nz <- c(TRUE, TRUE, FALSE)\r\nz\r\n<em>TRUE TRUE FALSE<\/em>\r\n<\/pre>\n<p>Additionally, you can create a vector which combine a numeric and a character values. Also we can check if the vector is numeric or character.<\/p>\n<pre>\r\n# numeric and character\r\nx <- c(1, 2.2, \"blue\")\r\nx\r\n# check if it is numeric\r\nis.numeric(x)\r\n# check if it is character\r\nis.character(x)\r\n<em>\"1\" \"2.2\" \"blue\"\r\nFALSE\r\nTRUE<\/em><\/pre>\n<p>Sometimes we might be interested to know the number of elements that a vector has, or in other words the length of vector.<\/p>\n<pre>\r\nx <- c(1,2,6,4,7)\r\nlength(x)\r\n<em>5<\/em>\r\n<\/pre>\n<p>A simple way to generate vectors is to use <code>seq()<\/code> function in arithmetic progression.<\/p>\n<pre>\r\nx <- seq(from=2, to=10, by=2)\r\nx\r\n<em>2 4 6 8 10<\/em>\r\n<\/pre>\n<h2>Factors<\/h2>\n<p>Factors are similar to vectors in R but they have another meaning. Factors have levels. In medical research, levels are widely used and they have an important meaning. For example, the smoking could be on 3 levels: never smoker, a former smoker, and current smoker. When we code smoking we can write 0, 1, 2 for never, former and current smoker, respectively. To create a factor the function <code>factor()<\/code> is used.<\/p>\n<p>Create a vector with 6 elements:<\/p>\n<pre>\r\ns <- c(0, 1, 2, 1, 0, 0)\r\ns\r\n<em>0, 1, 2, 1, 0<\/em><\/pre>\n<p>To make these factors, use the function <code>factor<\/code>:<\/p>\n<pre>\r\nsf <- factor(s)\r\nsf\r\n<em>0 1 2 1 0 0\r\nLevels: 0 1 2<\/em><\/pre>\n<p>When you conduct your analysis make sure that you have coded factors accurately.<\/p>\n<h2>Lists<\/h2>\n<p>Lists are vectors, but not like vectors links can combine different types of objects. For example, let suppose that we want to create a list of medical records. Medical records contain diagnosis, age, and treatment of patients.<\/p>\n<p>The function to create lists is <code>list()<\/code>. <\/p>\n<pre>\r\nx <- list(diagnosis=\"Gastritis\", age=79, medication=TRUE)\r\nx\r\n<em>$diagnosis\r\n\"Gastritis\"\r\n$age\r\n79\r\n$medication\r\nTRUE<\/em><\/pre>\n<p>Now that you created a list let see how we can work with it. You may want to access a individuals element of the list.<\/p>\n<pre>\r\nx$age\r\nx$medication\r\n<em>79\r\nTRUE<\/em><\/pre>\n<p>Sometime you may want to know what is the size of the list, and for this use the function <code>length<\/code>.<\/p>\n<pre>\r\nlength(x)\r\n<em>3<\/em><\/pre>\n<h2>Matrices<\/h2>\n<p>Matrices are vectors with more then one dimension, therefore, matrices has rows and columns. To defined number of columns and rows you use the functions <code>nrow<\/code> and <code>ncol<\/code>, respectively. Similarly to vectors, matrices can have numbers, characters and logical values.<\/p>\n<pre>\r\n# create matrix with 6 elements\r\ny <- matrix(1:6, nrow=3, ncol=2)\r\ny\r\n<em>     [,1] [,2]\r\n[1,]    1    4\r\n[2,]    2    5\r\n[3,]    3    6<\/em>\r\n<\/pre>\n<p>Or simply you can create a matrix like this.<\/p>\n<pre>\r\n# create matrix with 10 elements\r\ny <- matrix(1:10, nrow=2)\r\n# number of row is 2, than the columns will be 5\r\ny\r\n<em>     [,1] [,2] [,3] [,4] [,5]\r\n[1,]    1    3    5    7    9\r\n[2,]    2    4    6    8   10<\/em><\/pre>\n<p>Another way of creating matrices is by using functions column-binding <code>cbind()<\/code> or row-binding <code>rbind()<\/code>.<\/p>\n<pre>\r\n# create vectors\r\nx <- 2:5\r\ny <- 9:12\r\n# sort by rows\r\nrbind(x,y)\r\n# sort by columns\r\ncbind(x,y)\r\n<em>[,1] [,2] [,3] [,4]\r\nx    2    3    4    5\r\ny    9   10   11   12\r\n\r\n x  y\r\n[1,] 2  9\r\n[2,] 3 10\r\n[3,] 4 11\r\n[4,] 5 12<\/em><\/pre>\n<p>You can create matrix in an other way, by defining the vector and the names of columns and rows.<\/p>\n<pre>\r\n# create matrix with 4 elements\r\ncells <- c(2,5,12,30)\r\ncolname <- c(\"Jan\", \"Feb\")\r\nrowname <- c(\"Apple\", \"Orange\")\r\ny <- matrix(cells, nrow=2, ncol=2, byrow=TRUE, dimnames=list(rowname, colname))\r\ny\r\n<em>       Jan Feb\r\nApple    2   5\r\nOrange  12  30<\/em><\/pre>\n<p>As you see above, the function <code>byrow=TRUE<\/code> set the order of cells by row, you can change in <code>FALSE<\/code> as well.<\/p>\n<h2>Datasets<\/h2>\n<p>Datasets are similar to the matrix, but in comparison with the matrix, data frame contains numeric and character elements. Therefore, a data frame can have one column with numbers and another column with a character. The function used to create data frames is <code>dataframe()<\/code><\/p>\n<p>Let&#8217;s create a simple dataset.<\/p>\n<pre>\r\nhospital <- c(\"New York\", \"California\")\r\npatients <- c(150, 350)\r\ndf <- data.frame(hospital, patients)\r\ndf\r\n<em>hospital   patients\r\nNew York        150\r\nCalifornia      350<\/em>\r\n<\/pre>\n<p>Frequently we are intrested to look the structure of dataset we use, and for this we use the function <code>str()<\/code>:<\/p>\n<pre>\r\nstr(df)\r\n<em>'data.frame':\t2 obs. of  2 variables:\r\n $ hospital: Factor w\/ 2 levels \"California\",\"New York\": 2 1\r\n $ patients: num  150 350<\/em><\/pre>\n<p>Here we end this post. Post a comment if you have any question.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this post, we will show how to create vectors, factors, lists, matrices and datasets in R Vectors The vector is a very important tool in R programming. Through vectors, we create matrix and data frames. Vectors can have numeric, character and logical values. The function c() is used to create vectors in R programming. [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":181,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6],"tags":[14,232],"class_list":["post-178","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-introduction","tag-data-frames","tag-rstats"],"views":80850,"_links":{"self":[{"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/posts\/178","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/comments?post=178"}],"version-history":[{"count":0,"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/posts\/178\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/media\/181"}],"wp:attachment":[{"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/media?parent=178"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/categories?post=178"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datascienceplus.com\/wp-json\/wp\/v2\/tags?post=178"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}