Everyone of us can face with a situation, when we need to handle with some statistics. Me too, I was going to make a short review on how to build active charts (diagrams), and in the long run I built quite actual statistical graphs, which initially are presented as CSV files. I found this statistics here, where I chose a couple of interesting tables: Marital Status and Educational Attainment. I developed the script that is able to select data from one or even multiple tables (of a CSV file), drawing various graphs for analyzing. In today’s article I will show you an interesting method of parsing CSV files, and displays the results in graphs using Highcharts.
Ok, let’s download the source files and start coding !
Step 1. HTML
In the most beginning we have to include all necessary styles scripts in the header section:
2 | <link href="css/main.css" rel="stylesheet" type="text/css" /> |
6 | <script src="js/highcharts.js"></script> |
7 | <script src="js/skies.js"></script> |
8 | <script src="main.js"></script> |
We included jQuery and highcharts libraries, one small CSS file to stilize our page, and one main.js file with our own custom code. Rest of the code can looks like this:
03 | <button class="switcher" id="2">Get stat #2 (Both)</button> |
04 | <button class="switcher" id="19">Get stat #19 (Males)</button> |
05 | <button class="switcher" id="36">Get stat #36 (Females)</button> |
06 | <button class="switcher" id="multiple">Range</button> |
10 | <div id="container" class="chart"></div> |
There are just several buttons (to evoke different events) and the container for chart object.
Step 2. CSS
css/main.css
All we need is to add few styles for our buttons:
06 | background: none repeat scroll 0 0 #E3E3E3; |
07 | border: 1px solid #BBBBBB; |
08 | border-radius: 3px 3px 3px 3px; |
09 | box-shadow: 0 0 1px 1px #F6F6F6 inset; |
15 | text-shadow: 0 1px 0 #FFFFFF; |
19 | background: none repeat scroll 0 0 #D9D9D9; |
20 | box-shadow: 0 0 1px 1px #EAEAEA inset; |
25 | background: none repeat scroll 0 0 #D0D0D0; |
26 | box-shadow: 0 0 1px 1px #E3E3E3 inset; |
Step 3. JS
Now, we can overview the final JS code and I will explain how it works:
main.js
002 | $(document).ready(function() { |
005 | var chart = new Highcharts.Chart({ |
007 | renderTo: 'container', |
012 | text: 'Marital Status (United States: 2011)', |
015 | text: 'By Script Tutorials' |
029 | $('.switcher').click(function () { |
030 | var id = $(this).attr('id'); |
033 | chart.showLoading('Getting stat data ....'); |
035 | if (id != 'multiple') { |
038 | $.getJSON('data.php?id=' + id, function(aData) { |
041 | while (chart.series.length) { |
042 | chart.series[0].remove(); |
047 | $.each(aData.categories, function(idx, res) { |
048 | categories.push(res); |
050 | chart.xAxis[0].setCategories(categories); |
051 | chart.yAxis[0].axisTitle.attr({ |
052 | text: 'Amount of ' + aData.name + 's (thousands)' |
056 | var seriesValData = []; |
057 | $.each(aData.values, function(idx, res) { |
058 | seriesValData.push([res.name, parseFloat(res.val)]); |
068 | var seriesPerData = []; |
069 | $.each(aData.percentages, function(idx, res) { |
070 | seriesPerData.push([res.name, parseFloat(res.val)]); |
073 | var seriesPercentages = { |
074 | name: aData.name + ' (%)', |
078 | center: ['60%', '30%'], |
084 | chart.addSeries(seriesValues, false); |
085 | chart.addSeries(seriesPercentages, false); |
091 | $.getJSON('data2.php', function(aData) { |
094 | while (chart.series.length) { |
095 | chart.series[0].remove(); |
100 | $.each(aData.categories, function(idx, res) { |
101 | categories.push(res); |
103 | chart.xAxis[0].setCategories(categories); |
104 | chart.yAxis[0].axisTitle.attr({ |
111 | $.each(aData.series, function(idx, ser) { |
114 | var seriesValData = []; |
115 | $.each(ser.values, function(idx, res) { |
116 | seriesValData.push([res.name, parseFloat(res.val)]); |
125 | chart.addSeries(seriesValues, false); |
In the beginning (on document ready) our script prepares an empty Highcharts object (where we can put initial params like various titles and height. After, I added onclick event handler for our buttons. First three buttons are to generate sumple charts (using only one table from CSV file). The last one button is to obtain range (which is predefined by me). This range is a range between tables 3 and 17. I will explain the structure of tables soon (in next step). Well, when we click to the one of buttons, we send a request ($.getJSON) to data.php file, this file parses CSV file and returns a JSON response to us. And now we can: remove all existing series of our Chart object (series – means some data here), then we parse all categories from server response and add own custom categories for X axis, then we parse all stat data in order to prepare a new object (of series) for Chart. In case if we request for a single table, our script (data.php) returns categories, values (amounts of people) and percentages. For values, I use ‘column’ chart, for percentages – ‘pie’. In case when we request for a whole range, we age going to use only percentages (for every used table).
Now, I think that I have to explain a bit about structure of provided CSV files. I downloaded two files (2011gender_table10.csv and 2011gender_table11.csv) from census.gov website. I’m pretty sure that this is official statistics of US. Every file, consists of some comments (I noticed, that first 4 rows as usual – comments, which explain content of this file) and various ‘tables’ with necessary content. Few weeks ago I located the very good PHP parser for CSV files (http://code.google.com/p/parsecsv-for-php/). So, I decided to use it for today’s demos. And, the result surpassed all expectations! This library was able to skip all these comments in the beginning of CSV file (using ‘offset’ param). In the result, I got a set of various tables (after it parses CSV files). In case of selected CSV files, I noticed, that few two tables contain set of ‘categories’ (like Married, Widowed, Divorced, Separated and Never married) and types of information for next tables. All other tables contain actual statistics.
As example, in ‘Marital status’ table (2011gender_table10.csv) you can find information about Marital Status of the Population 15 Years and Over by Sex and Age (US, 2011).
In ‘Educational Attainment’ table (2011gender_table11.csv) you can find information about Educational Attainment of the Population 15 Years and Over by Sex and Age (US, 2011).
Step 4. PHP
Now, the final step – where I will tell you how to make a server-side script to prepare data for our JS code:
data.php
01 | function splitByPairs($array) { |
04 | $both = array(&$even, &$odd); |
05 | array_walk($array, function($v, $k) use ($both) { $both[$k % 2][] = $v; }); |
06 | return array($odd, $even); |
09 | require_once('parsecsv.lib.php'); |
12 | $sFilename = (isset($_GET['extra'])) ? '2011gender_table11.csv' : '2011gender_table10.csv'; |
15 | $csv = new parseCSV($sFilename, 4); |
19 | foreach ($csv->data[0] as $s) { |
26 | $i = (int)$_GET['id']; |
29 | $sTitle = $csv->data[$i]['Sex and age']; |
30 | $aDataSlc = array_slice($csv->data[$i], 3); |
32 | foreach ($aDataSlc as $s) { |
37 | list($aPerc, $aVals) = splitByPairs($aData); |
41 | $aValRows = $aPercRows = array(); |
42 | foreach ($aCats as $s) { |
43 | $fValue = str_replace(',', '.', $aVals[$i]); |
44 | $fValue = ((float)$fValue) ? (float)$fValue : 0; |
45 | $aValRows[] = array('name' => $s, 'val' => $fValue); |
47 | $fPercent = str_replace(',', '.', $aPerc[$i]); |
48 | $fPercent = ((float)$fPercent) ? (float)$fPercent : 0; |
49 | $aPercRows[] = array('name' => $s, 'val' => $fPercent); |
56 | $aJson['name'] = trim($sTitle); |
57 | $aJson['categories'] = $aCats; |
58 | $aJson['values'] = $aValRows; |
59 | $aJson['percentages'] = $aPercRows; |
60 | echo json_encode($aJson); |
In the beginning of this script, you can see that we use ‘parsecsv.lib.php’ library to parse the one of CSV files. Then we walk (foreach) through first table in order to obtain categories (of statistics). After, we extract a requested table (by ID) and gather all it’s fields (except for the first three fields, They contain title of table, total amount and total percentage). Then, we have to split result array with data to odd and even values (all because there are mix of Valies and Percentages). I made the special function to separate arrays: splitByPairs. Finally, we can prepare final separated arrays with values (amounts of people) and percentages for our JS script.
Now please pay attention that I use another one ‘data2.php’ file to prepare data for range of tables, here it is:
data2.php
01 | function splitByPairs($array) { |
04 | $both = array(&$even, &$odd); |
05 | array_walk($array, function($v, $k) use ($both) { $both[$k % 2][] = $v; }); |
06 | return array($odd, $even); |
09 | require_once('parsecsv.lib.php'); |
12 | $sFilename = (isset($_GET['extra'])) ? '2011gender_table11.csv' : '2011gender_table10.csv'; |
15 | $csv = new parseCSV($sFilename, 4); |
19 | foreach ($csv->data[0] as $s) { |
30 | for ($x = $iStart; $x <= $iEnd; $x++) { |
33 | $sTitle = $csv->data[$x]['Sex and age']; |
34 | $aDataSlc = array_slice($csv->data[$x], 3); |
36 | foreach ($aDataSlc as $s) { |
41 | list($aPerc, $aVals) = splitByPairs($aData); |
46 | foreach ($aCats as $s) { |
47 | $fPercent = str_replace(',', '.', $aPerc[$i]); |
48 | $fPercent = ((float)$fPercent) ? (float)$fPercent : 0; |
49 | $aPercRows[] = array('name' => $s, 'val' => $fPercent); |
55 | 'name' => trim($sTitle), |
56 | 'values' => $aPercRows |
62 | $aJson['categories'] = $aCats; |
63 | $aJson['series'] = $aSeries; |
64 | echo json_encode($aJson); |
This is similar script as our first ‘data.php’. But, we don’t extract data from only one table, but we use a range of tables (from 3 to 17 tables). And, we don’t use values (amounts), but only Percentages. I think that it is enough for our demo.
Conclusion
That is all for today. We have just created the really interesting and actual charts (with Highcharts) of real marital and educational statistics of 2011 (for US). I’m sure that this material will be very useful for you. Good luck and welcome back