<body><script type="text/javascript"> function setAttributeOnload(object, attribute, val) { if(window.addEventListener) { window.addEventListener('load', function(){ object[attribute] = val; }, false); } else { window.attachEvent('onload', function(){ object[attribute] = val; }); } } </script> <div id="navbar-iframe-container"></div> <script type="text/javascript" src="https://apis.google.com/js/platform.js"></script> <script type="text/javascript"> gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() { if (gapi.iframes && gapi.iframes.getContext) { gapi.iframes.getContext().openChild({ url: 'https://www.blogger.com/navbar/6813476980165976394?origin\x3dhttp://bayesianconspiracy.blogspot.com', where: document.getElementById("navbar-iframe-container"), id: "navbar-iframe" }); } }); </script>

The Bayesian Conspiracy

The Bayesian Conspiracy is a multinational, interdisciplinary, and shadowy group of scientists. It is rumored that at the upper levels of the Bayesian Conspiracy exist nine silent figures known only as the Bayes Council.

This blog is produced by James Durbin, a Bayesian Initiate.

gcsvsql

I have encapsulated the ideas from the last post into a single script that will allow you to perform SQL queries on CSV files from the command line as though those CSV files were existing database tables in MYSQL or something.

With gcsvsql, you can do things like:


gcsvsql "select * from people.csv where age > 40"
gcsvsql "select name,score from people.csv where age >40"
gcsvsql "select name,score from people.csv where age <50 and score > 100"

Full path names should work fine:
 
gcsvsql "select * from /users/data/people.csv where age > 40"
gcsvsql "select people.name from /users/data/people.csv where age > 40"

You can even do queries with sum and average and so on like:

gcsvsql "select sum(score) from people.csv where age < 40"

If children.csv is a file with same key name as people, then you can join query like:
  
gcsvsql "select people.name,children.child from people.csv,children.csv where people.name=children.name"

You can also enter the query on multiple lines like:
 
gcsvsql "
> select people.name,children.child
> from people.csv,children.csv
> where people.name=children.name and people.age < 40"

If this sounds interesting or useful to you, get more details and download the script over at the Google code project for gcsvsql

Labels: , , , , ,

You can leave your response or bookmark this post to del.icio.us by using the links below.

Post a Comment | Bookmark | Go to end |