Archive
Software Carpentry
“Software Carpentry is a community of volunteer instructors who teach short workshops and develop lessons which empower researchers of all disciplines to learn about and improve the ways in which they create software and collaborate.” (source)
I found them today: https://github.com/swcarpentry/swcarpentry . Looks good!
[mysql] rows are deleted but the database still has the same size
Problem
Our cache grew too big, so we wanted to remove rows that are older than X days. However, after removing the majority of the rows, the database still had the same size.
Solution
I found the solution here. Use the command “OPTIMIZE TABLE <tablename>;“. You can also do that with the graphical interface of phpMyAdmin (“tick the checkbox next to the table name you want to decrease, and in the ‘With selected’ drop-down under the list of tables, choose ‘Optimize’“).
Some notes on MongoDB and PyMongo
“MongoDB is an open source, high-performance, schema-free, document-oriented database, written in C++. It manages collections of BSON documents that can be nested in complex hierarchies and still be easy to query and index, which allows many applications to store data in a natural way that matches their native data types and structures. Development of MongoDB began in October 2007 by 10gen. The first public release was in February 2009.” (source)
Recently I heard a lot about MongoDB so I gave it a try. I like it :) It’s easy to set up, fast, and pretty easy to use. It has bindings to several languages.
Installation #01 (living on the edge)
The Ubuntu repos are usually out-of-date. If you want to use the latest stable version, install MongoDB from the developers. In short:
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10 # edit /etc/apt/sources.list and add this line: deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen sudo apt-get update sudo apt-get install mongodb-10gen
Installation #02 (from the Ubuntu repos)
I suggest installing MongoDB using the previous method above. Anyway, here is the classical way:
sudo apt-get install mongodb-server mongodb-clients
Normally it’ll start the server that needs to run in the background if you want to test it on your local machine (just like MySQL for instance). If the server didn’t start for some reason, try this:
sudo service mongodb start
If you get some error that the server cannot be started, remove the file /var/lib/mongodb/mongod.lock.
When you use SQLite, you have a database.sqlite file that stores your database. The database files of MongoDB are situated in the directory /var/lib/mongodb (by default). The log file of the server is at /var/log/mongodb/mongodb.log. These settings can be configured in /etc/mongodb.conf.
Python client
My language of choice is Python, so let’s see how to use MongoDB from a Python script. First, you need to install the PyMongo package:
sudo pip install pymongo
Let’s see if it works:
#!/usr/bin/env python
"""
This example is from the book "MongoDB and Python"
by Niall O'Higgins.
"""
import sys
from datetime import datetime
from pymongo import Connection
from pymongo.errors import ConnectionFailure
def main():
""" Connect to MongoDB """
try:
c = Connection(host="localhost", port=27017)
except ConnectionFailure, e:
sys.stderr.write("Could not connect to MongoDB: %s" % e)
sys.exit(1)
# Get a Database handle to a database named "mydb"
dbh = c["mydb"]
user_doc = {
"username" : "janedoe",
"firstname" : "Jane",
"surname" : "Doe",
"dateofbirth" : datetime(1974, 4, 12),
"email" : "janedoe74@example.com",
"score" : 0
}
dbh.users.insert(user_doc, safe=True)
print "Successfully inserted document: %s" % user_doc
#############################################################################
if __name__ == "__main__":
main()
For the connection we use the default values. Using SQL terminology, here is what happens: “mydb” is the name of the database that we access via the handler dbh. user_doc is a row, and dbh.users.insert(user_doc, safe=True) means that inside the database (“mydb”), in the “users” table we insert the row user_doc. It is recommended to use safe=True for write operations (insert, update, remove, and findAndModify), otherwise MongoDB doesn’t check for errors :(
As you can see, you don’t have to create neither the database “mydb” nor the table “users”. When you want to insert something in them, MongoDB will create them if they don’t exist.
Troubleshooting (20130414)
After installing “pymongo”, I couldn’t import it. As it turned out it conflicted with the “bson” package. Solution:
sudo pip uninstall pymongo sudo pip uninstall bson sudo apt-get remove python-bson sudo apt-get remove python-gridfs sudo pip install pymongo -U
Visualization
Now we have a database, a table, and a row in that table. It’d be nice to visualize the database. There is a very nice PHP-based administration GUI tool called RockMongo. This is what we’ll use. Requirements:
sudo apt-get install php-pear sudo pecl install mongo
Then put rockmongo in your public_html directory and open it in your browser. It’ll warn you to add a line to your php.ini file (located at /etc/php5/apache2/php.ini). Don’t forget to restart the webserver. For logging in, use “admin” and “admin” as username and password. The usage of rockmongo is completely intuitive. (For setting up PHP on your machine, check out this post.)
Reducing database size
MongoDB is quite aggressive in allocating disk space for databases. For reducing the database size, you can add the following lines to /etc/mongodb.conf:
# disable data file preallocation noprealloc = true # use smaller files smallfiles = true
It’s not recommended for production but for a small project on your local machine it can be useful. For taking it into account, restart the mongodb server. These settings will be applied for new databases only.
Toubleshooting
If you have problems starting MongoDB, refer to this link.
Essential links
- MongoDB HQ (10gen, creators of MongoDB)
- PyMongo (Python library for MongoDB)
- RockMongo (PHP-based admin tool)
- JSON Visualization
Books / Docs
- MongoDB and Python (good for starting, 53 pages)
- The Little MongoDB Book (a nice little book, 33 pages)
- more books
- SQL to Mongo Mapping Chart
- MongoDB Quick Reference Cards
- Manual
Further links
- mailing list (e-mail: mongodb-user@googlegroups.com)
- a quick reference on pymongo (blog post; don’t forget the safe=True modifier)
- MongoDB Gotchas
- The MongoDB Collection (mongly.com; there is a nice interactive tutorial here)
- Things I wish I knew about MongoDB a year ago (update 20121021)
Connect to sqlite3 databases and make queries
Problem
You have a binary sqlite3 database file and you want to make some queries on it: find out what tables it has, look at the content of the tables, etc.
Solution
There is a command called “sqlite3” which is a client for sqlite3 databases. Let’s say our database is stored in a file called “database.sqlite“. (Here “.sqlite” is the file extension.)
# open database.sqlite with the client: sqlite3 database.sqlite # let's get the list of tables in this database: .tables # Say it has a table called "images". Let's see its content: select * from images;
To learn more about the commands, just type “.help” in the client. You can dump a database in SQL text format, you can get the schema of a table, etc.
Tip
Although .sqlite database files are binary files, you can open them with a text editor too. At the top you can see the schemas of the tables in text format. Just be careful not to modify it.
Create a database from a schema (update 20111120)
sqlite3 database.sqlite < schema.sql

You must be logged in to post a comment.