Archive
[webdev] sortable table
Problem
If you create a table with HTML, it’s static. It would be great if you could sort it by various columns. How to do that?
Solution
Create the static table (like before) and integrate it with a Javascript library that will make it sortable. I found a great solution for this called tablesorter. You can also find it on github, though the first link conatains more documentation and examples.
[webdev] add a stylish Javascript image gallery
Problem
I wanted to add a modern Javascript image gallery to a website. I had some thumbnails and I wanted the following: (1) clicking on a thumbnail the image should appear, (2) the gallery must be browsable in both directions.
Solution
The Lightbox script worked for me like a charm. I’m not a Javascript expert but I could integrate it in 5 minutes. Awesome script!
Check out the project’s home page for a demo.
webapp analyzer
Wappalyzer is a browser extension that identifies software on websites.
Screenshot:
It’s very useful to learn about nem Javascript frameworks for instance :)
Codecademy – learn HTML, CSS, Javascript
“Codecademy is the easiest way to learn how to code. It’s interactive, fun, and you can do it with your friends.”
YesScript: disable Javascript on a given site
Problem
Today I visited a site and it blew the following message in my face: “if you want to use this site, disable AdBlock”. What da… ?
Solution
Install YesScript and put the given site on blacklist. From now on Javascript is disabled on that site.
Raphaël — JavaScript Library
“Raphaël is a small JavaScript library that should simplify your work with vector graphics on the web. If you want to create your own specific chart or image crop and rotate widget, for example, you can achieve it simply and easily with this library.” (source)
See for instance the color picker demo.
Scraping AJAX web pages (Part 3)
Don’t forget to check out the rest of the series too!
In Part 2 we saw how to download an Ajax-powered webpage. However, there was a problem with that approach: sometimes it terminated too quickly, thus it fetched just part of a page. The problem with Ajax is that we cannot tell for sure when a page is completely downloaded.
So, the solution is to integrate some waiting mechanism in the script. That is, we need the following: “open a given page, wait X seconds, then get the HTML source”. Hopefully all Ajax calls will be finished in X seconds. It is you who decides how many seconds to wait. Or, you can analyze the partially downloaded HTML and if something is missing, wait some more.
Here I will use Splinter for this task. It opens a browser window that you can control from Python. Thanks to the browser, it can interpret Javascript. The only disadvantage is that the browser window is visible.
Example
Let’s see how to fetch the page CP002059.1. If you open it in a browser, you’ll see a status bar at the bottom that indicates the download progress. For me it takes about 20 seconds to fully get this page. By analyzing the content of the page, we can notice that the string “ORIGIN” appears just once, at the end of the page. So we’ll check its presence in a loop and wait until it arrives.
#!/usr/bin/env python
from time import sleep
from splinter.browser import Browser
url = 'http://www.ncbi.nlm.nih.gov/nuccore/CP002059.1'
def main():
browser = Browser()
browser.visit(url)
# variation A:
while 'ORIGIN' not in browser.html:
sleep(5)
# variation B:
# sleep(30) # if you think everything arrives in 30 seconds
f = open("/tmp/source.html", "w") # save the source in a file
print >>f, browser.html
f.close()
browser.quit()
print '__END__'
#############################################################################
if __name__ == "__main__":
main()
You might be tempted to check the presence of ‘</html>’. However, don’t forget that the browser downloads a plain source first starting with ‘<html><body>…’ until ‘</body></html>’. Then it starts to interpret the source and if it finds some Ajax calls, they will be called, and these calls will expand something in the body of the HTML. So you’ll have ‘</html>’ right at the beginning.
Future work
This is not bad but I’m still not fully satisfied. I’d like something like this but without any browser window. If you have a headless solution, let me know. I think it’s possible with PhantomJS and/or Zombie.js but I had no time yet to investigate them.
D3: A JavaScript visualization library for HTML and SVG
“D3 allows you to bind arbitrary data to a Document Object Model (DOM), and then apply data-driven transformations to the document. As a trivial example, you can use D3 to generate a basic HTML table from an array of numbers. Or, use the same data to create an interactive SVG bar chart with smooth transitions and interaction.” (source)
See also D3 on GitHub.
I haven’t used it yet, this is just a “good-to-know-about-it” note.
Zombie.js, PhantomJS
This is not a real post, just a reminder for me. I should look at these projects in detail in the future.
“Zombie.js is a fast, headless full-stack testing using Node.js. Zombie.js is a lightweight framework for testing client-side JavaScript code in a simulated environment. No browser required. Here is a Python driver for it called python-zombie.”
“PhantomJS is a headless WebKit with JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG. PhantomJS is an optimal solution for headless testing of web-based applications, site scraping, pages capture, SVG renderer, PDF converter and many other use cases.“


You must be logged in to post a comment.