conversations on all interesting things related to technology and innovation ...

in collaboration with

Sikuli, the graphical scripting language from MIT

Share |

Sikuli is a new graphical scripting language created at MIT. Sikuli, the word, means “God’s eye” in the language of Mexico’s Huichol Indians, the power to see and understand things unknown.

It is amazing as its a paradigm shift in the way we can code interactions with any GUI. It enables programming using GUI screenshots rather than lines of code.

As the article on MIT site says : A programmer using Sikuli doesn’t need to know anything about the code underlying a GUI. Infact even Sikuli doesn’t know anything about it, either. Instead, it uses computer vision algorithms to analyze what’s happening on-screen. “It’s a software agent that looks at the screen the way humans do.

Lets take the example of a weather map maintained by some Meteriological deparment, which specifies with a certain color (say dark gray) that it is going to rain in a particular region. A programmer using Sikuli can actually specify that a message should be sent to him mobile when the particular region changes to the rain colors. Instead of using language (as in terminology) to describe the color or the region, the programmer can simply plug screen shots into the script: when this (the region) becomes this color (the color of rain, dark gray), send me a text message. This is a hypothetical scenario i could think about similar to the bus example that is present on the Sikuli website.

Sekuli looks for the targets that receive the commands on the screen using a ‘computer vision engine’ that finds the best matching region. The script is based on Jython, a Python implementation on Java VM so you can use virtually any Python module with it. So there is a learning curve, but a basic understanding of Python is probably all that’s required.

You can watch a You Tube video demo of the language in action here.

The first release of Sikuli contains Sikuli Script, a visual scripting API for Jython, and Sikuli IDE, an integrated development environment for writing visual scripts with screenshots easily. Sikuli Script automates anything you see on the screen without internal API’s support. You can programmatically control a web page, a desktop application running on Windows/Linux/Mac OS X, or even an iphone application running in an emulator.

Data Sources:

http://web.mit.edu/newsoffice/2010/screen-shots-0120.html

Tutorials :

Tutorials on the Sikuli site

Videos:

YouTube video : Script for automating a Coda/Firefox workflow

[youtube=http://www.youtube.com/watch?v=6OtmMKYhEjg]

YouTube video : Automatically setting IP on Mac OS X

[youtube=http://www.youtube.com/watch?v=FxDOlhysFcM]

YouTube video : Tracking a Panda on a webcam

[youtube=http://www.youtube.com/watch?v=vGC9AJqJUqA&feature=related]

Downloads:  Sikuli related downloads are available  here





2 Comments, Comment or Ping

  1. Shivshanker

    While this is a nice concept, its application seems more of a macro. Could this be used for automation testing? May be I need to understand more of its application as published by the authors. Unique concept for sure.

  2. nutsyputsy

    Shiv, at present yes it does looks like a macro.

    It would be cool concept to use it for automation testing in that a QA person can write UI test case just based on the visual changes on the webpage, without bothering about the backend working. This could open up automation to the ‘non coding’ QA crowd.

    i think though that Sikuli’s uniqueness stems from that you don’t have to know the code behind the UI but only the visual changes in the UI can be used to trigger events. Woudn’t it be cool if you have a page tracking the Stock market graph and you (without even having any widget from that site) go ahead and add a horizontal line and put in few lines of Sikuli code saying that alert me if the stock graph intersects the added line.

Reply to “Sikuli, the graphical scripting language from MIT”