Sunday 30 June 2013

GSoC - week2

So this week,  we have started looking at another module SIMBAD.  Of course there is only preliminary discussion/ review yet but the code skeleton is available in this PR.

One issue I am working on goes like this - What we would like to do is display an HTML table (this one) to the user - to let her decide the VOTable fields to set. The first step is to scrape the table using BeautifulSoup(bs4). Next we'll be storing the results as JSON - again with python this is as simple as a couple of lines of code. What I am looking at is a great way to then display the data interactively in python/ipython.

The obvious way to do this would be to use something like a table. Good News - Astropy already has this functionality and it is *truly fully loaded*. With astropy.table.Table - you can scroll the rows (replicate the unix more) and in an IPython NB, the table is displayed as an HTML table and looks very impressive. I spent quite a few hours poring over the source code and I was really inspired :)!

Well then I tried my hand at adding rows/columns and creating a table from the HTML I had scrapped and I came across a funny unicode error. This was just a false alarm :) - since an astropy table uses a numpy array underneath - in a numpy array if you need to enter a string of variable length you should also mention a datatype for it otherwise it tries to parse this by its default datatype and then errors ensue! (This then lead to a tour of numpy datatypes which in turn lead to an enjoyable diversion to the Scipy-2013 web page - and I spent a few minutes(?) looking excitedly at the plans - Astropy is presenting there(and its at UT-Austin too!) ) Well so what is needed is dtypes=['object'] for a variable length string.  Well but there are some points to be smoothed out yet - which is one of the things I am looking at.

P.S. The code I include via a gist doesn't appear in the Feeder (this post doesn't have any but the June-16th one did) - looking into this.


Sunday 23 June 2013

GSoC - Week-1

This week was mainly spent in refactoring the astroquery.irsa_dust module. A detailed discussion of the entire implementation can be found at this pull-request.  And as it currently stands most of the API for this is almost finalized except for some tweaks that may come up in the final review.

Apart from this the order in which the remaining modules of astroquery will be refactored may be found here.

Some other things that were also tackled this week were to monkeypatch the remote tests that most commonly resulted in network errors/ timeouts. Again more details here.

Currently SIMBAD is on the top of our to-do list. So I have been trying to familiarize myself with this web service and also made some preliminary attempts at re-writing the code, coherent with the finalized API.

P.S. Anyone with an Astronomical incline, could probably fork the revised irsa_dust module , use it and give feedback!

Sunday 16 June 2013

GSoC - Coding period starts!


The GSoC coding period starts from 17th June (today!). The last week has been quite enjoyable. The best part of GSoC is that it offers you the opportunity to learn many many new and useful things. This week I have mostly been trying my hand at refactoring the irsa_dust module in astroquery as per the API finalized last week.

One very useful software that I had mainly heard about but never used myself is the IPython Notebook. It is great for creating examples to demonstrate functionality and can be shared using gists and nbviewer It has very handy features like auto-completion - and I am glad that I have it installed and running now!

Another very useful thing I learnt is testing in python. Astropy uses pytest as its main testing framework. It is very simple to write tests in this. In-fact its catch line is No API!. Given its superb test discovery rules and features - it sure is fun to get started writing tests using pytest.

One very useful trick my mentor showed me is to use the @pytest.mark.parametrize decorator. This is very useful if say you want to test a function for several different values of a parameter(s). Rather than writing redundant tests you could do something like :

This will then test the funtion for each value of the parameters coordinate and radius and check that the output is as expected.

Another feature that I also learnt about is monkeypatching. This can be used to create dummy objects, that can mimic the behavior of a real resource - for instance HTTP response and request objects. This can be used in astroquery to create patches that replace actual responses/requests to the remote server.

Also this week I received  a one-year student membership to ACM (that includes a subscription to the digital library) as a part of GSoC!


Saturday 8 June 2013

GSoC - week 2 of interim period


This week I have mainly been reading up the Astroquery code base as well as relevant portions of Astropy. I also looked up on some advanced Python techniques like decorators, descriptors, generators and context managers and OOP done via Python. I found some really helpful resources on the web, I will just list them here:

This is definitely a must read:
http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html

http://www.pgbovine.net/python-idioms.htm

This may require a sign in but it has some really good links:
https://www.udacity.com/wiki/cs212/python_learning_aids

http://wiki.python.org/moin/PythonDecoratorLibrary#Property_Definition

And also an ebook:
http://www.linuxtopia.org/online_books/programming_books/python_programming/

Finally the best compilation of all python aspects at one place - The Hitchhiker's  guide to Python!:
http://docs.python-guide.org/en/latest/

Also the API for Astroquery is almost finalized! And the revised version may be found here:
http://astroquery.readthedocs.org/en/latest/astroquery/api.html

Thats all for now!