Get a Coordinates object from an astronomical name by adrn · Pull Request #556 · astropy/astropy

adrn · 2012-12-12T16:05:45Z

I threw this together in ~20 minutes, but I thought it'd be killer to have a feature that would let us get a Coordinates object from an astronomical name without any extra work. Right now, this looks like:

import astropy.coordinates as coord
m42_coords = coord.ICRSCoordinates.resolve_name("M42")
m42_gal_coords = coord.GalacticCoordinates.resolve_name("M42")

It just does a query to Sesame, and parses the returned text (see: http://cdsweb.u-strasbg.fr/doc/sesame.htx). The service lets you specify a database to search, e.g. SIMBAD, NED, Vizier, or all, so you can specify that like:

castor_coords = coord.ICRSCoordinates.resolve_name("castor", database="simbad")

but the default is to just search all.

Anyway, it's a pretty dumb-simple implementation so if anyone has ideas, let me know.

astrofrog · 2012-12-12T16:35:13Z

This is very cool!

Now to the main comment I have: I wonder whether we should host an intermediate page that will ensure that there is no breakage if the query URL or the format of the output changes? This is an issue with any other online querying (e.g. the survey querying functionality we discussed with @demitri) at the meeting. Maybe this requires having a page on astropy.org with some javascript acting as the intermediate? Then if the Sesame page moves or changes output, we just update the javascript. Otherwise users would have to wait for the next stable release for things to work again.

demitri · 2012-12-12T16:46:03Z

It may be more work, but I'm still a strong advocate of hosting our own data and database. Unless you are using a solid, stable, and funded(!) API service, the only way you can be sure to serve data reliably is to do it yourself. It's easy to do, and I think web scraping should not be considered as an option. (If that's what you're doing - I've not looked.)

Also, yes, very cool idea. I'd argue there's no reason for the method name from_name:

a = coord.ICRSCoordinates("M42")
a = coord.ICRSCoordinates("12h45m23s 23.4313")

It's just another string we can parse.

astrofrog · 2012-12-12T16:49:10Z

@demitri - sesame is not web-scraping, it is a proper API, so I think it would be safe enough to rely on them. I'm just suggesting adding a middle-man that can translate if their API ever changes, which would be sufficient, and could probably be done with a simple intermediate page. What you are suggesting would be great, but it's not something we can set up overnight.

adrn · 2012-12-12T16:49:24Z

@astrofrog Yea, I agree -- just wanted an initial implementation to get the idea out there :)

@demitri Not web scraping, it just provides a GET interface to querying these databases and returns a structured response. Actually I see now that one of the options is XML, which may be better than what this currently does (regex matching)

demitri · 2012-12-12T16:53:25Z

OK, that's better. I do worry about depending on others, and I'm worried that a (particularly javascript) middleware would be a bottleneck.

And no, such a database can't be set up overnight. But pretty close to it - it's a very simple thing. For things like common names (even many thousands of them), the external service is not really proving any added value.

astrofrog · 2012-12-12T17:06:43Z

Just a thought - we could always have the middleware as a fallback, then no performance issues by default.

demitri · 2012-12-12T17:20:52Z

I'd feel much better about that.

perwin · 2012-12-12T17:32:40Z

This is a really nice demonstration!

As for middleware -- fallback sounds like a good idea, given that you don't know when someone is going to decide to run a search on several thousand objects at once -- how much bandwidth and traffic is this hypothetical middleware server going to handle?

@demitri -- by querying via Sesame, you're not running queries on "thousands of names", you're gaining access to tens of millions of names. (Well, NED claims to have ~ 190 million unique objects, so "tens of millions" is an underestimate.) I think it's a little overly ambitious to try duplicating all of that functionality.

demitri · 2012-12-12T17:39:46Z

@perwin Thousands of names, a few tens of millions... with today's databases and hardware, there's hardly a difference. Can Sesame handle queries of several thousand objects at once? If we point to many different databases, how do we communicate the different limitations of each? At the moment, we're only talking about name translation. I don't know all of what Sesame does and certainly don't want to replicate all that it does, but I think there are low-hanging fruit.

taldcroft · 2012-12-12T18:17:57Z

@demitri - I think you are vastly underestimating the effort required to maintain a name server like what is provided through Sesame. It's not just a one-time ingest, the names and data are continually being updated as well.

adrn · 2012-12-12T20:03:10Z

@demitri @taldcroft Yea, I agree -- that is extremely ambitious and seems unnecessary.

I'll see if I can massage this in to the string parsing stuff (get rid of the classmethod), but at first glance it seems like it might be kind of kludgy to get right. Will look in to it more tonight..

astrofrog · 2012-12-12T20:28:46Z

I feel like a = coord.ICRSCoordinates("M42") may be slightly too magical, especially since there may be cases where source names start to look a lot more like coordinates. So I'm ok with the current class method approach - though maybe we could call it resolve_name instead of from_name?

demitri · 2012-12-12T20:52:00Z

a = coord.ICRSCoordinates(name="M42")

adrn · 2012-12-12T21:08:50Z

I'm fine with either WhateverCoordinates.resolve_name("m42") or WhateverCoordinates(name="M42") -- anyone else have an opinion?

astrofrog · 2012-12-12T21:49:01Z

Using a separate method allows additional arguments without interfering with the main __init__ ones. I would personally go with the class method for now.

adrn · 2012-12-12T21:49:43Z

Ah, right, that's why I made it a classmethod -- so the user could specify a search database.

wkerzendorf · 2012-12-12T22:22:08Z

That looks really cool - however I would separate this from coordinates. Sesame is not only linked to coordinates, but other datatypes like object type, reference count, .... . How about the other way round: you resolve an astronomical-object and you get back a python-object with one of the properties a coordinate object (which can then be converted to anything). Maybe it can live in vo.

astrofrog · 2012-12-12T22:32:42Z

@wkerzendorf - I agree that the core sesame functionality can live in e.g. utils or vo, but on the other hand there's no reason there can't be a shortcut in the coordinates sub-package.

adrn · 2012-12-12T22:34:36Z

Yea I agree this should probably go elsewhere, but the API doesn't have to change if we move it somewhere else.

embray · 2012-12-12T22:38:30Z

I'll just add: I really like the existing classmethod approach. But I'm not the intended audience :)

taldcroft · 2012-12-13T03:13:22Z

+1 on class method, -1 on name in the coordinates __init__. There will very likely need to be additional args which will clutter up the coordinates __init__. @adrn - this is a nice bit of shiny-ness!

mwcraig · 2012-12-13T04:33:02Z

One suggestion--wrap your tests in a check to see whether sesame is up. I wasted almost an hour a couple weeks ago when a similar lookup I wrote had tests that suddenly started to fail because sesame wasn't responding to requests.

adrn · 2012-12-13T05:09:03Z

@mwcraig Good idea! Thanks for the tip

embray · 2012-12-13T19:24:03Z

If Sesame is down, would the idea be to just mark the tests as skipped?
I forget, but when py.test skips a test is there a way to output a message as to why it was skipped? I would want to know that too. Or at least output a warning.

adrn · 2012-12-14T19:23:31Z

I think you can control an output message, for example see here: http://pytest.org/latest/skipping.html#evaluation-of-skipif-xfail-conditions

I think I just have to add something like this to the test:

if urllib.urlopen("http://cdsweb.u-strasbg.fr/cgi-bin/nph-sesame").getcode() != 200:
    pytest.skip("SESAME appears to be down, skipping test_database_specify.py:test_names()...")

I've added that to the tests, and pushed it up.

embray · 2012-12-14T19:34:35Z

Great--I think that's important for something like this, were I could hypothetically run the tests multiple times in a row without changing a thing and see different numbers of tests being skipped. I would want to know why so that I don't go crazy.

adrn · 2012-12-14T22:05:41Z

Huh. The build seems to be failing for Python 3.2, but not where I expected. I didn't realize that urllib2 got merged into urllib in 3.2, so why isn't the build failing at the import urllib2 line?

embray · 2012-12-14T23:13:05Z

I'm pretty sure 2to3 will convert urllib2 -> urllib.

embray · 2012-12-14T23:14:31Z

The error you're getting is because reading from a web site returns bytes by default--you have to decode them before passing them through a regular expression meant for text.

adrn · 2012-12-15T04:03:05Z

Yea, I see that, this must be a 3.0 thing?

modification from eteq/coordinates-ned)

eteq · 2013-01-25T20:54:48Z

The tests are passing except for one that hung, so I went ahead and merged this (with some minor modifications blessed by @adrn). backport when ready, @iguananaut!

eteq · 2013-01-25T20:55:20Z

Oh, and I reassigned this to v0.2 so that your script will see it, @iguananaut

modification from eteq/coordinates-ned)

Remove affiliated package decisions from Coco

adrn added 21 commits January 25, 2013 11:46

add timeout to urlopen, fix returns docstring

b49217d

update tests with method name change

379c63b

remove urlerror, define custom exception

d9ac756

add custom NAmeResolveError

614af99

fix docstring at top of page

026b194

add documentation of from_name to main coordinates doc page

4a4d50f

make default database and timeout configurationitems

cd4881c

update configuration items with erik's suggestions

8eb3081

fix typo

80bbe6e

fix a few more typos..

1feb1f2

database can only be changed through the configuration item

19365a7

change timeout to 5 seconds

924aba6

loop through urls until one works

525120b

mark test functions with remote_data

671752a

change SESAME_URL to be a list

89ad552

clean up syntax with eriks suggestions

6922e61

reference get_icrs_coordinates docstring in from_name

58e821b

add line to CHANGES explaining name resolve feature

f938312

added whats new entry for name resolve code

b05b8d2

import name resolve in coordinates __init__

d6a19d0

fix indentation of docstrings

b5616d7

eteq added a commit that referenced this pull request Jan 25, 2013

Merge pull request #556 from adrn/coordinates-ned (with small

1fcd472

modification from eteq/coordinates-ned)

eteq merged commit b5616d7 into astropy:master Jan 25, 2013

eteq added a commit that referenced this pull request Jan 25, 2013

Merge pull request #556 from adrn/coordinates-ned (with small

531504d

modification from eteq/coordinates-ned)

eteq mentioned this pull request Jan 25, 2013

Add a detailed example describing how to define custom coordinate system #645

Closed

keflavich pushed a commit to keflavich/astropy that referenced this pull request Oct 9, 2013

Merge pull request astropy#556 from adrn/coordinates-ned (with small

53fb672

modification from eteq/coordinates-ned)

jeffjennings pushed a commit to jeffjennings/astropy that referenced this pull request Jul 2, 2025

Merge pull request astropy#556 from hamogu/coco_no_affil

d3586bb

Remove affiliated package decisions from Coco

Uh oh!

Uh oh!

Conversation

adrn commented Dec 12, 2012

Uh oh!

astrofrog commented Dec 12, 2012

Uh oh!

demitri commented Dec 12, 2012

Uh oh!

astrofrog commented Dec 12, 2012

Uh oh!

adrn commented Dec 12, 2012

Uh oh!

demitri commented Dec 12, 2012

Uh oh!

astrofrog commented Dec 12, 2012

Uh oh!

demitri commented Dec 12, 2012

Uh oh!

perwin commented Dec 12, 2012

Uh oh!

demitri commented Dec 12, 2012

Uh oh!

taldcroft commented Dec 12, 2012

Uh oh!

adrn commented Dec 12, 2012

Uh oh!

astrofrog commented Dec 12, 2012

Uh oh!

demitri commented Dec 12, 2012

Uh oh!

adrn commented Dec 12, 2012

Uh oh!

astrofrog commented Dec 12, 2012

Uh oh!

adrn commented Dec 12, 2012

Uh oh!

wkerzendorf commented Dec 12, 2012

Uh oh!

astrofrog commented Dec 12, 2012

Uh oh!

adrn commented Dec 12, 2012

Uh oh!

embray commented Dec 12, 2012

Uh oh!

taldcroft commented Dec 13, 2012

Uh oh!

mwcraig commented Dec 13, 2012

Uh oh!

adrn commented Dec 13, 2012

Uh oh!

embray commented Dec 13, 2012

Uh oh!

adrn commented Dec 14, 2012

Uh oh!

embray commented Dec 14, 2012

Uh oh!

adrn commented Dec 14, 2012

Uh oh!

embray commented Dec 14, 2012

Uh oh!

embray commented Dec 14, 2012

Uh oh!

adrn commented Dec 15, 2012

Uh oh!

eteq commented Jan 25, 2013

Uh oh!

eteq commented Jan 25, 2013

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants