In [1]:
__author__ = 'Robert Nikutta, Stéphanie Juneau, Knut Olsen & the NOAO Data Lab team <datalab@noao.edu>'
__version__ = '20170603' # yyymmdd

(Re)Discovering the Hydra II dwarf galaxy

Robert Nikutta, Stéphanie Juneau, Knut Olsen & the NOAO Data Lab team

Data retrieval

We will retrieve data from Field 169 of the SMASH catalog (Nidever et al. (subm.)) and look for overdensities of blue objects.

The required columns are RA, Dec, and the g, r, i magnitudes.

Detection

We will convolve the spatial distribution of our dataset with a pair of Gaussian kernels and subtract the results, as done by e.g. Stanford et al. (2005, ApJ, 634, 2, L129) (galaxy clusters), or Koposov et al. (2008, ApJ, 686, 279) (MW satellites). This has the effect of convolving the spatial distribution with a Mexican hat filter, which is useful for detecting objects at a desired spatial scale.

Imports & initialization

In [2]:
# std lib
import sys
from getpass import getpass

# 3rd party
import numpy as np
from astropy import utils, io, convolution, stats
from photutils import find_peaks
import pylab as plt
import matplotlib
%matplotlib inline
from IPython.display import FileLink

# Data Lab
from dl import authClient as ac, queryClient as qc, helpers

# plots default setup
plt.rcParams['figure.figsize'] = (7, 6)

Login in

First, obtain a Data Lab authentication token, which needs to passed along to all Data Lab server-side operations.

In [3]:
token = ac.login(raw_input('Enter user name: '),getpass('Enter password: ')) # username here is 'anonymous'
Enter user name: anonymous
Enter password: ········

Query the SMASH DR1 database

We will query the averaged photometry table from the SMASH catalog and select field #169. We will limit the query to avoid photometry taken only with short exposures (depthflag>1), avoid broad objects (|sharp|<0.5), and pick blue objects (-0.4 < g-r < 0.4). We will also exclude objects with less than 4 detections to improve the spatial SNR.

Construct the query string

In [4]:
field = 169 # SMASH field number to query

# Create the query string; SQL keyword capitalized for clarity
#   depth > 1 = no short exposures please
#   ndetr, ndetg > 3 = more than 3 detections in r & g bands
#   abs(sharp) < 0.5 = avoid broad objects
query =\
"""SELECT ra,dec,gmag,rmag,imag
   FROM smash_dr1.object
   WHERE fieldid = '%d' AND
         depthflag > 1 AND
         ndetr > 3 AND ndetg > 3 AND
         abs(sharp) < 0.5 AND
         gmag BETWEEN 9 AND 25 AND
         (gmag-rmag) BETWEEN -0.4 AND 0.4""" % field

Submit the query

Running the query in synchroneous mode is very easy with the Querist helper.

In [5]:
%%time
response = qc.query(token,query) # response is by default a CSV-formatted string
CPU times: user 180 ms, sys: 116 ms, total: 296 ms
Wall time: 23.9 s

We can use a helper function to convert the query result into a data structure. Let's convert to a Pandas dataframe:

In [6]:
R = helpers.convert(response,'pandas') # R is a pandas dataframe
print "Number of objects:", R.shape[0]
print R.head()
Returning Pandas dataframe
Number of objects: 104973
           ra        dec       gmag       rmag       imag
0  184.876674 -32.873511  24.746605  24.838743  24.185682
1  184.876606 -32.870861  24.156397  24.068817  23.074945
2  184.875853 -32.867214  24.084047  24.028061  23.630045
3  184.877080 -32.869780  24.482061  24.446104  23.858896
4  184.878492 -32.866905  24.678942  24.714973  24.624266

Make a figure of the spatial distribution

You might spot some overdensities already.

In [7]:
plt.hexbin(R['ra'], R['dec'],gridsize=200)
plt.xlabel('RA'); plt.ylabel('Dec')
plt.colorbar(label='number of objects per spatial bin');

The Dwarf Filter

Here we define the dwarf filter as a differential convolution of a two-dimensional image using two Gaussian kernels; this has the effect of convolution with a Mexican hat filter. The default kernel shapes look for objects on the scale of a few arcmin. The output includes a clipped array of the convolved spatial distribution, which we will use for peak detection.

In [8]:
def dwarf_filter (ra,dec,fwhm_small=2.0,fwhm_big=20):

    """Differential convolution with 2D Gaussian kernels.
    
       Based on Koposov et al. (2008).
       Code by Ken Mighell and Mike Fitzpatrick. Minor edits by RN.
       
       Parameters
       ----------
       ra, dec : float or array
           RA & Dec in degrees.
    
       fwhm_small, fwhm_big : float
           Full-width half maximum sizes of the small and big Gaussian kernels
           to use in convolution, in arcminutes.
    """
    
    x, y = ra, dec

    print "Computing differential convolution .... ",
    sys.stdout.flush()

    # Information about declination (y) [degrees]
    ymean = (y.min() + y.max()) / 2.0
    ydiff_arcmin = (y.max() - y.min()) * 60.0 # convert from degrees to arcmin

    # Information about right ascension (x) [degrees in time]:
    xdiff = x.max() - x.min() # angular separation [degrees (time)] 
    xmean = (x.min() + x.max())/2.0

    # convert from degrees in time to separation in angular degrees:
    xdiff_angular = (x.max() - x.min()) * np.cos(ymean*(np.pi/180.0))

    # convert from degress to arcmin
    xdiff_angular_arcmin = xdiff_angular * 60.0 

    # Get the number of one-arcmin pixels in the X and Y directions:
    nx = np.rint(xdiff_angular_arcmin).astype('int')
    ny = np.rint(ydiff_arcmin).astype('int')

    print nx,ny
    
    # Create a two-dimensional histogram of the raw counts:
    Counts, xedges, yedges  = np.histogram2d (x, y, (nx,ny) )
    extent = [xedges[0], xedges[-1], yedges[0], yedges[-1]]
    raw_hist = np.rot90(Counts).copy() # hack around Pythonic weirdness

    # Make the small and big Gaussian kernels with a standard deviation
    # of the given FWHM in arcmin^2 pixels.
    kernel_small = convolution.Gaussian2DKernel(fwhm_small/2.35,factor=1)
    kernel_big = convolution.Gaussian2DKernel(fwhm_big/2.35,factor=1)

    # Compute the differential convolution kernels.
    conv_big = convolution.convolve(raw_hist, kernel_big)
    conv_small = convolution.convolve(raw_hist, kernel_small)
    conv_delta = conv_small - conv_big
    delta = conv_delta.copy()

    # Compute statistics and the floor
    mean = np.mean(delta, dtype='float64')
    sigma = np.std(delta, dtype='float64')
    sigmaRaw = np.std(raw_hist,dtype='float64')
    median = np.median(delta)                       # not used
    floor = mean 

    print 'dwarf_filter: mean = %g  sigma = %g sigmaRaw = %g' % (mean, sigma, sigmaRaw)

    # Clip to specified limits.
    clipped = delta.copy()
    clipped[ delta < floor ] = floor

    # Return the computed fields.
    return raw_hist, extent, delta, clipped, sigma

Run the dwarf filter

We'll use the default convolution kernels of 2 and 20 arcminutes in size.

In [9]:
%%time
small_k, big_k = 2., 20.  # kernel sizes in arcminutes
raw, extent, delta, clipped, dsigma = dwarf_filter(R['ra'],R['dec'],fwhm_small=small_k,fwhm_big=big_k)
Computing differential convolution ....  152 144
dwarf_filter: mean = 0.0893917  sigma = 1.79452 sigmaRaw = 5.3359
CPU times: user 592 ms, sys: 12 ms, total: 604 ms
Wall time: 604 ms

Plot the convolved 2D histogram

In [10]:
fig, ax = plt.subplots()
im = plt.imshow(clipped)
plt.xlabel('pixel')
plt.ylabel('pixel')
plt.colorbar(label='relative spatial density after convolution');

Some peaks are visible, let's locate them automatically...

Identify peaks

We'll use the photutils package to identify 10-sigma peaks in the clipped filtered image.

In [11]:
# find peaks
mean, median, std = stats.sigma_clipped_stats(clipped,sigma=3.0,iters=5)    
tbl = find_peaks(clipped,median+10,box_size=small_k*2)

# add ra & dec positions of peaks found
a, b = extent[:2]
xvec = np.arange(a,b,(b-a)/clipped.shape[1])
a, b = extent[2:]
yvec = np.arange(a,b,(b-a)/clipped.shape[0])

tbl['ra'] = xvec[tbl['x_peak']]
tbl['dec'] = yvec[-tbl['y_peak']-1]
print tbl
x_peak y_peak   peak_value        ra           dec      
------ ------ ------------- ------------- --------------
    86     89 11.3119628554 185.410559533 -31.9760034474
    34    100 11.5127808121 184.393480413 -32.1595418034

Show the identified density peaks

In [12]:
ecs = ['w','y'] # colors of box frames
ax.scatter(tbl['x_peak'],tbl['y_peak'],marker='s',s=tbl['peak_value']*40,c='none',edgecolors=ecs,lw=3) # keeps writing to previous ax
fig  # repeats (the updated) figure
Out[12]:

Inspect the image cutouts around the peaks

Simple Image Access service

Data Lab comes with batteries included. Image cutout and download services are built in.

We'll just write two little functions:

  • one to download the deepest stacked images found in the given bands at a given position in the sky
  • and a function to plot several images side-by-side.
In [13]:
# set up SIA
from pyvo.dal import sia
DEF_ACCESS_URL = "http://zeus1.datalab.noao.edu/siapv1"
#DEF_ACCESS_URL = "http://datalab.noao.edu/sia/smash"
svc = sia.SIAService(DEF_ACCESS_URL)

# a little func to download the deepest stacked images
def download_deepest_images(ra,dec,fov=0.1,bands=list('gri')):
    imgTable = svc.search((ra,dec), (fov/np.cos(dec*np.pi/180), fov), verbosity=2).votable.to_table()
    print "The full image list contains", len(imgTable), "entries"
    
    sel0 = (imgTable['proctype']=='Stacked') & (imgTable['prodtype']=='image') # basic selection
    images = []
    for band in bands:
        print "Band %s:" % band,

        sel = sel0 & (imgTable['obs_bandpass']==band) # add 'band' to selection
        Table = imgTable[sel] # select
        
        row = Table[np.argmax(Table['exptime'].data.data.astype('float'))] # pick image with longest exposure time
#preview        url = row['preview'] # get the download URL
        url = row['access_url'] # get the download URL

        print 'downloading deepest stacked image...'
#preview        img = plt.imread(utils.data.download_file(url,cache=True,show_progress=False,timeout=120))
        img = io.fits.getdata(utils.data.download_file(url,cache=True,show_progress=False,timeout=120))
        images.append(img)
        
    print "Downloaded %d images." % len(images)
    return images

# multi panel image plotter
def plot_images(images,geo=None,panelsize=5,bands=list('gri'),cmap=matplotlib.cm.gray_r):
    n = len(images)
    if geo is None: geo = (n,1)
        
    fig = plt.figure(figsize=(geo[0]*panelsize,geo[1]*panelsize))
    for j,img in enumerate(images):
        ax = fig.add_subplot(geo[1],geo[0],j+1)
        ax.imshow(img,origin='lower',interpolation='none',cmap=cmap,norm=matplotlib.colors.PowerNorm(0.1,vmin=1.01*img.min(),vmax=img.max()))
        ax.set_title('%s band' % bands[j])
        ax.xaxis.set_visible(False)
        ax.yaxis.set_visible(False)
        
    fig.subplots_adjust(wspace=0.1)

Get images for the "left yellow" box

In [14]:
%%capture --no-stdout --no-display
# capturing output b/c of ugly pyvo warnings
bands = list('gri')
idx = 1
images = download_deepest_images(tbl['ra'][idx], tbl['dec'][idx], fov=0.1, bands=bands) # FOV in deg
plot_images(images,bands=bands)
The full image list contains 2604 entries
Band g: downloading deepest stacked image...
Band r: downloading deepest stacked image...
Band i: downloading deepest stacked image...
Downloaded 3 images.

Looks like a galaxy cluster!

Now the "white center box" object

In [15]:
%%capture --no-stdout --no-display
# capturing output b/c of ugly pyvo warnings
idx = 0
images = download_deepest_images(tbl['ra'][idx], tbl['dec'][idx],fov=0.1,bands=bands)
plot_images(images)
The full image list contains 2713 entries
Band g: downloading deepest stacked image...
Band r: downloading deepest stacked image...
Band i: downloading deepest stacked image...
Downloaded 3 images.