pycf3 tutorial

Interactive Version

Launch Binder for an interactive version of this tutorial!

Binder

Introduction

This tutorial is a guide on how to interact with the two Cosmicflows-3 Distance-Velocity Calculators in a Python code. These calculators are featured in Kourkchi et al. 2020, AJ, 159, 67.

pycf3 present an OO API to access the same capabilities of http://edd.ifa.hawaii.edu/NAMcalculator/ and http://edd.ifa.hawaii.edu/CF3calculator/ . Please be gentle with the server.

The first required step is to import the project

[2]:
import pycf3

Example 1: Sending a request to the NAM D-V calcualtor

First, let’s create the NAM Client

[3]:
nam = pycf3.NAM()
nam
[3]:
NAM(calculator='NAM', cache_dir='/home/ehsan/pycf3_data/_cache_', cache_expire=None)

Then the basic functionality of the NAM are provided in two different methods

  • calculate_distance(velocity, <coordinates>) to calculate a distance based on a given velocity.

  • calculate_velocity(distance, <coordinates>) to calculate a velocity based on a given distance.

The <coordinates> parameters can be expressed in ra, dec (equatorial), glon, glat (galactic) or sgl, sgb (supergalactic) coordinates. You need to provide both components of a coordinate. Please note that multiple coordinate systems cannot be mixed.

Now, let’s assume we want to generate the same query as it appears in the following Figure

image2.png

Here is how to send the same request in a Python code:

[4]:
result = nam.calculate_distance(velocity=1000, sgl=102, sgb=-2)
result
[4]:
Result - NAM(velocity=1000, sgl=102, sgb=-2)
Observed Distance (Mpc) [ 8.08088613 18.78629089 22.09785028]
Velocity (Km/s) 1000.0
Show/Hide Raw

Note: You can click Show/Hide Raw to check the raw response from the server

The provided coordinates at different coordinate systems can be extracted as follows

[5]:
result.calculated_at_
[5]:
CalculatedAt(ra=187.7891703346409, dec=13.333860121247609, glon=282.9654677357161, glat=75.4136002414933, sgl=102.0, sgb=-2.0)

and the corresponding distance/velocity values are available at:

[6]:
result.observed_velocity_
[6]:
1000.0
[7]:
result.observed_distance_
[7]:
array([ 8.08088613, 18.78629089, 22.09785028])

In addition, the entire raw response, as it is returned by the EDD server, are packaged in the json_ attribute in the form of a Python dictionary

[8]:
result.json_
[8]:
{'message': 'Success',
 'RA': 187.7891703346409,
 'Dec': 13.333860121247609,
 'Glon': 282.9654677357161,
 'Glat': 75.4136002414933,
 'SGL': 102.0,
 'SGB': -2.0,
 'velocity': 1000.0,
 'distance': [8.08088612690689, 18.786290885088945, 22.097850275812398]}

Example 2: How to extract the radial velocity of an object at a given distance in NAM

[9]:
nam.calculate_velocity(distance=30, sgl=102, sgb=-2)
[9]:
Result - NAM(distance=30, sgl=102, sgb=-2)
Observed Distance (Mpc) [30.]
Velocity (Km/s) 1790.9019256321444
Show/Hide Raw

Example 3: Sending a request to the Cosmicflows-3 D-V calculator (d < 200 Mpc)

We are trying to reproduce this query in Python

image3.png

As stated before, first we need to create a Cosmic Flows-3 Client

[10]:
cf3 = pycf3.CF3()
cf3
[10]:
CF3(calculator='CF3', cache_dir='/home/ehsan/pycf3_data/_cache_', cache_expire=None)

Let’s calculate the distance

[11]:
result = cf3.calculate_distance(velocity=9000, glon=283, glat=75)
result
[11]:
Result - CF3(velocity=9000, glon=283, glat=75)
Observed Distance (Mpc) [136.90134347]
Velocity (Km/s) 9000.0
Adjusted Distance (Mpc) [134.26214472]
Velocity (Km/s) 9000.0
Show/Hide Raw

Similar to NAM, coordinates can be accessed with result.calculated_at_ and velocity/distance with result.observed_velocity_ and result.observed_distance_ respectively.

In addition, CF3 provides an adjusted version of the velocity/distance, that are available in result.adjusted_distance_ and result.adjusted_velocity_ . For further details on the adjustments you can visit Here.

[12]:
result.adjusted_distance_
[12]:
array([134.26214472])
[13]:
result.adjusted_velocity_
[13]:
9000.0

Note: The NAM client also has the adjusted_ values but all are None.

Example 4: How to obtain the radial velocity at a given distance with CF3

[14]:
cf3.calculate_velocity(distance=180, ra=187, dec=13)
[14]:
Result - CF3(distance=180, ra=187, dec=13)
Observed Distance (Mpc) [180.]
Velocity (Km/s) 12515.699706446017
Adjusted Distance (Mpc) [180.]
Velocity (Km/s) 12940.58481990226
Show/Hide Raw

PyCF3 Cache system

By default, any any client instance is created with a cache that prevent to send the same request twice. For example, let’s make a similar query twice and measure the calculation time.

Note: We are using the pycf3.CF3 client as an example, but any other pycf3 calculator has the same functionalities.

[15]:
cf3 = pycf3.CF3()
[17]:
%%time
cf3.calculate_velocity(distance=180, ra=187, dec=13)
CPU times: user 6.53 ms, sys: 1.02 ms, total: 7.56 ms
Wall time: 11.6 s
[17]:
Result - CF3(distance=180, ra=187, dec=13)
Observed Distance (Mpc) [180.]
Velocity (Km/s) 12515.699706446017
Adjusted Distance (Mpc) [180.]
Velocity (Km/s) 12940.58481990226
Show/Hide Raw
[18]:
%%time
cf3.calculate_velocity(distance=180, ra=187, dec=13)
CPU times: user 3.67 ms, sys: 0 ns, total: 3.67 ms
Wall time: 2.86 ms
[18]:
Result - CF3(distance=180, ra=187, dec=13)
Observed Distance (Mpc) [180.]
Velocity (Km/s) 12515.699706446017
Adjusted Distance (Mpc) [180.]
Velocity (Km/s) 12940.58481990226
Show/Hide Raw

Evidently, the first time execution lasts for about 10 seconds while the second time execution is of the order of ms. This is achieved by storing the results on the local hard drive in order to avoid the successive execution of similar requests (by default a folder called pycf3_data, is created in the user home directory)

As expected, if the query is altered by asking for another declination, the process is going to be slow again:

[19]:
%%time
cf3.calculate_velocity(distance=180, ra=187, dec=13.5)
CPU times: user 3.98 ms, sys: 4.03 ms, total: 8.01 ms
Wall time: 12.2 s
[19]:
Result - CF3(distance=180, ra=187, dec=13.5)
Observed Distance (Mpc) [180.]
Velocity (Km/s) 12512.725297090401
Adjusted Distance (Mpc) [180.]
Velocity (Km/s) 12937.406568949371
Show/Hide Raw

and the execution of the same query is fast again

[20]:
%%time
cf3.calculate_velocity(distance=180, ra=187, dec=13.5)
CPU times: user 1.68 ms, sys: 357 µs, total: 2.04 ms
Wall time: 1.37 ms
[20]:
Result - CF3(distance=180, ra=187, dec=13.5)
Observed Distance (Mpc) [180.]
Velocity (Km/s) 12512.725297090401
Adjusted Distance (Mpc) [180.]
Velocity (Km/s) 12937.406568949371
Show/Hide Raw

In addition, it is beneficial to recycle the results that are not “too older” than an expiration time. In these cases, you can set how many second your local data will be available by adding the parameter cache_expire when you create the CF3 client.

[21]:
cf3 = pycf3.CF3(cache_expire=2)

At this point the new cf3 instance shares the same default-cache of the previous one. Now, if we execute any of the previous requests, the process would be fast. As seen, it’s only taking a few milliseconds.

[22]:
%%time
cf3.calculate_velocity(distance=180, ra=187, dec=13)
CPU times: user 2.19 ms, sys: 0 ns, total: 2.19 ms
Wall time: 1.65 ms
[22]:
Result - CF3(distance=180, ra=187, dec=13)
Observed Distance (Mpc) [180.]
Velocity (Km/s) 12515.699706446017
Adjusted Distance (Mpc) [180.]
Velocity (Km/s) 12940.58481990226
Show/Hide Raw

Yo can remove the entire “cached” data by calling the command

[23]:
cf3.cache.clear()
[23]:
2
Now we can send the same original request and the result will be available for 2 seconds before it expires.
[24]:
%%time
cf3.calculate_velocity(distance=180, ra=187, dec=13.5)

import time
time.sleep(3)  # lets sleep 3 seconds
CPU times: user 6.2 ms, sys: 3.32 ms, total: 9.52 ms
Wall time: 14.5 s

because we waited to long, the next query will be slow again

[25]:
%%time
cf3.calculate_velocity(distance=180, ra=187, dec=13.5)
CPU times: user 6.5 ms, sys: 5.01 ms, total: 11.5 ms
Wall time: 11.7 s
[25]:
Result - CF3(distance=180, ra=187, dec=13.5)
Observed Distance (Mpc) [180.]
Velocity (Km/s) 12512.725297090401
Adjusted Distance (Mpc) [180.]
Velocity (Km/s) 12937.406568949371
Show/Hide Raw

However, if we don’t wait until the cached outcomes are expired, the process would be quick.

[26]:
%%time
cf3.calculate_velocity(distance=180, ra=187, dec=13.5)
CPU times: user 0 ns, sys: 2.3 ms, total: 2.3 ms
Wall time: 1.53 ms
[26]:
Result - CF3(distance=180, ra=187, dec=13.5)
Observed Distance (Mpc) [180.]
Velocity (Km/s) 12512.725297090401
Adjusted Distance (Mpc) [180.]
Velocity (Km/s) 12937.406568949371
Show/Hide Raw

Changing the cache backend

The entire cache backend of pycf3 was created with DiskCache (http://www.grantjenks.com/docs/diskcache/)

You can change your cache location (to store different datasets for example) by providing another diskcache.Cache or diskcache.FanoutCache instance.

import diskcache as dcache

cache = dcache.FanoutCache(
    directory="my/cache/directory")

# let make our data valid for 24 hours
cf3 = pycf3.CF3(cache=cache, cache_expire=86400)

Finally, to completely deactivate the cache system, pycf3 can be forced to ignore the cache system by setting cache to NoCache.

[27]:
cf3 = pycf3.CF3(cache=pycf3.NoCache())
cf3
[27]:
CF3(calculator='CF3', cache_dir='', cache_expire=None)

PyCF3 Retry

By default any calculator instance try 3 times to perform a request. if you want to customize the number of attempts, you need to change the default session of the instance.

Note: We are using the pycf3.CF3 client as an example, but any other pycf3 calculator has the same functionalities.

For example if you want to try 2 times:

[28]:
session = pycf3.RetrySession(retries=2)
cf3 = pycf3.CF3(session=session)

Also if you want to only wait for some arbitrary number of seconds between any request you can add the timeout=<SECONDS> to any search request.

[30]:
# no more than 5 seconds between every request
cf3.calculate_velocity(distance=180, ra=187, dec=13.5, timeout=5)
---------------------------------------------------------------------------
timeout                                   Traceback (most recent call last)
~/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    425                     # Otherwise it looks like a bug in the code.
--> 426                     six.raise_from(e, None)
    427         except (SocketTimeout, BaseSSLError, SocketError) as e:

~/anaconda3/lib/python3.8/site-packages/urllib3/packages/six.py in raise_from(value, from_value)

~/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    420                 try:
--> 421                     httplib_response = conn.getresponse()
    422                 except BaseException as e:

~/anaconda3/lib/python3.8/http/client.py in getresponse(self)
   1346             try:
-> 1347                 response.begin()
   1348             except ConnectionError:

~/anaconda3/lib/python3.8/http/client.py in begin(self)
    306         while True:
--> 307             version, status, reason = self._read_status()
    308             if status != CONTINUE:

~/anaconda3/lib/python3.8/http/client.py in _read_status(self)
    267     def _read_status(self):
--> 268         line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
    269         if len(line) > _MAXLINE:

~/anaconda3/lib/python3.8/socket.py in readinto(self, b)
    668             try:
--> 669                 return self._sock.recv_into(b)
    670             except timeout:

timeout: timed out

During handling of the above exception, another exception occurred:

ReadTimeoutError                          Traceback (most recent call last)
~/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    669             # Make the request on the httplib connection object.
--> 670             httplib_response = self._make_request(
    671                 conn,

~/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    427         except (SocketTimeout, BaseSSLError, SocketError) as e:
--> 428             self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
    429             raise

~/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py in _raise_timeout(self, err, url, timeout_value)
    334         if isinstance(err, SocketTimeout):
--> 335             raise ReadTimeoutError(
    336                 self, url, "Read timed out. (read timeout=%s)" % timeout_value

ReadTimeoutError: HTTPConnectionPool(host='edd.ifa.hawaii.edu', port=80): Read timed out. (read timeout=5)

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
~/anaconda3/lib/python3.8/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    438             if not chunked:
--> 439                 resp = conn.urlopen(
    440                     method=request.method,

~/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    753             )
--> 754             return self.urlopen(
    755                 method,

~/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    753             )
--> 754             return self.urlopen(
    755                 method,

~/anaconda3/lib/python3.8/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    725
--> 726             retries = retries.increment(
    727                 method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]

~/anaconda3/lib/python3.8/site-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    445         if new_retry.is_exhausted():
--> 446             raise MaxRetryError(_pool, url, error or ResponseError(cause))
    447

MaxRetryError: HTTPConnectionPool(host='edd.ifa.hawaii.edu', port=80): Max retries exceeded with url: /CF3calculator/api.php (Caused by ReadTimeoutError("HTTPConnectionPool(host='edd.ifa.hawaii.edu', port=80): Read timed out. (read timeout=5)"))

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
<ipython-input-30-a5a25eb08020> in <module>
      1 # no more than 5 seconds between every request
----> 2 cf3.calculate_velocity(distance=180, ra=187, dec=13.5, timeout=5)

/media/Data/Home/PanStarrs/Jan/HI/augment/test_TFRcatal/pycf3/pycf3.py in calculate_velocity(self, distance, ra, dec, glon, glat, sgl, sgb, **get_kwargs)
    902             ra=ra, dec=dec, glon=glon, glat=glat, sgl=sgl, sgb=sgb
    903         )
--> 904         response = self._search(
    905             coordinate_system=coordinate_system,
    906             alpha=alpha,

/media/Data/Home/PanStarrs/Jan/HI/augment/test_TFRcatal/pycf3/pycf3.py in _search(self, coordinate_system, alpha, delta, distance, velocity, **get_kwargs)
    727             response = cache.get(key, default=dcache.core.ENOVAL, retry=True)
    728             if response == dcache.core.ENOVAL:
--> 729                 response = self.session.get(
    730                     self.URL, json=payload, **get_kwargs
    731                 )

~/anaconda3/lib/python3.8/site-packages/requests/sessions.py in get(self, url, **kwargs)
    541
    542         kwargs.setdefault('allow_redirects', True)
--> 543         return self.request('GET', url, **kwargs)
    544
    545     def options(self, url, **kwargs):

~/anaconda3/lib/python3.8/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    528         }
    529         send_kwargs.update(settings)
--> 530         resp = self.send(prep, **send_kwargs)
    531
    532         return resp

~/anaconda3/lib/python3.8/site-packages/requests/sessions.py in send(self, request, **kwargs)
    641
    642         # Send the request
--> 643         r = adapter.send(request, **kwargs)
    644
    645         # Total elapsed time of the request (approximately)

~/anaconda3/lib/python3.8/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies)
    514                 raise SSLError(e, request=request)
    515
--> 516             raise ConnectionError(e, request=request)
    517
    518         except ClosedPoolError as e:

ConnectionError: HTTPConnectionPool(host='edd.ifa.hawaii.edu', port=80): Max retries exceeded with url: /CF3calculator/api.php (Caused by ReadTimeoutError("HTTPConnectionPool(host='edd.ifa.hawaii.edu', port=80): Read timed out. (read timeout=5)"))
[ ]: