Skip to content

Commit 5ca9d6a

Browse files
authored
Merge pull request #1 from AlSaeed/tests-conflict-resolution
Tests conflict resolution
2 parents 4e56dcb + 32b2603 commit 5ca9d6a

20 files changed

+567
-272
lines changed

Diff for: README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,15 @@
55
# About
66

77
This is the home of [Delphi](https://delphi.cmu.edu/)'s epidemiological data
8-
API. See our [API documentation](docs/api/README.md) for details on the
9-
available data sets, APIs, and clients.
8+
API. See our [API documentation](https://cmu-delphi.github.io/delphi-epidata/)
9+
for details on the available data sets, APIs, and clients.
1010

1111
# COVID-19 Notice
1212

1313
We are working on collecting several new data sources that may be useful for
1414
nowcasting and forecasting ILI during the COVID-19 pandemic. Each of these will
1515
make these available as soon as individually possible, through our [covidcast
16-
endpoint](docs/api/covidcast.md).
16+
endpoint](https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html).
1717

1818
For a list of many other data sources relevant to COVID-19, publicly available
1919
through external sites, we have compiled a simple

Diff for: docs/api/covidcast.md

+28-15
Original file line numberDiff line numberDiff line change
@@ -22,13 +22,15 @@ terms; we encourage academic users to [cite](README.md#citing) the data if they
2222
use it in any publications. Further documentation on Delphi's APIs is available
2323
in the [API overview](README.md).
2424

25-
**For users:** Delphi operates a [mailing
26-
list](https://lists.andrew.cmu.edu/mailman/listinfo/delphi-covidcast-api) for
27-
users of the COVIDcast API. We will use the list to announce API changes,
28-
corrections to data, and new features; API users may also use the mailing list
29-
to ask general questions about its use. If you use the API, we strongly
30-
encourage you to
31-
[subscribe](https://lists.andrew.cmu.edu/mailman/listinfo/delphi-covidcast-api).
25+
<div style="background-color:#FCC; padding: 10px 30px;"><strong>For
26+
users:</strong> Delphi operates a <a
27+
href="https://lists.andrew.cmu.edu/mailman/listinfo/delphi-covidcast-api">mailing
28+
list</a> for users of the COVIDcast API. We will use the list to announce API
29+
changes, corrections to data, and new features; API users may also use the
30+
mailing list to ask general questions about its use. If you use the API, we
31+
strongly encourage you to <a
32+
href="https://lists.andrew.cmu.edu/mailman/listinfo/delphi-covidcast-api">subscribe</a>.
33+
</div>
3234

3335
## Accessing the API
3436

@@ -38,7 +40,7 @@ the appropriate client for your programming language, accessing data is as easy
3840
as (in [R](https://www.r-project.org/)):
3941

4042
```r
41-
library(covidcastR)
43+
library(covidcast)
4244

4345
data <- covidcast_signal("fb-survey", "smoothed_cli", start_day = "20200501",
4446
end_day = "20200507")
@@ -82,6 +84,8 @@ and lists.
8284

8385
### Parameters
8486

87+
#### Required
88+
8589
| Parameter | Description | Type |
8690
| --- | --- | --- |
8791
| `data_source` | name of upstream data source (e.g., `doctor-visits` or `fb-survey`; [see full list](covidcast_signals.md)) | string |
@@ -96,9 +100,16 @@ The current set of signals available for each data source is returned by the
96100

97101
#### Optional
98102

99-
The default API behavior is to return the most recently issued value for each `time_value` selected.
103+
Estimates for a specific `time_value` and `geo_value` are sometimes updated
104+
after they are first published. Many of our data sources issue corrections or
105+
backfill estimates as data arrives; see the [documentation for each
106+
source](covidcast_signals.md) for details.
107+
108+
The default API behavior is to return the most recently issued value for each
109+
`time_value` selected.
100110

101-
We also provide access to previous versions of data using the optional parameters below.
111+
We also provide access to previous versions of data using the optional query
112+
parameters below.
102113

103114
| Parameter | Description | Type |
104115
| --- | --- | --- |
@@ -109,7 +120,7 @@ We also provide access to previous versions of data using the optional parameter
109120
Use cases:
110121

111122
* To pretend like you queried the API on June 1, such that the returned results
112-
do not include any updates which became available after June 1, use
123+
do not include any updates that became available after June 1, use
113124
`as_of=20200601`.
114125
* To retrieve only data that was published or updated on June 1, and exclude
115126
records whose most recent update occured earlier than June 1, use
@@ -121,10 +132,12 @@ Use cases:
121132
* To retrieve only data that was published or updated exactly 3 days after the
122133
underlying events occurred, use `lag=3`.
123134

124-
NB: Each issue in the versioning system contains only the records that were
125-
added or updated during that time unit; we exclude records whose values remain
126-
the same as a previous issue. If you have a research problem that would require
127-
knowing when an unchanged value was last confirmed, please get in touch.
135+
You should specify only one of these three parameters in any given query.
136+
137+
**Note:** Each issue in the versioning system contains only the records that
138+
were added or updated during that time unit; we exclude records whose values
139+
remain the same as a previous issue. If you have a research problem that would
140+
require knowing when an unchanged value was last confirmed, please get in touch.
128141

129142
### Response
130143

Diff for: docs/api/covidcast_clients.md

+3-4
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,8 @@ nav_order: 1
88

99
Dedicated COVIDcast clients are available for several languages:
1010

11-
* [covidcast-py](https://cmu-delphi.github.io/covidcast/covidcast-py/html/) for
12-
Python users
13-
* [covidcastR](https://cmu-delphi.github.io/covidcast/covidcastR/) for R users
11+
* R: [covidcast](https://cmu-delphi.github.io/covidcast/covidcastR/)
12+
* Python: [covidcast](https://cmu-delphi.github.io/covidcast/covidcast-py/html/)
1413

1514
These packages provide a convenient way to obtain COVIDcast data as a data frame
1615
ready to be used in further analyses. For installation instructions and
@@ -81,7 +80,7 @@ print(res['result'], res['message'], len(res['epidata']))
8180

8281
### R
8382

84-
**Note:** For COVIDcast usage, R users should prefer the [covidcastR
83+
**Note:** For COVIDcast usage, R users should prefer the [covidcast
8584
package](https://cmu-delphi.github.io/covidcast/covidcastR/); these instructions
8685
are for advanced users who want access to the entire Epidata API, including data
8786
on influenza, dengue, and norovirus.

Diff for: docs/api/covidcast_meta.md

+5-2
Original file line numberDiff line numberDiff line change
@@ -33,13 +33,16 @@ See [this documentation](README.md) for details on specifying epiweeks, dates, a
3333
| `epidata[].signal` | signal name | string |
3434
| `epidata[].time_type` | temporal resolution of the signal (e.g., `day`, `week`) | string |
3535
| `epidata[].geo_type` | geographic resolution (e.g. `county`, `hrr`, `msa`, `dma`, `state`) | string |
36-
| `epidata[].min_time` | minimum time (e.g., 20200406) | integer |
37-
| `epidata[].max_time` | maximum time (e.g., 20200413) | integer |
36+
| `epidata[].min_time` | minimum observation time (e.g., 20200406) | integer |
37+
| `epidata[].max_time` | maximum observation time (e.g., 20200413) | integer |
3838
| `epidata[].num_locations` | number of distinct geographic locations with data | integer |
3939
| `epidata[].min_value` | minimum value | float |
4040
| `epidata[].max_value` | maximum value | float |
4141
| `epidata[].mean_value` | mean of value | float |
4242
| `epidata[].stdev_value` | standard deviation of value | float |
43+
| `epidata[].max_issue` | most recent date data was issued (e.g., 20200710) | integer |
44+
| `epidata[].min_lag` | smallest lag from observation to issue, in `time_type` units | integer |
45+
| `epidata[].max_lag` | largest lag from observation to issue, in `time_type` units | integer |
4346
| `message` | `success` or error message | string |
4447

4548
## Example URLs

Diff for: docs/symptom-survey/index.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,8 @@ economic and health impacts they have experienced as a result of the pandemic. A
1313
high-level overview of the survey is posted [on the COVIDcast
1414
website](https://covidcast.cmu.edu/survey.html).
1515

16-
Aggregate data from this survey is available through the [COVIDcast
17-
API](../api/covidcast.md) as the [`fb-survey` data
18-
source](../api/covidcast-signals/fb-survey.md).
16+
Aggregate data from this survey is available through the [COVIDcast API](../api/covidcast.md)
17+
as the [`fb-survey` data source](../api/covidcast-signals/fb-survey.md).
1918

2019
This documentation is for users who have a signed Data Use Agreement to receive
2120
individual response data from the survey. It describes the survey items, data

Diff for: integrations/acquisition/covidcast/test_covidcast_meta_caching.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -66,14 +66,14 @@ def test_caching(self):
6666
self.cur.execute('''
6767
insert into covidcast values
6868
(0, 'src', 'sig', 'day', 'state', 20200422, 'pa',
69-
123, 1, 2, 3, 456, 1, 20200422, 0, False),
69+
123, 1, 2, 3, 456, 1, 20200422, 0, 1, False),
7070
(0, 'src', 'sig', 'day', 'state', 20200422, 'wa',
71-
789, 1, 2, 3, 456, 1, 20200423, 1, False)
71+
789, 1, 2, 3, 456, 1, 20200423, 1, 1, False)
7272
''')
7373
self.cur.execute('''
7474
insert into covidcast values
7575
(100, 'src', 'wip_sig', 'day', 'state', 20200422, 'pa',
76-
456, 4, 5, 6, 789, -1, 20200422, 0, True)
76+
456, 4, 5, 6, 789, -1, 20200422, 0, 1, True)
7777
''')
7878

7979
self.cnx.commit()

Diff for: integrations/acquisition/covidcast/test_direction_updating.py

+61-13
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,8 @@ def test_uploading(self):
5151
"""Update rows having a stale `direction` field and serve the results."""
5252

5353
# insert some sample data
54+
# src, sig1, 1111:
55+
# direction should be updated to None as there are no historical data for (src, sig1, state).
5456
# CA 20200301:
5557
# timeline should be x=[-2, -1, 0], y=[2, 6, 5] with direction=1
5658
# FL 20200517:
@@ -60,31 +62,77 @@ def test_uploading(self):
6062
# wrong) is fresh
6163
self.cur.execute('''
6264
insert into covidcast values
65+
(0, 'src', 'sig1', 'day', 'state', 20201028, '1111',
66+
123, 2, 0, 0, 0, -1, 20201028, 0, 1, False),
67+
(0, 'src', 'sig1', 'day', 'state', 20201029, '1111',
68+
123, 6, 0, 0, 0, 0, 20201029, 0, 1, False),
69+
(0, 'src', 'sig1', 'day', 'state', 20201030, '1111',
70+
123, 5, 0, 0, 0, 1, 20201030, 0, 1, False),
6371
(0, 'src', 'sig', 'day', 'state', 20200228, 'ca',
64-
123, 2, 0, 0, 0, NULL, 20200228, 0, False),
72+
123, 2, 0, 0, 0, NULL, 20200228, 0, 1, False),
6573
(0, 'src', 'sig', 'day', 'state', 20200229, 'ca',
66-
123, 6, 0, 0, 0, NULL, 20200229, 0, False),
74+
123, 6, 0, 0, 0, NULL, 20200229, 0, 1, False),
6775
(0, 'src', 'sig', 'day', 'state', 20200301, 'ca',
68-
123, 5, 0, 0, 0, NULL, 20200301, 0, False),
76+
123, 5, 0, 0, 0, NULL, 20200301, 0, 1, False),
6977
(0, 'src', 'sig', 'day', 'state', 20200511, 'fl',
70-
123, 1, 0, 0, 0, NULL, 20200511, 0, False),
78+
123, 1, 0, 0, 0, NULL, 20200511, 0, 1, False),
7179
(0, 'src', 'sig', 'day', 'state', 20200512, 'fl',
72-
123, 2, 0, 0, 0, NULL, 20200512, 0, False),
80+
123, 2, 0, 0, 0, NULL, 20200512, 0, 1, False),
7381
(0, 'src', 'sig', 'day', 'state', 20200517, 'fl',
74-
123, 2, 0, 0, 0, NULL, 20200517, 0, False),
82+
123, 2, 0, 0, 0, NULL, 20200517, 0, 1, False),
7583
(0, 'src', 'sig', 'day', 'state', 20200615, 'tx',
76-
123, 9, 0, 0, 456, NULL, 20200615, 0, False),
84+
123, 9, 0, 0, 456, NULL, 20200615, 0, 1, False),
7785
(0, 'src', 'sig', 'day', 'state', 20200616, 'tx',
78-
123, 5, 0, 0, 456, NULL, 20200616, 0, False),
86+
123, 5, 0, 0, 456, NULL, 20200616, 0, 1, False),
7987
(0, 'src', 'sig', 'day', 'state', 20200617, 'tx',
80-
123, 1, 0, 0, 456, 1, 20200617, 0, False)
88+
123, 1, 0, 0, 456, 1, 20200617, 0, 1, False)
8189
''')
8290
self.cnx.commit()
8391

8492
# update direction (only 20200417 has enough history)
85-
args = None
93+
args = get_argument_parser().parse_args('')
8694
main(args)
8795

96+
# The Quick-Fix is working
97+
response = Epidata.covidcast(
98+
'src', 'sig1', 'day', 'state', '20200101-20201231', '*')
99+
100+
self.assertEqual(response, {
101+
'result': 1,
102+
'epidata': [{
103+
'time_value': 20201028,
104+
'geo_value': '1111',
105+
'value': 2,
106+
'stderr': 0,
107+
'sample_size': 0,
108+
'direction': None,
109+
'issue': 20201028,
110+
'lag': 0
111+
},
112+
{
113+
'time_value': 20201029,
114+
'geo_value': '1111',
115+
'value': 6,
116+
'stderr': 0,
117+
'sample_size': 0,
118+
'direction': None,
119+
'issue': 20201029,
120+
'lag': 0
121+
},
122+
{
123+
'time_value': 20201030,
124+
'geo_value': '1111',
125+
'value': 5,
126+
'stderr': 0,
127+
'sample_size': 0,
128+
'direction': None,
129+
'issue': 20201030,
130+
'lag': 0
131+
},
132+
],
133+
'message': 'success',
134+
})
135+
88136
# request data from the API
89137
response = Epidata.covidcast(
90138
'src', 'sig', 'day', 'state', '20200101-20201231', '*')
@@ -190,9 +238,9 @@ def test_uploading(self):
190238
# verify secondary timestamps were updated
191239
self.cur.execute('select direction_updated_timestamp from covidcast order by id asc')
192240
timestamps = [t for (t,) in self.cur]
193-
for t in timestamps[:6]:
194-
# first 6 rows had `direction` updated
241+
for t in timestamps[:9]:
242+
# first 9 rows had `direction` updated
195243
self.assertGreater(t, 0)
196-
for t in timestamps[6:]:
244+
for t in timestamps[9:]:
197245
# last 3 rows were not updated
198246
self.assertEqual(t, 456)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
"""Integration tests for covidcast's direction updating."""
2+
3+
# standard library
4+
import unittest
5+
6+
# third party
7+
import mysql.connector
8+
9+
# first party
10+
from delphi.epidata.client.delphi_epidata import Epidata
11+
import delphi.operations.secrets as secrets
12+
13+
# py3tester coverage target (equivalent to `import *`)
14+
__test_target__ = 'delphi.epidata.acquisition.covidcast.fill_is_latest_issue'
15+
16+
17+
class FillIsLatestIssueTests(unittest.TestCase):
18+
"""Tests filling is_latest_issue column"""
19+
20+
def setUp(self):
21+
"""Perform per-test setup."""
22+
23+
# connect to the `epidata` database and clear the `covidcast` table
24+
cnx = mysql.connector.connect(
25+
user='user',
26+
password='pass',
27+
host='delphi_database_epidata',
28+
database='epidata')
29+
cur = cnx.cursor()
30+
cur.execute('truncate table covidcast')
31+
cnx.commit()
32+
cur.close()
33+
34+
# make connection and cursor available to test cases
35+
self.cnx = cnx
36+
self.cur = cnx.cursor()
37+
38+
# use the local instance of the epidata database
39+
secrets.db.host = 'delphi_database_epidata'
40+
secrets.db.epi = ('user', 'pass')
41+
42+
# use the local instance of the Epidata API
43+
Epidata.BASE_URL = 'http://delphi_web_epidata/epidata/api.php'
44+
45+
def tearDown(self):
46+
"""Perform per-test teardown."""
47+
self.cur.close()
48+
self.cnx.close()
49+
50+
def test_fill_is_latest_issue(self):
51+
"""Update rows having a stale `direction` field and serve the results."""
52+
53+
self.cur.execute('''
54+
insert into covidcast values
55+
(0, 'src', 'sig', 'day', 'state', 20200228, 'ca',
56+
123, 2, 5, 5, 5, NULL, 20200228, 0, 1, False),
57+
(0, 'src', 'sig', 'day', 'state', 20200228, 'ca',
58+
123, 2, 0, 0, 0, NULL, 20200229, 1, 1, False),
59+
(0, 'src', 'sig', 'day', 'state', 20200229, 'ca',
60+
123, 6, 0, 0, 0, NULL, 20200301, 1, 1, False),
61+
(0, 'src', 'sig', 'day', 'state', 20200229, 'ca',
62+
123, 6, 9, 9, 9, NULL, 20200229, 0, 1, False),
63+
(0, 'src', 'sig', 'day', 'state', 20200301, 'ca',
64+
123, 5, 0, 0, 0, NULL, 20200303, 2, 1, False),
65+
(0, 'src', 'sig', 'day', 'state', 20200301, 'ca',
66+
123, 5, 5, 5, 5, NULL, 20200302, 1, 1, False),
67+
(0, 'src', 'sig', 'day', 'state', 20200301, 'ca',
68+
123, 5, 9, 8, 7, NULL, 20200301, 0, 1, False)
69+
''')
70+
self.cnx.commit()
71+
72+
# fill is_latest_issue
73+
main()
74+
75+
self.cur.execute('''select * from covidcast''')
76+
result = list(self.cur)
77+
expected = [
78+
(1, 'src', 'sig', 'day', 'state', 20200228, 'ca',
79+
123, 2, 5, 5, 5, None, 20200228, 0, bytearray(b'0'), bytearray(b'0')),
80+
(2, 'src', 'sig', 'day', 'state', 20200228, 'ca',
81+
123, 2, 0, 0, 0, None, 20200229, 1, bytearray(b'1'), bytearray(b'0')),
82+
(3, 'src', 'sig', 'day', 'state', 20200229, 'ca',
83+
123, 6, 0, 0, 0, None, 20200301, 1, bytearray(b'1'), bytearray(b'0')),
84+
(4, 'src', 'sig', 'day', 'state', 20200229, 'ca',
85+
123, 6, 9, 9, 9, None, 20200229, 0, bytearray(b'0'), bytearray(b'0')),
86+
(5, 'src', 'sig', 'day', 'state', 20200301, 'ca',
87+
123, 5, 0, 0, 0, None, 20200303, 2, bytearray(b'1'), bytearray(b'0')),
88+
(6, 'src', 'sig', 'day', 'state', 20200301, 'ca',
89+
123, 5, 5, 5, 5, None, 20200302, 1, bytearray(b'0'), bytearray(b'0')),
90+
(7, 'src', 'sig', 'day', 'state', 20200301, 'ca',
91+
123, 5, 9, 8, 7, None, 20200301, 0, bytearray(b'0'), bytearray(b'0'))
92+
]
93+
94+
self.assertEqual(result, expected)

0 commit comments

Comments
 (0)