-
Notifications
You must be signed in to change notification settings - Fork 68
Update Google Docs Meta Data #1546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
It turns out that there are still extended ascii chars in here (they are actually unicode chars)... They are findable by running: from collections import defaultdict
highchars = defaultdict(int)
with open('db_signals.csv') as f:
for line in f:
for char in line:
val = ord(char)
if val>=127:
highchars[val] += 1 the current >>> highchars
defaultdict(<class 'int'>, {8220: 9, 8217: 30, 8221: 9})
>>> chr(8220)
'“'
>>> chr(8221)
'”'
>>> chr(8217)
'’'
>>> I am not going to simply replace them in the file itself because of escaping concerns, so after merging this PR, i will replace them in the google spreadsheet and then run the csv sync utility (GH action) again. |
in case it helps someone in the future, heres some ugly code that i used to help compare the two versions of these files: import csv
dev = []
with open('dev__db_signals.csv') as f:
for r in csv.reader(f):
dev.append(r)
new = []
with open('new__db_signals.csv') as f:
for r in csv.reader(f):
new.append(r)
def compare_rows(a, b):
if len(a) != len(b):
print("length mismatch")
for i in range(len(a)):
if a[i] != b[i]:
print(" ", i, a[i].replace("\n", ""))
print(" ", i, b[i].replace("\n", ""))
for i in range(len(dev)):
offset = 0
if i in (7,8):
# skip added rows
continue
if i > 8:
# account for added rows
offset = 2
n = new[i][:10] + new[i][11:] # skip added column @ index 10
d = dev[i-offset]
if n != d:
print(i)
compare_rows(n, d) |
Updating Google Docs Meta Data
Signal Set
" columnchng
signals:7dav_inpatient_covid
and7dav_outpatient_covid
The signal name for "
covid_naat_pct_positive_7dav
" was lost in an apparent accidental paste, but i fixed it here w/ a commit to the branch PR, and manually in the spreadsheet