Skip to content

Commit b3921ec

Browse files
authored
Make pip_smoke_test.sh more robust against false positives (#801)
This makes `pip_smoke_test.sh` robust against two types of false positives that I realized it might produce: 1. Testing for the post-pip-package-installation presence of a tensorboard binary with `which tensorboard` could falsely return true if tensorboard was anywhere on the path (e.g. installed to the system outside the virtualenv), meaning it didn't actually guarantee that the tensorboard command used later in the script would be the one from the pip package. 2. Testing for whether URLs are accessible at `localhost:6006` (by default) could falsely succeed if there was another pre-existing tensorboard instance running locally on port 6006 (which is common for a developer machine since that's the default tensorboard port), since the accesses don't really have a way to check that they're talking to the freshly launched tensorboard. Note that the newly launched tensorboard actually dies on startup in this case (since it can't bind to the requested port) but it's hidden since it's been backgrounded. This fixes the first one by checking directly for a `tensorboard` binary inside the virtualenv's `bin` directory, and fixes the second one by ensuring that we wait for the `tensorboard` binary to start up successfully and print its serving URL before proceeding with the test. That also means we can extract the serving URL automatically, which makes it possible to set the default port to 0 (aka requesting an unused local port from the kernel) which should eliminate the risk of interference in the default case.
1 parent c04ee3a commit b3921ec

File tree

1 file changed

+36
-9
lines changed

1 file changed

+36
-9
lines changed

tensorboard/pip_package/pip_smoke_test.sh

+36-9
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ die() {
3030
}
3131

3232
PY_VERSION=2
33-
TEST_PORT=6006
33+
TEST_PORT=0
3434
NUM_RETRIES=20
3535
while [[ "$#" -gt 0 ]]; do
3636
if [[ "$1" == "--python3" ]]; then
@@ -89,7 +89,8 @@ echo "Activating virtualenv at ${VENV_TMP_DIR}"
8989
echo
9090

9191
export VIRTUAL_ENV="${VENV_TMP_DIR}"
92-
export PATH="${VENV_TMP_DIR}/bin:${PATH}"
92+
export VENV_BIN_DIR="${VENV_TMP_DIR}/bin"
93+
export PATH="${VENV_BIN_DIR}:${PATH}"
9394
unset PYTHON_HOME
9495

9596
echo
@@ -122,17 +123,42 @@ elif [[ "${PY_VERSION}" == 3 ]]; then
122123
fi
123124

124125
# Check tensorboard binary path.
125-
TB_BIN_PATH=$(which tensorboard)
126-
if [[ -z ${TB_BIN_PATH} ]]; then
127-
die "ERROR: Cannot find tensorboard binary path after installing tensorboard pip package."
126+
TB_BIN_PATH="${VENV_BIN_DIR}/tensorboard"
127+
if ! [[ -x "${TB_BIN_PATH}" ]]; then
128+
die "ERROR: No tensorboard binary found after installing tensorboard pip package."
128129
fi
129130

131+
# Start TensorBoard running in the background
132+
TMP_LOG_OUTPUT=${PIP_TMP_DIR}/output.log
130133
TMP_LOGDIR=$(mktemp -d --suffix _tensorboard_logdir)
131-
tensorboard --port="${TEST_PORT}" --logdir="${TMP_LOGDIR}" &
134+
tensorboard --host=localhost --port="${TEST_PORT}" --logdir="${TMP_LOGDIR}" \
135+
>${TMP_LOG_OUTPUT} 2>&1 &
132136
TB_PID=$!
133137

134138
echo
135-
echo "tensorboard binary should be running at pid ${TB_PID}"
139+
echo "waiting for tensorboard binary to start up..."
140+
echo
141+
142+
# Wait until the binary has printed its serving URL so we know that it's
143+
# accessible and which port it's running on.
144+
while true; do
145+
if ! ps -p $TB_PID >/dev/null 2>&1; then
146+
echo
147+
echo "TensorBoard exited unexpected, printing logs:"
148+
echo "============================================="
149+
cat ${TMP_LOG_OUTPUT}
150+
echo "============================================="
151+
exit 1
152+
fi
153+
TB_URL=$(grep -o -m 1 -E 'http://localhost:[0-9]+' ${TMP_LOG_OUTPUT} || true)
154+
if [[ -n "${TB_URL}" ]]; then
155+
break
156+
fi
157+
sleep 1
158+
done
159+
160+
echo
161+
echo "tensorboard binary (pid ${TB_PID}) running at ${TB_URL}"
136162
echo
137163

138164
test_access_url() {
@@ -174,14 +200,15 @@ test_access_url() {
174200
}
175201

176202
TEST_URL_FAILED=0
177-
test_access_url "http://localhost:${TEST_PORT}/data/logdir" || TEST_URL_FAILED=1
178-
test_access_url "http://localhost:${TEST_PORT}" || TEST_URL_FAILED=1
203+
test_access_url "${TB_URL}/data/logdir" || TEST_URL_FAILED=1
204+
test_access_url "${TB_URL}" || TEST_URL_FAILED=1
179205

180206
echo
181207
echo "Terminating tensorboard binary at pid ${TB_PID}"
182208
echo
183209

184210
kill -9 "${TB_PID}"
211+
wait "${TB_PID}" 2>/dev/null || true # Wait to suppress "Killed" message.
185212

186213
echo
187214
if [[ "${TEST_URL_FAILED}" == 0 ]]; then

0 commit comments

Comments
 (0)