Skip to content

Commit e4df29d

Browse files
authored
Add pre-commit with ruff, pyproject.toml, gh lint/test actions (#124)
* Add pre-commit with ruff and related gh action * Split lint/unit test into separate jobs * Fix gh action syntax * Fix action version * Remove un-needed packages, switch to pyproject.toml * Tweak pyproject.toml * Update README, run pre-commit
1 parent af5e438 commit e4df29d

16 files changed

+438
-305
lines changed

.github/workflows/gh-actions.yml

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
name: gh-actions
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
pull_request:
8+
branches:
9+
- main
10+
11+
jobs:
12+
pre-commit:
13+
runs-on: ubuntu-latest
14+
steps:
15+
- name: Checkout
16+
uses: actions/checkout@v4
17+
18+
- name: Setup python
19+
uses: actions/setup-python@v5
20+
with:
21+
python-version: '3.12'
22+
23+
- name: Run pre-commit
24+
uses: pre-commit/[email protected]
25+
26+
unit-tests:
27+
runs-on: ubuntu-latest
28+
steps:
29+
- name: Checkout
30+
uses: actions/checkout@v4
31+
32+
- name: Setup python
33+
uses: actions/setup-python@v5
34+
with:
35+
python-version: '3.12'
36+
cache: 'pipenv'
37+
38+
- name: Install pipenv
39+
run: curl https://raw.githubusercontent.com/pypa/pipenv/master/get-pipenv.py | python
40+
41+
- name: Install dependencies
42+
run: pipenv install --system --deploy --dev
43+
44+
- name: Run pytest
45+
run: |
46+
pytest -v -s

.gitignore

-1
Original file line numberDiff line numberDiff line change
@@ -174,4 +174,3 @@ data/
174174

175175
# ignore local docker settings
176176
docker-compose-maas.yml
177-

.pre-commit-config.yaml

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
repos:
2+
- repo: https://github.com/pre-commit/pre-commit-hooks
3+
rev: v5.0.0
4+
hooks:
5+
- id: trailing-whitespace
6+
- id: end-of-file-fixer
7+
- id: check-yaml
8+
- id: check-added-large-files
9+
- repo: https://github.com/astral-sh/ruff-pre-commit
10+
rev: v0.9.9
11+
hooks:
12+
- id: ruff
13+
name: lint with ruff
14+
- id: ruff
15+
name: sort imports with ruff
16+
args: [--select, I, --fix]
17+
- id: ruff-format
18+
name: format with ruff

.vscode/launch.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -37,4 +37,4 @@
3737
"justMyCode": false
3838
}
3939
]
40-
}
40+
}

Dockerfile

-1
Original file line numberDiff line numberDiff line change
@@ -53,4 +53,3 @@ EXPOSE 8000
5353

5454
ENV PATH="$APP_ROOT/.venv/bin:$PATH"
5555
CMD ["flask", "run", "--host=0.0.0.0", "--port=8000"]
56-

Pipfile

+2-3
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,10 @@ pypdf2 = "*"
3030
scikit-learn = "*"
3131

3232
[dev-packages]
33-
black = "*"
34-
isort = "*"
35-
flake8 = "*"
3633
ipython = "*"
3734
pytest = "*"
35+
ruff = "*"
36+
pre-commit = "*"
3837

3938
[requires]
4039
python_version = "3.12"

Pipfile.lock

+300-255
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

README.md

+56-32
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,11 @@ Each agent is intended to answer questions related to a set of documents known a
2020
- [With Docker Compose](#with-docker-compose)
2121
- [Using huggingface text-embeddings-inference server to host embedding model (deprecated)](#using-huggingface-text-embeddings-inference-server-to-host-embedding-model-deprecated)
2222
- [Without Docker Compose](#without-docker-compose)
23+
- [Developer Guide](#developer-guide)
24+
- [Install development packages](#install-development-packages)
25+
- [Using pre-commit](#using-pre-commit)
26+
- [Debugging in VSCode](#debugging-in-vscode)
27+
- [Mac Development Tips](#mac-development-tips)
2328
- [Synchronizing Documents from S3](#synchronizing-documents-from-s3)
2429
- [Continuous synchronization](#continuous-synchronization)
2530
- [Deploying to OpenShift](#deploying-to-openshift)
@@ -142,52 +147,32 @@ A development/test environment can be set up with or without docker compose. In
142147

143148
The docker compose file offers an easy way to spin up all components. [ollama](https://ollama.com) is used to host the LLM and embedding model. For utilization of your GPU, refer to the comments in the compose file to see which configurations to uncomment on the 'ollama' container. Postgres persists the data, and pgadmin allows you to query the database.
144149

145-
You will need Docker version 27.5.1 on Fedora 40 and 41 to be able to use docker compose (not docker-compose) and for that You will need to reinstall latest docker version from the [fedora docker repo](https://docs.docker.com/engine/install/fedora/#install-using-the-repository) or follow the instructions here.
150+
1. First, install Docker: [Follow the official guide for your OS](https://docs.docker.com/engine/install/)
146151

147-
Docker 27.5.1 is confirmed working with macOS 15.3.
152+
- NOTE: Currently, the compose file does not work with `podman`.
148153

149-
To get the correct version of docker, add the repo:
154+
2. On Linux, be sure to run through the [postinstall steps](https://docs.docker.com/engine/install/linux-postinstall/)
150155

151-
```text
152-
sudo dnf -y install dnf-plugins-core
153-
sudo dnf-3 config-manager --add-repo https://download.docker.com/linux/fedora/docker-ce.repo
154-
```
155-
156-
Install the packages:
157-
158-
```text
159-
sudo dnf install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
160-
```
161-
162-
Enable:
163-
164-
```text
165-
sudo systemctl enable --now docker
166-
```
167-
168-
Run through the postinstall steps https://docs.docker.com/engine/install/linux-postinstall/
169-
170-
171-
1. Create the directory which will house the local environment data:
156+
3. Create the directory which will house the local environment data:
172157

173158
```text
174159
mkdir data
175160
```
176161
177-
1. Invoke docker compose (postgres data will persist in `data/postgres`):
162+
4. Invoke docker compose (postgres data will persist in `data/postgres`):
178163
179164
```text
180165
docker compose up --build
181166
```
182167
183-
1. Pull the mistral LLM and nomic embedding model (data will persist in `data/ollama`):
168+
5. Pull the mistral LLM and nomic embedding model (data will persist in `data/ollama`):
184169
185170
```text
186171
docker exec tangerine-ollama ollama pull mistral
187172
docker exec tangerine-ollama ollama pull nomic-embed-text
188173
```
189174
190-
1. Access the API on port `8000`
175+
6. Access the API on port `8000`
191176
192177
```sh
193178
curl -XGET 127.0.0.1:8000/api/agents
@@ -196,7 +181,7 @@ Run through the postinstall steps https://docs.docker.com/engine/install/linux-p
196181
}
197182
```
198183

199-
1. (optional) Follow these steps to start the [tangerine-frontend](https://github.com/RedHatInsights/tangerine-frontend#with-docker-compose)
184+
7. (optional) Follow these steps to start the [tangerine-frontend](https://github.com/RedHatInsights/tangerine-frontend#with-docker-compose)
200185

201186
Note: You can access pgadmin at localhost:5050.
202187

@@ -315,14 +300,53 @@ to use this to test different embedding models that are not supported by ollama,
315300

316301
1. (optional) Follow these steps to start the [tangerine-frontend](https://github.com/RedHatInsights/tangerine-frontend#without-docker-compose)
317302

318-
## Debugging in VSCode
303+
## Developer Guide
304+
305+
### Install development packages
306+
307+
If desiring to make contributions, be sure to install the development packages:
308+
309+
```sh
310+
pipenv install --dev
311+
```
312+
313+
### Using pre-commit
314+
315+
This project uses pre-commit to handle formatting and linting.
316+
317+
- Before pushing a commit, you can run:
318+
319+
```sh
320+
pre-commit run --all
321+
```
322+
323+
and if it fails, check for changes the tool has made to your files.
324+
325+
- Alternatively, you can add pre-commit as a git hook with:
326+
327+
```sh
328+
pre-commit install
329+
```
330+
331+
and pre-commit will automatically be invoked every time you create a commit.
332+
333+
### Debugging in VSCode
319334

320335
Run postgres and ollama either locally or in containers. Don't run the backend container. Click on "Run & Debug" in the left menu and then run the "Debug Tangerine Backend" debug target. You can now set breakpoints and inspect runtime state.
321336
322337
There's a second debug target for the unit tests if you want to run those in a debugger.
323338

324339
## Mac Development Tips
325-
Ollama running in Docker on Apple Silicon cannot make use of hardware acceleration. That means the LLM will be very slow to respond running in Docker, even on a very capable machine. However, running the model locally does make use of acceleration and is quite fast. If you are working on a Mac the best setup is to run the model through ollama locally and then the other deps like the database in Docker. The way the compose file is set up, the networking is all seemless. If you stop the ollama container and then ollama serve locally it will all just work together. You'll have the best local development setup if you combine the model running locally and tangerine-backend running in a debugger in VSCode with postgres and pgadmin running in Docker!
340+
341+
Ollama running in Docker on Apple Silicon cannot make use of hardware acceleration. That means the LLM will be very slow to respond running in Docker, even on a very capable machine.
342+
343+
However, running the ollama outside of Docker does make use of acceleration and is quite fast. If you are working on a Mac the best setup is to run the model through ollama locally and continue to run the other components (like the database) in Docker. The way the compose file is set up, the networking should allow this to work without issue.
344+
345+
Comment out `ollama` from the compose file, or stop the ollama container. Invoke `ollama serve` on your shell. For an optimal developer experience:
346+
347+
- run tangerine-backend in a debugger in VSCode
348+
- run ollama directly on your host
349+
- run postgres/pgadmin in Docker.
326350

327351
## Synchronizing Documents from S3
328352

@@ -350,15 +374,15 @@ To do so you'll need to do the following:
350374
echo 'BUCKET=mybucket' >> .env
351375
```
352376
353-
5. Create an `s3.yaml` file that describes your agents and the documents they should ingest. See [s3-example.yaml](s3-example.yaml) for an example.
377+
1. Create an `s3.yaml` file that describes your agents and the documents they should ingest. See [s3-example.yaml](s3-example.yaml) for an example.
354378
355379
If using docker compose, copy this config into your container:
356380
357381
```text
358382
docker cp s3.yaml tangerine-backend:/opt/app-root/src/s3.yaml
359383
```
360384
361-
6. Run the S3 sync job:
385+
1. Run the S3 sync job:
362386
363387
- With docker compose:
364388

connectors/llm/interface.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -81,13 +81,13 @@ def _build_context(search_results: list[Document], content_char_limit: int = 0):
8181
}
8282
)
8383

84-
context += f"\n<<Search result {i+1}"
84+
context += f"\n<<Search result {i + 1}"
8585
if "title" in metadata:
8686
title = metadata["title"]
8787
context += f", document title: '{title}'"
8888
limit = content_char_limit if content_char_limit else len(page_content)
8989
search_result = page_content[0:limit]
90-
context += ">>\n\n" f"{search_result}\n\n" f"<<Search result {i+1} END>>\n"
90+
context += f">>\n\n{search_result}\n\n<<Search result {i + 1} END>>\n"
9191

9292
return context, search_metadata
9393

@@ -185,7 +185,7 @@ def api_response_generator():
185185
for data in llm_response:
186186
yield f"data: {json.dumps(data)}\r\n"
187187
# final piece of content returned is the search metadata
188-
yield f"data: {json.dumps({"search_metadata": search_metadata})}\r\n"
188+
yield f"data: {json.dumps({'search_metadata': search_metadata})}\r\n"
189189

190190
if stream:
191191
log.debug("streaming response...")

file_upload_cli.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -41,9 +41,9 @@ def upload_files(source, directory_path, url, agent_id, html, bearer_token):
4141
)
4242

4343
if response.status_code == 200:
44-
print(f"Batch {i+1}/{num_batches} uploaded successfully.")
44+
print(f"Batch {i + 1}/{num_batches} uploaded successfully.")
4545
else:
46-
print(f"Error uploading batch {i+1}/{num_batches}: {response.text}")
46+
print(f"Error uploading batch {i + 1}/{num_batches}: {response.text}")
4747

4848

4949
if __name__ == "__main__":

json/quality_detection_training.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -48,4 +48,4 @@
4848
{"text": "[toc]", "label": "junk"},
4949
{"text": "", "label": "junk"}
5050

51-
]
51+
]

pgadmin/pgadmin-servers.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,4 @@
1212
"ConnectNow": true
1313
}
1414
}
15-
}
15+
}

pgadmin/pgpassfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
postgres:5432:citrus:citrus:citrus
1+
postgres:5432:citrus:citrus:citrus

pyproject.toml

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[tool.pytest.ini_options]
2+
addopts = ["--ignore=data/"]
3+
4+
[tool.ruff]
5+
line-length = 100
6+
indent-width = 4
7+
target-version = "py312"

pytest.ini

-2
This file was deleted.

setup.cfg

-2
This file was deleted.

0 commit comments

Comments
 (0)