Skip to content

Commit 7314a78

Browse files
ci: Automatically check for typos (#2855)
1 parent 2d566b8 commit 7314a78

File tree

16 files changed

+35
-23
lines changed

16 files changed

+35
-23
lines changed

.pre-commit-config.yaml

+8
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@ ci:
55
skip:
66
- uv-lock
77

8+
default_language_version:
9+
python: python3.13
10+
811
repos:
912
- repo: https://github.com/pre-commit/pre-commit-hooks
1013
rev: v5.0.0
@@ -65,3 +68,8 @@ repos:
6568
hooks:
6669
- id: uv-lock
6770
- id: uv-sync
71+
72+
- repo: https://github.com/codespell-project/codespell
73+
rev: v2.4.1
74+
hooks:
75+
- id: codespell

CHANGELOG.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
4040
- [#2482](https://github.com/meltano/sdk/issues/2482) Allow SQL tap developers to auto-skip certain schemas from discovery
4141
- [#2784](https://github.com/meltano/sdk/issues/2784) Added a new built-in setting `activate_version` for targets to optionally disable processing of `ACTIVATE_VERSION` messages
4242
- [#2780](https://github.com/meltano/sdk/issues/2780) Numeric values are now parsed as `decimal.Decimal` in REST and GraphQL stream responses
43-
- [#2775](https://github.com/meltano/sdk/issues/2775) Log a stream's bookmark (if it's avaiable) when its sync starts
43+
- [#2775](https://github.com/meltano/sdk/issues/2775) Log a stream's bookmark (if it's available) when its sync starts
4444
- [#2703](https://github.com/meltano/sdk/issues/2703) Targets now emit record count from the built-in batch file processor
4545
- [#2774](https://github.com/meltano/sdk/issues/2774) Accept a `maxLength` limit for VARCHARs
4646
- [#2769](https://github.com/meltano/sdk/issues/2769) Add `versioning-strategy` to dependabot config of Cookiecutter templates
@@ -210,7 +210,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
210210
### ✨ New
211211

212212
- [#2432](https://github.com/meltano/sdk/issues/2432) Developers can now customize the default logging configuration for their taps/targets by adding `default_logging.yml` to their package
213-
- [#2531](https://github.com/meltano/sdk/issues/2531) The `json` module is now avaiable to stream maps -- _**Thanks @grigi!**_
213+
- [#2531](https://github.com/meltano/sdk/issues/2531) The `json` module is now available to stream maps -- _**Thanks @grigi!**_
214214
- [#2529](https://github.com/meltano/sdk/issues/2529) Stream sync context is now available to all instances methods as a `Stream.context` attribute
215215

216216
### 🐛 Fixes
@@ -330,7 +330,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
330330
### 📚 Documentation Improvements
331331

332332
- [#2239](https://github.com/meltano/sdk/issues/2239) Linked reference docs to source code
333-
- [#2231](https://github.com/meltano/sdk/issues/2231) Added an example implemetation of JSON schema validation that uses `fastjsonschema`
333+
- [#2231](https://github.com/meltano/sdk/issues/2231) Added an example implementation of JSON schema validation that uses `fastjsonschema`
334334
- [#2219](https://github.com/meltano/sdk/issues/2219) Added reference docs for tap & target testing helpers
335335

336336
## v0.35.0 (2024-02-02)
@@ -748,7 +748,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
748748

749749
### ✨ New
750750

751-
- [#1262](https://github.com/meltano/sdk/issues/1262) Support string `"__NULL__"` whereever null values are allowed in stream maps configuration
751+
- [#1262](https://github.com/meltano/sdk/issues/1262) Support string `"__NULL__"` wherever null values are allowed in stream maps configuration
752752

753753
### 🐛 Fixes
754754

@@ -1286,7 +1286,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
12861286

12871287
### Changes
12881288

1289-
- Target SDK: Improved performance for Batch Sinks by skipping extra drain operations when newly recieved STATE messages are unchanged from the prior received STATE (#172, !125) -- _Thanks, **[Pat Nadolny](https://gitlab.com/pnadolny13)**!_
1289+
- Target SDK: Improved performance for Batch Sinks by skipping extra drain operations when newly received STATE messages are unchanged from the prior received STATE (#172, !125) -- _Thanks, **[Pat Nadolny](https://gitlab.com/pnadolny13)**!_
12901290

12911291
### Fixes
12921292

docs/CONTRIBUTING.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ Sphinx will automatically generate class stubs, so be sure to `git add` them.
161161

162162
## Semantic Pull Requests
163163

164-
This repo uses the [semantic-prs](https://github.com/Ezard/semantic-prs) GitHub app to check all PRs againts the conventional commit syntax.
164+
This repo uses the [semantic-prs](https://github.com/Ezard/semantic-prs) GitHub app to check all PRs against the conventional commit syntax.
165165

166166
Pull requests should be named according to the conventional commit syntax to streamline changelog and release notes management. We encourage (but do not require) the use of conventional commits in commit messages as well.
167167

docs/dev_guide.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -291,7 +291,7 @@ That command will produce a `result.json` file which you can explore with the `v
291291
$ poetry run vizviewer result.json
292292
```
293293

294-
Thet output should look like this
294+
The output should look like this
295295

296296
![SDK Flame Graph](https://gitlab.com/meltano/sdk/uploads/07633ba1217de6eb1bb0e018133c608d/_write_record_message.png)
297297

docs/implementation/at_least_once.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ According to the Singer spec, bookmark comparisons are performed on the basis of
1818

1919
[Replication Key Signposts](./state.md#replication-key-signposts) are an internal and automatic feature of the SDK. Signposts are necessary in order to deliver the 'at least once' delivery promise for unsorted streams and parent-child streams. The function of a signpost is to ensure that bookmark keys do not advance past a point where we may have not synced all records, such as for unsorted or reverse-sorted streams. This feature also enables developers to override `state_partitioning_key`, which reduces the number of bookmarks needed to track state on parent-child streams with a large number of parent records.
2020

21-
In all applications, the signpost prevents the bookmark's value from advancing too far and prevents records from being skipped in future sync operations. We _intentionally_ do not advance the bookmark as far as the max replication key value from all records we've synced, with the knowlege that _some_ records with equal or lower replication key values may have not yet been synced. It follows then, that any records whose replication key is greater than the signpost value will necessarily be re-synced in the next execution, causing some amount of record duplication downstream.
21+
In all applications, the signpost prevents the bookmark's value from advancing too far and prevents records from being skipped in future sync operations. We _intentionally_ do not advance the bookmark as far as the max replication key value from all records we've synced, with the knowledge that _some_ records with equal or lower replication key values may have not yet been synced. It follows then, that any records whose replication key is greater than the signpost value will necessarily be re-synced in the next execution, causing some amount of record duplication downstream.
2222

2323
### Cause #3: Stream interruption
2424

@@ -32,11 +32,11 @@ There are two generally recommended approaches for dealing with record duplicati
3232

3333
Assuming that a primary key exists, most target implementation will simply use the primary key to merge newly received records with their prior versions, eliminating any risk of duplication in the destination dataset.
3434

35-
However, this approach will not work for streams that lack primary keys or in implentations running in pure 'append only' mode. For these cases, some amount of record duplication should be expected and planned for by the end user.
35+
However, this approach will not work for streams that lack primary keys or in implementations running in pure 'append only' mode. For these cases, some amount of record duplication should be expected and planned for by the end user.
3636

3737
### Strategy #2: Removing duplicates using `dbt` transformations
3838

39-
For cases where the destination table _does not_ use primary keys, the most common way of resolving duplicates after they've landed in the downstream dataset is to apply a `ROW_NUMBER()` function in a tool like [dbt](https://www.getdbt.com). The `ROW_NUMBER()` function can caculate a `dedupe_rank` and/or a `recency_rank` in the transformation layer, and then downstream queries can easily filter out any duplicates using the calculated rank. Users can write these transformations by hand or leverage the [deduplicate-source](https://github.com/dbt-labs/dbt-utils#deduplicate-source) macro from the [dbt-utils](https://github.com/dbt-labs/dbt-utils) package.
39+
For cases where the destination table _does not_ use primary keys, the most common way of resolving duplicates after they've landed in the downstream dataset is to apply a `ROW_NUMBER()` function in a tool like [dbt](https://www.getdbt.com). The `ROW_NUMBER()` function can calculate a `dedupe_rank` and/or a `recency_rank` in the transformation layer, and then downstream queries can easily filter out any duplicates using the calculated rank. Users can write these transformations by hand or leverage the [deduplicate-source](https://github.com/dbt-labs/dbt-utils#deduplicate-source) macro from the [dbt-utils](https://github.com/dbt-labs/dbt-utils) package.
4040

4141
#### Sample dedupe implementation using `dbt`:
4242

docs/implementation/cli.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ The SDK automatically applies selection logic as described by the
9999

100100
Selection rules are applied at three levels:
101101

102-
1. **Streams** are filtered out if they are deselected or ommitted in the input catalog.
102+
1. **Streams** are filtered out if they are deselected or omitted in the input catalog.
103103
2. **RECORD messages** are filtered based upon selection rules in the input catalog.
104104
3. **SCHEMA messages** are filtered based upon selection rules in the input catalog.
105105

docs/partitioning.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ The Tap SDK supports stream partitioning, meaning a set of substreams
44
which each have their own state and their own distinct queryable domain.
55

66
You can read more about state partitioning in the
7-
[State Implemetation](./implementation/state.md#partitioned-state) explanation
7+
[State Implementation](./implementation/state.md#partitioned-state) explanation
88
document.
99

1010
## If you do not require partitioning

docs/stream_maps.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@ to expressions using the `config` dictionary.
175175
### Constructing Expressions
176176

177177
Expressions are defined and parsed using the
178-
[`simpleval`](https://github.com/danthedeckie/simpleeval) expression library. This library
178+
[`simpleeval`](https://github.com/danthedeckie/simpleeval) expression library. This library
179179
accepts most native python expressions and is extended by custom functions which have been declared
180180
within the SDK.
181181

@@ -499,7 +499,7 @@ faker_config:
499499
locale: en_US
500500
```
501501
502-
Remember, these expressions are evaluated by the [`simpleval`](https://github.com/danthedeckie/simpleeval) expression library, which only allows a single python expression (which is the reason for the `or` syntax above).
502+
Remember, these expressions are evaluated by the [`simpleeval`](https://github.com/danthedeckie/simpleeval) expression library, which only allows a single python expression (which is the reason for the `or` syntax above).
503503

504504
This means if you require more advanced masking logic, which cannot be defined in a single python expression, you may need to consider a custom stream mapper.
505505

@@ -749,7 +749,7 @@ excluded at the tap level, then the stream will be skipped exactly as if it were
749749
in the catalog metadata.
750750

751751
If a stream is specified to be excluded at the target level, or in a standalone mapper
752-
between the tap and target, the filtering occurs downstream from the tap and therefor cannot
752+
between the tap and target, the filtering occurs downstream from the tap and therefore cannot
753753
affect the selection rules of the tap itself. Except in special test cases or in cases where
754754
runtime is trivial, we highly recommend implementing stream-level exclusions at the tap
755755
level rather than within the downstream target or mapper plugins.

pyproject.toml

+4
Original file line numberDiff line numberDiff line change
@@ -421,3 +421,7 @@ max-args = 9
421421

422422
[tool.uv]
423423
required-version = ">=0.5.19"
424+
425+
[tool.codespell]
426+
skip = "*.csv,samples/aapl/*.json,samples/*/schemas/*.json"
427+
ignore-words-list = "fo,intoto"

singer_sdk/connectors/sql.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -919,7 +919,7 @@ def get_object_names( # pragma: no cover
919919
view_names = []
920920
return [(t, False) for t in table_names] + [(v, True) for v in view_names]
921921

922-
# TODO maybe should be splitted into smaller parts?
922+
# TODO maybe should be split into smaller parts?
923923
def discover_catalog_entry(
924924
self,
925925
engine: Engine, # noqa: ARG002

singer_sdk/contrib/filesystem/stream.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ def __init__(
6060

6161
super().__init__(tap, schema=None, name=name)
6262

63-
# TODO(edgarrmondragon): Make this None if the filesytem does not support it.
63+
# TODO(edgarrmondragon): Make this None if the filesystem does not support it.
6464
self.replication_key = SDC_META_MODIFIED_AT
6565
self._sync_start_time = utc_now()
6666
self._partitions = [{SDC_META_FILEPATH: path} for path in self._filepaths]

singer_sdk/helpers/capabilities.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@
6565
description=(
6666
"Config for the [`Faker`](https://faker.readthedocs.io/en/master/) "
6767
"instance variable `fake` used within map expressions. Only applicable if "
68-
"the plugin specifies `faker` as an addtional dependency (through the "
68+
"the plugin specifies `faker` as an additional dependency (through the "
6969
"`singer-sdk` `faker` extra or directly)."
7070
),
7171
),
@@ -340,7 +340,7 @@ class PluginCapabilities(CapabilitiesEnum):
340340
#: Support :doc:`inline stream map transforms</stream_maps>`.
341341
STREAM_MAPS = "stream-maps"
342342

343-
#: Support schema flattening, aka denesting of complex properties.
343+
#: Support schema flattening, aka unnesting of complex properties.
344344
FLATTENING = "schema-flattening"
345345

346346
#: Support the

singer_sdk/mapper.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -667,7 +667,7 @@ def _init_faker_instance(self) -> Faker | None:
667667

668668

669669
class PluginMapper:
670-
"""Inline map tranformer."""
670+
"""Inline map transformer."""
671671

672672
def __init__(
673673
self,

singer_sdk/plugin_base.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -257,7 +257,7 @@ def initialized_at(self) -> int:
257257
def capabilities(self) -> list[CapabilitiesEnum]: # noqa: PLR6301
258258
"""Get capabilities.
259259
260-
Developers may override this property in oder to add or remove
260+
Developers may override this property in order to add or remove
261261
advertised capabilities for this plugin.
262262
263263
Returns:

singer_sdk/sinks/sql.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ def schema_name(self) -> str | None:
9191
Returns:
9292
The target schema name.
9393
"""
94-
# Look for a default_target_scheme in the configuraion fle
94+
# Look for a default_target_scheme in the configuration file
9595
default_target_schema: str = self.config.get("default_target_schema", None)
9696
parts = self.stream_name.split("-")
9797

tests/core/test_streams.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -444,7 +444,7 @@ def records_jsonpath(cls): # noqa: N805
444444
{
445445
"link": [
446446
{
447-
"releation": "previous",
447+
"relation": "previous",
448448
"url": "https://myapi.test/6"
449449
},
450450
{

0 commit comments

Comments
 (0)