Skip to content

gh-129603: Don't segfault if sqlite3.Row description is None #129604

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

erlend-aasland
Copy link
Contributor

@erlend-aasland erlend-aasland commented Feb 3, 2025

@erlend-aasland erlend-aasland force-pushed the sqlite/row-description-is-none branch from ae502d7 to 6f029cd Compare February 3, 2025 00:47
@erlend-aasland erlend-aasland added needs backport to 3.12 only security fixes needs backport to 3.13 bugs and security fixes labels Feb 3, 2025
@erlend-aasland erlend-aasland linked an issue Feb 3, 2025 that may be closed by this pull request
@erlend-aasland
Copy link
Contributor Author

Alternatively, we could disallow instantiation of Row objects if description is None, but I'm not sure it is a better approach.

@erlend-aasland
Copy link
Contributor Author

Alternatively, we could disallow instantiation of Row objects if description is None, but I'm not sure it is a better approach.

Expanding on this: adding this check to the tp_new slot may impact performance, but I would expect the added pointer check to be negligible.

@erlend-aasland
Copy link
Contributor Author

Any other opinions on this, @picnixz?

@picnixz
Copy link
Member

picnixz commented Feb 9, 2025

Expanding on this: adding this check to the tp_new slot may impact performance, but I would expect the added pointer check to be negligible.

Are there situations where one needs to create an instance of an sqlite row from a cursor with no description? (even creating rows manually by passing a cursor feels a bit weird to me, but there are maybe cases when it's needed, I don't know, probably for libs interfacing even more with sqlite3). If so, this may be better to keep the current PR (which can be backported probably) and then do a follow-up PR and:

  • Solution 1: raise if None is passed as the description.
  • Solution 2: emit a deprecation warning saying that in the future rows without a description cannot be instantiated (i.e., the cursor must therefore have a description); this will not be really inefficient as it would only add one check but it's a bit annoying for users and for us to maintain.

If it doesn't make sense to instantiate a row with a cursor that has no description (and I think it doesn't make sense to do this but I may be mistaken), then we can just have this additional check at construction time. I doubt it will slow down since it's just an additional is None check. The row creation looks is already fast by itself and we're already doing an if (self == NULL) check, which is roughly the same cost of if (cursor->description == Py_None) { ... }.

Now, if rows with None descriptions were instantiated manually, and even if they don't make sense or reflect bad code practice, they were already broken when keys were being used but we could still use them with numeric indices. So there might be some code that is not entirely broken out there. If you want to be super-conservative, let's go down the deprecation road (but it's annoying for both the users and us to maintain), so I think we can just forbid instantiating such rows in 3.14+. For 3.12 and 3.13, we can leave it as is just to avoid possible reports.

@erlend-aasland
Copy link
Contributor Author

As I wrote in the issue OP, I don't think any real world code instantiates Row objects manually, but you never know. This PR is probably the most conservative: we avoid crashing, but apart from that, there are no change in behaviour. Changing __new__ to disallow missing descriptions would be a breaking change in theory, but probably not in practice. I'm not sure if it is worth it to disallow missing descriptions, though.

@picnixz
Copy link
Member

picnixz commented Feb 9, 2025

Changing __new__ to disallow missing descriptions would be a breaking change in theory

Yeah, that's why I suggested the (annoying) deprecation road. The PR doesn't look super complicate and I don't think we have other places where we need to be careful with the type of the description (maybe you could add a comment in the object structure saying that it can be None and thus should be carefully checked?).

@picnixz
Copy link
Member

picnixz commented Feb 9, 2025

Note: __getitem__ gets a tiny tiny tiny tiny slowdown since we have an additional is not None check every time we use a key (but this is not the dominant step of the call).

@erlend-aasland
Copy link
Contributor Author

Note: __getitem__ gets a tiny tiny tiny tiny slowdown since we have an additional is not None check every time we use a key (but this is not the dominant step of the call).

Hm, I'm not sure we even need that check. I'm strangely unable to trigger a crash for that path.

@erlend-aasland
Copy link
Contributor Author

Ah, it's using PyTuple_Size, not PyTuple_GET_SIZE.

@erlend-aasland
Copy link
Contributor Author

We should be fine with using the fast get-size API for both of the cases this PR touches. That should even out the potential slowdown of the None-check.

@erlend-aasland
Copy link
Contributor Author

Note: __getitem__ gets a tiny tiny tiny tiny slowdown since we have an additional is not None check every time we use a key (but this is not the dominant step of the call).

Slight digression: we could speed up Row objects a little bit by using fast tuple APIs for ->data access. Moreover, the PyUnicode_Checks in equal_ignore_case are redundant: left is already checked in pysqlite_row_subscript, and right is always a unicode object (we build the description tuples in cursor.c).

@erlend-aasland erlend-aasland merged commit 7e6ee50 into python:main Feb 9, 2025
46 checks passed
@erlend-aasland erlend-aasland deleted the sqlite/row-description-is-none branch February 9, 2025 23:27
@miss-islington-app

This comment was marked as outdated.

@miss-islington-app

This comment was marked as outdated.

@miss-islington-app

This comment was marked as outdated.

@bedevere-app
Copy link

bedevere-app bot commented Feb 9, 2025

GH-129923 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Feb 9, 2025
@bedevere-app
Copy link

bedevere-app bot commented Feb 9, 2025

GH-129924 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 only security fixes label Feb 9, 2025
erlend-aasland added a commit to erlend-aasland/cpython that referenced this pull request Feb 9, 2025
erlend-aasland added a commit that referenced this pull request Feb 10, 2025
erlend-aasland added a commit that referenced this pull request Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfault if pysqlite_Row->description == PyNone
2 participants