-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Reoccurrence of 16023 - 403 Failure Performing file_get_contents on URL With No Restrictions #17121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Have you contacted readthedocs.io? Possibly they limit automated downloads in some way. |
I will, but last time they said they didn't do anything and sent me to you ... which worked. The fact that I can't get it to fail locally is worrisome. |
FWIW, I created a PR which uses curl rather than file_get_contents for https. That PR was successful. I don't know that adding a requirement for curl is something we want to do. PHPOffice/PhpSpreadsheet#4274 |
That 403 hints at something that is actively blocked by the server; possibly too many accesses from a certain IP, or generally they block some IP range. Or maybe it's the missing |
I have opened readthedocs/readthedocs.org#11845 with readthedocs. |
Okay, let's see what comes out of that report. I'm switching to "need feedback", so this ticket will be open for at least two weeks. |
The readthedocs.org website is behind Cloudflare CDN. Typically I see the CloudFlare WAF blocking requests that look like bots. I find the following helps to make your request look more like a web browser
|
Adding user_agent seems to work. I am studying what is needed for Connection header and Accept header. |
Adding Connection and Accept headers did no harm. |
Adding Connection and Accept headers correctly (thanks to comment from @lucasnetau) also did no harm. |
An interesting consequence of the Accept header that Chrome uses. An image which I expected to be downloaded as png was instead downloaded as webp. Removing image/webp from the Accept header gets the expected result. |
I have a PR ready to go. I will wait a day or two in case anyone thinks of something else. Here is my new code: $ctx = stream_context_create([
'ssl' => ['crypto_method' => STREAM_CRYPTO_METHOD_TLSv1_3_CLIENT],
'http' => [
'user_agent' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
'header' => [
'Connection: keep-alive',
// accept header used by chrome without image/webp
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
],
],
]); |
|
I don't know. You have to ask the readthedocs people, or possibly Cloudflare. Maybe check what a successful cURL connection sends ( Anyhow, this is not a bug in php-src. |
Here is the final version of the code as I have implemented it: $ctx = null;
// https://github.com/php/php-src/issues/16023
// https://github.com/php/php-src/issues/17121
if (str_starts_with($path, 'https:') || str_starts_with($path, 'http:')) {
$ctxArray = [
'http' => [
'user_agent' => 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36',
'header' => [
//'Connection: keep-alive', // unacceptable performance
'Accept: image/*;q=0.9,*/*;q=0.8',
],
],
];
if (str_starts_with($path, 'https:')) {
$ctxArray['ssl'] = ['crypto_method' => STREAM_CRYPTO_METHOD_TLSv1_3_CLIENT];
}
$ctx = stream_context_create($ctxArray);
}
$imageContents = @file_get_contents($path, false, $ctx); |
Description
The is a repeat of issue #16023. The solution suggested when that issue was closed worked successfully from Sept. 24 through Dec. 10. It no longer works as of Dec. 11 on github (see https://github.com/PHPOffice/PhpSpreadsheet/actions/runs/12277531354/job/34257128156?pr=4272). I have tried using TLSv1_3 as well as TLSv1_2.
I cannot reproduce the problem on a local Windows or Linux machine.
The following code:
Resulted in this output:
But I expected this output instead:
PHP Version
PHP 8.3.14, PHP 8.1.31, PHP 8.4.1
Operating System
Ubuntu 22.04.5 LTS
The text was updated successfully, but these errors were encountered: