-
Notifications
You must be signed in to change notification settings - Fork 993
TarInputStream.GetNextEntry hangs on corrupt archive #762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Interesting... the archive untars correctly using Running it with archivediag, it looks like the end blocks are not in the expected format. I think it gets stuck because it finds blocks in the end with the specified length set as zero, and keeps reading the same blocks over and over again. That is definitely a bug if that's the case. https://pub.p1k.se/sharpziplib/archivediag/diffpy.srfit-1.3.tar.html |
I added some code to just skip all null bytes between entries, and it correctly reads the archive. I still don't understand why there is these numbers of null bytes in the file, especially since it doesn't align with the block size (512 bytes):
|
I'm seeing (likely) the same thing in 1.4.0 running in dotnet6. This is the code we're using to extract a .tar.gz: public static void Extract(string sTarballPath, string sTargetPath)
{
using (FileStream pInStream = File.OpenRead(sTarballPath))
{
using (GZipInputStream pGzipStream = new GZipInputStream(pInStream))
{
using (TarArchive pTarArchive = TarArchive.CreateInputTarArchive(pGzipStream, null))
{
pTarArchive.ExtractContents(sTargetPath);
}
}
}
} It hangs somewhere in ExtractContents(). The same file can be extracted using other extractors, so likely not corrupted (at least not beyond all help). The tar was creating using tar (GNU tar) 1.32. Extracting using the code in this gist works on the same file: https://gist.github.com/mikaeleiman/f9510716621b2a6343252df35c4259a1 |
@mikaeleiman This might be due to #789 if there are no end blocks at the end of the archive. In fact, this issue should be revisited with the new tar fixes... |
Indeed. The file in the original issue unpacks perfectly fine in |
Example file: I looked into our code, and it seems that SharpZipLib 1.3.3 handles the file, but 1.4.0 does not. Not sure why we bumped to 1.4.0, but I suspect it was for the dotnet6 support. Seems like 1.3.3 is working fine for our purposes, though. |
@mikaeleiman Yeah, then it probably is #788. It's fixed in
|
Steps to reproduce
Expected behavior
Exception should be thrown if archive cannot be processed.
Actual behavior
GetNextEntry() call hangs.
Version of SharpZipLib
1.3.3
Obtained from (only keep the relevant lines)
Other notes
Archive is an older python package hosted on PyPi.
Inspecting archive with 7zip shows "Unexpected end of data" for many files.
Also tried fully reading http response stream into byte array and reading from memory stream, same problem.
The text was updated successfully, but these errors were encountered: