python - Python从HTTPS aspx下载图像

我正在尝试从NASS Case Viewer下载一些图像。一个例子是

https://www-nass.nhtsa.dot.gov/nass/cds/CaseForm.aspx?xsl=main.xsl&CaseID=149006692

在这种情况下，指向图像查看器的链接是

https://www-nass.nhtsa.dot.gov/nass/cds/GetBinary.aspx?ImageView&ImageID=497001669&Desc=FRONT&Title=Vehicle+1+-+Front&Version=1&Extend=jpg

由于https，因此可能无法查看。但是，这仅仅是Front第二个图像。

图像的实际链接是（或应该是？）

https://www-nass.nhtsa.dot.gov/nass/cds/GetBinary.aspx?Image&ImageID=497001669&CaseID=149006692&Version=1

这将仅下载aspx二进制文件。

我的问题是我不知道如何将这些二进制文件存储到适当的jpg文件中。

我尝试过的代码示例是

import requests
test_image = "https://www-nass.nhtsa.dot.gov/nass/cds/GetBinary.aspx?Image&amp;ImageID=497001669&amp;CaseID=149006692&amp;Version=1"
pull_image = requests.get(test_image)

with open("test_image.jpg", "wb+") as myfile:
    myfile.write(str.encode(pull_image.text))

但这不会生成正确的jpg文件。我还检查了pull_image.raw.read()，发现它是空的。

这可能是什么问题？我的网址不正确吗？我使用Beautifulsoup将这些URL放在一起，并通过检查几页中的HTML代码来检查它们。

我是否错误地保存了二进制文件？

最佳答案

.text将响应内容解码为字符串，因此您的imge文件将被破坏。
相反，您应该使用.content来保存二进制响应内容。

import requests

test_image = "https://www-nass.nhtsa.dot.gov/nass/cds/GetBinary.aspx?Image&amp;ImageID=497001669&amp;CaseID=149006692&amp;Version=1"
pull_image = requests.get(test_image)

with open("test_image.jpg", "wb+") as myfile:
    myfile.write(pull_image.content)

.raw.read()还返回字节，但是要使用它，必须将stream参数设置为True。

pull_image = requests.get(test_image, stream=True)
with open("test_image.jpg", "wb+") as myfile:
    myfile.write(pull_image.raw.read())

关于python - Python从HTTPS aspx下载图像，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/48308973/