我已经设置了3个Google Cloud Storge存储桶和3个功能(每个存储桶一个),这些功能会在将PDF文件上传到存储桶时触发。函数将PDF转换为png图像并进行进一步处理。

当我尝试创建第四个存储桶和类似功能时,奇怪的是它无法正常工作。即使我复制了现有的3个功能之一,它仍然无法正常工作,并且出现了此错误:
Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 333, in run_background_function _function_handler.invoke_user_function(event_object) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 199, in invoke_user_function return call_user_function(request_or_event) File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", line 196, in call_user_function event_context.Context(**request_or_event.context)) File "/user_code/main.py", line 27, in pdf_to_img with Image(filename=tmp_pdf, resolution=300) as image: File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2874, in __init__ self.read(filename=filename, resolution=resolution) File "/env/local/lib/python3.7/site-packages/wand/image.py", line 2952, in read self.raise_exception() File "/env/local/lib/python3.7/site-packages/wand/resource.py", line 222, in raise_exception raise e wand.exceptions.PolicyError: not authorized/tmp/tmphm3hiezy'@ error/constitute.c/ReadImage/412`

让我感到困惑的是,为什么相同的功能可以在现有存储桶上运行,却不能在新存储桶上运行。

更新:
即使这样也不起作用(出现“缓存资源用尽”错误):

requirements.txt中:

google-cloud-storage
wand

main.py中:
import tempfile

from google.cloud import storage
from wand.image import Image

storage_client = storage.Client()

def pdf_to_img(data, context):
    file_data = data
    pdf = file_data['name']

    if pdf.startswith('v-'):
        return

    bucket_name = file_data['bucket']

    blob = storage_client.bucket(bucket_name).get_blob(pdf)

    _, tmp_pdf = tempfile.mkstemp()
    _, tmp_png = tempfile.mkstemp()

    tmp_png = tmp_png+".png"

    blob.download_to_filename(tmp_pdf)
    with Image(filename=tmp_pdf) as image:
        image.save(filename=tmp_png)

    print("Image created")
    new_file_name = "v-"+pdf.split('.')[0]+".png"
    blob.bucket.blob(new_file_name).upload_from_filename(tmp_png)

上面的代码应该只是创建一个图像文件的副本,然后将其上传到存储桶中。

最佳答案

由于该漏洞已在Ghostscript中修复,但未在ImageMagick中更新,因此将PDF转换为Google Cloud Functions中的图像的解决方法是使用此ghostscript wrapper并直接请求从Ghostscript将PDF转换为png(绕过ImageMagick)。

requirements.txt

google-cloud-storage
ghostscript==0.6

main.py
import locale
import tempfile
import ghostscript

from google.cloud import storage

storage_client = storage.Client()

def pdf_to_img(data, context):
    file_data = data
    pdf = file_data['name']

    if pdf.startswith('v-'):
        return

    bucket_name = file_data['bucket']

    blob = storage_client.bucket(bucket_name).get_blob(pdf)

    _, tmp_pdf = tempfile.mkstemp()
    _, tmp_png = tempfile.mkstemp()

    tmp_png = tmp_png+".png"

    blob.download_to_filename(tmp_pdf)

    # create a temp folder based on temp_local_filename
    # use ghostscript to export the pdf into pages as pngs in the temp dir
    args = [
        "pdf2png", # actual value doesn't matter
        "-dSAFER",
        "-sDEVICE=pngalpha",
        "-o", tmp_png,
        "-r300", tmp_pdf
        ]
    # the above arguments have to be bytes, encode them
    encoding = locale.getpreferredencoding()
    args = [a.encode(encoding) for a in args]
    #run the request through ghostscript
    ghostscript.Ghostscript(*args)

    print("Image created")
    new_file_name = "v-"+pdf.split('.')[0]+".png"
    blob.bucket.blob(new_file_name).upload_from_filename(tmp_png)

无论如何,这可以解决问题,并将所有处理保留在GCF中。希望能帮助到你。但是,您的代码适用于单页PDF。我的用例是this question中的多页pdf转换,ghostscript代码和解决方案。

关于python-3.x - 带有魔杖的Google云功能停止工作,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/53296500/

10-12 13:03