本文介绍了使用pyobjc将元数据写入pdf的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用以下python代码将元数据写入PDF文件:

I'm trying to write metadata to a pdf file using the following python code:

from Foundation import *
from Quartz import *

url = NSURL.fileURLWithPath_("test.pdf")
pdfdoc = PDFDocument.alloc().initWithURL_(url)
assert pdfdoc, "failed to create document"

print "reading pdf file"

attrs = {}
attrs[PDFDocumentTitleAttribute] = "THIS IS THE TITLE"
attrs[PDFDocumentAuthorAttribute] = "A. Author and B. Author"

PDFDocumentTitleAttribute = "test"

pdfdoc.setDocumentAttributes_(attrs)
pdfdoc.writeToFile_("mynewfile.pdf")   

print "pdf made"

工作正常(没有错误的安慰),但是当我检查的文件的元数据如下:

This appears to work fine (no errors to the consoled), however when I examine the metadata of the file it is as follows:

PdfID0:
242b7e252f1d3fdd89b35751b3f72d3
PdfID1:
242b7e252f1d3fdd89b35751b3f72d3
NumberOfPages: 4

并且原始文件具有以下元数据:

and the original file had the following metadata:

InfoKey: Creator
InfoValue: PScript5.dll Version 5.2.2
InfoKey: Title
InfoValue: Microsoft Word - PROGRESS  ON  THE  GABION  HOUSE Compressed.doc
InfoKey: Producer
InfoValue: GPL Ghostscript 8.15
InfoKey: Author
InfoValue: PWK
InfoKey: ModDate
InfoValue: D:20101021193627-05'00'
InfoKey: CreationDate
InfoValue: D:20101008152350Z
PdfID0: d5fd6d3960122ba72117db6c4d46cefa
PdfID1: 24bade63285c641b11a8248ada9f19
NumberOfPages: 4

所以问题是,附加元数据,并且它正在清除先前的元数据结构。我需要做什么来让这个工作?我的目标是附加引用管理系统可以导入的元数据。

So the problems are, it is not appending the metadata, and it is clearing the previous metadata structure. What do I need to do to get this to work? My objective is to append metadata that reference management systems can import.

推荐答案

Mark在正确的轨道上,

Mark is on the right track, but there are a few peculiarities that should be accounted for.

首先,他是正确的, pdfdoc.documentAttributes 是一个 NSDictionary ,其中包含文档元数据。你想修改它,但是注意 documentAttributes 给你一个 NSDictionary ,它是不可变的。您必须将其转换为 NSMutableDictionary ,如下所示:

First, he is correct that pdfdoc.documentAttributes is an NSDictionary that contains the document metadata. You would like to modify that, but note that documentAttributes gives you an NSDictionary, which is immutable. You have to convert it to an NSMutableDictionary as follows:

attrs = NSMutableDictionary.alloc().initWithDictionary_(pdfDoc.documentAttributes())

现在可以修改 attrs 。没有必要像Mark建议的那样写 PDFDocument.PDFDocumentTitleAttribute ,因为 PDFDocumentTitleAttribute 被声明为

Now you can modify attrs as you did. There is no need to write PDFDocument.PDFDocumentTitleAttribute as Mark suggested, that one won't work, PDFDocumentTitleAttribute is declared as a module-level constant, so just do as you did in your own code.

以下是适用于我的完整代码:

Here is the full code that works for me:

from Foundation import *
from Quartz import *

url = NSURL.fileURLWithPath_("test.pdf")
pdfdoc = PDFDocument.alloc().initWithURL_(url)

attrs = NSMutableDictionary.alloc().initWithDictionary_(pdfdoc.documentAttributes())
attrs[PDFDocumentTitleAttribute] = "THIS IS THE TITLE"
attrs[PDFDocumentAuthorAttribute] = "A. Author and B. Author"

pdfdoc.setDocumentAttributes_(attrs)
pdfdoc.writeToFile_("mynewfile.pdf")

这篇关于使用pyobjc将元数据写入pdf的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-22 19:13