如何保持关系数据库中大字符串字段的编辑历史

本文介绍了如何保持关系数据库中大字符串字段的编辑历史的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

N.B.我认为答案可能是设计为重点的，因此基本上是实现无关的，但如果有一些特别适合的解决方案使用这些技术，我正在使用Postgres的Java + Hibernate。

我有一个具有特定字段的表格，其中包含大字符串，我们假设博客文章平均为+10000个字符。

在我的应用程序中，您可以根据需要编辑博客帖子，并且最新版本将始终在更新后立即显示。但是，应用需要保留这些修改的完整版本历史记录，以便查看。

一个明显的策略是保留一个单独的表，例如 blog_post_history ，其中博客帖子行在创建和每个后续更新到主'live' blog_post 表时重复插入，增加的版本号，因此，如果将来需要，这些版本都可用。我正在考虑使用像Hibernate Envers这样的东西来设置它。

然而，它存储（和更重要的是传输）字符块的文本，每个之间的唯一的区别可能是固定拼写错误，添加一些单词等。由于编辑的博客帖子的性质，可能会有很多小的增量变化像这样，而不是更少，更大

我在想什么？当在编辑时仅存储当前版本和先前版本之间的差异，然后在请求时可以在客户端上以编程方式从这些增量重新构建版本历史，使得通过线路发送的数据最小化。

我最有可能将最新版本存储为全文，因为我想优化以请求最频繁，然后存储一连串的三角洲

解决方案

我不需要重新编辑当前版本的

t回复关于存储差异或完全更改，即使在我看来，只是性能测试可以实际回复什么解决方案是更好，因为内容的完整日志意味着更大的数据库，但更少的服务器工作。

我想分享的相反，我的经验与postgresql保持历史。
我做的非常成功的工作在服务器上的网站，只是postgresql没有写任何代码。
在Postgresql上使用这组函数，触发器和扩展

它们简单易于实现，您可以忘记代码的历史记录，但只需从日志表中读取即可

所以我的应用程序是使用YII框架编写的，我使用db方案和结构设计的数据结构，只有少数表作为框架本身的服务（用户，角色和一般日志），这很重要，因为如果数据库中的数据结构太复杂，下面总结的方法仍然有效，但更复杂。

安装postgresql扩展tablelog你在这里找到

您可以这样操作：
首先，您必须选择表（mytable）与您需要保留历史记录的内容。
你复制这个mytable（我做了一个新的模式log.mytable）添加一些新的列来跟踪历史（如描述成README到tablelog扩展）。

您必须在pgplsql中的postgresl上创建一些简单函数
现在您必须在mytable上创建一个触发器

CREATE TRIGGER mytable_trg更新或插入或删除mytable
for每个行执行过程table_log（'log.mytable'）;这就是说，在每个INSERT，UPDATE或DELETE，你将跟踪历史，你可以很容易地恢复旧版本与您创建的功能之前因此在您的应用程序代码中只是执行和SQL调用函数本身。

在我的应用程序中，我添加了一个历史图标，在需要的地方，

在表单创建过程中，从log.mytable中选择内容，您可以在表单中选择内容，在我看来，一个函数，提取与所有版本的差异与当前，但它很容易，如果你存储的每个版本的数据库的完整内容，因为相反，可能很难恢复一个版本附近的最后。事实上，如果你保持差异，考虑他们与下一个不与当前的比较。

另一个优点是，所有是服务器端，没有延迟写入额外的数据

下面提供的差异功能也可以是一个pgplsql函数，以避免以这种方式向客户端发送所有版本有时可能很大的内容，但这必须取决于内容的类型，很容易为html的文本和其他类型的内容更复杂。

我的应用程序是相当复杂但是保持这种变化的历史已经做了一个简单和干净，我忘了它的工作后，因为它总是工作顺利。

Luca

N.B. I think answers are likely to be design-focused and therefore basically implementation agnostic, but I'm using Java+Hibernate with Postgres if there's some particularly well-suited solution using those technologies.
I have a table with a particular field which will hold large strings, let's say blog posts which on average are +10000 characters.
In my app, you can edit blog posts as many times as you like, and the latest version will always be displayed instantly after an update. However, the app needs to keep a full version history of these edits, so they can be viewed.
An obvious strategy is to keep a separate table, something like blog_post_history, where blog post rows are inserted in duplicate on creation and each subsequent update to the main 'live' blog_post table, with an incrementing version number, so these versions are all available if needed in future. I was considering using something like Hibernate Envers to set this up.
However it seems remarkably inefficient to store (and - maybe more importantly - transmit), multiple versions of a 10000 character block of text where the only difference between each one might be fixing typos, adding a few words, etc. Due to the nature of edits to blog posts, there are likely to be many small incremental changes like this, rather than fewer, larger changes.
Is there a better way?
I'm thinking something along the lines of storing only deltas between the current and previous version when an edit is made, and then reconstructing the version history from these deltas programatically when it's requested, perhaps on the client so the data sent over the wire is minimised.
I would most likely store the latest version as full-text, as I'd want to optimise for requesting this most frequently, then store a chain of deltas going backwards from the current version to reconstruct historical versions, as and when they are requested.
解决方案
I don't reply about storing diff or full changes even if in my opinion just a performance test can actually reply for what solution is better because full log of content means bigger database but less work for server.
I want to share on the contrary, my experience for keep history with postgresql.I did it very successfully working on server site, just on postgresql without write any code out of it.Using this set of functions, triggers and extension on Postgresql
http://andreas.scherbaum.la/blog/archives/100-Log-Table-Changes-in-PostgreSQL-with-tablelog.html
They are simple and easy to implement and you can forget of history on your code but just you can read from the log table to present difference on the content.
So my application was written in php with YII framework with db schemes and structure designed by me for data, with just few table as service for framework itself (users, roles and general log) and this is important because if the data structure in the db is too complicated the approach summarised below is still valid but more complicated.
After installed the postgresql extension tablelog you find herehttp://pgfoundry.org/projects/tablelog/
You can proceed this way:First you must select the table (mytable) with the content you need to keep history of.You duplicate this mytable (I did into a new schema log.mytable) adding some new columns to keep track of history (as describe into README into tablelog extension).
You must create some simple functions on postgresl in pgplsql
Now you must create a trigger on your mytable as
That's all, on every INSERT, UPDATE or DELETE you will keep track of history and you can easily restore old versions with the function you create before and so in your app code just executing and SQL calling the function itself.
In my app I added an icon for History in several point where needed and with a click I open a dialog with form and a options in table to present all history and select the version you could restore.
In the form creation, selecting content form log.mytable, you could place, in my opinion, a function that extract the difference from all version with the current but it's easy if you store the full content for every version in the db because on contrary it could be difficult to restore a version near to the last. In fact if you keep differences consider that they are compared with the next not with the current.
Another advantage it that all is server side and no delay for writing extra data could be sensed on client side.
The function for presenting just the difference mentioned below could also be a pgplsql function to avoiding, this way, to send to client all version in full content that could be big sometimes but this must depend on type of content, easily for text less for html and more complex for other type of content.
My app was quite complex but keeping the history for changes this way has done a easy and clean and I forgot of it after done because always it worked smoothly.
Luca

这篇关于如何保持关系数据库中大字符串字段的编辑历史的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！