“返回透明”的可重入高速缓存IO调用

本文介绍了“返回透明”的可重入高速缓存IO调用的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我们有一个IO操作，例如

Assume we have an IO action such as

lookupStuff :: InputType -> IO OutputType

这可能是简单的，如DNS查找或一些Web服务调用 - 不变数据。

which could be something simple such as DNS lookup, or some web-service call against a time-invariant data.

让我们假设：

从不引发任何异常和/或从不分散

The operation never throws any exception and/or never diverges

如果不是 IO

操作是可重入的，也就是说，它可以从多个线程调用

The action is reentrant, i.e. it can be called from multiple threads at the same time safely.

lookupStuff 操作相当费时。

The lookupStuff operation is quite (time-)expensive.

我面临的问题是如何正确地（和w / o使用任何不安全* IO * cheat）实现可重入缓存，可以从多个线程调用，并将针对相同输入参数的多个查询合并到单个请求中。

The problem I'm facing is how to properly (and w/o using any unsafe*IO* cheat) implement a reentrant cache, that can be called from multiple threads, and coalesces multiple queries for the same input-parameters into a single request.

我想我在类似于GHC的黑洞概念类似纯粹的计算，但在IO计算上下文。

I guess I'm after something similiar as GHC's blackhole-concept for pure computations but in the IO "calculation" context.

什么是惯用的Haskell / GHC解决方案对于所述问题？

What is the idiomatic Haskell/GHC solution for the stated problem?

推荐答案

是的，基本上重新实现逻辑。虽然它似乎与GHC已经在做的一样，这是GHC的选择。 Haskell可以在工作方式不同的虚拟机上实现，所以在这个意义上它不是为你做的。

Yeah, basically reimplement the logic. Although it seems similar to what GHC is already doing, that's GHC's choice. Haskell can be implemented on VMs that work very differently, so in that sense it isn't already done for you.

但是，只要使用 MVal（Map InputType OutputType）或甚至 IORef（Map InputType OutputType）（确保修改为 atomicModifyIORef ），只是将缓存存储在那里。如果这个命令式解决方案似乎错误，那么如果不是为 IO ，这个函数将是纯的约束。如果它只是一个任意的 IO 动作，那么你必须保持状态以便知道执行什么或者不是看起来完全自然的想法。问题是Haskell没有一个类型为纯IO（如果它取决于数据库，它只是在某些条件下表现纯净，这与纯粹的纯粹不一样）。

But yeah, just use an MVar (Map InputType OutputType) or even an IORef (Map InputType OutputType) (make sure to modify with atomicModifyIORef), and just store the cache in there. If this imperative solution seems wrong, it's the "if not for the IO, this function would be pure" constraint. If it were just an arbitrary IO action, then the idea that you would have to keep state in order to know what to execute or not seems perfectly natural. The problem is that Haskell does not have a type for "pure IO" (which, if it depends on a database, it is just behaving pure under certain conditions, which is not the same as being a hereditarily pure).

import qualified Data.Map as Map
import Control.Concurrent.MVar

-- takes an IO function and returns a cached version
cache :: (Ord a) => (a -> IO b) -> IO (a -> IO b)
cache f = do
    r <- newMVar Map.empty
    return $ \x -> do
        cacheMap <- takeMVar r
        case Map.lookup x cacheMap of
            Just y -> do
                putMVar r cacheMap
                return y
            Nothing -> do
                y <- f x
                putMVar (Map.insert x y cacheMap)
                return y

b $ b

是的，它在内部丑陋。但在外面，看看！它只是一个纯粹的记忆功能的类型，除了它 IO 染色了。

这篇关于“返回透明”的可重入高速缓存IO调用的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！