功能編程和狀態算法


12

我正在與 Haskell 學習函數式編程。同時,我正在研究自動機理論,由於兩者看起來很融洽,所以我正在編寫一個小型庫來玩自動機。

這是讓我問這個問題的問題。在研究評估狀態可達性的方法時,我想到了一個簡單的遞歸算法效率很低,因為某些路徑可能共享某些狀態,而我可能最終不只一次評估它們。

例如,在這裡,從 a 中評估 g 可達性,我必須排除 f 同時檢查通過 d c 的路徑:

digraph representing an automaton

所以我的想法是,在許多路徑上並行工作並更新排除狀態的共享記錄的算法可能很棒,但這對我來說太過分了。

我已經看到,在一些簡單的遞歸情況下,可以將狀態作為參數傳遞,這就是我在這裡要做的,因為我向前傳遞了要避免循環的狀態列表。但是,是否有一種方法也可以向後傳遞該列表,例如將其與canReach函數的布爾結果一起返回到元組中?(儘管這有點強迫)

除了示例案例的有效性之外,還有哪些其他技術可以解決此類問題?我覺得這些必須足夠通用,因此必須有像fold*map一樣的解決方案。

到目前為止,讀learnyouahaskell.com並沒有發現任何東西,但考慮到我還沒有接觸過monad。

如果有興趣,我將代碼發佈在codereview

2

Here's a simple answer relying on mapConcat.

 mapConcat :: (a -> [b]) -> [a] -> [b]
 -- mapConcat is in the std libs, mapConcat = concat . map
 type Path = []

 isReachable :: a -> Auto a -> a -> [Path a]
 isReachable to auto from | to == from = [[]]
 isReachable to auto from | otherwise = 
    map (from:) . mapConcat (isReachable to auto) $ neighbors auto from

Where neighbors returns the states immediately connected to a state. This returns a series of paths.


16

Functional programming doesn't get rid of state. It only makes it explicit! While it's true that functions like map will often "unravel" a "shared" data structure, if all you want to do is write a reachability algorithm then it's just a matter of keeping track of what nodes you already visited:

import qualified Data.Set as S
data Node = Node Int [Node] deriving (Show)

-- Receives a root node, returns a list of the node keyss visited in a depth-first search
dfs :: Node -> [Int]
dfs x = fst (dfs' (x, S.empty))

-- This worker function keeps track of a set of already-visited nodes to ignore.
dfs' :: (Node, S.Set Int) -> ([Int], S.Set Int)
dfs' ([email protected](Node k ns), s )
  | k  `S.member` s = ([], s)
  | otherwise =
    let (childtrees, s') = loopChildren ns (S.insert k s) in
    (k:(concat childtrees), s')

--This function could probably be implemented as just a fold but Im lazy today...
loopChildren :: [Node] -> S.Set Int -> ([[Int]], S.Set Int)
loopChildren []  s = ([], s)
loopChildren (n:ns) s =
  let (xs, s') = dfs' (n, s) in
  let (xss, s'') = loopChildren ns s' in
  (xs:xss, s'')

na = Node 1 [nb, nc, nd]
nb = Node 2 [ne]
nc = Node 3 [ne, nf]
nd = Node 4 [nf]
ne = Node 5 [ng]
nf = Node 6 []
ng = Node 7 []

main = print $ dfs na -- [1,2,5,7,3,6,4]

Now, I must confess that keeping track of all this state by hand is pretty annoying and error prone (it's easy to use s' instead of s'', it's easy to pass the same s' to more than one computation...). This is where monads come in: they don't add anything you couldn't already do before but they let you pass the state variable around implicitly and the interface guarantees that it happens in a single-threaded manner.


Edit: I'll attempt to give a more reasoning of what I did now: first of all, instead of just testing for reachability I coded a depth-first search. The implementation is going to look pretty much the same but the debugging looks a bit better.

In a stateful language, the DFS would look kind of like this:

visited = set()  #mutable state
visitlist = []   #mutable state
def dfs(node):
   if isMember(node, visited):
       //do nothing
   else:
       visited[node.key] = true           
       visitlist.append(node.key)
       for child in node.children:
         dfs(child)

Now we need to find a way to get rid of the mutable state. First of all we get rid of the "visitlist" variable by making dfs return that instead of void:

visited = set()  #mutable state
def dfs(node):
   if isMember(node, visited):
       return []
   else:
       visited[node.key] = true
       return [node.key] + concat(map(dfs, node.children))

And now comes the tricky part: getting rid of the "visited" variable. The basic trick is to use a convention where we pass the state as an additional parameter to functions that need it and have those functions return the new version of the state as an extra return value if they want to modify it.

let increment_state s = s+1 in
let extract_state s = (s, 0) in

let s0 = 0 in
let s1 = increment_state s0 in
let s2 = increment_state s1 in
let (x, s3) = extract_state s2 in
-- and so on...

To apply this pattern to the dfs, we need to change it to receive the "visited" set as an extra parameter and to return the updated version of "visited" as an extra return value. Additionally, we need to rewrite the code so that we are always passing forward the "most recent" version of the "visited" array:

def dfs(node, visited1):
   if isMember(node, visited1):
       return ([], visited1) #return the old state because we dont want to  change it
   else:
       curr_visited = insert(node.key, visited1) #immutable update, with a new variable for the new value
       childtrees = []
       for child in node.children:
          (ct, curr_visited) = dfs(child, curr_visited)
          child_trees.append(ct)
       return ([node.key] + concat(childTrees), curr_visited)

The Haskell version does pretty much what I did here, except that it goes all the way and uses an inner recursive function instead of mutable "curr_visited" and "childtrees" variables.


As for monads, what they basically accomplish is implicitly passing the "curr_visited" around, instead of forcing you to do it by hand. Not only does this remove clutter from the code, but it also prevents you from making mistakes, such as forking state (passing the same "visited" set to two subsequent calls instead of chaining the state).