# 收斂速度值迭代

I will answer on the basis of one particular case of MDPs. There have been a few recent articles that extensively discuss the issues of the diversification of MDPs and introduce a new model known as the MDP-DIV (Markov Decision Process for Diversifying). For that I refer you to Xia et al. (2017).

It is identified that the main causes for slow convergence are data scarcity/sparse transition matrices as you have pointed out, and a large action space.

Two new approaches to accelerate the convergence rate are detailed in Liu et al. (2018). Here, I will quote the relevant parts of the abstract.

The MDP-DIV-kNN method adopts a $$k$$ nearest neighbor strategy, i.e., discarding the $$k$$ nearest neighbors of the recently-selected action (document), to reduce the diversification searching space.

The MDP-DIVNTN employs a pre-trained diversification neural tensor network (NTNDIV) as the evaluation model, and combines the results with MDP to produce the final ranking solution.

The experiment results demonstrate that the two proposed methods indeed accelerate the convergence rate of the MDP-DIV, which is 3x faster, while the accuracies produced barely degrade, or even are better.

References

[1] Xia, L., Xu, J., Lan, Y., Guo, J., Zeng, W., Cheng, X. (2017). Adapting markov decision process for search result diversification. pp. 535–544. SIGIR ’17, ACM, New York, NY, USA.

[2] Liu, F., Tang, R., Li, X., Ye, Y., Guo, H., He, X. (2018). Novel Approaches to Accelerating the Convergence Rate of Markov Decision Process for Search Result Diversification. https://arxiv.org/pdf/1802.08401.pdf.