iTranslated by AI
Notes on the Calculation Direction of Expectation DP
Overview
In explanations of Expected Value DP, I often see calculations being performed starting from the final state of state transitions, which seems to be in the reverse direction of the transitions. I have also heard that "while calculation from the initial state is sometimes possible, the implementation becomes complex."
Originally, I felt there was a slight gap between the update formula for Expected Value DP and the simple definition of expected value, and I wondered, "I'm formulating this based on intuition, but does it really hold true?"
Therefore, I will study the derivation of the update formula for Expected Value DP and the differences depending on the direction of calculation.
Conclusion
- I was able to transform the simple definition of expected value into the update formula for Expected Value DP.
- It seems more logical to calculate Expected Value DP from the final state, moving in the opposite direction of state transitions.
- The reason for this is the difference in the complexity of probability calculation between the transition destination and the transition source.
- States (the set of states) transitioning from a certain state:
- The sum of transition probabilities is 1.
- States (the set of states) transitioning to a certain state:
- The transition probability itself likely needs to be calculated using Probability DP.
- States (the set of states) transitioning from a certain state:
- Since a probabilistic state transition is "moving from one starting state to one of several possible states," probability calculations tend to be simpler if we fix the source state and group the destination states for each source.
Definitions
First, I will define notation for probability and state transitions based on graphs.
Basics
-
: Set of statesV -
: Set of directed edges.E \subseteq V \times V represents the existence of a transition from state(u, v) \in E tou .v -
: Cost incurred when transitioning fromw(u, v) \in \R tou . Assumev .(u, v) \in E -
: Probability of transitioning fromP(u, v) \in [0, 1] tou . Assumev .(u, v) \in E -
: Initial states \in V -
: Final state (for simplicity, assumed to be a single state)t \in V
Extension to Paths
Extend
-
: A path. Assume\rho = v_0 v_1 \ldots v_k holds.0 \le \forall i < k, (v_i, v_{i + 1}) \in E - Let
.\rho_i := v_i - Let
.|\rho| := k
- Let
-
: Cost incurred when transitioning along pathw(\rho) \in \R from state\rho .\rho_0 w(\rho) = \sum_{i = 0}^{|\rho| - 1} w(\rho_i, \rho_{i + 1})
-
: Probability of transitioning along pathP(\rho) \in [0, 1] from state\rho .\rho_0 P(\rho) = \Pi_{i = 0}^{|\rho| - 1} P(\rho_i, \rho_{i + 1}) - Is it necessary to be a Markov chain?
Others
To handle sets of vertices before and after transitions, define the following:
-
: Set of states\delta^+(v) = \left \{ u \in V \mid (v, u) \in E \right \} that can be reached fromu in one step.v -
: Set of states\delta^-(v) = \left \{ u \in V \mid (u, v) \in E \right \} that can reachu in one step.v
Note
In this article, I am considering relatively simple problem settings for Expected Value DP.
The following are points to note:
This covers cases where expected values for source/destination states can be obtained in topological sort order.
Cases with complex loops are outside the scope.
I might add information about this in the future.
(Example: https://atcoder.jp/contests/abc350/tasks/abc350_e)
DP Calculation from the Final State
We consider an approach where the DP calculation is performed from the final state, which is the reverse direction of the state transitions.
(Example:
Definitions
-
: The set of all paths from\mathcal P(v) \subseteq V^* to the final statev \in V .t \mathcal P(v) = \left \{ \rho \in V^* \mid \rho_0 = v \land \rho_{|\rho|} = t \right \}
-
: The expected cost to reach the final stateE[v] \in \R fromt .v \in V E[v] = \sum_{\rho \in \mathcal P(v) } P(\rho) \cdot w(\rho)
Expected Value Calculation
The expected cost of transitioning from the initial state
Below, we start from the simple definition of expected value and transform it into the familiar form used in Expected Value DP.
If
If
Now, let's perform the formula transformation.
Considering that
What Always Holds True
The sum of transition probabilities to all states
In the following, we will further transform the formula for
1. When the final state is always reachable from any state
Cases where a "state from which the final state cannot be reached" is never encountered during transitions.
- (Is the expected value even defined if this doesn't hold?)
In this case, the sum of probabilities for all paths from
Therefore,
2. When the cost of any transition is 1
Cases where the cost for a single transition is always 1, such as the "number of times" a die is rolled.
Therefore,
When both 1 and 2 hold
This has become the form commonly seen in typical problems.
DP Calculation from the Initial State
We consider an approach where the DP calculation is performed from the initial state, following the same direction as the state transitions.
(Example:
Definitions
-
: The set of all paths from the initial state\mathcal P(v) \subseteq V^* tos .v \in V \mathcal P(v) = \left \{ \rho \in V^* \mid \rho_0 = s \land \rho_{|\rho|} = v \right \}
-
: The expected cost to reachE[v] \in \R from the initial statev \in V .s E[v] = \sum_{\rho \in \mathcal P(v) } P(\rho) \cdot w(\rho)
Expected Value Calculation
The expected cost of transitioning from the initial state
If
If
Now, let's perform the formula transformation.
Considering calculating
Differences from "DP Calculation from the Final State"
1. \sum_{v \in \delta^-(u)} P(v, u) is not necessarily 1.
In the case of "DP Calculation from the Final State," a similar term
On the other hand,
2. \sum_{\rho' \in \mathcal P(v) } P(\rho') is not necessarily 1.
This is the sum of probabilities of all paths leading from the initial state
Reasons Why Implementation Becomes Complex
Unlike calculating in the reverse direction from the final state, there are terms that cannot be inherently simplified, so those terms must be calculated alongside the expected value calculation.
Specifically, the term
Discussion