Question
For two arrays $A\left[n\right]$ and $B\left[n\right]$, find the longest common strictly increasing subsequence.
Solution
To implement the dynamic programming algorithm, it could be divided into two steps:
- State modelling: We need two variables to model the state, inspired from the two problems: longest increasing subsequence and longest common subsequence. Donate the two variables $i$ and $j$, where $\mathrm{dp}\left[i,j\right]$ donates the longest common strictly increasing subsequence (LCIS) of arrays $A\left[1\dots i\right]$ and $B\left[1\dots j\right]$ ending with character number $A\left[i\right]$. The reason to include ending number the common sequence is for the state transition.
-
State transition: To find the value of $\mathrm{dp}\left[i,j\right]$, based on whether $A\left[i\right]\ =\ B\left[j\right]$, there are two circumstances
-
If $A\left[i\right]\ =\ B\left[j\right]$, then one possible state that could transfer to the optimal state is the subsequence ending with index $i$ in array $A$ and $j$ in array $B$. Then the second last element in the LCIS could be from the set $\left\{A\left[k\right]:\ k\ <\ l\ \land\ A\left[k\right]\ <\ A\left[i\right] \right\}$. Because $B\left[j\right]$ forms the LCIS, then the second last element $A\left[k\right]$ could be matched with elements in array $B$ with index smaller than $j$. Therefore, the state transition equation could be derived
$$ \begin{align*}\mathrm{dp}\left[i,j\right]\ =\ \max_{0\le k<i:\ A\left[k\right]<A\left[i\right]}\left\{\mathrm{dp}\left[k,j-1\right]\right\}\ +\ 1\end{align*} $$
-
If $A\left[i\right]\ \neq\ B\left[j\right]$, then $A\left[i\right]$ could not be matched with $B\left[j\right]$, which implies that the LCIS of $A\left[1\dots i\right]$ and $B\left[1\dots j\right]$ is the same as $A\left[1\dots i\right]$ and $B\left[1\dots j-1\right]$: $\mathrm{dp}\left[i,j\right]\ =\ \mathrm{dp}\left[i,j-1\right]$.
-
Implementation
A[0]=B[0]=MINV;
vector<vector<int>>dp(N+1,vector<int>(N+1,0));
dp[0][0]=0;
for(int i=1;i<=N;i++)for(int j=0;j<=N;j++){
if(A[i]==B[j]){
for(int k=0;k<i;k++)if(A[k]<A[i])dp[i][j]=max(dp[i][j],dp[k][j-1]+1);
}
dp[i][j]=max(dp[i][j],dp[i][j-1]);
}
int sol=0;
for(int i=1;i<=N;i++)sol=max(sol,dp[i][N]);
cout<<sol<<endl;
Optimization
The difficulty of this problem is to optimize the above code from $O\left(N^3\right)$ to $O\left(N^2\right)$, by removing the loop when $A\left[i\right]\ =\ B\left[j\right]$: for(int k=0;k<i;k++)if(A[k]<A[i])dp[i][j]=max(dp[i][j],dp[k][j-1]+1);
. At first glance, it seems very hard to optimize when the state transition from $\left(i,\ j\right)\ \to\ \left(i,\ j+1\right)$, Due to the constraint check if(A[k]<A[i])
, it might be necessary maintain a sophisticated data structure (with the potential to exceed the memory limit) to query $\max_{0\le k<i:\ A\left[k\right]<A\left[i\right]}\left\{\mathrm{dp}\left[k,j-1\right]\right\}$ in less than linear time complexity.
To tackle the problem, the first step is to change the outer two loops on $i$ and $j$ from for(int i=1;i<=N;i++)for(int j=0;j<=N;j++)
to for(int j=0;j<=N;j++)for(int i=1;i<=N;i++)
. The advantage brought by this modification is that we enumerate $i$ given fixed $j$, only maximum one element $\mathrm{dp}\left[i,j-1\right]$ is inserted into the decision set $\left\{\mathrm{dp}\left[k,j-1\right]:\ k\ <\ l\ \land\ A\left[k\right]\ <\ A\left[i\right] \right\}$ (first forget about the check).
Notice that when putting $j$ as the outer loop, when enumerating $i$, $B\left[j\right]$ is a constant value and we are testing $A\left[i\right]$ against a constant value $T$. Therefore, instead of doing a constraint check when accessing the element, we could check when inserting the element to the set: adding the element $\mathrm{dp}\left[i_1,j-1\right]$ to the decision set when $A\left[i_1\right]$ is smaller than the value $T$, therefore, when querying the set $A\left[i_2\right]\ =\ T$, all the elements in the decision check satisfy the constraint. Due to the fact that the set is monotonically increasing in size and the only operation applied on the set is querying the maximum value, only one number cache
could used to represent the maximum value of the set and below showing the querying the set and insertion an element to set
if(A[i]==T)dp[i][j]=cache+1; // Query the set
if(A[i-1]<T)cache=max(cache,dp[i][j-1]); // Update the set, no need to preserve all elements in the set.
for(int j=0;j<=N;j++){
int cache=0;
for(int i=1;i<=N;i++){
if(A[i]==B[j])dp[i][j]=max(dp[i][j],cache+1);
dp[i][j]=max(dp[i][j],dp[i][j-1]);
if(A[i]<B[j])cache=max(cache,dp[i][j-1]);
}
}