双指针题目：压缩字符串

文章目录

题目
解法

题目

标题和出处

标题：压缩字符串

出处：443. 压缩字符串

难度

4 级

题目描述

要求

给定一个字符数组 chars \texttt{chars} chars，请使用下述算法压缩：

从一个空字符串 s \texttt{s} s 开始。对于 chars \texttt{chars} chars 中的每组连续重复字符：

如果这一组长度为 1 \texttt{1} 1，则将字符追加到 s \texttt{s} s 中。
否则，需要向 s \texttt{s} s 追加字符，后跟这一组的长度。

压缩后得到的字符串 s \texttt{s} s 不应该直接返回 ，而是需要存储到输入字符数组 chars \texttt{chars} chars 中。需要注意的是，如果组长度大于等于 10 \texttt{10} 10，则在 chars \texttt{chars} chars 数组中会被拆分为多个字符。

在修改完输入数组后，返回该数组的新长度。

要求只能使用常量额外空间。

示例

示例 1：

输入： chars = ["a","a","b","b","c","c","c"] \texttt{chars = ["a","a","b","b","c","c","c"]} chars = ["a","a","b","b","c","c","c"]

输出：返回 6 \texttt{6} 6，输入数组的前 6 \texttt{6} 6 个字符应该是： ["a","2","b","2","c","3"] \texttt{["a","2","b","2","c","3"]} ["a","2","b","2","c","3"]。

解释：字符串中的组是 "aa" \texttt{"aa"} "aa"、 "bb" \texttt{"bb"} "bb" 和 "ccc" \texttt{"ccc"} "ccc"。压缩后的结果是 "a2b2c3" \texttt{"a2b2c3"} "a2b2c3"。

示例 2：

输入： chars = ["a"] \texttt{chars = ["a"]} chars = ["a"]

输出：返回 1 \texttt{1} 1，输入数组的前 1 \texttt{1} 1 个字符应该是： ["a"] \texttt{["a"]} ["a"]。

解释：唯一的组是 "a" \texttt{"a"} "a"，因为只有一个字符，所以不压缩。

示例 3：

输入： chars = ["a","b","b","b","b","b","b","b","b","b","b","b","b"] \texttt{chars = ["a","b","b","b","b","b","b","b","b","b","b","b","b"]} chars = ["a","b","b","b","b","b","b","b","b","b","b","b","b"]

输出：返回 4 \texttt{4} 4，输入数组的前 4 \texttt{4} 4 个字符应该是： ["a","b","1","2"] \texttt{["a","b","1","2"]} ["a","b","1","2"]。

解释：字符串中的组是 "a" \texttt{"a"} "a" 和 "bbbbbbbbbbbb" \texttt{"bbbbbbbbbbbb"} "bbbbbbbbbbbb"。压缩后的结果是 "ab12" \texttt{"ab12"} "ab12"。

数据范围

1 ≤ chars.length ≤ 2000 \texttt{1} \le \texttt{chars.length} \le \texttt{2000} 1≤chars.length≤2000
chars[i] \texttt{chars[i]} chars[i] 可以是小写英语字母、大写英语字母、数字或符号

解法

思路和算法

对于字符串中的每组连续重复字符，用 x x x 表示组的长度，考虑压缩前后的长度变化。

如果 x = 1 x = 1 x=1，则压缩后的长度不变。
如果 x > 1 x > 1 x>1，则压缩的效果是将后面 x − 1 x - 1 x−1 个字符替换成数字 x − 1 x - 1 x−1 的字符表示。当 x = 2 x = 2 x=2 时，数字 x − 1 x - 1 x−1 的有效位数等于 x − 1 x - 1 x−1；当 x > 2 x > 2 x>2 时，数字 x − 1 x - 1 x−1 的有效位数小于 x − 1 x - 1 x−1。因此压缩后的长度不变或减少，且只有当组的长度等于 2 2 2 的时候压缩后的长度不变。

因此，整个字符串压缩之后的长度一定不变或减少。

根据题目描述的压缩过程，可以使用双指针实现字符串的压缩。

定义快指针 fast \textit{fast} fast 和慢指针 slow \textit{slow} slow，快指针用于遍历字符数组 chars \textit{chars} chars，慢指针用于将压缩后的字符填入结果数组。任意时刻， fast \textit{fast} fast 指向下一个待遍历的下标， slow \textit{slow} slow 指向下一个待填入字符的下标。由于 chars [ 0 ] \textit{chars}[0] chars[0] 在压缩前后一定保持不变，因此从下标 1 1 1 开始遍历， fast \textit{fast} fast 的初始值是 1 1 1；由于初始时没有压缩后的字符填入结果数组，因此 slow \textit{slow} slow 的初始值是 0 0 0。

为了实现字符串的压缩，遍历过程中需要记录当前组的字符数以及维护字符数的数字表示长度。用 count \textit{count} count 表示当前组的字符数，初始时 count = 1 \textit{count} = 1 count=1，表示首个字符 chars [ 0 ] \textit{chars}[0] chars[0] 所在组的字符数。为了维护字符数的数字表示长度，需要维护 count \textit{count} count 的最高有效位的计数单位 unit \textit{unit} unit，初始时 unit = 1 \textit{unit} = 1 unit=1，每次当 count \textit{count} count 的有效位数增加一位时将 unit \textit{unit} unit 乘以 10 10 10。

使用快指针遍历数组，遍历的条件是快指针指向的下标小于等于数组的长度。对于快指针遍历到的每个下标，执行如下操作。

如果 fast \textit{fast} fast 指向的下标小于数组的长度且 chars [ fast ] = chars [ fast − 1 ] \textit{chars}[\textit{fast}] = \textit{chars}[\textit{fast} - 1] chars[fast]=chars[fast−1]，即当前字符与上一个字符相同，则当前字符与上一个字符在同一个组，执行如下操作。
1. 将 count \textit{count} count 加 1 1 1。
2. 更新 count \textit{count} count 之后，如果出现 count = unit × 10 \textit{count} = \textit{unit} \times 10 count=unit×10，则 count \textit{count} count 的有效位数增加一位，因此将 unit \textit{unit} unit 乘以 10 10 10。
如果 fast \textit{fast} fast 指向的下标等于数组的长度或 chars [ fast ] ≠ chars [ fast − 1 ] \textit{chars}[\textit{fast}] \ne \textit{chars}[\textit{fast} - 1] chars[fast]=chars[fast−1]，则上一个字符是上一个组的末尾字符，执行如下操作。
1. 上一个组的首个字符下标是 fast − count \textit{fast} - \textit{count} fast−count，字符是 chars [ fast − count ] \textit{chars}[\textit{fast} - \textit{count}] chars[fast−count]，将该字符填入 chars [ slow ] \textit{chars}[\textit{slow}] chars[slow]，然后将 slow \textit{slow} slow 向右移动一位。
2. 如果 count > 1 \textit{count} > 1 count>1，则需要将 count \textit{count} count 的每一位填入数组。从高到低遍历 count \textit{count} count 的每一位，每一位的计数单位是 unit \textit{unit} unit，获得当前位的值之后从 count \textit{count} count 中减去当前位的值并将 unit \textit{unit} unit 除以 10 10 10，将当前位的值对应的字符填入 chars [ slow ] \textit{chars}[\textit{slow}] chars[slow]，然后将 slow \textit{slow} slow 向右移动一位。注意如果 count = 1 \textit{count} = 1 count=1 则跳过这一步。
3. 将 count \textit{count} count 和 unit \textit{unit} unit 都更新为 1 1 1，表示当前 fast \textit{fast} fast 指向的下标为新的一组的开始。如果 fast \textit{fast} fast 指向的下标等于数组的长度，则不会有更多的字符，因此也可以得到正确的结果。

遍历结束之后，下标范围 [ 0 , slow − 1 ] [0, \textit{slow} - 1] [0,slow−1] 为压缩字符串得到的字符数组，数组的新长度为 slow \textit{slow} slow，返回 slow \textit{slow} slow。

代码

java 复制代码

class Solution {
    public int compress(char[] chars) {
        int length = chars.length;
        int fast = 1, slow = 0;
        int count = 1, unit = 1;
        while (fast <= length) {
            if (fast < length && chars[fast] == chars[fast - 1]) {
                count++;
                if (count == unit * 10) {
                    unit *= 10;
                }
            } else {
                char c = chars[fast - count];
                chars[slow] = c;
                slow++;
                if (count > 1) {
                    while (unit > 0) {
                        int digit = count / unit;
                        count -= digit * unit;
                        unit /= 10;
                        chars[slow] = (char) ('0' + digit);
                        slow++;
                    }
                }
                count = 1;
                unit = 1;
            }
            fast++;
        }
        return slow;
    }
}

复杂度分析

时间复杂度： O ( n ) O(n) O(n)，其中 n n n 是数组 chars \textit{chars} chars 的长度。双指针各遍历数组一次。
空间复杂度： O ( 1 ) O(1) O(1)。