关键词:机器学习、隐马尔可夫模型、HMM、维特比算法、前向算法、序列标注、词性标注、语音识别、Python HMM、Java HMM、手动计算
一句话答案:HMM是处理时序观测数据的利器——它假设存在一个不可见的“隐藏状态序列”,通过观测序列反推最可能的状态路径。语音识别、词性标注、生物序列分析都离不开它!
如果你在搜索:
那么,这篇文章就是为你写的——从状态转移矩阵到最优路径解码,一步不跳。
HMM 描述一个双重随机过程:





晴天(Sunny)、雨天(Rainy)散步(Walk)、购物(Shop)、宅家(Clean)状态 | π |
|---|---|
晴天 | 0.6 |
雨天 | 0.4 |
当前 \ 下一 | 晴天 | 雨天 |
|---|---|---|
晴天 | 0.7 | 0.3 |
雨天 | 0.4 | 0.6 |
状态 \ 观测 | 散步 | 购物 | 宅家 |
|---|---|---|---|
晴天 | 0.6 | 0.3 | 0.1 |
雨天 | 0.1 | 0.4 | 0.5 |
目标:已知观测序列
O = [散步, 购物, 宅家],求最可能的天气序列?
对每个状态 (s),计算:

记录路径:

对每个当前状态 (s),计算:




✅ 最优隐藏序列:[晴天, 雨天, 雨天]
💡 尽管第一天散步更可能对应晴天,但后续“宅家”强烈暗示雨天,HMM 通过动态规划全局最优。
import numpy as np
class HMM:
def __init__(self, states, observations, pi, A, B):
self.states = states
self.observations = observations
self.pi = np.array(pi)
self.A = np.array(A)
self.B = np.array(B)
self.state_to_idx = {s: i for i, s in enumerate(states)}
self.obs_to_idx = {o: i for i, o in enumerate(observations)}
def viterbi(self, obs_seq):
T = len(obs_seq)
N = len(self.states)
# 初始化
delta = np.zeros((T, N))
psi = np.zeros((T, N), dtype=int)
o0 = self.obs_to_idx[obs_seq[0]]
delta[0] = self.pi * self.B[:, o0]
# 递推
for t in range(1, T):
ot = self.obs_to_idx[obs_seq[t]]
for j in range(N):
probs = delta[t-1] * self.A[:, j]
psi[t, j] = np.argmax(probs)
delta[t, j] = np.max(probs) * self.B[j, ot]
# 终止
path = [np.argmax(delta[T-1])]
# 回溯
for t in range(T-1, 0, -1):
path.insert(0, psi[t, path[0]])
return [self.states[i] for i in path], np.max(delta[T-1])
def forward(self, obs_seq):
"""计算 P(O|λ)"""
T = len(obs_seq)
N = len(self.states)
alpha = np.zeros((T, N))
# 初始化
o0 = self.obs_to_idx[obs_seq[0]]
alpha[0] = self.pi * self.B[:, o0]
# 递推
for t in range(1, T):
ot = self.obs_to_idx[obs_seq[t]]
for j in range(N):
alpha[t, j] = np.sum(alpha[t-1] * self.A[:, j]) * self.B[j, ot]
return np.sum(alpha[T-1])
# 模型参数
states = ["Sunny", "Rainy"]
observations = ["Walk", "Shop", "Clean"]
pi = [0.6, 0.4]
A = [[0.7, 0.3],
[0.4, 0.6]]
B = [[0.6, 0.3, 0.1],
[0.1, 0.4, 0.5]]
hmm = HMM(states, observations, pi, A, B)
# 解码
obs_seq = ["Walk", "Shop", "Clean"]
best_path, prob = hmm.viterbi(obs_seq)
print("最优状态序列:", best_path) # ['Sunny', 'Rainy', 'Rainy']
# 评估
p_obs = hmm.forward(obs_seq)
print("P(O|λ) =", p_obs) # ≈ 0.0234import java.util.*;
public class HMM {
private String[] states;
private String[] observations;
private double[] pi;
private double[][] A;
private double[][] B;
private Map<String, Integer> stateToIdx;
private Map<String, Integer> obsToIdx;
public HMM(String[] states, String[] observations, double[] pi, double[][] A, double[][] B) {
this.states = states;
this.observations = observations;
this.pi = pi;
this.A = A;
this.B = B;
this.stateToIdx = new HashMap<>();
this.obsToIdx = new HashMap<>();
for (int i = 0; i < states.length; i++) stateToIdx.put(states[i], i);
for (int i = 0; i < observations.length; i++) obsToIdx.put(observations[i], i);
}
public Result viterbi(String[] obsSeq) {
int T = obsSeq.length;
int N = states.length;
double[][] delta = new double[T][N];
int[][] psi = new int[T][N];
// 初始化
int o0 = obsToIdx.get(obsSeq[0]);
for (int i = 0; i < N; i++) {
delta[0][i] = pi[i] * B[i][o0];
}
// 递推
for (int t = 1; t < T; t++) {
int ot = obsToIdx.get(obsSeq[t]);
for (int j = 0; j < N; j++) {
double maxProb = -1;
int bestPrev = 0;
for (int i = 0; i < N; i++) {
double prob = delta[t-1][i] * A[i][j];
if (prob > maxProb) {
maxProb = prob;
bestPrev = i;
}
}
psi[t][j] = bestPrev;
delta[t][j] = maxProb * B[j][ot];
}
}
// 终止
double maxProb = -1;
int lastState = 0;
for (int i = 0; i < N; i++) {
if (delta[T-1][i] > maxProb) {
maxProb = delta[T-1][i];
lastState = i;
}
}
// 回溯
int[] pathIdx = new int[T];
pathIdx[T-1] = lastState;
for (int t = T-2; t >= 0; t--) {
pathIdx[t] = psi[t+1][pathIdx[t+1]];
}
String[] path = new String[T];
for (int i = 0; i < T; i++) {
path[i] = states[pathIdx[i]];
}
return new Result(path, maxProb);
}
public static class Result {
public String[] path;
public double probability;
public Result(String[] path, double prob) {
this.path = path;
this.probability = prob;
}
}
// 测试
public static void main(String[] args) {
String[] states = {"Sunny", "Rainy"};
String[] observations = {"Walk", "Shop", "Clean"};
double[] pi = {0.6, 0.4};
double[][] A = {{0.7, 0.3}, {0.4, 0.6}};
double[][] B = {{0.6, 0.3, 0.1}, {0.1, 0.4, 0.5}};
HMM hmm = new HMM(states, observations, pi, A, B);
String[] obsSeq = {"Walk", "Shop", "Clean"};
Result res = hmm.viterbi(obsSeq);
System.out.println("最优状态序列: " + Arrays.toString(res.path));
// 输出: [Sunny, Rainy, Rainy]
System.out.printf("最大概率: %.5f\n", res.probability); // ≈0.01296
}
}优点 | 缺点 |
|---|---|
✅ 天然适配序列建模 | ❌ 假设状态一阶马尔可夫(忽略长程依赖) |
✅ 推理高效(O(TN²)) | ❌ 观测需离散化(连续观测需高斯HMM) |
✅ 可解释性强 | ❌ 参数需先验知识或大量标注数据 |
✅ 支持在线推理 | ❌ 无法建模状态间复杂交互 |
模型 | 优势 | 局限 |
|---|---|---|
HMM | 简单、高效、可解释 | 表达能力有限,依赖强假设 |
CRF(条件随机场) | 全局优化,支持特征工程 | 训练慢,需标注数据 |
RNN/LSTM | 自动学习长程依赖 | 黑盒,需大量数据 |
Transformer | 并行化,捕捉任意距离依赖 | 计算资源消耗大 |
💡 经验法则:
隐马尔可夫模型用两个简单假设,打开了序列建模的大门。它或许古老,但在标注数据稀缺、可解释性至关重要的场景中,依然是不可替代的“瑞士军刀”。
记住:在AI的世界里,看不见的状态往往比看得见的数据更重要。
现在,你已经能:
相关链接
无论你是想写代码调用 API 的开发者,设计 AI 产品的 PM,评估技术路线的管理者,还是单纯好奇智能本质的思考者,这里都有值得你驻足的内容。 不追 hype,只讲逻辑;不谈玄学,专注可复现的认知。 让我们一起,在这场百年一遇的智能革命中,看得更清,走得更稳 https://cloud.tencent.com/developer/column/107314
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。