"预测""证明有效"
因果归因分析在医疗AI中的核心作用

大多数医疗AI只做预测。PSM倾向性得分匹配法如何帮助ReHealth AI生产支付方认可的因果证据,让预防结果第一次可以被结算?

医疗AI领域有一个被忽视已久的根本性缺陷:几乎所有的医疗AI产品,只做预测,不做归因。

预测风险很有用——告诉医生这个患者未来三年心脏病发作的概率是23%,这是有价值的信息。但预测本身无法回答一个更关键的问题:"如果我们进行了干预,风险真的降低了吗?"

这个问题的答案,决定了预防医疗能否被结算。

相关性 ≠ 因果性:医疗AI的根本困境

举一个经典例子:我们观察到,服用某种降压药的患者,其心脏病发作率比不服药的患者低30%。这是否意味着这种药有效?

不一定。服药的患者可能本来就更关注健康,更倾向于定期体检、保持健康的生活方式——这些因素同样会降低心脏病风险。药物效果和"患者更健康"这两个因素是混在一起的,无法分离。

❌ 相关性分析(不够)

观察到:服药组发病率更低

无法排除:服药组本来就更健康、更积极配合治疗。无法得出:药物本身降低了风险。

✓ 因果归因(我们需要的)

证明:干预本身改变了结果

控制混杂变量,构建"反事实"对照组,统计验证干预的独立效果。这才是支付方认可的证据。

这在统计学上叫做"混杂偏差"——当我们试图用观察数据评估干预效果时,如果干预组和对照组本来就不同,简单比较两组的结果是有误导性的。

PSM:构造"平行世界"的统计方法

PSM(Propensity Score Matching,倾向性得分匹配法)是解决这个问题的核心方法。它的思路是:如果我们无法做随机对照试验(RCT),那么我们可以在观察数据中,用统计方法构造一个尽可能接近随机对照的比较组。

核心思路
对于每一个"接受了干预"的患者,在"未接受干预"的患者中找到一个在所有重要特征上都与其高度相似的"匹配对"。然后比较这两个高度相似的人,一个接受干预、一个没有,结果有何差异——这个差异才是干预本身的真实效果。

PSM 的具体操作步骤

1

收集混杂变量

识别所有可能同时影响"是否接受干预"和"结果"的变量:年龄、性别、基础疾病、用药史、生活习惯、社会经济状况等。

2

计算倾向性得分

用逻辑回归或其他分类模型,预测每个患者"接受干预的概率"——这个概率就是倾向性得分。得分相近的患者,在特征分布上高度相似。

3

按得分匹配

对每个干预组患者,从对照组中找到倾向性得分最接近的患者进行配对。配对后,两组患者在所有已知混杂变量上的分布高度相似。

4

比较匹配后的结果

在配对后的样本中,比较干预组和对照组的结果差异。由于两组特征高度相似,这个差异可以归因于干预本身。

PSM 匹配示意 / PSM Matching Process
干预组患者 A
60岁 男 高血压 吸烟
↔ 匹配
对照组患者 B
61岁 男 高血压 吸烟
风险差异
= 干预效果
特征高度相似的一对患者,唯一差异是是否接受干预
# PSM 倾向性得分计算(简化示意) from sklearn.linear_model import LogisticRegression # 混杂变量:年龄、BMI、血压、病史等 covariates = ['age', 'bmi', 'systolic_bp', 'diabetes', 'smoking', 'family_history'] # 计算接受干预的概率(倾向性得分) model = LogisticRegression() model.fit(X[covariates], treatment_indicator) propensity_scores = model.predict_proba(X[covariates])[:, 1] # 按得分匹配,构造近似随机对照组 matched_pairs = nearest_neighbor_matching(propensity_scores, caliper=0.1) # 在匹配样本中估计平均干预效果 ATT = estimate_average_treatment_effect(matched_pairs) # ATT = Average Treatment Effect on the Treated # 这才是干预本身的真实效果

为什么 PSM 是种子阶段的核心方法

我们选择 PSM 作为种子阶段的核心因果分析方法,原因很务实:PSM 对数据量的要求相对较低,适合早期阶段;PSM 的方法论已被医学界和监管机构广泛接受,产生的证据有较高的可信度;PSM 的结果相对容易解释,方便向保险公司和医疗机构说明。

因果分析方法路线图 / Causal Analysis Roadmap
种子阶段 PSM 倾向性得分匹配——数据需求低,方法成熟,可快速产生临床可接受证据
A 轮 引入 DID(差分法)——控制时间维度的混杂,适合长期随访数据
B 轮+ 合成对照法 + 工具变量——处理更复杂的因果场景,构建更高证据等级

因果证据如何转化为结算依据

当我们用PSM证明了"接受ReHealth AI干预方案的患者,三年内心脑血管发病率比同等条件未干预患者低X%",这个数字就变成了一份有统计显著性支撑的因果证据报告。

这份报告可以用于:保险公司评估是否将预防干预项目纳入理赔范围;医疗机构向政府医保部门申请预防项目报销资格;企业向员工证明健康管理投入的实际效益。

预防第一次有了可以结算的依据。

局限性与诚实的边界

PSM 不是万能的。它只能控制"已知的混杂变量",如果存在我们没有测量到的重要混杂因素,PSM 也无法完全消除偏差。这就是为什么我们需要持续扩大数据维度,收集更全面的个体健康信息——可穿戴设备数据、HIS系统数据、问卷数据——来尽可能减少未知混杂的影响。

这也是为什么我们说这是一个需要时间积累的基础设施建设,而不是一个可以快速"刷指标"的AI产品。

ReHealth Core:从预测到因果证据的完整系统

我们向医疗机构、保险公司和企业健康管理方开放 API 试用。亲自审核每一份申请。

申请 API 访问 →

From "Prediction" to "Proof":
Causal Attribution's Core Role in Healthcare AI

Most healthcare AI only predicts. How does Propensity Score Matching help ReHealth AI produce payer-accepted causal evidence, making prevention outcomes billable for the first time?

There's a fundamental flaw in healthcare AI that's been ignored for too long: almost all healthcare AI products only predict — they don't attribute causality.

Risk prediction is useful — telling a physician that this patient has a 23% probability of heart attack in the next three years is valuable information. But prediction alone can't answer a more critical question: "If we intervene, does the risk actually decrease?"

The answer to this question determines whether preventive medicine can be billed.

Correlation ≠ Causation: Healthcare AI's Core Dilemma

A classic example: we observe that patients taking a certain antihypertensive have 30% lower heart attack rates than those who don't. Does this mean the drug works?

Not necessarily. Patients who take medication may inherently be more health-conscious, more likely to get regular checkups, maintain healthier lifestyles — factors that independently reduce cardiac risk. Drug effects and "patients being healthier" are confounded together and can't be separated.

❌ Correlation Analysis (Insufficient)

Observed: Treatment group has lower incidence

Can't rule out: Treatment group was healthier to begin with. Can't conclude: The drug itself reduced risk.

✓ Causal Attribution (What We Need)

Proves: The intervention itself changed outcomes

Control confounders, construct counterfactual comparison groups, statistically verify the intervention's independent effect. This is payer-accepted evidence.

PSM: Constructing "Parallel Worlds" Statistically

PSM (Propensity Score Matching) is the core method for solving this problem. Its insight: if we can't run randomized controlled trials (RCTs), we can use statistical methods on observational data to construct a comparison group that approximates random assignment.

Core Insight
For each patient who "received intervention," find a patient from the "no intervention" group who is highly similar on all important characteristics. Compare this closely matched pair — one received intervention, one didn't — and the outcome difference is the true causal effect of the intervention.
PSM Matching Process
Treatment Patient A
60yr M Hypertension Smoker
↔ Match
Control Patient B
61yr M Hypertension Smoker
Risk Difference
= Intervention Effect
Highly similar pair — the only difference is whether they received intervention

Why PSM Is Our Core Method at Seed Stage

Causal Analysis Roadmap
Seed PSM — low data requirements, established methodology, rapidly generates clinically accepted evidence
Series A DID (Difference-in-Differences) — controls time-dimension confounders, suited for long-term follow-up data
Series B+ Synthetic control + instrumental variables — handles more complex causal scenarios, builds higher evidence grades

How Causal Evidence Becomes Settlement Basis

When PSM proves that "patients receiving ReHealth AI intervention programs had X% lower cardiovascular incidence over three years compared to similarly characterized non-intervention patients," this number becomes a causal evidence report backed by statistical significance.

This report enables: insurers to evaluate incorporating preventive intervention programs into coverage; hospitals to apply for preventive program reimbursement qualification; enterprises to demonstrate the actual ROI of health management investment to employees.

Prevention has a billing basis for the first time.

ReHealth Core: Complete System from Prediction to Causal Evidence

We open API access to qualified healthcare institutions, insurers, and enterprise health management partners. Every application is personally reviewed.

Apply for API Access →