1000ms

Difficulty: 5

所属公司 : 华为

时间 :2025年9月4日-留学生-AI

算法标签>

模拟

解题思路

1) 关键算法

Scaled Dot-Product Self-Attention 对输入 X (L×D)：
- 线性映射：Q=XWq, K=XWk, V=XWv（均为 L×D）。
- 打分：S = QK^T / sqrt(D)（L×L）。
- 归一化：对 S 每行做 softmax 得到注意力矩阵 A。
- 聚合：Y = A V（L×D）。
全连接(FC)：Y = XW + b，其中 b 对 L 行广播。

2) 实现要点

只涉及矩阵乘、softmax、广播，加上两次拼装即可。
softmax 为数值稳定：对每行先减去行最大值。
按输入顺序读入、按形状 reshape 成矩阵。
输出格式化为两位小数并用英文逗号连接。

3) 正确性说明（简要）

Attention 依公式对每个时刻的表征与所有时刻交互；两层堆叠后再经 FC，完全符合题面给出的结构图与公式，因此结果唯一确定。

4) 复杂度分析

两次注意力各需：
- QK^T：O(L^2·D)
- AV：O(L^2·D)
两次 FC：O(L·D^2) 综合：时间复杂度 O(L^2·D + L·D^2)；空间复杂度 O(L^2 + L·D)（存储注意力矩阵与中间结果）。

参考实现

Python

import sys
import numpy as np
import math

class Solution:
    def analyze_data(self, L: int, D: int,
                     seq: np.ndarray,
                     Wq1: np.ndarray, Wk1: np.ndarray, Wv1: np.ndarray,
                     Wmlp1: np.ndarray, bmlp1: np.ndarray,
                     Wq2: np.ndarray, Wk2: np.ndarray, Wv2: np.ndarray,
                     Wmlp2: np.ndarray, bmlp2: np.ndarray) -> np.ndarray:
        # —— 行 softmax（数值稳定）——
        def softmax_rows(M: np.ndarray) -> np.ndarray:
            mx = np.max(M, axis=1, keepdims=True)   # 每行最大值
            E = np.exp(M - mx)                      # 减去最大值避免上溢
            S = np.sum(E, axis=1, keepdims=True)    # 每行求和
            return E / S

        # —— 单层 Scaled Dot-Product Self-Attention ——
        def attn(X: np.ndarray, Wq: np.ndarray, Wk: np.ndarray, Wv: np.ndarray) -> np.ndarray:
            Q = X @ Wq                  # (L,D)
            K = X @ Wk                  # (L,D)
            V = X @ Wv                  # (L,D)
            S = (Q @ K.T) / math.sqrt(D)  # (L,L)  /sqrt(D) 缩放
            A = softmax_rows(S)         # (L,L)
            return A @ V                # (L,D)

        X = seq                         # (L,D)

        # 第一层 Self-Attention -> FC
        Y1 = attn(X, Wq1, Wk1, Wv1)     # (L,D)
        Z1 = Y1 @ Wmlp1 + bmlp1         # (L,D) + (D,)  行广播

        # 第二层 Self-Attention -> FC
        Y2 = attn(Z1, Wq2, Wk2, Wv2)    # (L,D)
        Z2 = Y2 @ Wmlp2 + bmlp2         # (L,D)

        return Z2                       # 返回最终 (L,D)

if __name__ == "__main__":
    # 固定读取 12 行（与模板保持一致）
    lines = [sys.stdin.readline().strip() for _ in range(12)]
    L, D = map(int, lines[0].split(','))

    def parse_line(idx: int, count: int):
        # 按英文逗号分隔（模板要求），读取指定数量的浮点
        values = list(map(float, lines[idx].split(',')))
        assert len(values) == count, f"Line {idx} expected {count} values"
        return np.array(values, dtype=np.float64), idx + 1

    idx = 1
    seq_flat, idx = parse_line(idx, L * D)
    seq = seq_flat.reshape(L, D)

    Wq1_flat, idx = parse_line(idx, D * D)
    Wk1_flat, idx = parse_line(idx, D * D)
    Wv1_flat, idx = parse_line(idx, D * D)
    Wmlp1_flat, idx = parse_line(idx, D * D)
    bmlp1, idx = parse_line(idx, D)

    Wq2_flat, idx = parse_line(idx, D * D)
    Wk2_flat, idx = parse_line(idx, D * D)
    Wv2_flat, idx = parse_line(idx, D * D)
    Wmlp2_flat, idx = parse_line(idx, D * D)
    bmlp2, idx = parse_line(idx, D)

    # reshape all matrices（名称与模板一致）
    Wq1 = Wq1_flat.reshape(D, D)
    Wk1 = Wk1_flat.reshape(D, D)
    Wv1 = Wv1_flat.reshape(D, D)
    Wmlp1 = Wmlp1_flat.reshape(D, D)

    Wq2 = Wq2_flat.reshape(D, D)
    Wk2 = Wk2_flat.reshape(D, D)
    Wv2 = Wv2_flat.reshape(D, D)
    Wmlp2 = Wmlp2_flat.reshape(D, D)

    # bias 保持 (D,)；numpy 与 (L,D) 相加会做按行广播
    solver = Solution()
    result = solver.analyze_data(L, D, seq,
                                 Wq1, Wk1, Wv1, Wmlp1, bmlp1,
                                 Wq2, Wk2, Wv2, Wmlp2, bmlp2)

    # 输出结果：保留两位小数
    flat = result.flatten()
    print(','.join(f"{x:.2f}" for x in flat))

Java

import java.io.*;
import java.util.*;

public class Main {
    static class FastScanner {
        BufferedInputStream in = new BufferedInputStream(System.in);
        byte[] buf = new byte[1 << 16];
        int ptr = 0, len = 0;

        int read() throws IOException {
            if (ptr >= len) {
                len = in.read(buf);
                ptr = 0;
                if (len <= 0) return -1;
            }
            return buf[ptr++];
        }
        String next() throws IOException {
            StringBuilder sb = new StringBuilder();
            int c;
            // 允许逗号或空格分隔
            while ((c = read()) != -1 && c <= 32 || c == ',');
            if (c == -1) return null;
            do {
                if (c == ',') break;
                sb.append((char)c);
                c = read();
            } while (c != -1 && c > 32 && c != ',');
            return sb.toString();
        }
        int nextInt() throws IOException { return Integer.parseInt(next()); }
        double nextDouble() throws IOException { return Double.parseDouble(next()); }
    }

    // 矩阵乘: A(L×K) * B(K×N) -> C(L×N)
    static double[][] matMul(double[][] A, double[][] B) {
        int L = A.length, K = A[0].length, N = B[0].length;
        double[][] C = new double[L][N];
        for (int i = 0; i < L; i++) {
            for (int k = 0; k < K; k++) {
                double aik = A[i][k];
                for (int j = 0; j < N; j++) {
                    C[i][j] += aik * B[k][j];
                }
            }
        }
        return C;
    }

    // 行softmax（数值稳定）
    static void softmaxRows(double[][] S) {
        int L = S.length, N = S[0].length;
        for (int i = 0; i < L; i++) {
            double mx = -1e300;
            for (int j = 0; j < N; j++) mx = Math.max(mx, S[i][j]);
            double sum = 0.0;
            for (int j = 0; j < N; j++) {
                S[i][j] = Math.exp(S[i][j] - mx);
                sum += S[i][j];
            }
            for (int j = 0; j < N; j++) S[i][j] /= sum;
        }
    }

    static double[][] attn(double[][] X, double[][] Wq, double[][] Wk, double[][] Wv) {
        int L = X.length, D = X[0].length;
        double[][] Q = matMul(X, Wq);
        double[][] K = matMul(X, Wk);
        double[][] V = matMul(X, Wv);

        // S = QK^T / sqrt(D)
        double[][] S = new double[L][L];
        double scale = 1.0 / Math.sqrt(D);
        for (int i = 0; i < L; i++) {
            for (int j = 0; j < L; j++) {
                double sum = 0;
                for (int k = 0; k < D; k++) sum += Q[i][k] * K[j][k];
                S[i][j] = sum * scale;
            }
        }
        softmaxRows(S);
        // A V
        double[][] Y = new double[L][D];
        for (int i = 0; i < L; i++) {
            for (int j = 0; j < L; j++) {
                double a = S[i][j];
                for (int k = 0; k < D; k++) {
                    Y[i][k] += a * V[j][k];
                }
            }
        }
        return Y;
    }

    static double[][] addBias(double[][] X, double[] b) {
        int L = X.length, D = X[0].length;
        double[][] Y = new double[L][D];
        for (int i = 0; i < L; i++)
            for (int j = 0; j < D; j++)
                Y[i][j] = X[i][j] + b[j]; // 行广播
        return Y;
    }

    public static void main(String[] args) throws Exception {
        FastScanner fs = new FastScanner();
        int L = Integer.parseInt(fs.next());
        int D = Integer.parseInt(fs.next());

        // 读入并reshape
        double[][] X = new double[L][D];
        for (int i = 0, p = 0; i < L; i++)
            for (int j = 0; j < D; j++, p++)
                X[i][j] = fs.nextDouble();

        double[][] Wq1 = new double[D][D];
        double[][] Wk1 = new double[D][D];
        double[][] Wv1 = new double[D][D];
        double[][] Wfc1 = new double[D][D];
        double[] bfc1 = new double[D];
        double[][] Wq2 = new double[D][D];
        double[][] Wk2 = new double[D][D];
        double[][] Wv2 = new double[D][D];
        double[][] Wfc2 = new double[D][D];
        double[] bfc2 = new double[D];

        // 工具函数：读 D*D
        java.util.function.Consumer<double[][]> readDD = (M) -> {
            try {
                for (int i = 0; i < D; i++)
                    for (int j = 0; j < D; j++)
                        M[i][j] = Double.parseDouble(fs.next());
            } catch (Exception e) { throw new RuntimeException(e); }
        };

        readDD.accept(Wq1); readDD.accept(Wk1); readDD.accept(Wv1);
        readDD.accept(Wfc1);
        for (int j = 0; j < D; j++) bfc1[j] = fs.nextDouble();
        readDD.accept(Wq2); readDD.accept(Wk2); readDD.accept(Wv2);
        readDD.accept(Wfc2);
        for (int j = 0; j < D; j++) bfc2[j] = fs.nextDouble();

        // SA1 -> FC1
        double[][] Y1 = attn(X, Wq1, Wk1, Wv1);
        double[][] Z1 = matMul(Y1, Wfc1);
        Z1 = addBias(Z1, bfc1);

        // SA2 -> FC2
        double[][] Y2 = attn(Z1, Wq2, Wk2, Wv2);
        double[][] Z2 = matMul(Y2, Wfc2);
        Z2 = addBias(Z2, bfc2);

        // 输出 L*D，英文逗号分隔，保留2位小数
        StringBuilder out = new StringBuilder();
        java.text.DecimalFormat df = new java.text.DecimalFormat("0.00");
        for (int i = 0; i < L; i++)
            for (int j = 0; j < D; j++) {
                if (out.length() > 0) out.append(',');
                out.append(df.format(Z2[i][j]));
            }
        System.out.println(out.toString());
    }
}

C++

#include <bits/stdc++.h>
using namespace std;

using Mat = vector<vector<double>>;

Mat mul(const Mat& A, const Mat& B) {
    int L = (int)A.size(), K = (int)A[0].size(), N = (int)B[0].size();
    Mat C(L, vector<double>(N, 0.0));
    for (int i = 0; i < L; ++i)
        for (int k = 0; k < K; ++k) {
            double aik = A[i][k];
            for (int j = 0; j < N; ++j) C[i][j] += aik * B[k][j];
        }
    return C;
}

void softmax_rows(Mat& S) {
    int L = (int)S.size(), N = (int)S[0].size();
    for (int i = 0; i < L; ++i) {
        double mx = -1e300;
        for (int j = 0; j < N; ++j) mx = max(mx, S[i][j]);
        double sum = 0.0;
        for (int j = 0; j < N; ++j) { S[i][j] = exp(S[i][j] - mx); sum += S[i][j]; }
        for (int j = 0; j < N; ++j) S[i][j] /= sum;
    }
}

Mat attn(const Mat& X, const Mat& Wq, const Mat& Wk, const Mat& Wv) {
    int L = (int)X.size(), D = (int)X[0].size();
    Mat Q = mul(X, Wq), K = mul(X, Wk), V = mul(X, Wv);
    Mat S(L, vector<double>(L, 0.0));
    double scale = 1.0 / sqrt((double)D);
    for (int i = 0; i < L; ++i)
        for (int j = 0; j < L; ++j) {
            double s = 0.0;
            for (int k = 0; k < D; ++k) s += Q[i][k] * K[j][k];
            S[i][j] = s * scale;
        }
    softmax_rows(S);
    Mat Y(L, vector<double>(D, 0.0));
    for (int i = 0; i < L; ++i)
        for (int j = 0; j < L; ++j) {
            double a = S[i][j];
            for (int k = 0; k < D; ++k) Y[i][k] += a * V[j][k];
        }
    return Y;
}

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    // 读完整个stdin，逗号替换为空格
    string all, line;
    while (getline(cin, line)) {
        if (!all.empty()) all.push_back('\n');
        all += line;
    }
    for (char &ch : all) if (ch == ',') ch = ' ';
    istringstream in(all);

    auto readMat = [&](int r, int c) {
        Mat M(r, vector<double>(c));
        for (int i = 0; i < r; ++i)
            for (int j = 0; j < c; ++j) in >> M[i][j];
        return M;
    };
    auto readVec = [&](int n) {
        vector<double> v(n);
        for (int i = 0; i < n; ++i) in >> v[i];
        return v;
    };

    int L, D;
    if (!(in >> L >> D)) return 0;

    // X: 按顺序读取 L*D 个数
    Mat X(L, vector<double>(D));
    for (int i = 0; i < L; ++i)
        for (int j = 0; j < D; ++j) in >> X[i][j];

    Mat Wq1 = readMat(D, D), Wk1 = readMat(D, D), Wv1 = readMat(D, D);
    Mat Wfc1 = readMat(D, D); vector<double> bfc1 = readVec(D);
    Mat Wq2 = readMat(D, D), Wk2 = readMat(D, D), Wv2 = readMat(D, D);
    Mat Wfc2 = readMat(D, D); vector<double> bfc2 = readVec(D);

    Mat Y1 = attn(X, Wq1, Wk1, Wv1);
    Mat Z1 = mul(Y1, Wfc1);
    for (int i = 0; i < L; ++i) for (int j = 0; j < D; ++j) Z1[i][j] += bfc1[j];

    Mat Y2 = attn(Z1, Wq2, Wk2, Wv2);
    Mat Z2 = mul(Y2, Wfc2);
    for (int i = 0; i < L; ++i) for (int j = 0; j < D; ++j) Z2[i][j] += bfc2[j];

    cout.setf(std::ios::fixed);
    cout << setprecision(2);
    bool first = true;
    for (int i = 0; i < L; ++i)
        for (int j = 0; j < D; ++j) {
            if (!first) cout << ",";
            first = false;
            cout << Z2[i][j];
        }
    cout << "\n";
    return 0;
}

题目内容

某工业制造企业在其生产线上部署了多台传感器以监控关键设备(如电机、泵、压缩机等)的运行状态。这些传感器周期性地采集设备的多维度运行数据(如温度、振动、压力、电流、转速等)，每隔固定时间窗口会生成一组时序特征数据。为了实现设备早期故障预警，需要对每一组采集到时序数据进行异常检测和评分。工程师们通过人工标记历史数据集，训练出一套多层自注意力 $(Self-Attention)$ +多层全连接层 $(FC)$ 结构的神经网管模型。现在，为了模型的快速部罢与测试，需要根据题目中给定的网络权重参数，编写代码完成端到端推理，输出每一组传感器时序数据的最终导常分数。结构如下图所示:

具体说明如下：

每一组采集数据为一个二维矩阵，尺寸为 $L$ ， $L$ 采样时序长度， $D$ 为每次采样包含的特征数(如: $10$ 个时间点、每点 $5$ 个特征)。
网络结构为:两层 $Self-Attention$ ，每层后接全连接层 $FC$ ，最终输出异常分数。为简化起见，网络中无非线性激活函数。
$Self-Attenton$ 采用 $Dot-product Attention$ ，计算公式如下:$\operatorname{Attention}(Q, K, V)=\operatorname{softmax}\left(\frac{Q K^{T}}{\sqrt{d_{k}}}\right) V$

输入描述

第一行:序列长度 $L∈[1,10]$ 、特征维度 $D∈ [1,10]$

第二行:输入序列， $L×D$ 个数

第 $3$ ~ $5$ 行:第一层 $SeIfAttention$ 参数 $W_{q1}$ ， $W_k1$ ， $W_v1$ 每行 $D×D$ 个数

第 $6$ 行:第-层 $FC$ 参数 $W_{fc1}$ ， $D×D$ 个数

第 $7$ 行:第一层 $FC$ 偏置 $b_{fc1}$ ， $D$ 个数

第 $8$ ~ $10$ 行:第二层 $Self-Attention$ 参数 $W_{q2}$ ， $W_{k2}$ ， $W_{v2}$ ，每行 $D×D$ 个数

第 $11$ 行:第层 $FC$ 参数 $W_{fc2}$ ， $D×D$ 个数

第 $12$ 行:第二层 $FC$ 编置 $b_{fc2}$ ， $D$ 个数

输出描述

一行，即最终 $FC$ 输出， $L×D$ 个数

注:数据间用返号隔开，输出结果均保留 $2$ 位小数

样例1

输入

4,1
1.00,-3.00,9.50,6.50
-0.20
0.45
-0.20
0.60
0.15
0.23
-0.34
0.50
-0.32
0.05

输出

0.04,0.04,0.05,0.05

说明

输入：

首行: $4\ 1$ 表示序列长度 $L=4$ ，特征维度 $D=1$

第二行: $1.00,-3.00,9.50,6.50$

输入序列为 $4×1=4$ ，即 $4$ 个时刻的传感器数据

第 $3$ ~ $5$ 行:第一层 $SelfAttention$ 参数 $W_{q1}$ , $W_{k1}$ ， $W_{v1}$ (每行 $1$ 个数)

第 $6$ 行:第一层 $FC$ 参数 $W_{fc1}$ ( $1$ 个数)

第7行:第一层 $FC$ 偏置 $b_{fc1}$ ( $1$ 个数)

第 $8$ ~ $10$ 行:第二层 $Self-Attention$ 参数 $W_{q2},W_{k2},W_{V2}$ (每行1个数)

第 $11$ 行:第二层 $FC$ 参数 $W_{fc2}$ ( $1$ 个数)

第 $12$ 行:第二层 $FC$ 偏置 $b_{fc2}$ ( $1$ 个数)

输出：

最终 $FC$ 输出的 $4×1=4$ 个数，英文逗号分隔。

样例2

输入

2,2
1.00,2.00,3.00,4.00
0.10,0.20,0.30,0.40
-0.10,-0.20,-0.30,-0.40
0.50,0.60,0.70,0.80
-0.50,-0.60,-0.70,-0.80
0.01,0.02
0.11,0.12,0.13,0.14
-0.11,-0.12,-0.13,-0.14
0.21,0.22,0.23,0.24
-0.21,-0.22,-0.23,-0.24
0.03,0.04

输出

0.66,0.69,0.66,0.69

说明

输入：

首行: $2\ 2$ 表示序列长度 $L=2$ ，特征维度 $D=2$

第二行: $1.00,2.00,3.00,4.00$

输入序列为 $2×2$ ，分别为第 $1$ 时刻 $(1.00,2.00)$ ，第 $2$ 时刻 $(3.00,4.00)$

第 $3$ ~ $5$ 行:第一层 $SelfAttention$ 参数 $W_{q1}$ ， $W_{k1}$ ， $W_{v1}$ (每行 $2×2$ 个数)

第 $6$ 行:第一层 $FC$ 参数 $W_{fc1}$ ( $2×2$ 个数)

第7行:第一层 $FC$ 偏置 $b_{fc1}$ ( $2$ 个数)

第 $8$ ~ $10$ 行:第二层 $Self-Attention$ 参数 $W_{q2},W_{k2},W_{V2}$ (每行 $2×2$ 个数)

第 $11$ 行:第二层 $FC$ 参数 $W_{fc2}$ ( $2×2$ 个数)

第 $12$ 行:第二层 $FC$ 偏置 $b_{fc2}$ ( $2$ 个数)

输出：

最终 $FC$ 输出的 $2×2=4$ 个数，英文逗号分隔。

#P3562. 第3题-传感器数据分析

第3题-传感器数据分析