
general ADaM definitions

basic data structure definitions 基本数据结构定义

analysis datasets and ADaM datasets

compares and contrasts analysis datasets in general versus ADaM datasets in particular.

fundamental of the ADaM standard ADaM 标准的基础

fundamental principles 标准原则

  • must clearly and unambiguously communicate the content and source of the datasets supporting the statistical analyses in clinical study
  • must provide traceability to show the source or derivation of a value or a variable
  • must be readily usable with commonly available software tools 必须用规定的software(SAS)
  • must be associa with metadata to facilitate clear and unambiguous communication,and matedata are machine-readable 必须与元数据相连接,且元数据必须是清晰可读的
  • must be analysis-ready 必须是分析就绪的,既可进行统计分析的

traceability 可追溯性的

ADaM datasets and matedata 必须是清晰地可以展示如何创建ADaM数据集。并且为了验证创建ADaM数据库输入的数据,并使数据符合CDISC standard 包括SDTM standard 以及ADaM standard,所以必须使用traceability去明确SDTM和ADaM之间的关系
- matedata traceability 元数据可追溯性
* understanding of the relationship of the analysis variable to its source dataset(s) and variable(s) and is required for ADaM compliance 明白分析变量与其来源数据集和变量之间的关系
* traceability is established by describeing the algorithm used or steps taken to drive or populate an analysis variable from its immediate predecessor 通过元数据描述其前生衍生或填充分析变量所采用的算法或者步骤
* used to established the relationship between an analysis result and ADaM datasets 用于建立分析结果与ADaM数据集之间的关系
- datapoint traceability数据点可追溯性
* specifie predecessor record and should be implemented if practical and feasible 直接指定特定的先前的数据,并且是在可行且真实的情况下给予实施
* very helpful when trying to trace a complex data manipulation path 在追踪复杂的数据操作路径时,十分有用
* BDS和OCCDS 结构旨在是数据点追溯以前的数据

  • when traceability is successful implemented

    • information in the ADaM datasets that comes from the SDTM data ADaM数据集中的信息同样来自STDM 数据集
    • information that os derived or imputed within the ADaM datasets ADaM数据集中的来源或估算信息
    • the method used to create derive or imputed data用于创建派生或估算的方法
    • information used for analysis, or not used for analysis yet is included to support traceability or future analysis 用于分析或者暂时还没有用于分析的信息,但是里面还有可支持追溯性以及将使用的分析的信息

the ADaM data structures ADaM数据结构

  • must be able to identify clearly the data inputs and the algorithm used to created the derived information 清晰地识别数据输入和信息算法
  • understand how to use the ADaM dataset to replicate result or to explore alternative analysis 理解如何使用数据集复制结果或探索替代分析
  • the structure of an ADaM dataset does not necessarily limit the type of analysis that can be done ADaM数据集结构并不限制可以使用的分析数据的方法和类型
  • other advantage 其他优势
  • 减轻了数据集元数据的管理负担,并且观察类型以及包含的变量的可变性比较小
  • 可以开发软件工具以支持元数据管理和数据审查
  • 可预测结构允许使用一致性去验证

ADSL(the ADaM subject-level analysis Dataset 受试者等级分析数据集)

  • ADSL contains 1 record per subject 每个受试者有一条记录
  • contains variables such as subject-level population flags,planned and actual treatment variable,demographic information,randomization factors,subgrouping variables,stratification factors and important dates 包含变量 人群标志,计划和实际治疗变量,人口统计信息,随机因素,分组变量,分层因素和重要日期
  • it should be noted rgar although the ADSL contains subject-level variables that are also important in other datasets,there is no requirement that every ADSL variable be present in other analysis dataset 有些其他数据集中很重要的受试者等级变量包含在ADSL中,但是不需要分析其他数据集中存在的ADSL变量

BDS(The ADaM basic data structure ADaM 基础数据结构)

  • BDS dataset contains 1 or more record per subject,per analysis parameter,per analysis timepoint 包含每个受试者,每个分析参数,每个分析时间点的一个或多个记录
  • these variables include the value being analyzed and the description of the value being analyzed,other variables in the dataset provide more information about the value being analyzed,describe and trace the derivation of it, or enable the analysis of it。 包括要分析的值和要分析的值得描述,数据集中的其他变量提供有关正在分析得知的更多信息,描述和跟踪或启用对其的分析
  • may be derived from findings,events,intervention and special-purpose SDTM domain,other ADaM datasets or any combination thereof。also provide robust and flexible support for the performence and review of most statistical analysis。BDS数据集可以从SDTM数据域中衍生出来,发现类域,事件域,特殊用途域和干预域。BDS数据集为大多数统计分析的性能和审查提供了强大且灵活的支持
  • a record in an ADaM dataset can represent an observed,derived,or imputed value required for analysis 数据集中的记录可以展示分析所需的观测值,派生值或者是估算值
  • is flexible in that additional rows and columns can be added to support the analyses and provide traceability 可以灵活的添加行列去支持分析并提供可追溯性

standard ADaM variables 标准ADaM 变量

ADaM variable conventions ADaM 变量规则

general variable conventions 常规变量规则

timing variable conventions 时间变量规则

date and time imputation flag variable 时间和日期的分配标记变量(table

  • variable whose name end in DTF are date imputation flags.以DTF结尾的变量是日期的插补标注 DTF 是DT变量的最高级别,DT基于SDTM中的DTC变量
  • variable whose name end in TMF are time imputation flags,以TMF结尾的变量是时间插补标注

flag variable conventions 标志变量规定

  • flag and indicator are used interchangeably within this document and simply as flags
  • must be included in a dataset if the dataset is analyzed by the given population. At least 1 population flag is required for dataset used for analysis 数据集中必须包含总体标记,如果要有总体分析数据集存在时,宁缺用于分析的数据集至少需要一个总体标记
  • character and numeric subject-level population flag names end in FL and FN,parameter-level population flag names end in PFL and PFN 数字和字符主题受试者等级的总体标记以FL/FN结尾,参数等级的总体标记以PFL/PFN结尾
  • for subject-level character population flag variable,N/Y,null values not exist. 存在包括与不包括但是空值不存在
  • for subject-level numeric population flag variable N=0 1=Y NULL value are not allowed
  • for parameter-level character population flag variable,N/Y,null values not exist.
  • for parameter-level numeric population flag variable N=0 1=Y NULL value are not allowed
  • 除cha 3 中提到的总体标识变量外,还需要根据ADaM规定当添加其他总体表示变量进入ADaM数据集中时候
  • 对于结尾带有FL的变量不是填充标识的变量,指定Y/N/NULL

variable naming fragments 变量名片段

ADSL variables

the structure of the ADSL is i record per subject,regardless of the type of clinical trial design 无论临床试验设计的类型是什么样的,ADSL的结构都是每个受试者的一个记录
tables 都在IG中可以自己打印出来好好看
- timing variables notes

  • A set of analysis timing variables can be included in ADSL only if the definition for all of the variables in the set are fixed across the study 在整个研究固定一组分析时间变量且都固定的情况下,才能将这组分析时间变量都包括在ADSL中
  • If period timing variables are not included in ADSL,the subperiod timing variables must be excluded from ADSL 如果ADSL中不包含周期定时变量,那么必须从ADSL中排除子周期定时变量
  • if any definition of the variables in the set do vary,then none of the variables in the set can be included in ADSL 如果合集中变量的定义发生了明确的变化且合集中的变量定义不同,那么合集中的任何变量都不能包含在ADSL中
  • if all of the variable definitions in the set are the same cross all dataset,then they can be included in ADSL 如果合集中的所有变量定义在所有数据集中都相同,那么可以将他们包含在ASDL中

BDS (ADaM basic data structure variable) ADaM基本数据结构变量

identifier variable for BDS datasets

- STUDYID  研究标识  req
- USUBJID 受试者唯一id  req
- SUBJID 受试者标识  perm
- SITEID 研究地点标识  perm
- ASEQ 分析序号           perm

record-level treatment and does variables for BDS datasets 治疗记录等级和能记录在BDS数据集中的变量

tables in IG

timing variables for BDS datasets BDS 数据集中的日期变量

  • any SDTM timing variable may be copied into ADaM datasets when it supports data traceability and/or shows how ADaM timing variables contrast with SDTM timing data 任何在 SDTM中存在的时间变量,都可以复制存在于ADaM数据集中,并有助于显示可追溯性和去展示ADaM变量和SDTM数据进行对比
  • timing variables not directly characterized AVAL should be prefixed by character string AVAL的时间变量需要加上前缀

analysis parameter variables for BDS datasets BDS数据集中的分析参数变量

analysis parameter variables 分析参数变量

table in IG


  • PARAM is created to meet an analysis need , not just because something was collected 创建PARAM是为了满足分析需求,而不是仅仅收集一些东西
  • may describe an analysis value that is highly derived from subject data from any combination of SDTM domain of any class or classes and / or any ADaM datasets 可以描述一个分析值且凯泽任何SDTM域中或者ADaM数据集中
  • describe what is in AVAL or AVALC 描述AVAL或AVALC中的内容

for most parameters,only AVAL or AVALC will be populated,not both 在大多数参数中,只有AVAL或AVALC被填充,而不需要二者都被填充

  • when PARAM describes the numberic source of an individual question from questionnaire,AVAL contains the score and AVALC can be populated withe the question answer text. populating AVALC with question answer text is supportive of review 在PARAM中的单个问题含有数字分数时候,AVAL包含数字而AVALC则以文本形式为问题的填充答案
  • one-to-one relationship between AVAL and AVALC on the rows on which both are populated 填充行的AVALC和AVAL是一对一存在的关系
  • describes a character-value responses from a set of possible values, the result contained AVACL 在描述字符型回答时候,结果包含在AVALC中,且AVAL和AVALC 是一对一映射的关系在被填充的时候

analysis descriptor variables for BDS datasets BDS数据集中的分析描述类变量

在BDS 数据集给定的参数中,很重要的是如何区分AVAL/AVALC的特殊情况,以及了解使用什么方法以及算法进行填充,这时候会需要DTYPE变量
the following are some example of when DTYPE is required

  • consider a situation where the analysis value for a parameter is populated copying a value from SDTM dataset 作为参数被填充是从SDTM中复制过来的分析值的情况.除非是丢失值,不然如果缺少这个值需要用特殊情况去补上
  • consider a situation where the analysis value for a populated on the subject’s corresponding value in another parameter or dataset,unless the value is outside a specified range. 被填充的分析值在另一个数据集或参数中是相关联的,除非该值超出指定范围
  • consider a situation where in addition to a subject’s analysis value for each visit, additional timepoint is to be identified called’post-baseline’,withe analysis value populated with a average of the value from the subject’s on-treatment visites 每次访问受访者的分析值外,还需要考虑 post-baseline 这个情况的附加时间点,填充的值中有该受访者进行治疗的平均分析值
    DTYPE for a list of situation in which DTYPE should be populated 在下列情况下DTYPE应该被填充
  • 在已有的参数中创建一个新纪录,以便促进时间点的推导,并且根据各种推导算法算出分析值,such as endpoint, min,max,abd average post-baseline
  • 在已有参数中创建一个新纪录,代表受试者的缺失值被创建
  • 在已经存在的记录中,根据预先指定的算法进行了修改

time-to-event variable for BDS datasets BDS数据集中的时间时间变量

toxicity and range variable for BDS datasets BDS 数据集中的毒理和范围变量

indicator variables for BDS datasets BDS数据集中指标变量

for important point about the use of flag variables 使用指标变量中的要点

datapoint traceability variable 可追溯性数据点变量

  • variable to support datapoint traceability should be included whenever practical and feasible
  • variable used for datapoint traceability may also include any other variable that facilitate transparency and clarity of derivations and analysis 拥有数据点可追溯性的变量也需要包括足够的清晰度和透明度有助于推导和分析其他变量
  • if the value of AVAL or AVALC in the ADaM datasets is taken from another ADaM datasets(从另外一个ADaM数据集中获取的)
    • the SRCDOM/SRCVAR/SRCSEQ should be included the name of the sourc ADaM dataset,the value name and the ASEQ value of the row where the source datapoint is located,respectively
      这三variable name 需要包含源ADaM数据集中的信息,包括名称,变量名以及数据点所在的ASEQ对的值
    • 在补充数据集中需要添加前缀,–SQ的两个字母的域前缀或相关域的缩写
  • if all values of AVAL or AVALC in the ADaM dataset are taken from a single SDTMA domain 如果均取自单个SDTM域
    • all records in the ADaM datasets would have the same value for SRCDM and SRCVAR 所有记录在ADaM 数据集中应该是相同的和SRCDM and SRCVAR

Analysis-enabling variables 支持分析变量

class of variables that enable 1 or more of the analysis that the dataset was designed to support

difference between SDTM and ADaM population and baseline flag SDTM 和ADaM填充的人口和基础性标注间的不同

  • SDTM- supplemental qualifier value for subject-level population flags 在supp域中有很多补充值是关于受试者等级人口标志的
  • conceptual mapping from those term to ADaM indicator variable 有一些匹配概念属于ADaM变量
    • 当ADaM受试者等级中的人口标志可能域SDTM中的对应概念项不匹配时候,不必要也没有充分理由去修改这些变量,且ADaM还支持SDTM中不存在的参数等级记录的填充标志
    • 如果基准线不匹配时候,基准线只在ADaM中存在,并且若希望对基准线的不同定义进行分析,则加上填充标志一起处理这些实际情况
    • 处于分析目的,只要找ADaM数据集中找到用于分析的总体和基线标志的值,标志应存在在元数据中(ADaM)

implementation issue,standar solutions,and examples 实施问题,标准解决方案和例子

creation of derived columns versus creation of derived rows

  • provides specific rules to use in building a BDS dataset.提供构建于BDS数据集的特定规则
  • all analysis-enabling variables and supportive variables included in a predictable structure,while preventing a ‘horizontalization’ of the dataset 在分析和支持分析的变量中的可预测变量是为了防止数据集的水平化
  • the rows in the ADaM BDS represent subject data for analysis paramets and timepoints 行代表受试者分析数据的参数和时间点,如何去做取决于观察数以及基线等
  • contains a central set of column that represent the data being analyzed 所包含的中央列是存在在代表要分析的数据中,包含要分析的值和要分析的值的描述
  • the BDS is flexible in that derived data can be added to the collected data as additional rows and columns that support the analyses and provide traceability BDS 是具有灵活性的,派生书记是将收集到的可作为支持分析有可追溯性的数据并作为行和列添加到数据集中
  • the precise sequence of steps involved in creating a BDS ADaM dataset varies accordi to operational and study-specific needs 创建数据集所涉及的步骤的确切顺序需要根据操作和研究的特定需求而变化
    • create an initial dataset from the source dataset 从源数据集中创建初始数据集

      • first step 创建一组行或列
    • add additional derived data as needed for the analysis 根据分析需求添加其他派生数据 (includ 6 rules)

rules for the creation of row and columns

Rule 1:A parameter-invariant function of AVL and BASE on the same row that does not involve a transform of BASE should be added as a new column 与AVL 和 VASE 在同一行上的是一个不变参数函数,在不涉及base的转换的情况下,这个不变参数函数作为一个新列添加

  • the function is of AVAL and optionaly,base on the same row 该函数在同一行中具有AVAL,并且可以选择的具有BASE
  • the function is parameter-invariant 函数是具有参数不变的
  • the function does not involve a transform of base 不涉及base转换

Rule 2:A transformation of AVAL that does not meet the conditions of Rule 1 should be added as a new parameter,and AVAL should contain the transformed value 将不符合Rule 1 中的AVAL 转换为一个新的参数,并包含转化值

  • the creation of a new parameter result, by definition, in the creation of a new set of rows 根据定义,新创建参数作为一个新的行被放入一个新的组里面

Rule 3:A function of one or more rows within the same parameter for the purpose of creating an analysis timepoint should be add as a new row for the same parameter 为了创建分析时间点,将同一参数中的一行或者多行的功能添加为同一参数的新行


Rule 4:A function of multiple rows within a parameter should be added as a new parameter 参数中应具有多个行的功能作为新参数被添加

  • R4 是R2 的特例,这个规则的涵盖功能违反了规则1中的条件兵器还违反了第一和第三条件

Rule 5:A function of more than one parameter should be added as a new parameter 应将一个以上的参数函数添加作为新的参数

  • 通常需要导出未收集的参数进行分析,这个规则解决了从数据集中以及存在的其他参数派生参数的情况

Rule 6:when there is more than one definition of baseline,each additional definition of baseline creation of its own set of row 当存在多个基准线的时候,每个附加的基准线都需要被定义并且创建属于自己的一组行

  • 根据这个rule 代表会有多个组行,其中每个组行对应的基线的特定定义
  • 需要BASETYPE列,与每行中的BASE的值相对应

inclusion of all observed and derived records for a parameter versus the subset of records used for analysis

  • discusses whether the ADaM dataset should include all rows of an analysis parameter, or only the subset of rows that are used for analysis 讨论ADaM数据集中是包含分析参数的所有行,还是仅包含用于分析的行的子集
  • A value of AVAL or AVALC for an analysis parameter at aspecific timepoint may be observed 观察在特定时间点的分析参数AVAL和AVALC

ADaM methodology ADaM 方法论

  • include all observed and derived row for a given analysis parameter 包含了所有分析变量所需要观察的行和派生的行,在ADaM中包含所有的行,其中包括分析中未使用的行,在这种情况下需要这种方法来指定分析中所使用的行
  • 优点是 更容易验证是否正确且有追溯性和灵活性
  • 还可以启用其他分析,包括敏感性分析

inclusion of input data that are not analyzed but that support a derivation in the ADaM datas

  • addresses the broader issue of whether an ADaM dataset should contain the input data used in the derivation of the analysis data as weill as the actual data analyzed 是够包含导出的分析数据以及分析实际数据时候所需要输入的数据

    • input data rows and columns to support traceability of the derivation of analyzed rows and columns 输入行数据或列数据以便支持所分析的行或列进行派生并且确保具有可追溯性
    • raw or derived predecessor parameter that are not analyed themselves but are used to derive an analyzed parameter 原始或派生的原参数,这些参数不作为分析参数但是用于已分析派生参数

ADaM Methodology

  • 有助于进行预期分析
  • 实现可追溯性,并描述元数据中的派生算法
  • 支持尽可能多的数据以便确保实现追溯性
  • 包含分析数据,输入数据,并且这些输入的数据可以更加清晰的用于导出分析数据的算法并且具有可追溯性
  • 数据集中还可以包含分析中未使用的行,且包含输入数据的行以及包含在分析数据推导期间计算出的中间值的行
  • 标准或其他列用于区分各种类型的数据,并提供可最追溯性

identification of records used for analysis 识别用于分析的记录

identification of records used in a timepoint imputation analysis 识别用于分析时间点归因的记录

  • consider the issue of how to identify records used in a timepoint-related imputation analysis as well as how to represent data imputation for missing timepoint in an ADaM dataset 如何识别与时间点相关的插补记录,以及如何表示数据集中缺少时间点的数据问题

ADaM Methodology

  • when an analysis timepoint is missing,the ADaM M is to create a new record in the ADaM dataset to represent the missing timepoint and identify these imputed record by populating the derivation type variable DTYPE 当分析时间点缺少时候,方法为将在数据集中创建一个新纪录来表示缺少的时间点,并且通过填充派生型变量DTYPE来识别这些记录

identification of baseline records 基准线的识别

ADaM Methodology

  • create a baseline flag column to indicate the record used as baseline 创建一个标准线的标志来展示作为基准线的记录,若不导出则不需要记录

identification of post-baseline conceptual timepoint record 后基准线概念记录的识别

  • analysis involve cross-timepoint derivation 分析涉及跨越时间点的派生记录

ADaM Methodology

  • is to create a new record with a unique value of AVISIT in cases where analysis is based on AVIST 分析基于AVISIT的情况下创建具有AVISIT唯一值的新纪录,便于分析,唯一的记录用于指定用于分析的记录

identification of records used for analysis- general case

  • important to identify the records used in or excluded from analysis 确定用于分析或者被从分析中删除的记录是否重要

ADaM Methodology

  • use an analysis flag(ANLzzFL) to indicate the records that fulfill specific requirement for 1 or more analyses 使用分析标记来满足一项或多向分析特定要求的记录

identification of population-specific analyzed records确定特定人群的分析记录

  • 假定记录分析等级的总体记录取决于受试者等级的总体定义,如何更好地为分析选择记录

ADaM Methodology

  • create single ADaM dataset that can be used to perform multiple analyses using population flag variables to identify records that are used for each type of analysis 创建一个单独的数据集,用于记录总体标记变量进行多种类型的分析,并便于记录和识别每种分析类型
  • 优势在于单个数据集用于多个分析

identification of records which satisfy a predefined criterion for analysis purpose

often defined to group result based on the collected value’s relationships to one or more algorithmic conditions 通过定义标准去收集标准下的值和一种与多种算法条件的关系,并对结果进行分组

when the criterion has binary responses 当标准具有二进制响应时

ADaM Methodology

  • provides an analysis criterion variable,CRITy,paired with a criterion evaluation result flag,CRITyFL, to identify whether a criterion is met 提供了CRITy作为分析变量,与标准评估结果标准CRITyFL 配对,识别是否满足标准
  • CRITY 中填充的是文本描述,定义满足标准条件的所需条件
  • CRITyFL 是指是否满足CRITy中描述的标准字符的指示器

when the criterion has multiple responses 当条件中有多重回应时

ADaM Methodology

  • provide an analysis criterion variable,MCRITy, paired with a criterion evaluation result flag,MCRITyML,to identify which level of a multiple responses criterion is met 提供了MCRITy作为分析变量,与标准评估结果标准MCRITyML 配对,识别是否满足标准
  • MCRITY 中填充的是文本描述,定义满足标准条件的所需条件
  • MCRITyFL 是指是否满足MCRITy中描述的标准字符的指示器

Other issue to consider

adding records to create a full complement of analysis timepoint for every subject 添加记录并为每个主题创建完整的分析时间点

  • 对于未完全参加全部受试的受试者,可以在ADaM中为这些创建记录

creating multiple dataset to support analysis of the same type of data 为相同类型的数据常见多重数据集去补充分析

  • SAP 通常会指定将使用略微不同的方法进行分析
  • ADaM 提供了可用于标识不同目的记录的变量,但是并不意味着生产者不能提交相似的多重数据集,且每个数据集都为特定分析而设计

size of ADaM dataset 数据集大小

  • 考虑数据集的大小,避免在转移加载的时候,在软件处理中造成问题
  • 接受者们需要对数据集讨论以及明确记录下列

traceability when the multiple imputation method is used 使用多重插补模型的可追溯性

  • 创建多个数据集并为每个缺失的数据值敲定合理的值
  • 使用所需的统计程序分析这些数据集中的每个数据集,并生成统计估计值
  • 使用生成的估计值们来进行组合

copying values onto a new record 将值复制进一个新的记录中

  • 当一个记录由多个记录派生而来时,在派生记录上保留所有的原纪录不变,不变的变量值,在新纪录中和所保留的原始记录中是同样具有意义的

