Imports a CSV dataset as an IQRdataGENERAL object
IQRdataGENERAL.Rd
The general format of an IQRdataGeneral dataset is documented in the first table below.
Some columns are required, others are optional and do not need to be present in the
CSV file. The table below defines default values for optional columns. These optional
columns will be added to the IQRdataGENERAL object with the default settings. For more
information, please see the Details section and visit
https://iqrtools.intiquan.com/doc/book/analysis-dataset-preparation.html for the
most recent version. The version valid for the installed version of IQR Tools is
available from the function call doc_IQRtools()
.
IQRdataGENERAL(
input,
doseNAMES = NULL,
obsNAMES = NULL,
cov0 = NULL,
covT = NULL,
cat0 = NULL,
catT = NULL,
covInfoAdd = NULL,
catInfoAdd = NULL,
methodBLLOQ = "M1",
FLAGforceOverwriteNLMEcols = TRUE,
FLAGtaskEventsOnly = TRUE,
FLAGnoNAlocf = FALSE
)
Arguments
- input
Path to an datafile or data.frame
- doseNAMES
Character string or vector with character strings, defining the names of the events that are to be considered as dose events. These names need to match the entries in the NAME column. If doseNAMES is not provided, doses are identified by ROUTE being not NA.
- obsNAMES
Character string or vector with character strings, defining the names of the events that are to be considered as observations events. These names need to match the entries in the NAME column. Adverse event NAMEs cannot be selected. If obsNAMES is not provided all records that are not doses will be considered as observations.
- cov0
(Handling covariates that are stored as events in the long format) List, defining the TIME INDEPENDENT CONTINUOUS covariates. Entries in list need to be named by a name that defines the name of the covariate column to create and the value needs to be a character string with the NAME of the event to consider as this covariate. The rule for the definition of these covariates is: If baseline assessments are defined in the BASE column then take the mean. If no baseline defined then use the mean of SCREEN observations. If no SCREEN observations use the mean of all pre-first dose values (TIME<=0). If no pre-first dose values set to NA. Adverse event NAMEs cannot be selected.
- covT
(Handling covariates that are stored as events in the long format) List, defining the TIME DEPENDENT CONTINUOUS covariates. Entries in list need to be named by a name that defines the name of the covariate column to create and the value needs to be a character string with the NAME of the event to consider as this covariate. Carry-backward from first observation is used until first observation. Then carry-forward is used. Adverse event NAMEs cannot be selected.
- cat0
(Handling covariates that are stored as events in the long format) Same as cov0 but for TIME INDEPENDENT CATEGORICAL covariates
- catT
(Handling covariates that are stored as events in the long format) Same as covT but for TIME DEPENDENT CATEGORICAL covariates
- covInfoAdd
-
(Handling covariates that are stored in additional columns in the dataset) List (no data.frame) with information about additional continuous covariates. The list needs to contain 4 named elements: COLNAME, NAME, UNIT, TIME.VARYING (in this order). Each value of these elements is a vector. COLNAME defines the column names of the additional covariates, NAME the "real" name, UNIT their UNIT, and TIME.VARYING is TRUE or FALSE depending ... Example:
covInfoAdd <- list(COLNAME=c("WT0","AGE0"),NAME=c("Bodyweight at baseline","Age"), UNIT=c("kg","years"),TIME.VARYING=c(FALSE,FALSE))
- catInfoAdd
-
(Handling covariates that are stored additional columns in the dataset) List (no data.frame) with information about additional categorical covariates. The list needs to contain 6 named elements: COLNAME, NAME, UNIT, VALUETXT, VALUES, TIME.VARYING (in this order). Each value of these elements is a vector. COLNAME defines the column names of the additional covariates, NAME the "real" name, and UNIT their UNIT. VALUETXT is a string defining a vector in R notation, containing the text version of the different levels. VALUES is a string defining a numeric vector in notation, containing the numeric version of the different levels (used in the corresponding covariate column), and TIME.VARYING is TRUE or FALSE depending ... Example:
catInfoAdd <- list( COLNAME=c("SEX","FOOD"),NAME=c("Gender","Food taken"), UNIT=c("-","-"),VALUETXT=c("Male,Female", "NO,YES"), VALUES=c("1,2","0,1"),TIME.VARYING=c(FALSE,FALSE))
- methodBLLOQ
Allows specification of the method for handling values below the lower limit of quantification. By default the "M1" method is used. Alternative settings are "M3", "M4", "M5", "M6", and "M7".
- FLAGforceOverwriteNLMEcols
TRUE: if NLME columns (see second table above) are present in the dataset they will be overwritten with default values. FALSE: they will not be overwritten and kept as they are - user has to ensure that their contents do make sense!
- FLAGtaskEventsOnly
For test function purpose only! Please do not change this setting - unless you know EXACTLY what you do!
- FLAGnoNAlocf
If FALSE (default) the time dependent covariates will be imputed by a last observation carried forward (LOCF) approach for NA values. If TRUE, then no LOCF imputation will be done and values in the covariate column will be NA.
Value
An IQRdataGENERAL object
Details
During import, the following is done:
Sanity checks are made on the dataset and if needed, the user is notified by errors, warnings, and normal messages
Modeling tools related columns are added (see second table below) - if not yet present in the data
Covariate columns might be added (see input arguments cov0, covT, cat0, catT)
Observations below the LLOQ are handled based on the setting of the input argument methodBLLOQ.
Records that are neiter dose nor observation records are removed (depending on setting of input argument FLAGtaskEventsOnly)
The resulting output argument is of class IQRdataGENERAL.
Some additional information:
It would be good practice to avoid the use of commata in any text entries, as this might interfer with subsequent handling of files where entries are separated by commata.
The following elements will be interpreted as NA: "."," ","","NA","NaN"
The loaded dataset will be sorted by (STUDY), USUBJID, TIME, (TYPENAME), NAME. Note that this might lead to non-sequential entries in the IXGDF column. The sorting by STUDY and/or TYPENAME is ignored if these columns are not present in the data.
Columns in the imported dataset that do not match column names in the following two tables will be retained in the output argument
In order to identify doses and observations, the user needs to provide the function with additional arguments that identify the NAMEs of doses and observations.
Adverse events (by NAME) can not be used in the generation of the DV and covariate column information. This might be added in a future version.
STUDYN and TRT will be added automatically as covariate if information available in dataset.
-
Changes in the first data format update since 5 years (Version >=1.3.0):
During import of a general dataset the non-required columns are not generated with default content anymore. This leads to considerable smaller and more pleasant datasets.
Columns that were essentially never used have been deprecated by removing them from the first table below. The deprecated columns are listed in the third table below. They still are supported though to allow backward compatibility.
COLUMN | REQUIRED | DEFAULT | DESCRIPTION |
======== | ======== | =============================== | =============================================================== |
IXGDF | - | 1:nrows | (NUMERIC) Index of record in dataset. Starting from 1, then 2,3,... until last record/row number |
IGNORE | - | NA | Reason/comment related to exclusion of the observation or dose from the analysis. If no entry then event is not ignored |
USUBJID | YES | - | Unique subject identifier |
INDNAME | - | Not included in dataset | Indication name |
IND | - | 1-N for |
(NUMERIC) Numeric indication flag |
COMPOUND | - | Not included in dataset | Name of the investigational compound |
STUDY | - | Not included in dataset | Short study name/number |
STUDYN | - | 1-N for |
(NUMERIC) Numeric study flag |
TRTNAME | - | Not included in dataset | Name of actual treatment given to subject |
TRT | - | 1-N for |
(NUMERIC) Numeric treatment flag |
VISIT | - | Not included in dataset | Visit number |
VISNAME | - | Not included in dataset | Visit name |
BASE | - | 0 | Flag indicating assessments at baseline (0 for non-baseline, 1 for first, 2 for second, ...) |
SCREEN | - | 0 | Flag indicating assessments at screening (0 for non-screening, 1 for first, 2 for second, ...) |
TIME | YES d | - | (NUMERIC) Actual time of event relative to first dose administration. |
DURATION | - | 0 | (NUMERIC) Duration of event (-1 if ongoing longer than observation time). |
TIMEUNIT | YES | - | Unit of all numeric time definitions in the dataset ("HOURS", "MINUTES", "DAYS", "SECONDS", "WEEKS", "MONTHS", "YEARS" |
NT | - | Not included in dataset | (NUMERIC) Nominal event time |
PROFNR | - | Not included in dataset | (CHARACTER) Name or number of the profile |
PROFTIME | - | Not included in dataset | (NUMERIC) Profile time - relative to previously given dose |
TYPENAME | - | Not included in dataset | Unique name of type of event. |
NAME | YES | - | Unique short name of event |
VALUE | YES | - | (NUMERIC) Value of event defined by NAME. Not used for Adverse Event records (should be set to 0) |
VALUETXT | - | NA | Text version of value (Instead of a VALUE, a VALUETXT can be entered. VALUE can have also an entry but then for the same VALUE the same VALUETXT has to be used for a specific event NAME. If VALUETXT is defined, VALUE can be undefined (NA). VALUETXT makes only sense for categorical information |
UNIT | YES | - | Unit of the value reported in the VALUE column |
OCC | - | Not included in dataset | (NUMERIC) Integer values defining separate occasions for IOV (1-N) |
ULOQ | - | Not included in dataset | (NUMERIC) Upper limit of quantification for event defined by NAME (value only interpreted for observation events) |
LLOQ | - | NA | (NUMERIC) Lower limit of quantification for event defined by NAME (value only interpreted for observation events) |
ROUTE | YES | - | Route of administration ("IV","SUBCUT","ORAL","INHALED","INTRAMUSCULAR","INTRAARTICULAR","RECTAL","TOPICAL","GENERAL_IV","GENERAL_ABS1","GENERAL_ABS0"). If not a dosing record set to: NA |
II | - | 0 | (NUMERIC) Interval of dosing (value only interpreted for dosing events) |
ADDL | - | 0 | (NUMERIC) Number of ADDITIONAL doses given with the specified interval (value only interpreted for observation events) |
AE | - | Not included in dataset | (NUMERIC) Defines if the record codes an adverse event (0: no, 1: yes) |
AEGRADE | - | NA (if |
(NUMERIC) Grade of adverse event |
AESER | - | NA (if |
(NUMERIC) Flag (0 or 1) Seriousness of adverse event |
AEDRGREL | - | NA (if |
(NUMERIC) Flag (0 or 1) Drug related adverse event or not |
COMMENT | - | Not included in dataset | Additional information for the observation/event |
During import the following columns will be added to the dataset. If these columns are already present in the input dataset they can be kept or overwritten (set by the input argument FLAGforceOverwriteNLMEcols).
COLUMN | DESCRIPTION | ======== | =============================================================== |
ID | (NUMERIC) Unique subject ID for modeling software | TIMEPOS | (NUMERIC) TIME shifted to have TIMEPOS=0 at first event in a subject |
TAD | (NUMERIC) Time since last dose (pre-first-dose values same as TIME). This columns does not make a difference between different dose names. It contains the time since last dose, independently of the DOSENAME. If no dose defined in subject it is NA. TAD before the first dose is TIME | DV | (NUMERIC) Observation value (0 for dosing events). Set to LLOQ if BLOQ handling method is M3 or M4. Set to LLOQ/2 is M5 or M6. Set to 0 if M7. If VALUE undefined but VALUETXT defined, DV will be determined as 1:N for alphabetic ordering of VALUETXT. |
MDV | (NUMERIC) Missing data value columns (0 if observation value is defined and IGNORE is NA, 1 for dose records and for NA observation values, 1 for all records that do have IGNORE not NA). MDV=1 for values below LLOQ if M1 method. If M6 method then MDV=0 for first DV<LLOQ in a sequence and MDV=1 for the following in a sequence. If Value of an observation or time information is missing, MDV will be set to 1 as well and an entry in IGNORE will be made. | EVID | (NUMERIC) Event ID. 0 for observations, 1 for dosing records |
CENS | (NUMERIC) Censoring column. Depending on the method for BLLOQ handling this column is set. If M3 or M4 method is chosen then CENS=1 if DV<LLOQ. If M1, M5, M6, or M7 then CENS=0. 0 for all dosing events. | AMT | (NUMERIC) Dose given at dosing instant (0 for observation records) |
ADM | (NUMERIC) Administration column. 0 for observation events. Number of input for dosing events. Usually defined by the user. Default values if not user defined: If more than one dose is considered, the order of the defined dose NAMEs defines the ADM number (1 for first, 2 for second, ...) If a single dose is considered, then ADM is selected according to the information in ROUTE: 1 for: SUBCUT, ORAL, INTRAMUSCULAR, INTRAARTICULAR, RECTAL, INHALED, GENERAL_ABS1 - 2 for: IV, GENERAL_IV - 3 for: TOPICAL, GENERAL_ABS0 | TINF | (NUMERIC) Infusion time (TIMEUNIT). (0 for observation records, DURATION for dose records) |
RATE | (NUMERIC) Calculated from AMT and TINF | YTYPE | (NUMERIC) Observation number. 0 for dosing records. 1,2,3,4, ... for observation records. If observations provided in obsNAMES then this order will be used. Non-doses that are not defined in obsNAMES will obtain YTYPE=0 |
DOSE | (NUMERIC) Carry forward of the last defined AMT of a dose event. Values before first dose get the DOSE set to 0. If no dose present in subject DOSE is set to 0. This column does not make a difference between different dose names. It contains the AMT since last dose, independently of the selected doses in doseNAMES | TADDx | (NUMERIC) Dosing input specific TAD column. Only present if more than one dosing input defined in "doseNAMES". "x" defines the index of the dose NAME in doseNAMES. If a dose NAME does not appear in a subject the value is set to NA. |
The following columns are deprecated. Their usefulness was going towards 0 and no code and functionality relied on them. The general dataset can still include them to ensure backward compatibility.
COLUMN | REQUIRED | DEFAULT | DESCRIPTION |
======== | ======== | =============================== | =============================================================== |
SUBJECT | - | Not included in dataset | Subject number |
CENTER | - | Not included in dataset | Center number |
STUDYDES | - | Not included in dataset | Study title, short description |
PART | - | Not included in dataset | Part of study as defined per protocol (1=part 1, A=part A, ...) |
EXTENS | - | Not included in dataset | Extension of the core study (0=core, 1=extension 1, 2=extension 2, ...) |
TRTNAMER | - | Not included in dataset | Name of treatment to which subject was randomized |
TRTR | - | Not included in dataset | (NUMERIC) Numeric randomized treatment flag |
DATEDAY | - | Not included in dataset | Start date of event (dd-mmm-yyyy) |
DATETIME | - | Not included in dataset | Start time of event (HH:MM:SS) |
See also
Other IQRdataGeneral:
+.IQRdataGENERAL()
,
IQRcalcTAD()
,
IQRexpandADDLII()
,
IQRloadCSVdata()
,
IQRsaveCSVdata()
,
addCovariateInfo_IQRdataGENERAL()
,
addIndivRegressors_IQRdataGENERAL()
,
addLabel_IQRdataGENERAL()
,
attributes0()
,
blloqInfo_IQRdataGENERAL()
,
blloq_IQRdataGENERAL()
,
check_IQRdataGENERAL()
,
clean_IQRdataGENERAL()
,
combine_IQRdataGENERAL()
,
convertCat2Text()
,
covImpute_IQRdataGENERAL()
,
date2dateday_IQRdataProgramming()
,
date2datetime_IQRdataProgramming()
,
date2time_IQRdataProgramming()
,
exportDEFINE_IQRaedataER()
,
exportDEFINE_IQRdataGENERAL()
,
exportDEFINEpdf_IQRdataGENERAL()
,
exportSYS_IQRdataGENERAL()
,
export_IQRdataGENERAL()
,
getLabels_dataframe()
,
getNAcolNLME_IQRdataGENERAL()
,
handleSameTimeObs_IQRdataGENERAL()
,
is_IQRdataGENERAL()
,
loadATRinfo_csvData()
,
loadAttributeFile()
,
load_IQRdataGENERAL()
,
mapCategoricalCovariate_IQRnlmeProject()
,
mapCategoricalCovariate_csvData()
,
mapContinuousCovariate_IQRnlmeProject()
,
mapContinuousCovariate_csvData()
,
mutateCov_IQRdataGENERAL()
,
obfuscate_IQRdataGENERAL()
,
plot.IQRdataGENERAL()
,
plotCorCat_IQRdataGENERAL()
,
plotCorCovCat_IQRdataGENERAL()
,
plotCorCov_IQRdataGENERAL()
,
plotCovDistribution_IQRdataGENERAL()
,
plotDoseSchedule_IQRdataGENERAL()
,
plotIndiv_IQRdataGENERAL()
,
plotObsSummarizedByCovCat_IQRdataGENERAL()
,
plotRange_IQRdataGENERAL()
,
plotSampleSchedule_IQRdataGENERAL()
,
plotSpaghetti_IQRdataGENERAL()
,
print.IQRdataGENERAL()
,
removeCommata_dataframe()
,
rmAMT0_IQRdataGENERAL()
,
rmDosePostLastObs_IQRdataGENERAL()
,
rmIGNOREd_IQRdataGENERAL()
,
rmMissingTIMEobsRecords_IQRdataGENERAL()
,
rmNOobsSUB_IQRdataGENERAL()
,
rmNonTask_IQRdataGENERAL()
,
rmPLACEBO_IQRdataGENERAL()
,
rmSubjects_IQRdataGENERAL()
,
setIGNORErecords_IQRdataGENERAL()
,
setMissingDVobsRecordsIGNORE_IQRdataGENERAL()
,
subset.IQRdataGENERAL()
,
summary.IQRdataGENERAL()
,
summaryCat_IQRdataGENERAL()
,
summaryCov_IQRdataGENERAL()
,
summaryObservations_IQRdataGENERAL()
,
transformObs_IQRdataGENERAL()
,
unlabel_dataframe()