Pipeline for analyzing Electronic Health Records data of the Alzheimer Disease Centers type

Added by Kat Galkina about 2 years ago.

The pipeline would include:

1) (Preferably easy) import of individual-level data
2) Recoding of several variables into same column (for purposes of handling missing data). Example: MMSE has 5 codes to designate non-compliance 95-98, so designate a range (above 90) as missing or specific numbers.
3) Filter on several variables. Example: above (>), below (<), equal to (=) a number X
4) Run 3 scripts: 1) reshape, 2) add orthogonal variables, 3) run linear regression analysis (these functions are available in R).
The first script creates more rows than columns ("wide" data to "long").
The second scripts adds an additional row of data.
So I believe the output from each script needs to feed into the next.

I understand that at least half of this is already done, but needs to be tested with my data. Thanks, Kat.

