Modelling Complex Survey Data Using R, SAS, SPSS and Stata: A Comparison Using CLSA Datasets Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2010.09879
The R software has become popular among researchers due to its flexibility and open-source nature. However, researchers in the fields of public health and epidemiological studies are more customary to commercial statistical softwares such as SAS, SPSS and Stata. This paper provides a comprehensive comparison on analysis of health survey data using the R survey package, SAS, SPSS and Stata. We describe detailed R codes and procedures for other software packages on commonly encountered statistical analyses, such as estimation of population means and regression analysis, using datasets from the Canadian Longitudinal Study on Aging (CLSA). It is hoped that the paper stimulates interest among health science researchers to carry data analysis using R and also serves as a cookbook for statistical analysis using different software packages.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2010.09879
- https://arxiv.org/pdf/2010.09879
- OA Status
- green
- References
- 2
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W3094241084
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3094241084Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2010.09879Digital Object Identifier
- Title
-
Modelling Complex Survey Data Using R, SAS, SPSS and Stata: A Comparison Using CLSA DatasetsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-10-19Full publication date if available
- Authors
-
Hon Yiu So, Urun Erbas Oz, Lauren Miller Griffith, Susan Kirkland, Jinhua Ma, Parminder Raina, Nazmul Sohel, Mary Thompson, Christina Wolfson, Changbao WuList of authors in order
- Landing page
-
https://arxiv.org/abs/2010.09879Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2010.09879Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2010.09879Direct OA link when available
- Concepts
-
R package, Software, Computer science, Flexibility (engineering), Statistical software, Data science, Regression analysis, Statistical analysis, Data mining, Software package, Survey data collection, Statistics, Mathematics, Machine learning, Computational science, Programming languageTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- References (count)
-
2Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3094241084 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2010.09879 |
| ids.doi | https://doi.org/10.48550/arxiv.2010.09879 |
| ids.mag | 3094241084 |
| ids.openalex | https://openalex.org/W3094241084 |
| fwci | |
| type | preprint |
| title | Modelling Complex Survey Data Using R, SAS, SPSS and Stata: A Comparison Using CLSA Datasets |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10866 |
| topics[0].field.id | https://openalex.org/fields/27 |
| topics[0].field.display_name | Medicine |
| topics[0].score | 0.6305999755859375 |
| topics[0].domain.id | https://openalex.org/domains/4 |
| topics[0].domain.display_name | Health Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2739 |
| topics[0].subfield.display_name | Public Health, Environmental and Occupational Health |
| topics[0].display_name | Nutritional Studies and Diet |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2984074130 |
| concepts[0].level | 2 |
| concepts[0].score | 0.666056752204895 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q73539779 |
| concepts[0].display_name | R package |
| concepts[1].id | https://openalex.org/C2777904410 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6462762355804443 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q7397 |
| concepts[1].display_name | Software |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.6182385683059692 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C2780598303 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6064727306365967 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q65921492 |
| concepts[3].display_name | Flexibility (engineering) |
| concepts[4].id | https://openalex.org/C2987757206 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5942795276641846 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q494735 |
| concepts[4].display_name | Statistical software |
| concepts[5].id | https://openalex.org/C2522767166 |
| concepts[5].level | 1 |
| concepts[5].score | 0.5108830332756042 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2374463 |
| concepts[5].display_name | Data science |
| concepts[6].id | https://openalex.org/C152877465 |
| concepts[6].level | 2 |
| concepts[6].score | 0.46800580620765686 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q208042 |
| concepts[6].display_name | Regression analysis |
| concepts[7].id | https://openalex.org/C2986587452 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4504387080669403 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q938438 |
| concepts[7].display_name | Statistical analysis |
| concepts[8].id | https://openalex.org/C124101348 |
| concepts[8].level | 1 |
| concepts[8].score | 0.44551706314086914 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[8].display_name | Data mining |
| concepts[9].id | https://openalex.org/C3020440742 |
| concepts[9].level | 3 |
| concepts[9].score | 0.444801926612854 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q1176855 |
| concepts[9].display_name | Software package |
| concepts[10].id | https://openalex.org/C198477413 |
| concepts[10].level | 2 |
| concepts[10].score | 0.43599367141723633 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q7647069 |
| concepts[10].display_name | Survey data collection |
| concepts[11].id | https://openalex.org/C105795698 |
| concepts[11].level | 1 |
| concepts[11].score | 0.3565610647201538 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[11].display_name | Statistics |
| concepts[12].id | https://openalex.org/C33923547 |
| concepts[12].level | 0 |
| concepts[12].score | 0.15234774351119995 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[12].display_name | Mathematics |
| concepts[13].id | https://openalex.org/C119857082 |
| concepts[13].level | 1 |
| concepts[13].score | 0.11253178119659424 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[13].display_name | Machine learning |
| concepts[14].id | https://openalex.org/C459310 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q117801 |
| concepts[14].display_name | Computational science |
| concepts[15].id | https://openalex.org/C199360897 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[15].display_name | Programming language |
| keywords[0].id | https://openalex.org/keywords/r-package |
| keywords[0].score | 0.666056752204895 |
| keywords[0].display_name | R package |
| keywords[1].id | https://openalex.org/keywords/software |
| keywords[1].score | 0.6462762355804443 |
| keywords[1].display_name | Software |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.6182385683059692 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/flexibility |
| keywords[3].score | 0.6064727306365967 |
| keywords[3].display_name | Flexibility (engineering) |
| keywords[4].id | https://openalex.org/keywords/statistical-software |
| keywords[4].score | 0.5942795276641846 |
| keywords[4].display_name | Statistical software |
| keywords[5].id | https://openalex.org/keywords/data-science |
| keywords[5].score | 0.5108830332756042 |
| keywords[5].display_name | Data science |
| keywords[6].id | https://openalex.org/keywords/regression-analysis |
| keywords[6].score | 0.46800580620765686 |
| keywords[6].display_name | Regression analysis |
| keywords[7].id | https://openalex.org/keywords/statistical-analysis |
| keywords[7].score | 0.4504387080669403 |
| keywords[7].display_name | Statistical analysis |
| keywords[8].id | https://openalex.org/keywords/data-mining |
| keywords[8].score | 0.44551706314086914 |
| keywords[8].display_name | Data mining |
| keywords[9].id | https://openalex.org/keywords/software-package |
| keywords[9].score | 0.444801926612854 |
| keywords[9].display_name | Software package |
| keywords[10].id | https://openalex.org/keywords/survey-data-collection |
| keywords[10].score | 0.43599367141723633 |
| keywords[10].display_name | Survey data collection |
| keywords[11].id | https://openalex.org/keywords/statistics |
| keywords[11].score | 0.3565610647201538 |
| keywords[11].display_name | Statistics |
| keywords[12].id | https://openalex.org/keywords/mathematics |
| keywords[12].score | 0.15234774351119995 |
| keywords[12].display_name | Mathematics |
| keywords[13].id | https://openalex.org/keywords/machine-learning |
| keywords[13].score | 0.11253178119659424 |
| keywords[13].display_name | Machine learning |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2010.09879 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2010.09879 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2010.09879 |
| locations[1].id | doi:10.48550/arxiv.2010.09879 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2010.09879 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5021626244 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-0068-4302 |
| authorships[0].author.display_name | Hon Yiu So |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Hon Yiu So |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5072653764 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-1274-5715 |
| authorships[1].author.display_name | Urun Erbas Oz |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Urun Erbas Oz |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5070818605 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-3565-9134 |
| authorships[2].author.display_name | Lauren Miller Griffith |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Lauren Griffith |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5010633402 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-8182-2369 |
| authorships[3].author.display_name | Susan Kirkland |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Susan Kirkland |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100357283 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-6870-048X |
| authorships[4].author.display_name | Jinhua Ma |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Jinhua Ma |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5112188286 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Parminder Raina |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Parminder Raina |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5029198137 |
| authorships[6].author.orcid | https://orcid.org/0000-0002-0269-4937 |
| authorships[6].author.display_name | Nazmul Sohel |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Nazmul Sohel |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5048199161 |
| authorships[7].author.orcid | https://orcid.org/0000-0001-5110-7236 |
| authorships[7].author.display_name | Mary Thompson |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Mary E. Thompson |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5076597853 |
| authorships[8].author.orcid | https://orcid.org/0000-0002-0213-8711 |
| authorships[8].author.display_name | Christina Wolfson |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Christina Wolfson |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5045980590 |
| authorships[9].author.orcid | https://orcid.org/0000-0001-6122-4225 |
| authorships[9].author.display_name | Changbao Wu |
| authorships[9].author_position | last |
| authorships[9].raw_author_name | Changbao Wu |
| authorships[9].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2010.09879 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Modelling Complex Survey Data Using R, SAS, SPSS and Stata: A Comparison Using CLSA Datasets |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10866 |
| primary_topic.field.id | https://openalex.org/fields/27 |
| primary_topic.field.display_name | Medicine |
| primary_topic.score | 0.6305999755859375 |
| primary_topic.domain.id | https://openalex.org/domains/4 |
| primary_topic.domain.display_name | Health Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2739 |
| primary_topic.subfield.display_name | Public Health, Environmental and Occupational Health |
| primary_topic.display_name | Nutritional Studies and Diet |
| related_works | https://openalex.org/W4389426664, https://openalex.org/W2988220364, https://openalex.org/W2347669405, https://openalex.org/W3203739433, https://openalex.org/W2763172433, https://openalex.org/W2066947074, https://openalex.org/W3126000805, https://openalex.org/W3123518769, https://openalex.org/W1581783538, https://openalex.org/W2057430366 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2010.09879 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2010.09879 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2010.09879 |
| primary_location.id | pmh:oai:arXiv.org:2010.09879 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2010.09879 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2010.09879 |
| publication_date | 2020-10-19 |
| publication_year | 2020 |
| referenced_works | https://openalex.org/W2024766342, https://openalex.org/W2188128005 |
| referenced_works_count | 2 |
| abstract_inverted_index.R | 1, 53, 63, 112 |
| abstract_inverted_index.a | 42, 117 |
| abstract_inverted_index.It | 95 |
| abstract_inverted_index.We | 60 |
| abstract_inverted_index.as | 34, 77, 116 |
| abstract_inverted_index.in | 17 |
| abstract_inverted_index.is | 96 |
| abstract_inverted_index.of | 20, 47, 79 |
| abstract_inverted_index.on | 45, 71, 92 |
| abstract_inverted_index.to | 9, 29, 107 |
| abstract_inverted_index.The | 0 |
| abstract_inverted_index.and | 12, 23, 37, 58, 65, 82, 113 |
| abstract_inverted_index.are | 26 |
| abstract_inverted_index.due | 8 |
| abstract_inverted_index.for | 67, 119 |
| abstract_inverted_index.has | 3 |
| abstract_inverted_index.its | 10 |
| abstract_inverted_index.the | 18, 52, 88, 99 |
| abstract_inverted_index.SAS, | 35, 56 |
| abstract_inverted_index.SPSS | 36, 57 |
| abstract_inverted_index.This | 39 |
| abstract_inverted_index.also | 114 |
| abstract_inverted_index.data | 50, 109 |
| abstract_inverted_index.from | 87 |
| abstract_inverted_index.more | 27 |
| abstract_inverted_index.such | 33, 76 |
| abstract_inverted_index.that | 98 |
| abstract_inverted_index.Aging | 93 |
| abstract_inverted_index.Study | 91 |
| abstract_inverted_index.among | 6, 103 |
| abstract_inverted_index.carry | 108 |
| abstract_inverted_index.codes | 64 |
| abstract_inverted_index.hoped | 97 |
| abstract_inverted_index.means | 81 |
| abstract_inverted_index.other | 68 |
| abstract_inverted_index.paper | 40, 100 |
| abstract_inverted_index.using | 51, 85, 111, 122 |
| abstract_inverted_index.Stata. | 38, 59 |
| abstract_inverted_index.become | 4 |
| abstract_inverted_index.fields | 19 |
| abstract_inverted_index.health | 22, 48, 104 |
| abstract_inverted_index.public | 21 |
| abstract_inverted_index.serves | 115 |
| abstract_inverted_index.survey | 49, 54 |
| abstract_inverted_index.(CLSA). | 94 |
| abstract_inverted_index.nature. | 14 |
| abstract_inverted_index.popular | 5 |
| abstract_inverted_index.science | 105 |
| abstract_inverted_index.studies | 25 |
| abstract_inverted_index.Canadian | 89 |
| abstract_inverted_index.However, | 15 |
| abstract_inverted_index.analysis | 46, 110, 121 |
| abstract_inverted_index.commonly | 72 |
| abstract_inverted_index.cookbook | 118 |
| abstract_inverted_index.datasets | 86 |
| abstract_inverted_index.describe | 61 |
| abstract_inverted_index.detailed | 62 |
| abstract_inverted_index.interest | 102 |
| abstract_inverted_index.package, | 55 |
| abstract_inverted_index.packages | 70 |
| abstract_inverted_index.provides | 41 |
| abstract_inverted_index.software | 2, 69, 124 |
| abstract_inverted_index.analyses, | 75 |
| abstract_inverted_index.analysis, | 84 |
| abstract_inverted_index.customary | 28 |
| abstract_inverted_index.different | 123 |
| abstract_inverted_index.packages. | 125 |
| abstract_inverted_index.softwares | 32 |
| abstract_inverted_index.commercial | 30 |
| abstract_inverted_index.comparison | 44 |
| abstract_inverted_index.estimation | 78 |
| abstract_inverted_index.population | 80 |
| abstract_inverted_index.procedures | 66 |
| abstract_inverted_index.regression | 83 |
| abstract_inverted_index.stimulates | 101 |
| abstract_inverted_index.encountered | 73 |
| abstract_inverted_index.flexibility | 11 |
| abstract_inverted_index.open-source | 13 |
| abstract_inverted_index.researchers | 7, 16, 106 |
| abstract_inverted_index.statistical | 31, 74, 120 |
| abstract_inverted_index.Longitudinal | 90 |
| abstract_inverted_index.comprehensive | 43 |
| abstract_inverted_index.epidemiological | 24 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 10 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/3 |
| sustainable_development_goals[0].score | 0.6100000143051147 |
| sustainable_development_goals[0].display_name | Good health and well-being |
| citation_normalized_percentile |